Towards the Enrichment of Arabic WordNet with Big Corpora

Georges Lebboss; Gilles Bernard; Noureddine Aliane; Mohammad Hajjar

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Towards the Enrichment of Arabic WordNet with Big Corpora

Topics: Applications: Image Processing and Artificial Vision, Pattern Recognition, Decision Making, Industrial and Real World applications, Financial Applications, Neural Prostheses and Medical Applications, Neural based Data Mining and Complex Information Processing, Neural Network Software and Applications, Applications of Deep Neural networks, Robotics and Control Applications; Learning Paradigms and Algorithms; Self-Organization and Emergence

In Proceedings of the 9th International Joint Conference on Computational Intelligence - Volume 0IJCCI, 101-109, 2017 , Funchal, Madeira, Portugal

Authors: Georges Lebboss ¹ ; Gilles Bernard ¹ ; Noureddine Aliane ¹ and Mohammad Hajjar ²

Affiliations: ¹ LIASD and Paris 8 University, France ; ² Lebanese University and IUT, Lebanon

Keyword(s): Semantic Relations, Semantic Arabic Resources, Arabic WordNet, Synsets, Arabic Corpus, Data Preprocessing, Word Vectors, Word Classification, Self Organizing Maps.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Biomedical Engineering ; Biomedical Signal Processing ; Computational Intelligence ; Health Engineering and Technology Applications ; Human-Computer Interaction ; Learning Paradigms and Algorithms ; Methodologies and Methods ; Neural Networks ; Neurocomputing ; Neurotechnology, Electronics and Informatics ; Pattern Recognition ; Physiological Computing Systems ; Self-Organization and Emergence ; Sensor Networks ; Signal Processing ; Soft Computing ; Theory and Methods

Abstract: This paper presents a method aiming to enrich Arabic WordNet with semantic clusters extracted from a large general corpus. As the Arabic language is poor in open digital linguistic resources, we built such a corpus (more than 7.5 billion words) with ad-hoc tools. We then applied GraPaVec, a new method for word vectorization using automatically generated frequency patterns, as well as state-of-the-art Word2Vec and Glove methods. Word vectors were fed to a Self Organizing Map neural network model; the clusterings produced were then compared for evaluation with Arabic WordNet existing synsets (sets of synonymous words). The evaluation yields a F-score of 82.1 % for GrapaVec, 55.1 % for Word2Vec's Skipgram, 52.2 % for CBOW and 56.6 % for Glove, which at least shows the interest of the context that GraPaVec takes into account. We end up by discussing parameters and possible biases.

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.108

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Lebboss, G., Bernard, G., Aliane, N. and Hajjar, M. (2017). Towards the Enrichment of Arabic WordNet with Big Corpora. In Proceedings of the 9th International Joint Conference on Computational Intelligence (IJCCI 2017) - IJCCI; ISBN 978-989-758-274-5; ISSN 2184-3236, SciTePress, pages 101-109. DOI: 10.5220/0006505701010109

@conference{ijcci17,
author={Georges Lebboss and Gilles Bernard and Noureddine Aliane and Mohammad Hajjar},
title={Towards the Enrichment of Arabic WordNet with Big Corpora},
booktitle={Proceedings of the 9th International Joint Conference on Computational Intelligence (IJCCI 2017) - IJCCI},
year={2017},
pages={101-109},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006505701010109},
isbn={978-989-758-274-5},
issn={2184-3236},
}

TY - CONF

JO - Proceedings of the 9th International Joint Conference on Computational Intelligence (IJCCI 2017) - IJCCI
TI - Towards the Enrichment of Arabic WordNet with Big Corpora
SN - 978-989-758-274-5
IS - 2184-3236
AU - Lebboss, G.
AU - Bernard, G.
AU - Aliane, N.
AU - Hajjar, M.
PY - 2017
SP - 101
EP - 109
DO - 10.5220/0006505701010109
PB - SciTePress