Unsupervised Grammatical Pattern Discovery from Arabic Extra Large Corpora

Adelle Abdallah; Hussein Awdeh; Youssef Zaki; Gilles Bernard; Mohammad Hajjar

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Unsupervised Grammatical Pattern Discovery from Arabic Extra Large Corpora

Topics: Self-Organizing Maps (SOM) and Self-organizing Systems; Stochastic Learning and Statistical Algorithms ; Support Vector Machines and Kernel Methods

In Proceedings of the 13th International Joint Conference on Computational Intelligence - Volume 1: NCTA, 211-220, 2021

Authors: Adelle Abdallah ¹ ; Hussein Awdeh ¹ ; Youssef Zaki ¹ ; Gilles Bernard ¹ and Mohammad Hajjar ²

Affiliations: ¹ LIASD Lab, Paris 8 University, 2 rue de la Liberté 93526 Saint-Denis, Cedex, France ; ² Faculty of Technology, Lebanese University, Hisbeh Street, Saida, Lebanon

Keyword(s): Arabic Language, Arabic Natural Language Process, Validation Information Retrieval, Silver Standard Corpus.

Abstract: Many methods have been applied to automatic construction or expansion of lexical semantic resources. Most follow the distributional hypothesis applied to lexical context of words, eliminating grammatical context (stopwords). This paper will show that the grammatical context can yield information about semantic properties of words, if the corpus be large enough. In order to do this, we present an unsupervised pattern-based model building semantic word categories from large corpora, devised for resource-poor languages. We divide the vocabulary between high-frequency and lower frequency items, and explore the patterns formed by high-frequency items in the neighborhood of lower frequency words. Word categories are then created by clustering. This is done on a very large Arabic corpus, and, for comparison, on a large English corpus; results are evaluated with direct and indirect evaluation methods. We compare the results with state-of-the-art lexical models for performance and for computa tion time. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.159

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Abdallah, A., Awdeh, H., Zaki, Y., Bernard, G. and Hajjar, M. (2021). Unsupervised Grammatical Pattern Discovery from Arabic Extra Large Corpora. In Proceedings of the 13th International Joint Conference on Computational Intelligence (IJCCI 2021) - NCTA; ISBN 978-989-758-534-0; ISSN 2184-3236, SciTePress, pages 211-220. DOI: 10.5220/0010651700003063

@conference{ncta21,
author={Adelle Abdallah and Hussein Awdeh and Youssef Zaki and Gilles Bernard and Mohammad Hajjar},
title={Unsupervised Grammatical Pattern Discovery from Arabic Extra Large Corpora},
booktitle={Proceedings of the 13th International Joint Conference on Computational Intelligence (IJCCI 2021) - NCTA},
year={2021},
pages={211-220},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010651700003063},
isbn={978-989-758-534-0},
issn={2184-3236},
}

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Computational Intelligence (IJCCI 2021) - NCTA
TI - Unsupervised Grammatical Pattern Discovery from Arabic Extra Large Corpora
SN - 978-989-758-534-0
IS - 2184-3236
AU - Abdallah, A.
AU - Awdeh, H.
AU - Zaki, Y.
AU - Bernard, G.
AU - Hajjar, M.
PY - 2021
SP - 211
EP - 220
DO - 10.5220/0010651700003063
PB - SciTePress