7 CONCLUSION AND
PERSPECTIVES
We have described in the present paper a
formalization of the enrichment method, which
consists of adding syntactic properties to the existing
annotation. This method is based on three main
phases: the enrichment problem formalization, the GP
induction and the regeneration of the ATB with
property annotations. The heart of the enrichment
method is specially in the third phase. It consists on
the verification of the satisfaction of the GP property
for each ATB phrase. The verification result is used
to enrich the ATB. We had good experimentation
results and various properties of different types in the
enriched ATB.
As perspectives, in order to offer a very precise
representation of the syntactic information in the
ATB, we can enrich or improve the relation set
presented in the induced GP. For example, proposing
an interpretation of the dependency property or
modify the description of the obligation and exclusion
properties. In future works, we can optimize our
enrichment method by integrating several control
mechanisms into determining syntactic categories
and verification of their properties. We can go further
by applying our enrichment method to other
annotated corpora obtained from existing parsers.
REFERENCES
Abdul-Mageed, M., Diab, M., 2012. AWATIF: A Multi-
Genre Corpus for Modern Standard Arabic Subjectivity
and Sentiment Analysis. Language Resources and
Evaluation Conference (LREC’12), Istanbul, Turkey.
Alkuhlani, S., Habash, N., 2011. A Corpus for Modeling
Morpho-Syntactic Agreement in Arabic: Gender,
Number and Rationality. Association for
Computational Linguistics (ACL’11), Portland,
Oregon, USA.
Alkuhlani, S., Habash, N., Roth, R., 2013. Automatic
Morphological Enrichment of a Morphologically
Underspecified Treebank. North American Chapter of
the Association for Computational Linguistics: Human
Language Technologies (HLT-NAACL’13), pp. 460-
470, Atlanta, Georgia, USA.
Bensalem R. B., Elkarwi, M., 2014. Induction d’une
grammaire de propriétés à granularité variable à partir
du treebank arabe ATB. Rencontre des Étudiants
Chercheurs en Informatique pour le Traitement
Automatique des Langues (RECITAL’14), pp. 124-135,
ATALA, ACL-ontology, Marseille, France.
R. B. Bensalem, Elkarwi, M., Haddar, K., Blache, P., 2014.
Building an Arabic Linguistic Resource from a
treebank: The Case of Property Grammar. Text, Speech
and Dialogue (TSD’14), pp. 240-246, Springer, Czech
Republic.
Blache, P., Rauzy, S., 2012. Hybridization and treebank
enrichment with constraint-based representations.
LREC’12- Workshop on Advanced Treebanking.
Istanbul. Turkey.
Cahill, A., 2008. Treebank-Based Probabilistic Phrase
Structure Parsing. Language and Linguistics Compass
2 (1), 18-40.
Çakıcı, R., 2005. Automatic induction of a CCG grammar
for Turkish. ACL Student Research Workshop, pp. 73–
78, Ann Arbor, Michigan.
El-taher, A. I., Abo Bakr, H. M., Zidan, I., Shaalan, K.,
2014. An Arabic CCG approach for determining
constituent types from Arabic treebank. Journal of King
Saud University - Computer and Information Sciences,
pp. 1319-1578.
Habash, N., Rambow O., 2005. Arabic Tokenization, Part-
of-Speech Tagging and Morphological Disambiguation
in One Fell Swoop. ACL, pp. 573-580, Ann Arbor,
Michigan.
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L.,
Weischedel, R., 2006. OntoNotes: The 90% Solution.
North American Chapter of the Association for
Computational Linguistics (NAACL’06), pp. 57–60,
USA.
Koller, A., Thater, S., 2010. Computing weakest readings.
ACL, Uppsala, Sweden.
Maamouri, M., Bies, A., Buckwalter, T., Mekki, W., 2004.
The Penn Arabic Treebank: Building a Large-Scale
Annotated Arabic Corpus. NEMLAR Conference on
Arabic Language Resources and Tools, Cairo, Egypt.
Müller, H. H., 2010. Annotation of Morphology and NP
Structure in the Copenhagen Dependency Treebanks
(CDT). International Workshop on Treebanks and
Linguistic Theories, pp. 151-162, University of Tartu,
Estonia.
Oepen, S., Flickinger, D., Toutanova, K., Manning, C. D.,
2002. LinGO Redwoods - A Rich and Dynamic
Treebank for HPSG. LREC’02 - workshop on parsing
evaluation, Las Palmas, Spain.
Palmer, M., Babko-Malaya, O., Bies, A., Diab, M.,
Maamouri, M., Mansouri, A., Zaghouani, W., 2008. A
Pilot Arabic Propbank. LREC’08, Marrakech,
Morocco.
Tounsi, L., Attia, M., Van-Genabith, J., 2009. Automatic
Treebank-Based Acquisition of Arabic LFG
Dependency Structures. The European Chapter of the
ACL (EACL) Workshop on Computational Approaches
to Semitic Languages, pp. 45–52, Greece.