TAXONOMY LEARNING FOR THE ROMANIAN LANGUAGE USING SOTA AND WORDNET

Viorica R. Chifu, Ioan Salomie, Emil Şt. Chifu, Corina Grumazescu

Abstract

Ontologies are widely used today in various domains such as information retrieval, semantic Web, NLP tasks or for describing specific domains like certain branches of medicine. While there are many tools that can be used for learning domain ontologies for English, when learning domain specific ontologies for Romanian, we face a lack of available tools and resources. Moreover, due to the complexity of the Romanian grammar, processing of Romanian text corpora is also difficult. This paper focuses on building a domain specific ontology for the Romanian language using machine learning techniques. The taxonomy learning process is based on an unsupervised neural network. The resulting modules are intended to be used for semantic annotations of traceability services in meat industry.

References

  1. Alfonseca E., and Manandhar, S., 2002. Extending a lexical ontology by a combination of distributional semantics signatures. In A. Gómez-Pérez, V.R. Benjamins, eds., 13th International Conference on Knowledge Engineering and Knowledge Management, LNAI, Springer, pp. 1-7.
  2. Brill, E.,1999. A simple rule-based part-of-speech tagger. ANLP'92, 3rd Conference on Applied Natural Language Processing, pp. 152-155.
  3. Buitelaar, P., Cimiano, P., Grobelnik, M., and Sintek, M., 2005. Ontology learning from text. Tutorial at the ECML/PKDD workshop on Knowledge Discovery and Ontologies.
  4. Cimiano, P., Staab, S.,2005. Learning concept hierarchies from text with a guided hierarchical clustering algorithm. ICML workshop on Learning and Extending Lexical Ontologies with Machine Learning Methods.
  5. Cimiano, P., Pivk, A., Schmidt-Thieme, L., and Staab, S., 2005. Learning taxonomic relations from heterogeneous sources of evidence. In P. Buitelaar, P. Cimiano, B. Magnini, eds., Ontology Learning from Text: Methods Applications and Evaluation, IOS Press, pp. 59-73.
  6. M. Dittenbach, D. Merkl, and A. Rauber, Organizing and exploring high-dimensional data with the Growing Hierarchical Self-Organizing Map”, In L. Wang, et al., eds., 1st International Conference on Fuzzy Systems and Knowledge Discovery, vol. 2, 2002, pp. 626-630.
  7. Gómez-Pérez, A., and Manzano-Macho, D., 2003. A survey of ontology learning methods and techniques. OntoWeb Deliverable 1.5.
  8. Hearst, M. A.,1992. Automatic acquisition of hyponyms from large text corpora. 14th International Conference on Computational Linguistics.
  9. Herrero, J., Valencia, A., and Dopazo, J., 2001. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics 17, pp.:126-136.
  10. Khan, L., and Luo, F., 2002. Ontology construction for information selection. 14th IEEE International Conference on Tools with Artificial Intelligence, pp. 122-127.
  11. Miller, G., A., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K., 1993. Introduction to WordNet: An on-line lexical database. Technical report, Princeton. CSL Report 43, revised March.
  12. Noy, N.F., Crubézy, M.,et al., 2003. Protégé-2000: An Open-Source Ontology-Development and KnowledgeAcquisition Environment. AMIA Annual Symposium Proceedings.
  13. Studer, R., Benjamins, V., and Fensel, D., 1998. Knowledge Engineering: Principles and Methods, Data and Knowledge Engineering, 25(1-2), pp.:161 - 197.
  14. Tufis, D., 1999. Tiered Tagging and Combined Classifiers, In F. Jelinek and E. Nöth, eds., Text, Speech and Dialogue, Lecture Notes in Artificial Intelligence 1692, Springer.
  15. Witschel, H. F.,2005. Using Decision Trees and Text Mining Techniques for Extending Taxonomies. In Proceedings of Learning and Extending Lexical Ontologies by Using Machine Learning Methods, Workshop at ICML-05.
  16. Maestro: http://www.maestro.ro/, 2007
  17. CrisTim: http://www.cirstim.ro/, 2007.
  18. OWL: http://www.w3.org/TR/owl-guide/, 2006
Download


Paper Citation


in Harvard Style

R. Chifu V., Salomie I., Şt. Chifu E. and Grumazescu C. (2008). TAXONOMY LEARNING FOR THE ROMANIAN LANGUAGE USING SOTA AND WORDNET . In Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 2: WEBIST, ISBN 978-989-8111-27-2, pages 169-174. DOI: 10.5220/0001521601690174


in Bibtex Style

@conference{webist08,
author={Viorica R. Chifu and Ioan Salomie and Emil Şt. Chifu and Corina Grumazescu},
title={TAXONOMY LEARNING FOR THE ROMANIAN LANGUAGE USING SOTA AND WORDNET},
booktitle={Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,},
year={2008},
pages={169-174},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001521601690174},
isbn={978-989-8111-27-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,
TI - TAXONOMY LEARNING FOR THE ROMANIAN LANGUAGE USING SOTA AND WORDNET
SN - 978-989-8111-27-2
AU - R. Chifu V.
AU - Salomie I.
AU - Şt. Chifu E.
AU - Grumazescu C.
PY - 2008
SP - 169
EP - 174
DO - 10.5220/0001521601690174