USING NATURAL LANGUAGE PROCESSING FOR AUTOMATIC EXTRACTION OF ONTOLOGY INSTANCES

Carla Faria, Rosario Girardi, Ivo Serra, Maria Macedo, Djefferson Maranhão

Abstract

Ontologies are used by modern knowledge-based systems to represent and share knowledge about an application domain. Ontology population looks for identifying instances of concepts and relationships of an ontology. Manual population by domain experts and knowledge engineers is an expensive and time consuming task so, automatic and semi-automatic approaches are needed. This article proposes an initial approach for automatic ontology population from textual sources that use natural language processing and machine learning techniques. Some experiments using a family law corpus were conducted in order to evaluate it. Initial results are promising and indicate that our approach can extract instances with good effectiveness.

References

  1. Allen, J. 1995. Natural Language Understanding. Redwood City, CA: The Benjamin/Cummings Publishing Company, Inc.
  2. Bichop, C. M. 2006. Pattern Recognition and Machine Learning, Springer.
  3. Cimiano, P. and Volker, J., 2005. Towards large-scale, open-domain and ontology-based named entity classification. In: Proceedings of RANLP'05, p. 166- 172, Borovets, Bulgaria.
  4. Cimiano,P., Pivk, A., Thieme, L. S. and Staab, L. S., 2005. Learning Taxonomic Relations from heterogeneous Sources of Evidence. In Ontology Learning from Text: Methods, Evaluation and Applications. IOS Press.
  5. Dale, R., Moisl, H. and Somers, H. L. 2000. Handbook of natural language processing. CRC Press.
  6. Dellschaft, K. and Staab, S. 2006. On how to perform a gold standard based evaluation of ontology learning. In: Proceedings of the 5th International Semantic Web Conference, p. 228 - 241, Athens. Springer.
  7. Fellbaum, C., 1998. Wordnet: An Electronic Lexical Database, MIT Press.
  8. Fleischman, M. and Hovy, E., 2002. Fine Grained Classification of Named Entities. In: Proceedings of COLING, Taipei, Taiwan.
  9. General Architecture for Text Engineering, 2009, http://gate.ac.uk, December.
  10. Gruber, T. R., 1995. Toward Principles for the Design of Ontologies used for Knowledge Sharing. International Journal of Human-Computer Studies, nº43, pp. 907- 928.
  11. Guarino, N., Masolo, C., and Vetere, G. 1999. Ontoseek: Content-based Access to the web. IEEE Intelligent Systems, v. 14(3), p. 70-80.
  12. Hearst, M., 1998. Automated Discovery of Word-Net Relations. In WordNet: An Electronic Lexical Database. MIT Press.
  13. Marcus, M., Santorini, B. and Marcinkiewicz, M. 1993. Building a Large Annotaded Corpus of English: Penn TreeBank. Computational linguistics: Special Issue on Using Large Corpora, [S. I.], v. 19, n.2, p. 313 - 330.
  14. Marneffe, M. and Manning, C. 2008. The Stanford typed dependencies representation. In: Workshop on CrossFramework and Cross-Domain Parser Evaluation, Manchester. Proceedings of the Workshop on CrossFramework and Cross-Domain Parser Evaluation. p. 1 - 8.
  15. Mitchell, T. 1997. Machine Learning, Mc Graw Hill.
  16. Nierenburg, S. and Raskin, V. 2004. Ontological Semantics, MIT Press.
  17. Russel, S. and Norvig, P. 1995. Artificial Intelligence: A Modern Approach, Prentice-Hall.
  18. Salton, G. and Buckley, C., 1987. Term Weighting Approaches in Automatic Text Retrieval. Cornell University.
  19. Tanev, H. and Magnini, B., 2006. Weakly Supervised Approaches for Ontology Population. In: Proceedings of EACL.
  20. Witten, I. H. and Frank, E. 2005. Data Mining Practical Machine Learning Tools and Techniques, Elsevier 2nd edition.
Download


Paper Citation


in Harvard Style

Faria C., Girardi R., Serra I., Macedo M. and Maranhão D. (2010). USING NATURAL LANGUAGE PROCESSING FOR AUTOMATIC EXTRACTION OF ONTOLOGY INSTANCES . In Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8425-05-8, pages 278-283. DOI: 10.5220/0002904402780283


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - USING NATURAL LANGUAGE PROCESSING FOR AUTOMATIC EXTRACTION OF ONTOLOGY INSTANCES
SN - 978-989-8425-05-8
AU - Faria C.
AU - Girardi R.
AU - Serra I.
AU - Macedo M.
AU - Maranhão D.
PY - 2010
SP - 278
EP - 283
DO - 10.5220/0002904402780283


in Bibtex Style

@conference{iceis10,
author={Carla Faria and Rosario Girardi and Ivo Serra and Maria Macedo and Djefferson Maranhão},
title={USING NATURAL LANGUAGE PROCESSING FOR AUTOMATIC EXTRACTION OF ONTOLOGY INSTANCES},
booktitle={Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2010},
pages={278-283},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002904402780283},
isbn={978-989-8425-05-8},
}