ONTOLOGY BUILDING USING PARALLEL ENUMERATIVE STRUCTURES

Mouna Kamel, Bernard Rothenburger

Abstract

The semantics of a text is carried by both the natural language it contains and its layout. As ontology building processes have so far taken only plain text into consideration, our aim is to elicit its textual structure. We focus here on parallel enumerative structures because they bear implicit or explicit hierarchical relations, they have salient visual properties, and they are frequently found in corpora. We have defined a process which identifies them in a text, translates them into ontological structures and finally links such structures to the concepts of an existing ontology. We have assessed this process on Wikipedia encyclopaedic articles as they are rich in definitions and statements, and contain many enumerations. The many ontological structures we have obtained are thus used to enrich an ontology which we had automatically built from database specification documents.

References

  1. Auer, S., Bizer, C., Lehmann, J., Kobilarov, G., Cyganiac, R., Ives, Z., 2007. DBpedia : a nucleus for a web of open data. In: Proceedings of the Sixth International Semantic Web Conference and Second Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea, vol. 4825, pp 715-728
  2. Chernov, S., Iofciu, T., Nejdl, W., Zhou, X. , 2006. Extracting semantic relationships between Wikipedia categories. In: Proceedings of the First International Workshop : SemWiki'06 - From Wiki to Semantics. Co-located with the Third Annual European Semantic Web Conference ESWC'06 in Budva, Montenegro
  3. Giovannetti, E., Marchi, S., Montemagni, S.: Combining Statistical Techniques and Lexico-syntactic Patterns for Semantic Relation Extraction from Text. Fifth workshop on Semantic Web Applications and Perspectives, FA0-UN, Roma, Italy (2008)
  4. Groza, T., Handschuh, S., Möller K., Decker, S., 2007. SALT - Semantically Annotated LaTeX for scientific publications. In: Proceedings of the 4th European Semantic Web Conference (ESWC 2007). Innsbruck, Austria
  5. Giuliano, C., Lavelli, A., Romano, L.: Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature. In Proc. EACL (2006)
  6. Hearst M. A.: TextTiling, 1997. Segmenting Text into Multi-paragraph Subtopic Passages. Computational Linguistics, volume 23, Number 1
  7. Herbelot, A., Copestake, A., 2006: Acquiring ontological relationships from Wikipedia using RMRS. In: Proceedings of the International Semantic Web Conference 2006. Workshop on Web Content Mining with Human Language Technologies, Athens, GA
  8. Jacquemin C., Bush C., 2000. Fouille du Web pour la collecte d'Entités Nommées. In : E. Wehrli (Ed.), TALN 2000, Lausanne
  9. Kamel, M., Aussenac-Gilles, N., 2009. How can document structure improve ontology learning? (regular paper). In: Semantic Authoring, Annotation and Knowledge Markup Workshop - collocated with K-CAP 2009 (SAAKM 2009), Redondo Beach, California (USA), Siegfried Handschuh, Michael Sintek (Eds.), CEUR Workshop Proceedings, p. 1-8
  10. Luc, C., 2001. Une typologie des énumérations basée sur les structures rhétoriques et architecturales du texte. TALN2001, Université de Tours, p. 263-272
  11. Mann, W. C., Matthiessen, C. M., Thompson, S. A., 1992. Rhetorical structure theory and text analysis. In: Mann, W. C. and Thompson, S. A., editors, Discourse Description, Diverse Linguistic Analyses of a FundRaising Text, pp. 39-78. John Benjamins publishing Compagny, Amsterdam/Philadelphia
  12. Medelyan O., Milne D., Legg C., Witten I.H., 2009. Mining meaning from Wikipedia. International Journal of Human-Computer studies. Volume 67, Issue 9, pp.716-754
  13. Nédellec, C., Nazarenko, A.: Ontology and Information Extraction. in S. Staab & R. Studer (eds.) Handbook on Ontologies in Information Systems, Springer (2003)
  14. Nguyen, D.P.T., Matsuo, Y., Ishizuka, M., 2007. Relation extraction from Wikipedia using subtree mining. In: Proceedings of the AAAI'07 Conference, Vancouver, Canada, July 2007, pp. 1414-1420
  15. Power, R., Scott, D., Bouayad-Agua, N., 2003. Document Structure. Computational linguistics, 29:4, pp. 211- 260
  16. Rebeyrolle, J, Péry-Woodley M.-P, 1998. Repérage d'objets textuels fonctionnels pour le filtrage d'information : le cas de la définition. In: Rencontre Internationale sur l'Extraction et le Filtrage et le Résumé Automatique, Sfax, Tunisie, pp19-30
  17. Shen, D., Yang, Q., Chen, Z., 2007. Noise reduction through summarization for Web-page classification. Information Processing and Management, volume 43, issue 6, pp. 1735-1747
  18. Wang, G., Zhang, H., Wang, H., Yu, Y., 2007. Enhancing relation extraction by eliciting selectional constraint features from Wikipedia. In : Proceedings of the Natural Language Processing and Information Systems Conference, pp. 329-340
Download


Paper Citation


in Harvard Style

Kamel M. and Rothenburger B. (2010). ONTOLOGY BUILDING USING PARALLEL ENUMERATIVE STRUCTURES . In Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010) ISBN 978-989-8425-29-4, pages 276-281. DOI: 10.5220/0003097602760281


in Bibtex Style

@conference{keod10,
author={Mouna Kamel and Bernard Rothenburger},
title={ONTOLOGY BUILDING USING PARALLEL ENUMERATIVE STRUCTURES },
booktitle={Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010)},
year={2010},
pages={276-281},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003097602760281},
isbn={978-989-8425-29-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010)
TI - ONTOLOGY BUILDING USING PARALLEL ENUMERATIVE STRUCTURES
SN - 978-989-8425-29-4
AU - Kamel M.
AU - Rothenburger B.
PY - 2010
SP - 276
EP - 281
DO - 10.5220/0003097602760281