BUILDING TAILORED ONTOLOGIES FROM VERY LARGE KNOWLEDGE RESOURCES

Victoria Nebot, Rafael Berlanga

Abstract

Nowadays very large domain knowledge resources are being developed in domains like Biomedicine. Users and applications can benefit enormously from these repositories in very different tasks, such as visualization, vocabulary homogenizing and classification. However, due to their large size and lack of formal semantics, they cannot be properly managed and exploited. Instead, it is necessary to provide small and useful logic-based ontologies from these large knowledge resource so that they become manageable and the user can take benefit from the semantics encoded. In this work we present a novel framework for efficiently indexing and generating ontologies according to the user requirements. Moreover, the generated ontologies are encoded using OWL logic-based axioms so that ontologies are provided with reasoning capabilities. Such a framework relies on an interval labeling scheme that efficiently manages the transitive relationships present in the domain knowledge resources. We have evaluated the proposed framework over the Unified Medical Language System (UMLS). Results show very good performance and scalability, demonstrating the applicability of the proposed framework in real scenarios.

References

  1. Agrawal, R., Borgida, A., and Jagadish, H. V. (1989). Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD 7889: Proceedings of the 1989 ACM SIGMOD international conference on Management of data, pages 253-262, New York, NY, USA. ACM.
  2. Aronson, A. R. (2001). Effective mapping of biomedical text to the UMLS metathesaurus: the metamap program. Proc AMIA Symp, pages 17-21.
  3. Christophides, V., Plexousakis, D., Scholl, M., and Tourtounis, S. (2003). On labeling schemes for the semantic web. In WWW, pages 544-555.
  4. Cornet, R. and Abu-Hanna, A. (2002). Usability of expressive description logics - a case study in UMLS. In Proc. AMIA Symp, pages 180-4.
  5. Jimeno, A., Jimenez-Ruiz, E., Lee, V., Gaudan, S., Berlanga, R., and Rebholz-Schuhmann, D. (2008). Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics, 9(Suppl 3):S3.
  6. Kashyap, V. and Borgida, A. (2003). Representing the UMLS semantic network using owl: (or ”what's in a semantic web link?”). In Fensel, D., Sycara, K. P., and Mylopoulos, J., editors, International Semantic Web Conference, volume 2870 of Lecture Notes in Computer Science, pages 1-16. Springer.
  7. Noy, N. F., Sintek, M., Decker, S., Crubézy, M., Fergerson, R. W., and Musen, M. A. (2001). Creating semantic web contents with protégé-2000. IEEE Intelligent Systems, 16(2):60-71.
  8. Rebholz-Schuhmann, D., Arregui, M., Gaudan, S., Kirsch, H., and Yepes, A. J. (2007). Text processing through web services: Calling whatizit. Bioinformatics, pages btm557+.
  9. Schubert, L. K., Papalaskaris, M. A., and Taugher, J. (1983). Determining type, part, color and time relationships. IEEE Computer, 16(10):53-60.
Download


Paper Citation


in Harvard Style

Nebot V. and Berlanga R. (2009). BUILDING TAILORED ONTOLOGIES FROM VERY LARGE KNOWLEDGE RESOURCES . In Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8111-85-2, pages 144-151. DOI: 10.5220/0001984001440151


in Bibtex Style

@conference{iceis09,
author={Victoria Nebot and Rafael Berlanga},
title={BUILDING TAILORED ONTOLOGIES FROM VERY LARGE KNOWLEDGE RESOURCES},
booktitle={Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2009},
pages={144-151},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001984001440151},
isbn={978-989-8111-85-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - BUILDING TAILORED ONTOLOGIES FROM VERY LARGE KNOWLEDGE RESOURCES
SN - 978-989-8111-85-2
AU - Nebot V.
AU - Berlanga R.
PY - 2009
SP - 144
EP - 151
DO - 10.5220/0001984001440151