Mapping Text Mining Taxonomies

Katja Pfeifer, Eric Peukert

Abstract

Huge amounts of textual information relevant for market analysis, trending or product monitoring can be found on the Web. To make use of that information a number of text mining services were proposed that extract and categorize entities from given text. Such services have individual strengths and weaknesses so that merging results from multiple services can improve quality. To merge results, mappings between service taxonomies are needed since different taxonomies are used for categorizing extracted information. The mappings can potentially be computed by using ontology matching systems. However, the available meta data within most taxonomies is weak so that ontology matching systems currently return insufficient results. In this paper we propose a novel approach to enrich service taxonomies with instance information which is crucial for finding mappings. Based on the found instances we present a novel instance-based matching technique and metric that allows us to automatically identify equal, hierarchical and associative mappings. These mappings can be used for merging results of multiple extraction services. We broadly evaluate our matching approach on real world service taxonomies and compare to state-of-the-art approaches.

References

  1. AlchemyAPI (2013). AlchemyAPI Homepage. www.alchemyapi.com/. March 2013.
  2. Chua, W. W. K. and Kim, J.-J. (2012). Discovering CrossOntology Subsumption Relationships by Using Ontological Annotations on Biomedical Literature. In ICBO, volume 897 of CEUR Workshop Proc.
  3. Do, H. H. and Rahm, E. (2002). COMA - A System for Flexible Combination of Schema Matching Approach. In VLDB Proc.
  4. Drumm, C., Schmitt, M., Do, H.-H., and Rahm, E. (2007). QuickMig: Automatic Schema Matching for Data Migration Projects. In CIKM'07 Proc.
  5. Euzenat, J. and Shvaiko, P. (2007). Ontology Matching. Springer-Verlag.
  6. Evri (2012). Evri Developer Homepage. www.evri.com/developer/. June 2012.
  7. FISE (2013). Furtwangen IKS Semantic Engine project page. http://wiki.iks-project.eu/index.php/FISE. March 2013.
  8. Grimes, S. (2008). Unstructured data and the 80 percent rule. http://breakthroughanalysis.com/2008/08/01/ unstructured-data-and-the-80-percent-rule/. Clarabridge Bridgepoints.
  9. Hotho, A., Nürnberger, A., and Paaß, G. (2005). A Brief Survey of Text Mining. LDV Forum, 20(1):19-62.
  10. Hu, W. and Qu, Y. (2008). Falcon-AO: A practical Ontology Matching System. Web Semantics, 6(3):237-239.
  11. Isaac, A., Van Der Meij, L., Schlobach, S., and Wang, S. (2007). An Empirical Study of Instance-Based Ontology Matching. In ISWC'07 Proc., pages 253-266.
  12. Jean-Mary, Y. R., Shironoshita, E. P., and Kabuka, M. R. (2009). Ontology Matching with Semantic Verification. Web Semantics, 7(3):235-251.
  13. Li, J., Tang, J., Li, Y., and Luo, Q. (2009). RiMOM: A Dynamic Multistrategy Ontology Alignment Framework. TKDE, 21(8):1218-1232.
  14. Massmann, S. and Rahm, E. (2008). Evaluating Instancebased Matching of Web Directories. In WebDB'08 Proc.
  15. OpenCalais (2013). Calais Homepage. www.opencalais.com/. March 2013.
  16. Rahm, E. and Bernstein, P. A. (2001). A survey of approaches to automatic schema matching. The VLDB Journal, 10:334-350.
  17. Seidler, K. and Schill, A. (2011). Service-oriented Information Extraction. In Joint EDBT/ICDT Ph.D. Workshop'11 Proc., pages 25-31.
  18. Shvaiko, P. and Euzenat, J. (2005). A Survey of SchemaBased Matching Approaches. Journal on Data Semantics IV.
  19. Suchanek, F. M., Abiteboul, S., and Senellart, P. (2011). Paris: probabilistic alignment of relations, instances, and schema. Proc. VLDB Endow., 5(3):157-168.
Download


Paper Citation


in Harvard Style

Pfeifer K. and Peukert E. (2013). Mapping Text Mining Taxonomies . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval and the International Conference on Knowledge Management and Information Sharing - Volume 1: KDIR, (IC3K 2013) ISBN 978-989-8565-75-4, pages 5-16. DOI: 10.5220/0004500400050016


in Bibtex Style

@conference{kdir13,
author={Katja Pfeifer and Eric Peukert},
title={Mapping Text Mining Taxonomies},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval and the International Conference on Knowledge Management and Information Sharing - Volume 1: KDIR, (IC3K 2013)},
year={2013},
pages={5-16},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004500400050016},
isbn={978-989-8565-75-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval and the International Conference on Knowledge Management and Information Sharing - Volume 1: KDIR, (IC3K 2013)
TI - Mapping Text Mining Taxonomies
SN - 978-989-8565-75-4
AU - Pfeifer K.
AU - Peukert E.
PY - 2013
SP - 5
EP - 16
DO - 10.5220/0004500400050016