USING LINGUISTIC TECHNIQUES FOR SCHEMA MATCHING

Ozgul Unal, Hamideh Afsarmanesh

Abstract

In order to deal with the problem of semantic and schematic heterogeneity in collaborative networks, matching components among database schemas need to be identified and heterogeneity needs to be resolved, by creating the corresponding mappings in a process called schema matching. One important step in this process is the identification of the syntactic and semantic similarity among elements from different schemas, usually referred to as Linguistic Matching. The Linguistic Matching component of a schema matching and integration system, called SASMINT, is the focus of this paper. Unlike other systems, which typically utilize only a limited number of similarity metrics, SASMINT makes an effective use of NLP techniques for the Linguistic Matching and proposes a weighted usage of several syntactic and semantic similarity metrics. Since it is not easy for the user to determine the weights, SASMINT provides a component called Sampler as another novelty, to support automatic generation of weights.

References

  1. ENBI (2005). European Network for Biodiversity Information (IST 2001-00618). http://www.enbi.info.
  2. Camarinha-Matos, L. M. and H. Afsarmanesh (2005). Collaborative networks: A new scientific discipline. Journal of Intelligent Manufacturing 16(4-5): 439- 452.
  3. Cleverdon, C. W. and E. M. Keen (1966). Factors determining the performance of indexing systems, vol 2: Test results, Aslib Cranfield Research Project. Cranfield Institute of Technology.
  4. Do, H. H. and E. Rahm (2002). COMA - A System for Flexible Combination of Schema Matching Approaches. In 28th International Conference on Very Large Databases (VLDB).
  5. Doan, A., J. Madhavan, et al. (2002). Learning to Map between Ontologies on the Semantic Web. In WorldWide Web Conf. (WWW-2002).
  6. Fellbaum, C. (1998). An Electronic Lexical Database., Cambridge: MIT press.
  7. Jaccard, P. (1912). The distribution of flora in the alpine zone. The New Phytologist 11(2): 37-50.
  8. Jaro, M. A. (1995). Probabilistic linkage of large public health. Statistics in Medicine: 14:491-498.
  9. Lesk, M. (1986). Automatic sense disambiguation using machine readable dictionaries: how to tell a pine code from an ice cream cone. In 5th SIGDOC Conference.
  10. Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Cybernetics and Control Theory 10(8): 707-710.
  11. Madhavan, J., P. A. Bernstein, et al. (2001). Generic Schema Matching with Cupid. In 27th International Conference on Very Large Databases (VLDB).
  12. Melnik, S., H. Garcia-Molina, et al. (2002). Similarity Flooding: A Versatile Graph Matching Algorithm and its Application to Schema Matching. In 18th International Conference on Data Engineering (ICDE).
  13. Miller, R. J., L. M. Haas, et al. (2000). Schema Mapping as Query Discovery. In 26th International Conference on Very Large Databases (VLDB).
  14. Mitra, P., G. Wiederhold, et al. (2001). A scalable framework for the interoperation of information sources. International Semantic Web Working Symposium.
  15. Monge, A. E. and C. Elkan (1996). The Field Matching Problem: Algorithms and Applications. In 2nd International Conference on Knowledge Discovery and Data Mining.
  16. Pedersen, T., S. Banerjee, et al. (2003). Maximizing Semantic Relatedness to Perform Word Sense Disambiguation. Supercomputing Institute, University of Minnesota.
  17. Rijsbergen, C. J. v. (1979). Information Retrieval, Butterworths, London.
  18. Salton, G. and C. S. Yang (1973). On the specification of term values in automatic indexing. Journal of Documentation(29): 351-372.
  19. Unal, O. and H. Afsarmanesh (2006). Interoperability in Collaborative Network of Biodiversity Organizations. In Proc. of PRO-VE'06 - Virtual Enterprises and Collaborative Networks, Accepted for Publication.
  20. Wu, Z. and M. Palmer (1994). Verb Semantics and Lexical Selection. 32nd Annual Meeting of the Association for Computational Linguistics.
Download


Paper Citation


in Harvard Style

Unal O. and Afsarmanesh H. (2006). USING LINGUISTIC TECHNIQUES FOR SCHEMA MATCHING . In Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-972-8865-69-6, pages 115-120. DOI: 10.5220/0001319801150120


in Bibtex Style

@conference{icsoft06,
author={Ozgul Unal and Hamideh Afsarmanesh},
title={USING LINGUISTIC TECHNIQUES FOR SCHEMA MATCHING},
booktitle={Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2006},
pages={115-120},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001319801150120},
isbn={978-972-8865-69-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - USING LINGUISTIC TECHNIQUES FOR SCHEMA MATCHING
SN - 978-972-8865-69-6
AU - Unal O.
AU - Afsarmanesh H.
PY - 2006
SP - 115
EP - 120
DO - 10.5220/0001319801150120