A METHOD FOR EARLY CORRESPONDENCE DISCOVERY USING INSTANCE DATA

Indrakshi Ray, C. J. Michael Geisterfer

Abstract

Most of the research in database integration have focused on matching schema-level information to determine the correspondences between data concepts in the component databases. Such research relies on the availability of schema experts, schema documentation, and well – designed schemas – items that are often not available. We propose a method of initial instance-based correspondence discovery that greatly reduces the manual effort involved in the current integration processes. The gains are accomplished because the ensuing method uses only instance data (a body of database knowledge that is always available) to make its initial discoveries.

References

  1. Aslan, G. and Mcleod, D. (1999). Semantic heterogeneity resolution in federated databases by metadata implantation and stepwise evolution. The VLDB Journal, 8:120-132.
  2. Castano, S., Antonellis, V. D., and di Vemercati, S. D. C. (2001). Global Viewing of Heterogeneous Data Sources. IEEE Transactions on Data Knowledge and Engineering, 13(2):277-297.
  3. Chua, C. E. H., Chiang, R. H. L., and Lim, E. (2003). Instance-based attribute identification in database integration. The VLDB Journal, 12:228-243.
  4. Gal, A., Trombetta, A., Anaby-Tavor, A., and Montesi, D. (2003). A model for schema integration in heterogeneous databases. In Proceedings of the 7th International Database Engineering and Applications Symposium IDEAS 7803), pages 2-11.
  5. Lawrence, R. and Barker, K. (2001). Integrating relational database schemas using a standardized dictionary. In Proceedings of the 2001 ACM Symposium of Applied Computing, pages 225-230.
  6. Lenzerini, M. (2002). Data integration: A theoretical perspective. In Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 233-246.
  7. Li, W. and Clifton, C. (2000). SemInt: A Tool for Identifying Attribute Correspondences in Heterogeneous Databases Using Neural Network. IEEE Transactions on Data Knowledge and Engineering, 33(1):49-84.
  8. Parent, C. and Spaccapietra, S. (1998). Issues and Approaches of Database Integration. CACM, 41(5):166- 178.
  9. Rahm, E. and Bernstein, P. A. (2001). A survey of approaches to automatic schema matching. The VLDB Journal, 10:334-350.
  10. Schmitt, I. and Trker, C. (1998). An incremental approach to schema integration by refining extensional relationships. In Proceedings of the 7th International Conference on Information and Knowledge Management, pages 322-330.
  11. Yan, G., Ng, W. K., and Lim, E. (2002). Product Schema Integration for Electronic Commerce A Synonym Approach. IEEE Transactions of Knowledge and Data Engineering, 14(3):583-598.
  12. Zhang, J. (1994). A formal specification model and its application on multidatabase systems. In Proceedings of the 1994 Conference of the Centre for Advanced Studies in Collaborative Research, pages 76-89.
Download


Paper Citation


in Harvard Style

Ray I. and J. Michael Geisterfer C. (2007). A METHOD FOR EARLY CORRESPONDENCE DISCOVERY USING INSTANCE DATA . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-972-8865-88-7, pages 258-263. DOI: 10.5220/0002350102580263


in Bibtex Style

@conference{iceis07,
author={Indrakshi Ray and C. J. Michael Geisterfer},
title={A METHOD FOR EARLY CORRESPONDENCE DISCOVERY USING INSTANCE DATA},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2007},
pages={258-263},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002350102580263},
isbn={978-972-8865-88-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A METHOD FOR EARLY CORRESPONDENCE DISCOVERY USING INSTANCE DATA
SN - 978-972-8865-88-7
AU - Ray I.
AU - J. Michael Geisterfer C.
PY - 2007
SP - 258
EP - 263
DO - 10.5220/0002350102580263