MATCHING OF ENHANCED XML SCHEMAS WITH A MEASURE OF STRUCTURAL-CONTEXT SIMILARITY

Amar Zerdazi, Myriam Lamolle

Abstract

Schema matching is a critical step in integration of heterogeneous data sources. Recent integration work has mainly focused on developing matching techniques to find equivalent elements among the different XML sources. In this paper we propose a new approach to structural similarity measure based on the notion of context, between entities of the Enhanced XML Schemas, called EXS. In our approach, the set of the EXS schemas, are considered like a federation of XML schemas descended of different heterogeneous sources schemas (relational, object, XML, etc.) and enriched by the semantic metaknowledge. We present here the major problems bound to this crucial task, notably with regard to the semantic of schemas. So, we propose a structural matching algorithm. The algorithm takes two schema graphs as input, and produces as output a mapping between corresponding nodes of the schema graphs. After our algorithm runs, we expect a human to check and adjust the results.

References

  1. Abiteboul, S., Cluet, S., Milo, T., 1997. Correspondence and Translation for heterogeneous data. In Proceeding of The international Conference on Database Theory (ICDT). 351-363.
  2. Boukottaya, A., Vanoirbeek, C., Paganelli, F., AbouKhaled, O., 2004. Automating XML documents transformations: a conceptual modelling based approach. In Proceedings of the first Asian-Pacific conference on Conceptual modelling. ACM, 81-90.
  3. Castano, S. and De Antonellis, V., 1999. A schema analysis and Reconciliation Tool Environment For Heterogeneous Databases. In Proceedings of International Database Engineering and Applications Symposium.
  4. Doan, A., Madhavan, J., Domingos, P., Halevey, A., 2001. Reconciling schemas of disparate data sources: A machine Learning Approach. In Proceedings ACM SIGMOD conference. 509-520.
  5. Drew, P., King, R., McLeod, D., Rusinkiewicz, M., Silberschatz, A., 1993. Report of the Workshop on Semantic Heterogeneity and Interoperation in Multidatabase Systems. In Proceedings ACM SIGMOD record, 47-56.
  6. Fellbum, C., 1998. WordNet: An Electronic Lexical Database. MIT press.
  7. Lamolle, M. and Mellouli, N., 2003. Intégration de bases de données hétérogènes via XML.EGC'2003.
  8. Lamolle, M. and Zerdazi, A., 2005. Intégration de Bases de données hétérogènes par une modélisation conceptuelle XML, COSI'05. 216-227.
  9. Li, W.S. and Clifton, C., 1994, Semantic Integration in Heterogeneous Databases Using Neural Networks. VLDB.
  10. Li, W.S. and Clifton C., 2000, SemInt: A Tool for Identifying Attribute Correspondences in Heterogeneous Databases Using Neural Network. Data and Knowledge Engineering. 49-84.
  11. Madhavan, J., Bernstein, P., Rahm, E., 2001. Generic schema matching with cupid. VLDB.
  12. Melnik, S., Garcia-Molina, H., Rahm, E., 2002. Similarity Flooding: A versatile Graph Matching and its Application to Schema Matching. Data Engineering.
  13. Miller, A.G., 1995. WordNet: A lexical Database for English. ACM. 39-41.
  14. Miller, A.G., Hass, L., Hernandez, M.A., 2000. Schema mapping as query discovery. VLDB. 77-88.
  15. Rahm, E. and Bernstein, P., 2001 A survey of approaches to automatic schema matching. In VLDB Journal. 334-350.
  16. XML Schema, W3C Recommendation, 2001. XMLSchema Primer, W3 Consortium, 2001. Available at http://www.w3.org/TR /xmlschema-0.
  17. Zerdazi, A. and Lamolle, M., 2005. Modélisation des schémas XML par adjonction de métaconnaissances sémantiques. ASTI'05. 29-32.
  18. Zerdazi, A. and Lamolle, M., 2006. Intégration de sources hétérogènes par matching semi-automatique de schémas XML étendus. INFORSID'2006. 991-1006.
Download


Paper Citation


in Harvard Style

Zerdazi A. and Lamolle M. (2007). MATCHING OF ENHANCED XML SCHEMAS WITH A MEASURE OF STRUCTURAL-CONTEXT SIMILARITY . In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-972-8865-77-1, pages 128-133. DOI: 10.5220/0001263801280133


in Bibtex Style

@conference{webist07,
author={Amar Zerdazi and Myriam Lamolle},
title={MATCHING OF ENHANCED XML SCHEMAS WITH A MEASURE OF STRUCTURAL-CONTEXT SIMILARITY},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2007},
pages={128-133},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001263801280133},
isbn={978-972-8865-77-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - MATCHING OF ENHANCED XML SCHEMAS WITH A MEASURE OF STRUCTURAL-CONTEXT SIMILARITY
SN - 978-972-8865-77-1
AU - Zerdazi A.
AU - Lamolle M.
PY - 2007
SP - 128
EP - 133
DO - 10.5220/0001263801280133