IMPROVING REAL WORLD SCHEMA MATCHING WITH DECOMPOSITION PROCESS

Sana Sellami, Aïcha-Nabila Benharkat, Youssef Amghar, Frédéric Flouvat

Abstract

This paper tends to provide an answer to a difficult problem: Matching large XML schemas. Scalable Matching acquires a long execution time other than decreasing the quality of matches. In this paper, we propose an XML schema decomposition approach as a solution for large schema matching problem. The presented approach identifies the common structures between and within XML schemas, and decomposes these input schemas. Our method uses tree mining techniques to identify these common structures and to select the most relevant sub-parts of large schemas for matching. As proved by our experiments in e-business domain, the proposed approach improves the performance of schema matching and offers a better quality of matches in comparison to other existing matching tools.

References

  1. Chukmol, U., Rifaieh, R. and Benharkat, A.,2005. EXSMAL: EDI/XML semi-automatic Schema Matching Algorithm. In the 7th International IEEE Conference on E-Commerce Technology (CEC), pp. 422-425.
  2. Cohen William W., Ravikumar P., Fienberg S.E., 2003. A Comparison of String Distance Metrics for NameMatching Tasks. In Proceedings of IJCAI-03 Workshop on Information Integration on the Web, pp. 73-78.
  3. Do, H.H., Melnik, S., and Rahm, E., 2002. Comparison of schema Matching Evaluations. In GI-Workshop Web and Databases. Erfurt, Germany, pp.221-23.
  4. Do, H.H., and Rahm, E., 2007. Matching large schemas: Approaches and evaluation. In Journal of Information Systems, pp 857-885.
  5. He, B., Chen-Chan Chang, K.n 2003. Statistical Schema Matching across Web Query Interfaces. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 217-228.
  6. He, B., Chen-Chan Chang, K.,Han, J.,2004. Discovering complex matchings across Web Query Interfaces: A Correlation Mining Approach. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 148--157, ACM Press New York, NY.
  7. Hu W., Qu Y., Cheng G.,2008. Matching large ontologies: A divide-and-conquer approach. Journal on Data and Knowledge Engineering, 67, 140-160
  8. Pei, J., Hong, J., Bell, D.A.,2006. A Novel Clusteringbased Approach to Schema Matching. In Proceedings of the 4th International Conference on Advances in Information Systems (ADVIS), pp. 60--69
  9. Rahm, E., Bernstein, P.A., 2001.A survey of approaches to automatic schema matching. In The International Journal on Very Large Data Bases.
  10. Rahm E., Do H.H. , and Maßmann S., 2004. Matching Large XML Schemas. In SIGMOD Record. ACM Press, NY, vol.33, pp. 26--31, New York
  11. Su, W., Wang, J., Lochovsky, F., 2006. Holistic Schema Matching for Web Query Interface. In Proceedings of the 10th International Conference on Extending Database Technology (EDBT), pp. 77-94
  12. Termier A., Rousset M-A., Sebag M.,2004. DRYADE: a new approach for discovering closed frequent trees in heterogeneous tree databases. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM), pp. 543-546
  13. Wang Z., Wang Y., Zhang S., Shen G., Du T., 2006. Matching Large Scale Ontology Effectively. Proceedings of the First Asian Semantic Web Conference (ASWC), pp. 99-106.
Download


Paper Citation


in Harvard Style

Sellami S., Benharkat A., Amghar Y. and Flouvat F. (2010). IMPROVING REAL WORLD SCHEMA MATCHING WITH DECOMPOSITION PROCESS . In Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8425-04-1, pages 151-158. DOI: 10.5220/0002887001510158


in Bibtex Style

@conference{iceis10,
author={Sana Sellami and Aïcha-Nabila Benharkat and Youssef Amghar and Frédéric Flouvat},
title={IMPROVING REAL WORLD SCHEMA MATCHING WITH DECOMPOSITION PROCESS},
booktitle={Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2010},
pages={151-158},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002887001510158},
isbn={978-989-8425-04-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - IMPROVING REAL WORLD SCHEMA MATCHING WITH DECOMPOSITION PROCESS
SN - 978-989-8425-04-1
AU - Sellami S.
AU - Benharkat A.
AU - Amghar Y.
AU - Flouvat F.
PY - 2010
SP - 151
EP - 158
DO - 10.5220/0002887001510158