Integrating Semi-structured Information using Semantic Technologies - An Evaluation of Tools and a Case Study on University Rankings Data

Alejandra Casas-Bayona, Hector G. Ceballos

Abstract

Information integration is not a trivial activity. Information managers face problems like: heterogeneity (in data, schemas, syntax and platforms), distribution and duplicity. In this paper we: 1) analyze ontology-based methodologies that provide mediation frameworks for integrating and reconciling information from structured data sources, and 2) propose the use of available semantic technologies for replicating such functionality. Our aim is providing an agile method for integrating and reconciling information from semi-structured data (spreadsheets) and determining to which extent available semantic technologies minimize the need of ontological expertise for information integration. We present our findings and lessons learned from a case study on university rankings data.

References

  1. Allemang, D, & Hendler, J., 2008. Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL. Morgan Kaufmann.
  2. Arens, Y., Chee, C., Hsu, C., In, H., Knoblock, C. A., & Rey, M., 1996. Query Processing in the SIMS Information Mediator. In Proceedings of ARPI 1996. 61-69.
  3. Arens, Y., Chee, C. Y., Hsu, C.-N., & Craig A., K., 1993. Retriving and Integrating Data from Multiple Information Sources. International Journal of Intelligent and Cooperative Information Systems, 2. 127-158.
  4. Bakhtouchi, A., Bellatreche, L., Jean, S., & Ameur, Y. A., 2012. MIRSOFT: mediator for integrating and reconciling sources using ontological functional dependencies. International Journal of Web and Grid Services. doi:10.1504/IJWGS.2012.046731.
  5. Beneventano, D., Bergamaschi, F., Guerra, M., & Vincini, 2001. The MOMIS approach to Information Integration.
  6. Beneventano, Domenico, & Bergamaschi, S., 2004. The Momis Methodology For Integrating Heterogeneous Data Sources. Chapter in Building the Information Society, 19-24.
  7. Buccella, A., Cechich, A., & Brisaboa, N. R, 2005. Ontology-Based Data Integration Methods?: A Framework for Comparison. Revista Colombiana de Computación, 6.
  8. Cui, Z., Brien, P. O., & Park, A., 2000. Domain Ontology Management Environment, In Proceedings of the 33rd Hawaii International Conference on System Sciences. 1-9.
  9. Gruber, T., 2009. What is Ontology? In the Encyclopedia of Database Systems, Ling Liu and M. Tamer Özsu (Eds.), Springer-Verlag.
  10. Lenzerini, M., Sapienza, L., Salaria, V., & Roma, I., 2002. Data Integration?: A Theoretical Perspective. In Proceedings of the twenty-first ACM SIGMODSIGACT-SIGART symposium on Principles of database systems. 233 - 246.
  11. Moreno Paredes, A., 2007. Técnicas de depuración e integración de ontologías en el ámbito empresarial. Universidad de Sevilla.
  12. Rodríguez-Mancha, M., Ceballos, H., Cantú, F., & DíazPrado, A., 2011. Mapping relational databases through ontology matching: a case study on information migration. In Proceedings of the 6th International Workshop on Ontology Matching (OM-2011). CEURWS Vol-814, 244-245.
  13. Silvescu, A., Reinoso-castillo, J., & Honavar, V., 2001. Ontology-Driven Information Extraction and Knowledge Acquisition from Heterogeneous , Distributed , Autonomous Biological Data Sources. In Proceedings of the IJCAI-2001 Workshop on Knowledge Discovery from Heterogeneous, Distributed, Autonomous, Dynamic Data and Knowledge Sources.
  14. Volz, J., Bizer, C., Gaedke, M., & Kobilarov, G., 2009. Discovering and Maintaining Links on the Web of Data. In Proceedings of the 8th International Semantic Web Conference (ISWC 2009), 650-665.
  15. Zhao, H., & Ram, S., 2008. Entity matching across heterogeneous data sources: An approach based on constrained cascade generalization. Data & Knowledge Engineering, 66(3), 368-381.
  16. Zohra B., Angela B., Erhard R., 2011. Schema matching and mapping. Springer.
Download


Paper Citation


in Harvard Style

Casas-Bayona A. and Ceballos H. (2014). Integrating Semi-structured Information using Semantic Technologies - An Evaluation of Tools and a Case Study on University Rankings Data . In Proceedings of 3rd International Conference on Data Management Technologies and Applications - Volume 1: DATA, ISBN 978-989-758-035-2, pages 357-364. DOI: 10.5220/0005004203570364


in Bibtex Style

@conference{data14,
author={Alejandra Casas-Bayona and Hector G. Ceballos},
title={Integrating Semi-structured Information using Semantic Technologies - An Evaluation of Tools and a Case Study on University Rankings Data},
booktitle={Proceedings of 3rd International Conference on Data Management Technologies and Applications - Volume 1: DATA,},
year={2014},
pages={357-364},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005004203570364},
isbn={978-989-758-035-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of 3rd International Conference on Data Management Technologies and Applications - Volume 1: DATA,
TI - Integrating Semi-structured Information using Semantic Technologies - An Evaluation of Tools and a Case Study on University Rankings Data
SN - 978-989-758-035-2
AU - Casas-Bayona A.
AU - Ceballos H.
PY - 2014
SP - 357
EP - 364
DO - 10.5220/0005004203570364