DATA QUALITY IN XML DATABASES - A Methodology for Semi-structured Database Design Supporting Data Quality Issues

Eugenio Verbo, Ismael Caballero, Eduardo Fernandez-Medina, Mario Piattini



As the use of XML as a technology for data exchange has widely spread, the need of a new technology to store semi-structured data in a more efficient way has been emphasized. Consequently, XML DBs have been created in order to store a great amount of XML documents. However, like in previous data models as the relational model, data quality has been frequently left aside. Since data plays a key role in organization efficiency management, its quality should be managed. With the intention of providing a base for data quality management, our proposal address the adaptation of a XML DB development methodology focused on data quality. To do that we have based on some key area processes of a Data Quality Maturity reference model for information management process definition.


  1. Bray, T., Paoli, J. & Sperberg-McQueen, C. M., 1998. Extensible Markup Language (XML) 1.0. W3C Recommendation.
  2. Caballero, I. & Piattini, M., 2007. Assessment and Improvement of Data and Information Quality. IN ALHAKIM, L. (Ed.) Information Quality Management: Theory and Applications. Hershey, PA, USA, Idea Group Publishing.
  3. English, L., 1999. Improving Data Warehouse and Business Information Quality: Methods for reducing costs and increasing Profits, New York, NY, USA, Willey & Sons.
  4. Eppler, M., 2001. A Generic Framework for Information Quality in Knowledge-Intensive Processes. In Proceeding of the Sixth International Conference on Information Quality.
  5. Fuggeta, A., 2000. Software Process: A Road Map. . In FINKELSTEIN, A. (Ed.) In Twenty-Second International Conference on Software Engineering (ICSE'2000). Limerick, Ireland, ACM Press.
  6. García, F., Bertoa, M. F., Calero, C., Vallecillo, A., Ruiz, F., Piattini, M. & Genero, M., 2005. Toward a consistent terminology for software measurement. Information and Software Technology, 48, 631-644.
  7. Huang, K. T., Lee, Y. W. & Wang, R. Y., 1999. Quality Information and Knowledge, Upper Saddle River, NJ, USA, Prentice-Hall.
  8. Lee, Y. W., Pipino, L. L., Funk, J. D. & Wang, R. Y., 2006. Journey to Data Quality, Cambridge, MA, USA, Massachussets Institute of Technology.
  9. Levitin, A. & Redman, T., 1995. Quality Dimensions of a Conceptual View. Information Processing and Management, 31(1), 81-88.
  10. Marcos, E., Vela, B. & Cavero, J. M., 2001. Extending UML for Object-Relational Database Design. In Fourth Int. Conference on the Unified Modeling Language, UML 2001. Toronto (Canada), SpringerVerlag.
  11. OASIS, 2006. ISO/IEC 26300:2006 Information technology -- Open Document Format for Office Applications (OpenDocument) v1.0. International Organization for Standardization.
  12. Redman, T. C., 1996. Data Quality for the Information Age, Boston, MA, USA, Artech House Publishers.
  13. Strong, D., Lee, Y. & Wang, R., 1997. Data Quality in Context. Communications of the ACM, Vol. 40, Nº 5, 103 -110.
  14. Verbo, E., Caballero, I. & Piattini, M., 2007. DQXSD: An XML Schema for Data Quality. Paper accepted for the 9th International Conference on Enterprise Information Systems (ICEIS). Funchal, Madeira - Portugal.
  15. Wang, R. Y., Reddy, M. P. & Kon, H. B., 1995. Toward quality data: An attribute-based approach. Decision Support Systems.

Paper Citation

in Harvard Style

Verbo E., Caballero I., Fernandez-Medina E. and Piattini M. (2007). DATA QUALITY IN XML DATABASES - A Methodology for Semi-structured Database Design Supporting Data Quality Issues . In Proceedings of the Second International Conference on Software and Data Technologies - Volume 3: ICSOFT, ISBN 978-989-8111-07-4, pages 117-122. DOI: 10.5220/0001337601170122

in Bibtex Style

author={Eugenio Verbo and Ismael Caballero and Eduardo Fernandez-Medina and Mario Piattini},
title={DATA QUALITY IN XML DATABASES - A Methodology for Semi-structured Database Design Supporting Data Quality Issues},
booktitle={Proceedings of the Second International Conference on Software and Data Technologies - Volume 3: ICSOFT,},

in EndNote Style

JO - Proceedings of the Second International Conference on Software and Data Technologies - Volume 3: ICSOFT,
TI - DATA QUALITY IN XML DATABASES - A Methodology for Semi-structured Database Design Supporting Data Quality Issues
SN - 978-989-8111-07-4
AU - Verbo E.
AU - Caballero I.
AU - Fernandez-Medina E.
AU - Piattini M.
PY - 2007
SP - 117
EP - 122
DO - 10.5220/0001337601170122