AN ANALYSIS OF RELATIONAL STORAGE STRATEGIES FOR PARTIALLY STRUCTURED XML

Yasser Abdel Kader, Barry Eaglestone, Siobhán North

Abstract

This paper presents a performance analysis of strategies for storing XML data sets in relational databases, focusing on XML datasets that are a combination of structured and semi-structured data. The analysis demonstrates advantages of a hybrid approach combining structure mapping and XML data type instances. However problems remain with current technology with regards to scaling of the approach for large data sets. Also, anomalous results are identified and a threshold at which the cost of data shredding out weighs the advantages of structure mapping.

References

  1. Abdel Kader, Y. 2007. A Performance Analysis of a Hybrid Relational-Xml Approach to store PartiallyStructured Data. PhD Thesis, University of Sheffield.
  2. Afanasiev, L. Manolescu, I., Michiels, P. 2005. MemBer: A Micro-benchmark Repository for XQuery. XML Symposium (XSym).
  3. Atay, M., et al. 2007. Efficient schema-based XML-toRelational data mapping. Information Systems 32(3), pp.458- 476.
  4. Balmin, A., Papakonstantinou, Y 2005. Storing and Querying XML data using denormalized relational databases The VLDB Journal, 14, pp.30-49.
  5. Böhme, T., Rahm, E. 2002. Multi-user Evaluation of XML Data Management Systems with XMach-1. Efficiency and Effectiveness of XML Tools, and Techniques EEXTT, pp.148-158.
  6. Brassan, S., et al. 2002. The XOO7 benchmark. In Proceedings of VLDB'02 Workshop Efficiency and Effectiveness of XML Tools, and Techniques EEXTT, LNCS 2590, pp.146-147.
  7. Deutsch, A., Fernandez, M., Suciu, D. 1999. Storing semistructured data with STORED. In Proceedings of the 25th ACM SIGMOD International Conference on Management of Data.
  8. DeWitt, D. 1993. The Wisconsin Benchmark: Past, Present, and Future. The Benchmark Handbook for Database and Transaction Systems (2nd Edition). Morgan Kaufmann, ISBN 1-55860-292-5.
  9. Eisenberg, A., Melton, J. 2004. Advancements in SQL/XML. SIGMOD Record 33(3), pp.79-86.
  10. Franceschet, M. 2005. XPathMark - An XPath benchmark for XMark Generated Data. International XML Database Symposium (XSYM), pp. 129-143.
  11. Gou, G., Chirkova, R. 2007. Efficiently Querying Large XML Data Repositories: A Survey. IEE Transactions on Knowledge and Data Engineering. 19(10), October, 2007, pp.1381-1403.
  12. Krishnaprasad, et al. 2005. Towards an Industrial Strength SQL/XML Infrastructure. Proceedings of the 21st International Conference on Data Engineering, ICDE 2005. Tokyo, Japan, pp.991-1000.
  13. Lacoude, P. 2006. Pushing SQL Server 2005 Limits, Dealing with Oversized XML Documents [online] Available from: http://www.lacoude.com/Docs/public /public.aspx?doc=SQL90XML.PDF[Accessed 17.10.2007]
  14. Ley, M., Reuther, P. 2006. Maintaining an Online Bibliographical Database: The Problem of Data Quality. EGC 2006, Lille, France, pp.5-10.
  15. Lu, H. et al., et al. 2005. What Makes the Differences: Benchmarking XML Database Implementations. ACM Transactions on Internet Technology (ACM TOIT), 5 (1), pp.154-194.
  16. Lu, S., et al. 2003. A new inlining algorithm for mapping XML DTDs to relational schemas. In Proceedings of the 1st International Workshop on XML Schema and Data Management. LNCS, Chicago, Illinois, USA.
  17. Lv, T., Yan, P. 2006. Mapping DTDs to relational schemas with semantic constraints. Information and Software Technology, Volume 48 (4), pp. 245-252
  18. Murthy, R. et al. 2005. Towards an enterprise XML architecture. In Proceedings of the 2005 ACM SIGMOD international Conference on Management of Data, Baltimore, Maryland, pp.14-16.
  19. Nicola, M., Kogan, I., Schiefer, B. 2007. An XML Transaction Processing Benchmark. SIGMOD, Beijing, China.
  20. Özcan, F., et al. 2006. Integration of SQL and XQuery in IBM DB2. IBM System Journal. 45 (2), pp.245-270.
  21. Pal, S. et al. 2005. XQuery implementation in a relational database system. In Proceedings of the 31st international Conference on Very Large Data Bases. Trondheim, Norway, pp. 1175-1186.
  22. Pal, S., Tomic, D., Berg, B., Xavier, J. 2006. Managing Collections of XML Schemas in Microsoft SQL Server 2005. EDBT 2006, pp. 1102-1105.
  23. Penna, G. et al. 2006. Interoperability mapping from XML schemas to ER diagrams. Data & Knowledge Engineering, 59 (1), pp.166-188.
  24. Reuther, P. et al. 2006. Managing the Quality of Person Names in DBLP. Research and Advanced Technology for Digital Libraries, 10th European Conference, ECDL 2006, Alicante, Spain, pp.508-511.
  25. Runapongsa, K. et al. 2006. The Michigan benchmark: towards XML query performance diagnostics. Information Systems 31(2), pp.73-97.
  26. Rys, M. 2005. XML and relational database management systems: inside Microsoft® SQL Server™ 2005. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, pp.14-16.
  27. Schmidt, A., et al. 2002. XMark: A benchmark for XML data management. In Proceedings of the 28th International Conference on VLDB. Hong Kong, China, pp.974-985.
  28. Shanmugasundaram et al. 1999. Relational Databases for Querying XML Documents: Limitations and Opportunities. Proceeding of the 25th VLDB Conference, Edinburgh, Scotland.
  29. MS SQL Server 2005 [online] Available from: http:// www.microsoft.com/sql/default.mspx [Accessed 17.10.2007].
  30. SQL:2006. International Organization for Standardization (ISO). Information Technology-Database Language SQL. Standard No. ISO/IEC 9075-14:2006. Part 14: XML-Related Specifications (SQL/XML) (Available from American National Standards Institute, New York, NY 10036)
  31. XQuery 1.0 and XPath 2.0 Data Model (XDM) W3C Candidate Recommendation 11 July 2006 [online].
  32. Available from: http://www.w3.org/TR/xpath-datamodel/ [Accessed 07.11.2006]
  33. Yao, B., Özsu, M. and Keenleysidem J. 2004. XBench Benchmark and Performance Testing of XML DBMSs. In Proceedings of 20th International Conference on Data Engineering, Boston, MA, United States of America, pp.621-632.
  34. Yoshikawa, M. and Amagasa, T. 2001. Xrel: A PathBased Approach to Storage and Retrieval of XML Documents Using Relational Databases. ACM Transaction on Internet Technology 1 (1), pp.110-141.
Download


Paper Citation


in Harvard Style

Abdel Kader Y., Eaglestone B. and North S. (2008). AN ANALYSIS OF RELATIONAL STORAGE STRATEGIES FOR PARTIALLY STRUCTURED XML . In Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-26-5, pages 165-170. DOI: 10.5220/0001525601650170


in Bibtex Style

@conference{webist08,
author={Yasser Abdel Kader and Barry Eaglestone and Siobhán North},
title={AN ANALYSIS OF RELATIONAL STORAGE STRATEGIES FOR PARTIALLY STRUCTURED XML},
booktitle={Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2008},
pages={165-170},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001525601650170},
isbn={978-989-8111-26-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - AN ANALYSIS OF RELATIONAL STORAGE STRATEGIES FOR PARTIALLY STRUCTURED XML
SN - 978-989-8111-26-5
AU - Abdel Kader Y.
AU - Eaglestone B.
AU - North S.
PY - 2008
SP - 165
EP - 170
DO - 10.5220/0001525601650170