UNORDERED TREE MATCHING AND TREE PATTERN QUERIES IN XML DATABASES

Yangjun Chen

Abstract

With the growing importance of XML in data ex¬change, much research has been done in providing flexible query facilities to extract data from structured XML docu¬ments. In this paper, we discuss an efficient algorithm for tree mapping problem in XML databases based on unordered tree matching. Given a target tree T and a pattern tree Q, the algorithm can find all the embeddings of Q in T in O(|D||Q|) time, where D is a largest data stream associated with a node of Q. More importantly, the algorithm is index-oriented: with XB-trees constructed over data streams, disk access can be dramatically decreased.

References

  1. Abiteboul, S., Buneman, P. and Suciu, D., 1999. Data on the web: from relations to semistructured data and XML, Morgan Kaufmann Publisher, Los Altos, CA 94022, USA.
  2. Aghili, A., Li, H., Agrawal, D. and Abbadi, A.E., 2006. TWIX: Twig structure and content matching of selective queries using binary labeling, in: INFOSCALE.
  3. Al-Khalifa, S., Jagadish, H.V., N. Koudas, Patel, J.M., Srivastava, D. and Wu, Y., 2002. Structural Joins: A primitive for efficient XML query pattern matching, in Proc. of IEEE Int. Conf. on Data Engineering.
  4. Bruno, N., Koudas, N. and Srivastava, D., 2002. Holistic Twig Joins: Optimal XML Pattern Matching, in Proc. SIGMOD Int. Conf. on Management of Data, Madison, Wisconsin, June 2002, pp. 310-321.
  5. Chamberlin, D.D., Clark, J., Florescu, D. and Stefanescu, M., 2002. XQuery1.0: An XML Query Language, http:/ /www.w3.org/TR/ querydatamodel/.
  6. Chamberlin, D.D., Robie J. and D. Florescu, D., 2000. Quilt: An XML Query Language for Heterogeneous Data Sources, WebDB 2000.
  7. Chen, T., Lu, J. and Ling, T.W., 2005. On Boosting Holism in XML Twig Pattern Matching, in: Proc. SIGMOD, pp. 455-466.
  8. Choi, B., Mahoui, M. and Wood, D., 2003. On the optimality of holistic algorithms for twig queries, in: Proc. DEXA, pp. 235-244.
  9. Chung, C., Min, J. and Shim, K., 2002. APEX: An adaptive path index for XML data, ACM SIGMOD.
  10. Chen, S., Li, H-G., Tatemura, J., Hsiung, W-P., Agrawa, D. and Canda, K.S., 2006. Twig2Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents, in Proc. VLDB, Seoul, Korea, pp. 283-294.
  11. Cooper, B.F., Sample, N., Franklin, M., Hialtason, A.B. and Shadmon, M., 2001. A fast index for semistructured data, in: Proc. VLDB, pp. 341-350.
  12. Deutch, A., Fernandez, M., Florescu, D., Levy, A. and Suciu, D., 1999. A Query Language for XML, in: Proc. 8th World Wide Web Conf., pp. 77-91.
  13. Florescu, D. and Kossman, D., 1999. Storing and Querying XML data using an RDMBS, IEEE Data Engineering Bulletin, 22(3):27-34.
  14. Goldman R. and Widom, J. 1997. DataGuide: Enable query formulation and optimization in semistructured databases, in: Proc. VLDB, pp. 436-445.
  15. C.M. Hoffmann, C.M. and M.J. O'Donnell, M.J., 1982. Pattern matching in trees, J. ACM, 29(1):68-95.
  16. Lu, J., Ling, T.W., Chan, C.Y. and Chan, T., 2005 From Region Encoding to Extended Dewey: on Efficient Processing of XML Twig Pattern Matching, in: Proc. VLDB, pp. 193 - 204.
  17. McHugh, J. and Widom, J., 1999. Query optimization for XML, in Proc. of VLDB.
  18. Seo, C., Lee, S. and Kim, H., 2003. An Efficient Index Technique for XML Documents Using RDBMS, Information and Software Technology 45(2003) 11-22, Elsevier Science B.V.
  19. Li Q. and Moon, B., 2001. Indexing and Querying XML data for regular path expressions, in: Proc. VLDB, pp. 361-370.
  20. Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., Dewitt, D.J., and J.F. Naughton, J.F., 1999. Relational databases for querying XML documents: Limitations and opportunities, in Proc. of VLDB.
  21. U. of Washington, 2007. The Tukwila System, available from http://data.cs.washington.edu. integration/tukwila/.
  22. U. of Wisconsin, 2007. The Niagara System, available from http://www.cs.wisc.edu/ niagara/.
  23. U of Washington XML Repository, 2007. available from http://www.cs.washington.edu/ research/xmldatasets.
  24. Wang, H., S. Park, Fan, W. and Yu, P.S., 2003. ViST: A Dynamic Index Method for Querying XML Data by Tree Structures, SIGMOD Int. Conf. on Management of Data, San Diego, CA.
  25. Wang H. and Meng, X., 2005. On the Sequencing of Tree Structures for XML Indexing, in Proc. Conf. Data Engineering, Tokyo, Japan, April, pp. 372-385.
  26. World Wide Web Consortium, 2007. XML Path Language (XPath), W3C Recommendation. See http:// www.w3.org/TR/xpath20.
  27. World Wide Web Consortium, 2007. XQuery 1.0: An XML Query Language, W3C Recommedation, Version 1.0. See http://www.w3.org/TR/xquery.
  28. XMARK: The XML-benchmark project, 2002. http://monetdb.cwi.nl/xml.
  29. C. Zhang, C., J. Naughton, Dewitt, D., Luo, Q. and G. Lohman, G., 2001. on Supporting containment queries in relational database management systems, in Proc. of ACM SIGMOD.
  30. Kaushik, R., Bohannon, P., Naughton, J. and Korth, H., 2002. Covering indexes for branching path queries, in: ACM SIGMOD.
  31. Schmidt, A.R., F. Waas, Kersten, M.L., Florescu, D., Manolescu, I., Carey, M.J. and R. Busse, 2001. The XML benchmark project, Technical Report INSRo1o3, Centrum voor Wiskunde en Informatica.
  32. Jiang, Z., Luo, C., Hou, W.-C., Zhu, Q., and Che, D., 2007. “Efficient Processing of XML Twig Pattern: A Novel One-Phase Holistic Solution,” In Proc. the 18th Int'l Conf. on Database and Expert Systems Applications (DEXA), pp. 87-97.
  33. Bar-Yossef, Z., Fontoura, M., and V. Josifovski, V. 2007. On the memmory requirements of XPath evaluation over XML streams, Journal of Computer and System Sciences 73, pp. 391-441.
Download


Paper Citation


in Harvard Style

Chen Y. (2009). UNORDERED TREE MATCHING AND TREE PATTERN QUERIES IN XML DATABASES . In Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-989-674-010-8, pages 191-198. DOI: 10.5220/0002238801910198


in Bibtex Style

@conference{icsoft09,
author={Yangjun Chen},
title={UNORDERED TREE MATCHING AND TREE PATTERN QUERIES IN XML DATABASES},
booktitle={Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2009},
pages={191-198},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002238801910198},
isbn={978-989-674-010-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - UNORDERED TREE MATCHING AND TREE PATTERN QUERIES IN XML DATABASES
SN - 978-989-674-010-8
AU - Chen Y.
PY - 2009
SP - 191
EP - 198
DO - 10.5220/0002238801910198