STACK ENCODING REVISITED

Yangjun Chen

Abstract

The twig join, which is used to find all occurrences of a twig pattern in an XML database, is a core operation for XML query processing. A great many strategies for handling this problem have been proposed and can be roughly classified into two groups. The first group decomposes a twig pattern (a small tree) into a set of binary relationships between pairs of nodes, such as parent-child and ancestor-descendant relations; and transforms a tree matching problem into a series of simple relation look-ups. The second group decomposes a twig pattern into a set of paths. Among all this kind of methods, the approach based on the so-called stack encoding [N. Bruno, N. Koudas, and D. Srivastava, Holistic Twig Hoins: Optimal XML Pattern Matching, in Proc. SIGMOD Int. Conf. on Management of Data, Madison, Wisconsin, June 2002, pp. 310-321] is very interesting, which can represent in linear space a potentially exponential (in the number of query nodes) number of matching paths. However, the available processes for generating such compressed paths suffer some redundancy and can be significantly improved. In this paper, we analyze this method and show that the time complexities of path generation in its two main procedures: TwigStack and TwigStackXB can be reduced from O(m2n) to O(mn), where m and n are the sizes of the query tree and document tree, respectively. Experiments have been done to compare TwigStackXB and ours, which shows that using our method much less time is needed to generate matching paths.

References

  1. S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, D. Srivastava, and Y. Wu (2002). Structureal Joins: Aprimitive for efficient XML query pattern matching, in Proc. of IEEE Int. Conf. on Data Engineering.
  2. N. Bruno, N. Koudas, and D. Srivastava (2002). Holistic Twig Hoins: Optimal XML Pattern Matching, in Proc. SIGMOD Int. Conf. on Management of Data, Madison, Wisconsin, June 2002, pp. 310-321.
  3. D. D. Chamberlin, J.Clark, D. Florescu and M. Stefanescu (1999). XQuery1.0: An XML Query Language, http://www.w3.org/ TR/query-datamodel/.
  4. D. D. Chamberlin, J. Robie and D. Florescu (2000). Quilt: An XML Query Language for Heterogeneous Data Sources, WebDB 2000.
  5. A. Deutch, M. Fernandex, D. Florescu, A. Levy, D.Suciu (1999). A Query Language for XML, WWW'99.
  6. D. Florescu and D. Kossman, Storing and Querying (1999). XML data using an RDMBS, IEEE Data Engineering Bulletin, 22(3):27-34.
  7. J. McHugh, J. Widom (1999) Query optimization for XML, in Proc. of VLDB.
  8. J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D.J. Dewitt, and J.F. Naughton (1999). Relational databases for querying XML documents: Limitations and opportunities, in Proc. of VLDB.
  9. The Tukwila System (1999), available from U. of Washington http:/ /data.cs.washington.edu/integration /tukwila/.
  10. The Niagara System (2000), available from U. of Wisconsin http:// www.cs.wisc.edu/niagara/.
  11. H. Wang, S. Park, W. Fan, and P.S. Yu (2003). ViST: A Dynamic Index Method for Querying XML Data by Tree Structures, SIGMOD Int. Conf. on Management of Data, San Diego, CA., June 2003.
  12. H. Wang and X. Meng (2005) On the Sequencing of Tree Structures for XML Indexing, in Proc. Conf. Data Engineering, Tokyo, Japan, April, 2005, pp. 372-385.
  13. World Wide Web Consortium (1999). XML Path Language (XPath), W3C Recommendation, Version 1.0, November 1999. See http://www.w3.org/TR/xpath.
  14. World Wide Web Consortium (2001) XQuery 1.0: An XML Query Language, W3C Recommendation, Version 1.0, Dec. 2001. See http://www.w3.org/TR/ xquery.
  15. C. Zhang, J. Naughton, D. Dewitt (2001) Q. Luo, and G. Lohman, on Supporting containment queries in relational database management systems, in Proc. of ACM SIGMOD, 2001.
Download


Paper Citation


in Harvard Style

Chen Y. (2007). STACK ENCODING REVISITED . In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-972-8865-77-1, pages 5-14. DOI: 10.5220/0001260800050014


in Bibtex Style

@conference{webist07,
author={Yangjun Chen},
title={STACK ENCODING REVISITED},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2007},
pages={5-14},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001260800050014},
isbn={978-972-8865-77-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - STACK ENCODING REVISITED
SN - 978-972-8865-77-1
AU - Chen Y.
PY - 2007
SP - 5
EP - 14
DO - 10.5220/0001260800050014