A Prime Number Approach to Matching an XML Twig Pattern including Parent-Child Edges

Shtwai Alsubai, Siobhán North

Abstract

Twig pattern matching is a core operation in XML query processing because it is how all the occurrences of a twig pattern in an XML document are found. In the past decade, many algorithms have been proposed to perform twig pattern matching. They rely on labelling schemes to determine relationships between elements corresponding to query nodes in constant time, therefore the processing time is improved. In this paper, a new algorithm TwigStackPrime is proposed, which is an improvement to TwigStack (Bruno et al., 2002). To reduce the memory consumption and computation overhead of twig pattern matching algorithms when Parent-Child (P-C) edges are involved, TwigStackPrime efficiently filters out a tremendous amount of irrelevant elements and avoid unnecessary computations by introducing a new labelling scheme, called Child Prime Labels (CPL). Extensive performance studies on various real-world and artificial datasets were conducted to demonstrate the significant improvement of CPL over the previous indexing and querying techniques. The experimental results show that the new technique has a superior performance to the previous approaches.

References

  1. Alireza Aghili, S., Alireza Aghili, S., Hua-Gang, L., HuaGang, L., Agrawal, D., Agrawal, D., El Abbadi, A., and El Abbadi, A. (2006). TWIX: twig structure and content matching of selective queries using. InfoScale 7806: Proceedings of the 1st international conference on, page 42.
  2. Bruno, N., Koudas, N., and Srivastava, D. (2002). Holistic twig joins: optimal XML pattern matching. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pages 310-321, Madison, Wisconsin. ACM.
  3. Chen, S., Li, H.-G., Tatemura, J., Hsiung, W.-P., Agrawal, D., Sel, K., #231, uk Candan, and Candan, K. S. (2006). Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents.
  4. Chen, T., Lu, J., and Ling, T. W. (2005). On Boosting Holism in XML Twig Pattern Matching Using Structural Indexing Techniques. Science, pages 455-466.
  5. Choi, B., Mahoui, M., and Wood, D. (2003). On the optimality of holistic algorithms for twig queries. Database and Expert Systems Applications, pages 28-37.
  6. Grimsmo, N., Bjørklund, T. A., and Hetland, M. L. (2010). Fast optimal twig joins. VLDB, 3(1-2):894-905.
  7. Li, J. and Wang, J. (2008). Fast Matching of Twig Patterns. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5181 LNCS:523-536.
  8. Lu, J., Chen, T., and Ling, T. W. T. (2004). Efficient Processing of XML Twig Patterns with Parent Child Edges : A Look-ahead Approach. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, number i, pages 533- 542, Washington, D.C., USA. ACM.
  9. Lu, J., Meng, X., and Ling, T. W. (2011). Indexing and querying XML using extended Dewey labeling scheme. Data & Knowledge Engineering, 70(1):35-59.
  10. Qin, L., Yu, J. X., and Ding, B. (2007). TwigList: Make Twig Pattern Matching Fast. In Kotagiri, R., Krishna, P. R., Mohania, M., and Nantajeewarawat, E., editors, Advances in Databases: Concepts, Systems and Applications: 12th International Conference on Database Systems for Advanced Applications, DASFAA 2007, Bangkok, Thailand, April 9-12, 2007. Proceedings, pages 850-862. Springer Berlin Heidelberg, Berlin, Heidelberg.
  11. Wu, H., Lin, C., Ling, T. W., and Lu, J. (2012). Processing XML twig pattern query with wildcards. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7446 LNCS:326-341.
  12. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., and Lohman, G. (2001). On supporting containment queries in relational database management systems. ACM SIGMOD Record, 30:425-436.
Download


Paper Citation


in Harvard Style

Alsubai S. and North S. (2017). A Prime Number Approach to Matching an XML Twig Pattern including Parent-Child Edges . In Proceedings of the 13th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-246-2, pages 204-211. DOI: 10.5220/0006225602040211


in Bibtex Style

@conference{webist17,
author={Shtwai Alsubai and Siobhán North},
title={A Prime Number Approach to Matching an XML Twig Pattern including Parent-Child Edges},
booktitle={Proceedings of the 13th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2017},
pages={204-211},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006225602040211},
isbn={978-989-758-246-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 13th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - A Prime Number Approach to Matching an XML Twig Pattern including Parent-Child Edges
SN - 978-989-758-246-2
AU - Alsubai S.
AU - North S.
PY - 2017
SP - 204
EP - 211
DO - 10.5220/0006225602040211