10. else call XB-navigation(q);}
end
In the above algorithm, all the entries from data
streams will be visited through XB-trees (see line 3
and 10.) But they will be reordered by using a global
stack ST so that they are handled actually in
postorder (see lines 4 - 9; also see Algorithm stream-
transformation( ) for comparison.) For checking the
tree embedding, Algorithm embeddingCheck( ) is
invoked (see line 7) while for navigating an XB-tree
Algorithm XB-navigation( ) is called (see line 10.)
Procedure XB-navigation(q)
Input: a query node q.
Output: β
q
is changed.
begin
1. if q is the first node (in postorder) then downtrill(β
q
);
2. else {let q’ be the node just before q (in postorder);
3. if q’ is to the left of q then
4. {if empty(R
q’
) ∧ (currL(β
q’
) > currR(β
q
))
5. then advance(β
q
) (*not part of a solution*)
6. else drilldown(β
q
);} (*may have a child in
some solution*)
7. else (*q is the parent of q’.*)
8. if (¬empty(R
q’
) ∨ (currL(β
q’
) > currL(β
q
) ∧
currL(β
q’
) < currR(β
q
))
9. then drilldown(β
q
)
10. else advance(β
q
);
11. }
end
The above procedure shows a way different from
TwigStackXB to control the navigation of XB-trees.
On the one hand, it is because we check the tree
embedding bottom-up. On the other hand, we use
not only ancestor-descendant, but also left-to-right
relationships to control the XB-tree traversal. First,
we examine whether q is the first node in postorder
(see line 1.) If it is the case, we will drill down the
corresponding XB-tree since along the branch we
may find some entries which are part of a solution.
In general, we will check the query node q’ which is
the predecessor of q in postorder. It can be to the left
of q or the right-most child of q. In the former case,
we will compare currL(β
q’
) and currR(β
q
). If
empty(R
q’
) and currL(β
q’
) > currR(β
q
), any entry in
the subtree rooted at the entry pointed to by β
q
,
cannot be part of a solution, so β
q
will be advanced
(see lines 4 - 5.) Otherwise, we will drill down the
subtree to find some entries which might be part of a
solution (see line 6.) A similar analysis applies to
lines 7 - 10.
Procedure embeddingCheck(q, v)
Input: a query nodes q; a document tree node v.
output: a matching subtree T’ of T, D
root
and D
output
.
begin
1. generate node v;
… … (*same as lines 3 – 29 in tree-matching*)
end
5 CONCLUSIONS
In this paper, a new algorithms is presented to
evaluate twig pattern queries based on unordered
tree matching. The main idea is a process for tree
reconstruction from data streams, during which each
node v that matches a query node will be inserted
into a tree structure and associated with a query node
stream QS(v) such that for each node q in QS(v) T[v]
embeds Q[q]. Especially, by using an important
property of the tree encoding, this process can be
done very efficiently, which enables us to reduce the
time complexity of the existing methods such as
Twig
2
Stack (Chen et al., 2006) and One-Phase
Holistic (Jiang et al., 2007) by one order of
magnitude. Our experiments demonstrate that the
new algorithm is both effective and efficient for the
evaluation of twig pattern queries.
REFERENCES
Abiteboul, S., Buneman, P. and Suciu, D., 1999. Data on
the web: from relations to semistructured data and
XML, Morgan Kaufmann Publisher, Los Altos, CA
94022, USA.
Aghili, A., Li, H., Agrawal, D. and Abbadi, A.E., 2006.
TWIX: Twig structure and content matching of
selective queries using binary labeling, in:
INFOSCALE.
Al-Khalifa, S., Jagadish, H.V., N. Koudas, Patel, J.M.,
Srivastava, D. and Wu, Y., 2002. Structural Joins: A
primitive for efficient XML query pattern matching, in
Proc. of IEEE Int. Conf. on Data Engineering.
Bruno, N., Koudas, N. and Srivastava, D., 2002. Holistic
Twig Joins: Optimal XML Pattern Matching, in Proc.
SIGMOD Int. Conf. on Management of Data,
Madison, Wisconsin, June 2002, pp. 310-321.
Chamberlin, D.D., Clark, J., Florescu, D. and Stefanescu,
M., 2002. XQuery1.0: An XML Query Language,
http:/ /www.w3.org/TR/
querydatamodel/.
Chamberlin, D.D., Robie J. and D. Florescu, D., 2000.
Quilt: An XML Query Language for Heterogeneous
Data Sources, WebDB 2000.
Chen, T., Lu, J. and Ling, T.W., 2005. On Boosting
Holism in XML Twig Pattern Matching, in: Proc.
SIGMOD, pp. 455-466.
Choi, B., Mahoui, M. and Wood, D., 2003. On the
optimality of holistic algorithms for twig queries, in:
Proc. DEXA, pp. 235-244.
Chung, C., Min, J. and Shim, K., 2002. APEX: An
adaptive path index for XML data, ACM SIGMOD.
Chen, S., Li, H-G., Tatemura, J., Hsiung, W-P., Agrawa,
D. and Canda, K.S., 2006. Twig
2
Stack: Bottom-up
Processing of Generalized-Tree-Pattern Queries over
XML Documents, in Proc. VLDB, Seoul, Korea, pp.
UNORDERED TREE MATCHING AND TREE PATTERN QUERIES IN XML DATABASES
197