in general-tree-embedding( ). In F(j), we maintain
all those query nodes i such that Q[i] can be
embedded (not only root-preservingly embedded) in
T[j]. Although more time is needed for this, the
whole time complexity remains unchanged. See line
5, in which the merge operation is first introduced in
(Chen, 2007). The time complexity of
merge(F(j
1
), ..., F(j
k
)) is bounded by O(k⋅leaf
Q
).
Special attention should be paid to Function general-
node-check( ). It is used to check ∧-nodes in Q.
Since each name node has only one ∧-node as its
child, the checking of name nodes is integrated into
this process to simplify the procedure (see line 2 in
this function.)
In Function calculate-H(j, F(j)), we compute H(j)
based on F(j). It is done exactly according to the
conditions given above for checking ∨-node
containment. Especially, in the presence of ‘¬’, we
have to check each negative node in N
Q
to see
whether it appears in F(j). (see lines 7 - 9 in this
function). It needs O(|N
Q
|⋅log|F(j)|) time. So the total
time of the algorithm is bounded by O(|T|⋅leaf
Q
+
|N
Q
|⋅|T|⋅logleaf
Q
).
5 CONCLUSIONS
In this paper, a new algorithm is proposed to
evaluate twig pattern queries in XML document
databases. The algorithm works in a bottom-up way,
by which an important property of the postorder
numbering is used to avoid join or join-like
operations. The time complexity and the space
complexity of the algorithm are bounded by
O(|T|⋅|Q|) and O(|Q|⋅leaf
T
), respectively, where T is
the document tree and Q the twig pattern query, and
leaf
T
represents the number of leaf nodes in T.
Experiments have been done to compare our method
with some existing strategies, which demonstrates
that our method is highly promising in evaluating
twig pattern queries.
ACKNOWLEDGEMENTS
The work is supported by NSERC 239074-01
(242523) (Natural Science and Engineering Council
of Canada).
REFERENCES
A. Aghili, H. Li, D. Agrawal (2006). and A.E. Abbadi,
TWIX: Twig structure and content matching of
selective queries using binary labeling, in:
INFOSCALE, 2006.
N. Bruno, N. Koudas, and D. Srivastava (2002) Holistic
Twig Hoins: Optimal XML Pattern Matching, in Proc.
SIGMOD Int. Conf. on Management of Data
,
Madison, Wisconsin, June 2002, pp. 310-321.
C. Chung, J. Min, and K. Shim (2002). APEX: An
adaptive path index for XML data,
ACM SIGMOD,
June 2002.
S. Chen et al. (2006). Twig
2
Stack: Bottom-up Processing
of Generalized-Tree-Pattern Queries over XML
Documents, in Proc. VLDB, Seoul, Korea, Sept. 2006,
pp. 283-323.
Y. Chen (2007). A New Algorithm for Tree Mapping in
XML Databases, in Proc. of the Internet and
Multimedia Systems and Applications Conference
(IMSA 2007), Honolulu, Hawaii, USA.
B.F. Cooper, N. Sample (2001). M. Franklin, A.B.
Hialtason, and M. Shadmon, A fast index for
semistructured data, in: Proc. VLDB, Sept. 2001, pp.
341-350.
R. Goldman and J. Widom (1997). DataGuide: Enable
query formulation and optimization in semistructured
databases, in: Proc. VLDB, Aug. 1997, pp. 436-445.
G. Gottlob, C. Koch, and R. Pichler (2005). Efficient
Algorithms for Processing XPath Queries, ACM
Transaction on Database Systems
, Vol. 30, No. 2,
June 2005, pp. 444-491.
C.M. Hoffmann and M.J. O’Donnell (1982). Pattern
matching in trees,
J. ACM, 29(1):68-95, 1982.
Q. Li and B. Moon (2001) Indexing and Querying XML
data for regular path expressions, in: Proc. VLDB,
Sept. 2001, pp. 361-370.
J. Lu, T.W. Ling, C.Y. Chan, and T. Chan (2005). From
Region Encoding to Extended Dewey: on Efficient
Processing of XML Twig Pattern Matching, in: Proc.
VLDB
, pp. 193 - 204, 2005.
G. Miklau and D. Suciu (2004) Containment and
Equivalence of a Fragment of XPath, J. ACM, 51(1):2-
45, 2004.
H. Wang, S. Park, W. Fan, and P.S. Yu (2003) ViST: A
Dynamic Index Method for Querying XML Data by
Tree Structures,
SIGMOD Int. Conf. on Management
of Data
, San Diego, CA., June 2003.
H. Wang and X. Meng (2005), On the Sequencing of Tree
Structures for XML Indexing, in Proc. Conf. Data En-
gineering
, Tokyo, Japan, April, 2005, pp. 372-385.
R. Kaushik, P. Bohannon, J. Naughton, and H. Korth
(2002) Covering indexes for branching path queries,
in:
ACM SIGMOD, June 2002.
ICEIS 2008 - International Conference on Enterprise Information Systems
178