loop schema path, the new sequence of nodes and
the new parent pointer. Note that all the schema
paths, which contain the pointer to the schema path
record, are also aware of this modification. When a
loop is detected, instead of setting
p
i
’=p
i
, p
i
’ is set to
empty, i.e. if a loop is detected in
p
i
, p
i
will not
contribute to the further computation of schema
paths anymore.
The schema paths
L(q, fp, -) of a predicate q are
added into the field of the predicate schema paths of
the record.
fp logs the context node of the predicate
such that we compute the schema paths of the
predicate from
fp. When L(q, fp, -) is computed to
empty, the main schema paths are computed to an
empty set. Checking of data-type and occurrence
constraints is presented in (Groppe J. et al., 2007).
• L(e1|e2, p, -) = L(e1, p, -) ∪ L(e2, p, -)
• L(/e, p, -) = L(e, p1, /), where p1=( </, {(/)}, -, -, -, - > )
• L(e1/e2, p, xp) = {p2 | p2∈L(e2, p1, xp/e1) ∧ p1∈L(e1, p, xp)}
• L(self::n, p, xp) = {p+<xp/self::n, S, self, p[|p|].z, -, -> |
S=p[|p|].S ∧ NT(S(1)[|S(1)|], n)}
• L(child::n, p, xp) = {p+<xp/n, {s}, child, p[|p|], -, -> |
NT(s[|s|], n) ∧ S=p[|p|].S ∧ isiElement(S(1)[S(1)]) ∧
((s∈iChild(S(1)[|S(1)|]) ∧ n≠text()) ∨
(s∈iTextChild(S(1)[|S(1)|] ∧ (n=text() ∨ n=node())))}
• L
r
(self::n, p, xp) = {p | NT(S(1)[|S(1)|], n) ∧ S=p[|p|].S}
• L(descendant::n, p, xp) = {p’ | p’∈∪
i=1
∞
L
r
(self::n, p’
i
, xp) ∧ (
(p
i
’=p
i
∧ p
i
∈L(child::node(), p
i-1
, xp) ∧ ∀k∈{1, …, |p
i
|-1}: (
p
i
[k].XP≠p
i
[|p
i
|].XP ∨ (S
1
(1)[|S
1
(1)|]≠S
2
(1)[|S
2
(1)|] ∧ S
1
=p
i
[k].S ∧
S
2
=p
i
[|p
i
|].S)) ∧ p
i-1
∈L(child::node(), p
i-2
, xp) ∧…∧
p
1
∈L(child::node(), p, xp))
∨
(p’
i
=⊥ ∧ (p
i
[k]→<p
i
[k].XP, p
i
[k].S∪ p
i
[|p
i
|].S, p
i
[k].z∪p
i
[|p
i
|].z,
p
i
[k].lp∪{(p
i
[|p
i
|], p
i
[k+1], ..., p
i
[|p
i
|-1])}, p
i
[k].f>) ∧ ∃k∈{1, ..., |p
i
|-1}: (
p
i
[k].XP=p
i
[|p
i
|]XP ∧ S
1
(1)[|S
1
(1)|]=S
2
(1)[|S
2
(1)|] ∧ S
1
=p
i
[k].S ∧
S
2
=p
i
[|p
i
|].S) ∧ p
i
∈L(child::node(), p
i-1
, xp) ∧
p
i-1
∈L(child::node(), p
i-2
, xp) ∧ … ∧ p
1
∈L(child::node(), p, xp)))}
• L(parent::n, p, xp) = {p + <xp/parent::n, S, parent, Z1.z, -, -> |
S=Z1.S ∧ Z1∈p[|p|].z ∧ NT(S(1)[|S(1)|], n) }
• L(ancestor::n, p, xp) = { p’ | p’∈∪
i=1
∞
L
r
(self::n, p’
i
, xp) ∧ (
(p
i
’=p
i
∧ p
i
∈L(parent::node(), p
i-1
, xp) ∧ ∀k∈{1, …, |p
i
|-1}: (
p
i
[k].XP≠p
i
[|p
i
|].XP ∨ (S
1
(1)[|S
1
(1)|]≠S
2
(1)[|S
2
(1)|] ∧ S
1
=p
i
[k].S ∧
S
2
=p
i
[|p
i
|].S)) ∧ p
i-1
∈L(parent::node(), p
i-2
, xp) ∧…∧
p
1
∈L(parent::node(), p, xp))
∨
(p’
i
=⊥ ∧ (p
i
[k]→<p
i
[k].XP, p
i
[k].S∪ p
i
[|p
i
|].S, p
i
[k].z∪p
i
[|p
i
|].z,
p
i
[k].lp∪{(p
i
[|p
i
|], p
i
[k+1], ..., p
i
[|p
i
|-1])}, p
i
[k].f>) ∧ ∃k∈{1, ..., |p
i
|-1}: (
p
i
[k].XP=p
i
[|p
i
|]XP∧ S
1
(1)[|S
1
(1)|]=S
2
(1)[|S
2
(1)|] ∧ S
1
=p
i
[k].S ∧
S
2
=p
i
[|p
i
|].S) ∧ p
i
∈L(parent::node(), p
i-1
, xp) ∧
p
i-1
∈L(parent::node(), p
i-2
, xp) ∧ … ∧ p
1
∈L(parent::node(), p, xp)))}
• L(DoS::n, p, xp) = L(self::n, p, xp) ∪ L(descendant::n, p, xp)
• L(AoS::n, p, xp) = L(self::n, p, xp) ∪ L(ancestor::n, p, xp)
• L(FS::n, p, xp) = {p+<xp/FS::n, {s}, FS, p[|p|].z, -, -> | s∈iFS(s1) ∧
NT(s[|s|], n) ∧ s1∈p[|p|].S}
• L(following::n, p, xp) = L(AoS::node()/FS::node()/DoS::n, p, xp)
• L(PS::n, p, xp) = {p+<xp/PS::n, {s}, PS, p[|p|].z, -, -> | s∈iPS(s1) ∧
NT(s[|s|], n) ∧ s1∈p[|p|].S}
• L(preceding::n, p, xp) = L(AoS::node()/PS::node()/DoS::n, p, xp)
• L(attribute::n, p, xp) = {p+<xp/attribute::n, {s}, attribute, p[|p|], -, -> |
s∈iAttribute(S(1)[|S(1)|]) ∧ NT(s[|s|], n) ∧ S=p[|p|].S}
• L(e[q], p, xp) = {(p’[1], p’[2], …, p’[|p’|-1]) + <p’[|p’|].XP, p’[|p’|].S,
p’[|p’|].z, p’[|p’|].lp, p’[|p’|].f∪L(q, fp, -)> | p’∈L(e, p, xp) ∧ L(q, fp, -)≠∅ ∧
fp=(<-, p’[|p’|].S, self, p’[|p’|].z, -, ->)}
• L(e[q
1
]…[q
n
], p, xp) = {(p’[1], p’[2], …, p’[|p’|-1]) + <p’[|p’|].XP, p’[|p’|].S,
p’[|p’|].z, p’[|p’|].lp, p’[|p’|].f∪L(q
1
, fp, -)∪…∪L(q
n
, fp, -)> | p’=L(e, p, xp)
∧L(q
1
, fp, -)≠∅∧…∧L(q
n
, fp, -)≠∅∧fp=(<-, p’[|p’|].S, self, p’[|p’|].z, -, ->)}
• L(q
1
and q
2
, fp, -) = {(<‘and’, L(q
1
, fp, -)∪L(q
2
, fp, -)>) |
L(q
1
, fp, -)≠∅ ∧ L(q
2
, fp, -)≠∅}
• L(q
1
or q
2
, fp, -) = {(<‘or’, L(q
1
, fp, -)∪L(q
2
, fp, -)>) | L(q
1
, fp, -)≠∅ ∨
L(q
2
, fp, -)≠∅}
• L(q
1
= q
2
, fp, -) = {(<‘=’, L(q
1
, fp, -)∪L(q
2
, fp, -)>) | L(q
1
, fp, -)≠∅ ∧
L(q
2
, fp, -)≠∅}
• L(not(q), fp, -) = {(<‘not’, L(q, fp, -)>)}
• L(q=C, fp, -) = L(q[self::node()=C], fp, -), where q≠self::node()
• L(self::node()=C, fp, -) = {(<self::node()=C>)}
Figure 4: The function L: XPath×schema_path×XPath→Set
(schema_path).
3.3 Analyzing Complexity
Different from instance XML documents the
topology of which is a tree, an XML Schema
definition is a directed graph. In the directed graph
leading to the worst-case complexity, each node has
directed edges to all nodes. Thus, we assume that in
an XML Schema definition
S in the worst case, each
node in S is an instance node and each node is a
succeeding node of all the nodes. In an XPath query
Q in the worst case, each location step in Q selects all
the instance nodes in S.
Let a be the number of location steps in an XPath
query
Q. Let N be the number of nodes in an XML
Schema definition
S. In the worst case, from each
schema path p, at most O(∑
k=1
N
N!/(N-k)!) schema paths
are computed with length from
|p|+1 to |p|+N, and thus
at most
O((∑
k=1
N
N!/(N-k)!)
a
)=O((N!∗3)
a
) schema paths are
computed, each of which contains at most O(a∗N)
pointers to schema records, for
Q. Therefore, the
worst case complexity of our approach in terms of
run time and space is
O(a∗N∗(N!∗3)
a
).
The XML Schema definitions of the worst case
are rare. A query of the worst case is typically not
used. Therefore, it makes sense to investigate the
complexity of our approach in typical cases.
According to the schema and queries in
(Franceschet, 2005), we assume that the typical
cases are characterized as follows: each node in an
XML Schema definition
S has only a small number
of succeeding nodes compared with the number N of
nodes in
S; for each location step in the XPath query
Q, the number of nodes visited is on the average less
ICEIS 2008 - International Conference on Enterprise Information Systems
154