(ii) There exists an interval [
a, b) in such that [a
2
,
b
2
) is subsumed by [a, b).
Proof. The proposition can be derived from the
following two facts:
(1)
v is reachable from u by tree edges iff [a
2
, b
2
)
is subsumed by [
a
1
, b
1
).
(2) In terms of Lemma 6,
v is reachable from u
via non-tree edges iff v
-
exists and its interval
sequence contains an interval [
a, b) which
subsumes [
a
2
, b
2
). Furthermore, in terms of
Lemma 7, [
a, b) subsumes [a
2
, b
2
) iff v*
exists and its interval is subsumed by [
a, b).
Now we consider node
c and e in the graph
shown in Fig. 10. To check whether node
c (labeled
[2, 4), <-, [2, 4)>) is a descendant of node
e (labeled
([6, 9), <5, ->), we will first check whether 2 ∈ [6,
9). Since 2 ∉ [6, 9), we will check whether there is
an interval in
L(e) = [2, 4)[4, 5)[6, 9)[11, 12) (note
that
φ
(e) = 5), which subsumes [2, 4). Since [2, 4) in
L(e) subsumes [2, 4), we know that node c is
reachable from node
e.
Finally, we notice that each interval sequence in
the core table of
G contains only the intervals not on
the same path in
T and they are also increasingly
ordered. Therefore, to check a given interval is
subsumed by an interval in
L(v) for some node v, we
need only O(log|
L(v)|) time. But |L(v)| is bounded by
b, so we require only O(logb) time for reachability
checking.
Proposition 5. Let
v and u be two nodes in G. It
needs O(log
b) time to check whether u is reachable
from
v via non-tree edges or vice versa.
Proof. See the above analysis.
4 CONCLUSIONS
In this paper, a new approach is proposed to
compress transitive closure. Its main idea is to
recognize a subset of nodes in
G and assign them
labels in such a way that the reachability via non-
tree edges
can be determined by checking such
labels only. This is achieved by finding a general
spanning tree with the least number of leaf nodes,
based on which a core label can be established.
Using this method, the labeling can be done in O(
n +
e + t⋅b) time, where t is the number of non-tree edges
(edges that do not appear in the general spanning
tree
T of G) and b is the number of the leaf nodes of
T. It can be proven that b equals G’s width, defined
to be the size of a largest node subset
U of G such
that for every pair of nodes
u, v ∈ U, there does not
exist a path from
u to v or from v to u. The space and
time complexities are bounded by O(
n + t⋅b) and
O(log
b), respectively.
REFERENCES
Agrawal, R., Borgida, A. and Jagadish, H.V., 1989.
Efficient management of transtive relationships in
large data and knowledge bases, Proc. of the 1989
ACM SIGMOD Intl. Conf. on Management of Data,
Oregon, pp. 253-262.
Y. Chen, Y., 2009. General Spanning Trees and
Reachability Query Evaluation, in Proc. Canadian
Conference on Computer Science and Software
Engineering, ACM, Montreal, Canada, May 2009, pp.
243-252.
J. Cheng, J., Yu, J.X., Lin, X., Wang, H. and Yu, P.S.,
2006. Fast computation of reachability labeling for
large graphs, in Proc. EDBT, Munich, Germany, May
26-31.
Cohen, N.H., 1991. Type-extension tests can be performed
in constant time, ACM Transactions on Programming
Languages and Systems, 13:626-629.
Cohen, E., Halperin, E., Kaplan, H. and Zwick, U., 2003.
Reachability and distance queries via 2-hop labels,
SIAM J. Comput, vol. 32, No. 5, pp. 1338-1355.
Jagadish, H.V., 1990. “A Compression Technique to
Materialize Transitive Closure,” ACM Trans.
Database Systems, Vol. 15, No. 4, pp. 558 - 598.
Knuth, D.E., 1969. The Art of Computer Programming,
Vol.1, Addison-Wesley, Reading.
R. Schenkel, R., Theobald, A. and G. Weikum, G., 2004.
HOPI: an efficient connection index for complex
XML document collections, in Proc. EDBT.
R. Schenkel, R., Theobald, A, and G. Weikum, G., 2006.
Efficient creation and incrementation maintenance of
HOPI index for complex xml document collection, in
Proc. ICDE.
R. Tarjan, R., 1972. Depth-first Search and Linear Graph
Algorithms, SIAM J. Compt. Vol. 1. No. 2, pp. 146 -140.
Teuhola, J., 1996. Path Signatures: A Way to Speed up
Recursion in Relational Databases, IEEE Trans. on
Knowledge and Data Engineering, Vol. 8, No. 3, pp.
446 - 454.
M. Thorup, M., 2004. “Compact Oracles for Reachability
and Approximate Distances in Planar Digraphs,”
JACM, 51, 6(Nov. 2004), 993-1024.
Wang, H., He, H., Yang, J., Yu, P.S. and Yu, J.X., 2006.
Dual Labeling: Answering Graph Reachability
Queries in Constant time, in Proc. of Int. Conf. on
Data Engineering, Atlanta, USA.
Zibin, Y. and Gil, J., 2001. Efficient Subtyping Tests with
PQ-Encoding, Proc. of the 2001 ACM SIGPLAN Conf.
on Object-Oriented Programming Systems, Languages
and Application, Florida, October 14-18, pp. 96-107.
ICSOFT 2009 - 4th International Conference on Software and Data Technologies
98