5 CONCLUSION
In this paper, for the variations of edit distance τ
A
for
A ∈ {ILST, ACC, LCA, LCART, TOP}, we have for-
mulated the earth mover’s distances EMD
A
based on
τ
A
. Then, we have given experimental results to eva-
luate EMD
A
comparing with τ
A
. As a result, we have
investigated the properties of EMD
A
.
It is a future work to give experimental results for
more large data (with large degrees) to analyze the
theoretical ratio O(n logn/d) in Section 4.1 in expe-
rimental. Also it is a future work to formulate EMDs
to other tra c ta ble variations in Tai mapping hierar-
chy (Yoshino and Hirata, 2017).
Concerned w ith Ex ample 1 in Section 4.3 and Ste-
tement 1 in Section 4. 4, we have found no trees T
1
and T
2
such that τ
A
(T
1
,T
2
) < EMD
A
(T
1
,T
2
) except
the case that T
1
is obtained by deleting leaves to T
2
.
Then, it is a fu ture work to determine whether or
not there exist other cases satisfying that τ
A
(T
1
,T
2
) <
EMD
A
(T
1
,T
2
).
It is a future work to analyze the properties of
EMDs in Section 4.4 in more detail and investigate
how data are appropriate for EMDs. In particular,
since it is possible that the number of the signature is
too small to formulate EMDs for trees, it is an impor-
tant future work to investigate appropriate signatures
for EMDs for trees.
ACKNOWLEDGEMENTS
This work is partially supported by Grant-in-
Aid for Scientific Research 17H00762, 16H02870,
16H01743 and 15K12102 fro m the Ministry of Edu-
cation, Cu lture, Sports, Science and Te chnology, Ja-
pan.
REFERENCES
Aratsu, T., Hirata, K., and Kuboyama, T. (2009). Sibling
distance for rooted labeled trees. In JSAI PAK DD ’08
Post-Workshop Proc. (LNAI 5433), pages 99–110.
Chawathe, S. S. (1999). Comparing hierarchical data in ex-
ternal memory. In Proc. VLDB’99, pages 90–101.
Gollapudi, S. and Panigrahy, R. (2008). The power of two
min-hashes for similarity search among hierarchical
data objects. In Proc. PODS’08, pages 211–219.
Hirata, K., Yamamoto, Y., and Kuboyama, T. (2011). Im-
proved MAX SNP-hard results for finding an edit dis-
tance between unordered trees. In Proc. CPM’11
(LNCS 6661), pages 402–415.
Jiang, T., Wang, L., and Z hang, K. (1995). Al ignment of
trees – an alternative to tree edit. Theoret. Comput.
Sci., 143:137–148.
Kailing, K., Kriegel, H.-P., Sch¨onaur, S., and Seidl, T.
(2004). Efficient simi larity search for hierarchical data
in large databases. In Proc. EDBT’04, pages 676–693.
Kawaguchi, T. and Hirata, K. (2017). On earth mover’s
distance based on complete subtrees for rooted labeled
trees. In Proc. SISA’17, pages 225–228.
Kuboyama, T. (2007). Matching and learning in trees. Ph.D
thesis, University of Tokyo.
Li, F., Wang, H., Li, J., and Gao, H. (2013). A survey on
tree edit distance lower bound estimation techniques
for similarity join on XML data. SIGMOD Record,
43:29–39.
Luke, S. and Panait, L. (2001). A survey and comparison
of tree generation algorithms. In Proc. GECCO’01,
pages 81–88.
Rubner, Y., Tomasi, C., and Guibas, L. J. (2007). The earth
mover’s distance as a metric for image retrieval. Int.
J. Comput. Visi on, 40:99–121.
Selkow, S. M. (1977). The tree-to-tree editing problem. In-
form. Process. Lett., 6:184–186.
Tai, K.-C. (1979). The tree-to-tree correction problem. J.
ACM, 26:422–433.
Yamamoto, Y., Hirata, K., and Kuboyama, T. (2014). Trac-
table and intractable variations of unordered tree edit
distance. Internat. J. Found. Comput. Sci., 25:307–
329.
Yoshino, T. and Hirata, K. ( 2017). Tai mapping hierarchy
for rooted labeled trees through common subforest.
Theory of Comput. Sys., 60:769–787.
Zhang, K. (1995). Algorithms for t he constrained edi-
ting distance between ordered labeled trees and related
problems. Pattern Recog., 28:463–474.
Zhang, K. (1996). A constrained edit distance between
unordered labeled trees. Algorithmica, 15:205–222.
Zhang, K. and Jiang, T. ( 1994). Some MAX SNP-hard re-
sults concerning unordered labeled trees. Inform. Pro-
cess. Lett., 49:249–254.
Zhang, K., Wang, J., and Shasha, D. (1996). On the editing
distance between undirected acyclic graphs. I nternat.
J. Found. Comput. Sci., 7:43–58.