0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
10 20 30 40 50 60 70 80 90 100
cpu-time[sec]
k
dim12
dim24
dim48
dim96
Figure 8: CPU-time of AESA on each dimensional data.
100 retrievals and 96 dimensions, the improvement
increases to about 12%. Thus, an effective gain is ob-
tained with the present method even when the num-
ber of dimensions increases. The maximum number
of splits per leaf node was set to 10. The size of the
file that stores the distance list that is needed in the
candidate reduction method based on nearest neigh-
bor objects was 313 MB for all dimensions.
Fig.8 indicates that although AESA outperforms
the VP-tree in terms of the number of distance cal-
culations, the retrieval time is slower. A possible
reason for this is the difference in the number of
read accesses to the distance-list file. For AESA, the
distance-list file must be read at every iteration of the
process. In other words, this file is read as many
times as the number of distance calculations, and this
is thought to have a large influence on the retrieval
time. For the VP-tree, the distance-list file needs to be
read only for the reduction of leaf objects, and there-
fore the number of read accesses can be reduced to a
minimal level. Thus, the VP-tree resulted in a more
significant improvement in the retrieval effectiveness
than did AESA.
5 CONCLUSIONS
We have proposed an improvement to the search al-
gorithm for the leaf nodes of a VP-tree. The results
show that the retrieval times were reduced by 5% to
12% for the task involving retrieval of similar images.
A topic for future work is the creation of a search al-
gorithm that permits further reductions in the distance
calculations with a smaller index size.
ACKNOWLEDGEMENTS
This work was supported in part by a grant from
the Grant-in-Aid for Scientific Research numbered
#21500940, #21300036 and #20650143 from the
Ministry of Education, Science and Culture, Japan.
REFERENCES
Beckmann, N., Kriegel, H. P., Schneider, R., and Seeger,
B. (1990). The r*-tree: An efficient and robust access
method for points and rectangles. In Proc. of the ACM
SIGMOD ’90, pages 322–331.
Berchtold, S., Keim, D. A., and Kriegel, H. P. (1996). The
x-tree an index structure for high-dimensional data. In
Proc. of the 22nd VLDB, pages 28–39.
Bozkaya, T. and Ozsoyoglu, M. (1997). Distance-based in-
dexing for high-dimensional metric spaces. In Proc.
of the ACM SIGMOD, pages 357–368.
Ciaccia, P., Patella, M., and Zezula, P. (1995). M-tree: An
efficient access method for similarity search in metric
spaces. In Proc. of the ACM SIGMOD Int. Conf. on
the Management of Data, pages 71–79.
Corel (2011). Corel image garally. http://www.corel.co.jp/.
Fu, A. W., Chan, P. M., Cheung, Y. L., and Moon, Y. S.
(2000). Dynamic vp-tree indexing for n-nearest neigh-
bor search given pair-wise distances. VLDB Journal,
pages 2–8.
Guttman, A. (1984). A dynamic index structure for spatial
searching. In Proc. of the ACM SIGMOD ’84, pages
47–57.
Ioka, M. (1989). A method of defining the similarity of
images on the basis of color information. Technical
Report RT-0030.
Ishikawa, M., Notoya, J., Chen, H., and Ohbo, N. (1999).
A metric index mi-tree. Transactions of Information
Processing Society of Japan, 40(SIG6(TOD3)):104–
114.
Katayama, N. and Satoh, S. (1997). Sr-tree : An in-
dex structure for nearest neighbor searching of high-
dimensional point data. IEICE Transaction on Infor-
mation and Systems, J80-D-I(8):703–717.
Rubner, Y., Tomasi, C., and Guibas, L. J. (1999). The
earch mover’s distance, multi-dimensional scaling,
and color-based image retrieval. In Proc. of the ARPA
Image Understanding Workshop, pages 661–668.
Vidal, R. (1986). An algorithm for finding nearest neigh-
bours in approximately constant average time. Pattern
Recognition Letters, pages 145–157.
Weber, R., Schek, H. J., and Blott, S. (1998). A quantitative
analysis and performance study for similarity-search
methods in high-dimensional spaces. In Proc. of the
24th VLDB, pages 194–205.
White, D. A. and Jain, R. (1996). Similarity indexing with
ss-tree. In Proc. of the 12th Int. Conf. on Data Engi-
neering, pages 516–523.
Yianilos, P. N. (1993). Data structures and algorithms for
nearest neighbor search in general metric spaces. In
Proc. of the ACM-SIAM SODA’93, pages 311–321.
Zezula, P., Amato, G., Dohnal, V., and Batko, M. (2006).
Similarity Search –The Metric Space Approach –.
Springer press.
AN IMPROVED METHOD TO SELECT CANDIDATES ON METRIC INDEX VP-TREE
319