# AN IMPROVED METHOD TO SELECT CANDIDATES ON METRIC INDEX VP-TREE

### Masami Shishibori, Samuel Sangkon Lee, Kenji Kita

#### Abstract

On multimedia databases, it is one of important techniques to use the efficient indexing method for the fast access. Metric indexing methods can apply for various distance measures other than the Euclidean distance. Then, metric indexing methods have higher flexibility than multi-dimensional indexing methods. We focus on the Vantage Point tree (VP-tree) which is one of the metric indexing methods. VP-tree is an efficient metric space indexing method, however the number of distance calculations at leaf nodes tends to increase. In this paper, we propose an efficient algorithm to reduce the number of distance calculations at leaf nodes of the VPtree. The conventional VP-tree uses the triangle inequality at the leaf node in order to reduce the number of distance calculations. At this point, the vantage point of the VP-tree is used as a reference point of the triangle inequality. The proposed algorithm uses the nearest neighbor (NN) point for the query instead of the vantage point as the reference point. By using this method, the selection range by the triangle inequality becomes small, and the number of distance calculations at leaf nodes can be cut down. Moreover, it is impossible to specify the NN point in advance. Then, this method regards the nearest point to the query in the result buffer as the temporary NN point. If the nearer point is found on the retrieval process, the temporary NN point is replaced with new one. From evaluation experiments using 10,000 image data, it was found that our proposed method could cut 5%12% of search time of the conventional VP-tree.

#### References

- Beckmann, N., Kriegel, H. P., Schneider, R., and Seeger, B. (1990). The r*-tree: An efficient and robust access method for points and rectangles. In Proc. of the ACM SIGMOD 7890, pages 322-331.
- Berchtold, S., Keim, D. A., and Kriegel, H. P. (1996). The x-tree an index structure for high-dimensional data. In Proc. of the 22nd VLDB, pages 28-39.
- Bozkaya, T. and Ozsoyoglu, M. (1997). Distance-based indexing for high-dimensional metric spaces. In Proc. of the ACM SIGMOD, pages 357-368.
- Ciaccia, P., Patella, M., and Zezula, P. (1995). M-tree: An efficient access method for similarity search in metric spaces. In Proc. of the ACM SIGMOD Int. Conf. on the Management of Data, pages 71-79.
- Corel (2011). Corel image garally. http://www.corel.co.jp/.
- Fu, A. W., Chan, P. M., Cheung, Y. L., and Moon, Y. S. (2000). Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB Journal, pages 2-8.
- Guttman, A. (1984). A dynamic index structure for spatial searching. In Proc. of the ACM SIGMOD 7884, pages 47-57.
- Ioka, M. (1989). A method of defining the similarity of images on the basis of color information. Technical Report RT-0030.
- Ishikawa, M., Notoya, J., Chen, H., and Ohbo, N. (1999). A metric index mi-tree. Transactions of Information Processing Society of Japan, 40(SIG6(TOD3)):104- 114.
- Katayama, N. and Satoh, S. (1997). Sr-tree : An index structure for nearest neighbor searching of highdimensional point data. IEICE Transaction on Information and Systems, J80-D-I(8):703-717.
- Rubner, Y., Tomasi, C., and Guibas, L. J. (1999). The earch mover's distance, multi-dimensional scaling, and color-based image retrieval. In Proc. of the ARPA Image Understanding Workshop, pages 661-668.
- Vidal, R. (1986). An algorithm for finding nearest neighbours in approximately constant average time. Pattern Recognition Letters, pages 145-157.
- Weber, R., Schek, H. J., and Blott, S. (1998). A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proc. of the 24th VLDB, pages 194-205.
- White, D. A. and Jain, R. (1996). Similarity indexing with ss-tree. In Proc. of the 12th Int. Conf. on Data Engineering, pages 516-523.
- Yianilos, P. N. (1993). Data structures and algorithms for nearest neighbor search in general metric spaces. In Proc. of the ACM-SIAM SODA'93, pages 311-321.
- Zezula, P., Amato, G., Dohnal, V., and Batko, M. (2006). Similarity Search -The Metric Space Approach -. Springer press.

#### Paper Citation

#### in Harvard Style

Shishibori M., Sangkon Lee S. and Kita K. (2011). **AN IMPROVED METHOD TO SELECT CANDIDATES ON METRIC INDEX VP-TREE** . In *Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)* ISBN 978-989-8425-79-9, pages 306-311. DOI: 10.5220/0003668803140319

#### in Bibtex Style

@conference{kdir11,

author={Masami Shishibori and Samuel Sangkon Lee and Kenji Kita},

title={AN IMPROVED METHOD TO SELECT CANDIDATES ON METRIC INDEX VP-TREE},

booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},

year={2011},

pages={306-311},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0003668803140319},

isbn={978-989-8425-79-9},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)

TI - AN IMPROVED METHOD TO SELECT CANDIDATES ON METRIC INDEX VP-TREE

SN - 978-989-8425-79-9

AU - Shishibori M.

AU - Sangkon Lee S.

AU - Kita K.

PY - 2011

SP - 306

EP - 311

DO - 10.5220/0003668803140319