# FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION

### Marius Muja, David G. Lowe

#### Abstract

For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems that are faster than linear search. Approximate algorithms are known to provide large speedups with only minor loss in accuracy, but many such algorithms have been published with only minimal guidance on selecting an algorithm and its parameters for any given problem. In this paper, we describe a system that answers the question, “What is the fastest approximate nearest-neighbor algorithm for my data?” Our system will take any given dataset and desired degree of precision and use these to automatically determine the best algorithm and parameter values. We also describe a new algorithm that applies priority search on hierarchical k-means trees, which we have found to provide the best known performance on many datasets. After testing a range of alternatives, we have found that multiple randomized k-d trees provide the best performance for other datasets. We are releasing public domain code that implements these approaches. This library provides about one order of magnitude improvement in query time over the best previously available software and provides fully automated parameter selection.

#### References

- Andoni, A. (2006). Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06), pages 459-468.
- Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., and Wu, A. Y. (1998). An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. Journal of the ACM, 45:891-923.
- Beis, J. S. and Lowe, D. G. (1997). Shape indexing using approximate nearest-neighbor search in high dimensional spaces. In CVPR, pages 1000-1006.
- Brin, S. (1995). Near neighbor search in large metric spaces. In VLDB, pages 574-584.
- Freidman, J. H., Bentley, J. L., and Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw., 3:209-226.
- Fukunaga, K. and Narendra, P. M. (1975). A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput., 24:750-753.
- Leibe, B., Mikolajczyk, K., and Schiele, B. (2006). Efficient clustering and matching for object class recognition. In BMVC.
- Liu, T., Moore, A., Gray, A., and Yang, K. (2004). An investigation of practical approximate nearest neighbor algorithms. In Neural Information Processing Systems.
- Lowe, D. G. (2004). Distinctive image features from scaleinvariant keypoints. Int. Journal of Computer Vision, 60:91-110.
- Mikolajczyk, K. and Matas, J. (2007). Improving descriptors for fast tree matching by optimal linear projection. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pages 1-8.
- Nister, D. and Stewenius, H. (2006). Scalable recognition with a vocabulary tree. In CVPR, pages 2161-2168.
- Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In CVPR.
- Schindler, G., Brown, M., and Szeliski, R. (2007). Cityscale location recognition. In CVPR, pages 1-7.
- Silpa-Anan, C. and Hartley, R. (2004). Localization using an imagemap. In Australasian Conference on Robotics and Automation.
- Silpa-Anan, C. and Hartley, R. (2008). Optimised KD-trees for fast image descriptor matching. In CVPR.
- Sivic, J. and Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In ICCV.
- Torralba, A., Fergus, R., and Freeman, W. T. (2008). 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1958-1970.

#### Paper Citation

#### in Harvard Style

Muja M. and G. Lowe D. (2009). **FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION** . In *Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2009)* ISBN 978-989-8111-69-2, pages 331-340. DOI: 10.5220/0001787803310340

#### in Bibtex Style

@conference{visapp09,

author={Marius Muja and David G. Lowe},

title={FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION},

booktitle={Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2009)},

year={2009},

pages={331-340},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0001787803310340},

isbn={978-989-8111-69-2},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2009)

TI - FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION

SN - 978-989-8111-69-2

AU - Muja M.

AU - G. Lowe D.

PY - 2009

SP - 331

EP - 340

DO - 10.5220/0001787803310340