Authors:
Naoya Higuchi
1
;
Yasunobu Imamura
2
;
5
;
Vladimir Mic
3
;
Takeshi Shinohara
4
and
Kouichi Hirata
4
Affiliations:
1
Sojo University, 4-22-1 Ikeda, Nishi-ku, Kumamoto City 860-0082, Japan
;
2
THIRD INC., Shinjuku, Tokyo 160-0004, Japan
;
3
Aarhus University, Denmark
;
4
Kyushu Institute of Technology, Kawazu 680-4, Iizuka 820-8502, Japan
;
5
Gakushuin University, Mejiro 1-5-1, Toshima, Tokyo 171-8588, Japan
Keyword(s):
Similarity Search, Approximate Nearest Neighbor Search, Sketch, Conjunctive Enumeration, Hamming Distance, Asymmetric Distance.
Abstract:
Sketches are compact bit-string representations of points, often employed for speeding up searches through the effects of dimensionality reduction and data compression. In this paper, we propose a novel sketch enumeration method and demonstrate its ability to realize fast filtering for approximate nearest neighbor search in metric spaces. Whereas the Hamming distance between the query’s sketch and sketches of points to be searched has been used for sketch prioritization traditionally, recent research has introduced asymmetric distances, enabling higher recall rates with fewer candidates. Additionally, sketch enumeration methods that speed up the filtering such that high-priority solution candidates are selected based on the priority of the sketch to the given query without the need for direct sketch comparisons have been proposed. Our primary goal in this paper is to further accelerate sketch enumeration through parallel processing. While Hamming distance-based enumeration can be par
allelized relatively easily, achieving high recall rates requires a large number of candidates, and speeding up the filtering alone is insufficient for overall similarity search acceleration. Therefore, we introduce the conjunctive enumeration method, which concatenates two Hamming distance-based enumerations to approximate asymmetric distance-based enumeration. Then, we validate the effectiveness of the proposed method through experiments using large-scale public datasets. Our approach offers a significant acceleration effect, thereby enhancing the efficiency of similarity search operations.
(More)