Nearest-neighbor Search from Large Datasets using Narrow Sketches

Naoya Higuchi; Yasunobu Imamura; Vladimir Mic; Takeshi Shinohara; Kouichi Hirata; Tetsuji Kuboyama

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Nearest-neighbor Search from Large Datasets using Narrow Sketches

Topics: Data Mining and Algorithms for Big Data; Information Retrieval

In Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods ICPRAM - Volume 1, 401-410, 2022

Authors: Naoya Higuchi ¹ ; Yasunobu Imamura ² ; Vladimir Mic ³ ; Takeshi Shinohara ⁴ ; Kouichi Hirata ⁴ and Tetsuji Kuboyama ⁵

Affiliations: ¹ Sojo University, Ikeda 4-22-1, Kumamoto 860-0082, Japan ; ² THIRD INC., Shinjuku, Tokyo 160-0004, Japan ; ³ Masaryk University, Brno, Czech Republic ; ⁴ Kyushu Institute of Technology, Kawazu 680-4, Iizuka 820-8502, Japan ; ⁵ Gakushuin University, Mejiro 1-5-1, Toshima, Tokyo 171-8588, Japan

Keyword(s): Narrow Sketch, Nearest-neighbor Search, Large Dataset, Sketch Enumeration, Partially Restored Distance.

Abstract: We consider the nearest-neighbor search on large-scale high-dimensional datasets that cannot fit in the main memory. Sketches are bit strings that compactly express data points. Although it is usually thought that wide sketches are needed for high-precision searches, we use relatively narrow sketches such as 22-bit or 24-bit, to select a small set of candidates for the search. We use an asymmetric distance between data points and sketches as the criteria for candidate selection, instead of traditionally used Hamming distance. It can be considered a distance partially restoring quantization error. We utilize an efficient one-by-one sketch enumeration in the order of the partially restored distance to realize a fast candidate selection. We use two datasets to demonstrate the effectiveness of the method: YFCC100M-HNfc6 consisting of about 100 million 4,096 dimensional image descriptors and DEEP1B consisting of 1 billion 96 dimensional vectors. Using a standard desktop computer, we condu cted a nearest-neighbor search for a query on datasets stored on SSD, where vectors are represented by 8-bit integers. The proposed method executes the search in 5.8 seconds for the 400GB dataset YFCC100M, and 0.24 seconds for the 100GB dataset DEEP1B, while keeping the recall of 90%. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.128.204.151

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Higuchi, N., Imamura, Y., Mic, V., Shinohara, T., Hirata, K. and Kuboyama, T. (2022). Nearest-neighbor Search from Large Datasets using Narrow Sketches. In Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-549-4; ISSN 2184-4313, SciTePress, pages 401-410. DOI: 10.5220/0010817600003122

@conference{icpram22,
author={Naoya Higuchi and Yasunobu Imamura and Vladimir Mic and Takeshi Shinohara and Kouichi Hirata and Tetsuji Kuboyama},
title={Nearest-neighbor Search from Large Datasets using Narrow Sketches},
booktitle={Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2022},
pages={401-410},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010817600003122},
isbn={978-989-758-549-4},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - Nearest-neighbor Search from Large Datasets using Narrow Sketches
SN - 978-989-758-549-4
IS - 2184-4313
AU - Higuchi, N.
AU - Imamura, Y.
AU - Mic, V.
AU - Shinohara, T.
AU - Hirata, K.
AU - Kuboyama, T.
PY - 2022
SP - 401
EP - 410
DO - 10.5220/0010817600003122
PB - SciTePress