Table 10: Comparison of indexing structures.
indexing 1 2 3 4 5 6 7
k
′
(×10
3
) 120 60 25 15 240 25 3
k
′
: the number of candidates to achieve the recall 90%
for DEEP1B. 1. INVADC, 2. IMI, 3. NO-IMI,
4. GNO-IMI, 5. 48-bit, 6. 96-bit, 7. 192-bit
5 CONCLUSION
Using two datasets YFCC100M-HNfc6 and DEEP1B
consisting of 100 million and 1 billion vectors, re-
spectively, we have demonstrated that narrow 22-bit
and 24-bit sketches provide fast and accurate NN
search. We have also described how difficult it is for
the wide sketches to outperform narrow sketches in
search speed. The key to fast search is to keep the
ability to narrow the candidates, as well as to speed
up the filtering.
One of the most important future tasks is the re-
vision of the pivot selection to make the sketching
more reliable. Since the current version of pivot selec-
tion by AIR can treat only narrow sketches, we have
to modify it. As we have observed in Section 4.4,
that 96-bit and 192-bit sketches provide the index-
ing structures with high quality without using AIR.
Therefore, we can expect that AIR can find indexing
structures with higher quality by not so wide sketches.
We also plan to build a search engine based on
double-filtering with narrow and wide sketches which
can be expected to search even larger datasets than
YFCC100M-HNfc6 and DEEP1B.
ACKNOWLEDGMENTS
This research was partly supported by ERDF
“CyberSecurity, CyberCrime and Critical Infor-
mation Infrastructures Center of Excellence” (No.
CZ.02.1.01/0.0/0.0/16 019/0000822), and also
by JSPS KAKENHI Grant Numbers 19H01133,
19K12125, 20H00595, 20H05962, 21H03559 and
20K20509.
REFERENCES
Amato, G., Falchi, F., Gennaro, C., and Rabitti, F.
(2016a). YFCC100M-HNfc6: A large-scale deep
features benchmark for similarity search. In Proc.
SISAP’16, LNCS 9939, Springer, pages 196–209.
Amato, G., Falchi, F., Gennaro, C., and Vadicamo, L.
(2016b). Deep permutations: Deep convolutional
neural networks and permutation-based indexing. In
Proc. SISAP’16, LNCS 9939, Springer, pages 93–106.
Babenko, A. and Lempitsky, V. (2016). Efficient indexing
of billion-scale datasets of deep descriptors. In Proc.
CVPR’16, IEEE Computer Society, pages 2055–2063.
Dong, W., Charikar, M., and Li, K. (2008). Asymmetric
distance estimation with sketches for similarity search
in high-dimensional spaces. In Proc. ACM SIGIR’08,
pages 123–130.
Higuchi, N., Imamura, Y., Kuboyama, T., Hirata, K., and
Shinohara, T. (2018). Nearest neighbor search using
sketches as quantized images of dimension reduction.
In Proc. ICPRAM’18, pages 356–363.
Higuchi, N., Imamura, Y., Kuboyama, T., Hirata, K.,
and Shinohara, T. (2019a). Fast filtering for nearest
neighbor search by sketch enumeration without using
matching. In Proc. AI 2019, LNAI 11919, Springer,
pages 240–252.
Higuchi, N., Imamura, Y., Kuboyama, T., Hirata, K., and
Shinohara, T. (2019b). Fast nearest neighbor search
with narrow 16-bit sketch. In Proc. ICPRAM’19,
pages 540–547.
Higuchi, N., Imamura, Y., Kuboyama, T., Hirata, K., and
Shinohara, T. (2020a). Annealing by increasing re-
sampling. In Revised Selected Papers, ICPRAM 2019,
LNCS 11996, Springer, pages 71–92.
Higuchi, N., Imamura, Y., Mic, V., Shinohara, T.,
Kuboyama, T., and Hirata, K. (2020b). Pivot selec-
tion for narrow sketches by optimization algorithms.
In Proc. SISAP’20, LNCS 12440, Springer, pages 33–
46.
Imamura, Y., Higuchi, N., Kuboyama, T., Hirata, K., and
Shinohara, T. (2017). Pivot selection for dimension
reduction using annealing by increasing resampling.
In Proc. LWDA’17, pages 15–24.
Johnson, J., Douze, M., and J
´
egou, H. (2021). Billion-scale
similarity search with GPUs. IEEE Transactions on
Big Data, 7(3):535–547.
Matsui, Y., Uchida, Y., J
´
egou, H., and Satoh, S. (2018). A
survey of product quantization. ITE Transactions on
Media Technology and Applications, 6(1):2–10.
Mic, V., Novak, D., and Zezula, P. (2015). Improving
sketches for similarity search. In Proc. MEMICS’15,
pages 45–57.
Mic, V., Novak, D., and Zezula, P. (2016). Speeding up sim-
ilarity search by sketches. In Proc. SISAP’16, pages
250–258.
M
¨
uller, A. and Shinohara, T. (2009). Efficient similarity
search by reducing I/O with compressed sketches. In
Proc. SISAP’09, pages 30–38.
Shinohara, T. and Ishizaka, H. (2002). On dimension re-
duction mappings for approximate retrieval of multi-
dimensional data. In Progress of Discovery Science,
LNCS 2281, Springer, pages 89–94.
Wang, Z., Dong, W., Josephson, W., Q. Lv, M. C., and Li,
K. (2007). Sizing sketches: A rank-based analysis
for similarity search. In Proc. ACM SIGMETRICS’07,
pages 157–168.
ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods
410