Indexing High-Dimensional Vector Streams

João Pinheiro, Lucas Borges, Bruno Martins da Silva, Luiz Leme, Marco Casanova

2023

Abstract

This paper addresses the vector stream similarity search problem, defined as: “Given a (high-dimensional) vector q and a time interval T, find a ranked list of vectors, retrieved from a vector stream, that are similar to q and that were received in the time interval T.” The paper first introduces a family of methods, called staged vector stream similarity search methods, or briefly SVS methods, to help solve this problem. SVS methods are continuous in the sense that they do not depend on having the full set of vectors available beforehand, but adapt to the vector stream. The paper then presents experiments to assess the performance of two SVS methods, one based on product quantization, called staged IVFADC, and another based on Hierarchical Navigable Small World graphs, called staged HNSW. The experiments with staged IVFADC use well-known image datasets, while those with staged HNSW use real data. The paper concludes with a brief description of a proof-of-concept implementation of a classified ad retrieval tool that uses staged HNSW.

Download


Paper Citation


in Harvard Style

Pinheiro J., Borges L., Martins da Silva B., Leme L. and Casanova M. (2023). Indexing High-Dimensional Vector Streams. In Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-648-4, SciTePress, pages 32-43. DOI: 10.5220/0011758900003467


in Bibtex Style

@conference{iceis23,
author={João Pinheiro and Lucas Borges and Bruno Martins da Silva and Luiz Leme and Marco Casanova},
title={Indexing High-Dimensional Vector Streams},
booktitle={Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2023},
pages={32-43},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011758900003467},
isbn={978-989-758-648-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Indexing High-Dimensional Vector Streams
SN - 978-989-758-648-4
AU - Pinheiro J.
AU - Borges L.
AU - Martins da Silva B.
AU - Leme L.
AU - Casanova M.
PY - 2023
SP - 32
EP - 43
DO - 10.5220/0011758900003467
PB - SciTePress