Authors:
William Silva
;
Igor Eleutério
;
Larissa Teixeira
;
Agma J. M. Traina
and
Caetano Traina Júnior
Affiliation:
Institute of Mathematics and Computer Sciences (ICMC), University of São Paulo, São Carlos, Brazil
Keyword(s):
Similarity Query, NoSQL, Metric Access Methods, Cloud-Based Storage, Billing Reduction.
Abstract:
Several popular cloud NoSQL data stores, such as MongoDB and Firestore, organize data as document collections. However, they provide few resources for querying complex data by similarity. The comparison conditions provided to express queries over documents are based only on identity, containment, or order relationships. Thus, reading through an entire collection is often the only way to execute a similarity query. This can be both computationally and financially expensive, because data storage licenses charge for the number of document reads and writes. This paper presents Similarity-Slim, an innovative extension for NoSQL databases, designed to reduce the financial and computational costs associated with similarity queries. The extension was evaluated on the Firestore repository as a case study, considering three application scenarios: geospatial, image recommendation and medical support systems. Experiments have shown that it can reduce costs by up to 2,800 times and speed up queri
es by up to 85 times.
(More)