is not as expected in most of the cases (especially in
comparison to the NoSQL alternatives).
Our experimentation has also quantified how
query selectivity affects the performance of the dif-
ferent systems. In the exact phrase matching scenario
the systems under comparison responded without any
serious fluctuation in query answering time, but in
the boolean search scenarios significant fluctuations
in query execution were observed. Additionally, most
of the examined systems present faster query execu-
tion times in the wildcard and boolean search scenar-
ios. Finally, we demonstrated that PostgreSQL needs
lower insertion/indexing time over its competitors.
Overall, we conclude that NoSQL and text data
stores indeed provide a fast and trustworthy alterna-
tive for full-text search that is agnostic to the size
of the database. However, MongoDB is the slow-
est and most sensitive to parameter and query setup
among the NoSQL competitors, whereas PostgreSQL
performs (surprisingly) well in some query scenar-
ios (mainly wildcard and boolean search) and outper-
forms some of the NoSQL competitors especially for
small and medium database sizes.
4 FUTURE WORK
In the future, we plan to (i) expand our study with
more systems (including CouchDB, Cassandra, Spinx
and SQL Server), query types (including proxim-
ity, fuzzy and synonyms) and parameter combina-
tions, (ii) incorporate a variety of textual content
(web pages, social media posts, emails, publications,
audio/video transcripts), and (iii) introduce a ma-
chine learning component that will enable the self-
designing of text stores depending on the dataset and
the workload in the spirit of (Chatterjee et al., 2021).
ACKNOWLEDGEMENTS
This work was supported in part by project
ENIRISST+ under grant agreement No. MIS
5047041 from the General Secretary for ERDF & CF,
under Operational Programme Competitiveness, En-
trepreneurship and Innovation 2014-2020 (EPAnEK)
of the Greek Ministry of Economy and Development
(co-financed by Greece and the EU through the Euro-
pean Regional Development Fund).
REFERENCES
AnyTXT (2021). AnyTXT Searcher: Lucene vs
Solr vs ElasticSearch, 2021. https://anytxt.net/
how-to-choose-a-full-text-search-engine/.
Brewer, E. (2012). CAP twelve years later: How the rules
have changed. Computer, 45(2).
Carvalho, I., S
´
a, F., and Bernardino, J. (2022). NoSQL Doc-
ument Databases Assessment: Couchbase, CouchDB,
and MongoDB. In DATA.
ˇ
Cere
ˇ
s
ˇ
n
´
ak, R. and Kvet, M. (2019). Comparison of query
performance in relational a non-relation databases.
TRPRO, 40.
Chatterjee, S., Jagadeesan, M., Qin, W., and Idreos,
S. (2021). Cosine: A cloud-cost optimized self-
designing key-value storage engine. VLDB Endow-
ment, 15(1).
Fraczek, K. and Plechawska-Wojcik, M. (2017). Com-
parative Analysis of Relational and Non-relational
Databases in the Context of Performance in Web Ap-
plications. In BDAS.
Jatana, N., Puri, S., Ahuja, M., Kathuria, I., and Gosain,
D. (2012). A survey and comparison of relational and
non-relational database. IJERT, 1(6).
Li, Y. and Manoharan, S. (2013). A performance compari-
son of SQL and NoSQL databases. In IEEE PACRIM.
Lourenco, J. R., Cabral, B., Carreiro, P., Vieira, M., and
Bernardino, J. (2015). Choosing the right NoSQL
database for the job: a quality attribute evaluation. J
Big Data, 2.
Lucidworks (2019). Full Text Search Engines
vs. DBMS. https://lucidworks.com/post/
full-text-search-engines-vs-dbms/.
McCreary, D. G. and Kelly, A. M. (2013). Finding informa-
tion with NoSQL search. Manning.
Microsoft (2023). Full-text search. https:
//docs.microsoft.com/en-us/sql/relational-databases/
search/full-text-search?view=sql-server-ver15.
Mohamed, M., Altrafi, O., and Ismail, M. (2014). Rela-
tional vs. nosql databases: A survey. IJCIT, 3(3).
Nayak, A., Poriya, A., and Poojary, D. (2013). Type of
NOSQL databases and its comparison with relational
databases. IJAIS, 5(4).
Sahatqija, K., Ajdari, J., Zenuni, X., Raufi, B., and Is-
maili, F. (2018). Comparison between relational and
NOSQL databases. In IEEE MIPRO.
Schuler, K., Peterson, C., and Vincze, E. (2009). Data Iden-
tification and Search Techniques. Syngress.
Truica, C. O., Radulescu, F., Boicea, A., and Bucur, I.
(2015). Performance evaluation for CRUD opera-
tions in asynchronously replicated document oriented
database. In CSCS.
Tryfonopoulos, C. (2018). A Methodology for the Auto-
matic Creation of Massive Continuous Query Datasets
from Real-Life Corpora. In ICAIT.
Tryfonopoulos, C., Koubarakis, M., and Drougas, Y.
(2009). Information Filtering and Query Indexing for
an Information Retrieval Model. ACM TOIS, 27(2).
Zervakis, L., Tryfonopoulos, C., Skiadopoulos, S., and
Koubarakis, M. (2017). Query Reorganisation Al-
gorithms for Efficient Boolean Information Filtering.
IEEE TKDE, 29(2).
Comparing Data Store Performance for Full-Text Search: To SQL or to NoSQL?
413