eration. E.g., as new data sources may join or dis-
join the federation, new data may rise or disap-
pear, which brings new knowledge.
• Benchmarks: should be developed to evalu-
ate federations addressing store combinations and
multiple query language processing, like Poly-
Bench (Karimov et al., 2018).
5 CONCLUSION
Modern applications require the manipulation of
structured and unstructured data, usually in high vol-
ume, over distributed and heterogeneous data sources.
The pattern “one size fits all” does not hold any-
more. Thus, innovative solutions capable of access-
ing and manipulating data in such an environment are
required.
This work presented the state-of-the-art, detailed
solutions, their main components, how queries are
specified and executed, and other features. Afterward,
we presented guidelines and challenges the solutions
should address. Researchers and practitioners can use
our finds to focus their work.
As future work, we aim at evaluating the tools in
practice through case studies and experiments to iden-
tify the level they meet the challenges and to bring
new open issues. We also intend to perform a system-
atic review towards a broader analysis improving the
overview presented in this work.
REFERENCES
Angles, R. and Gutierrez, C. (2008). Survey of graph
database models. ACM Comp. Surveys, 40(1):1–39.
Bondiombouy, C. and Valduriez, P. (2016). Query process-
ing in multistore systems: an overview. Research Re-
port RR-8890, INRIA.
Elmore, A., Duggan, J., Stonebraker, M., Balazinska, M.,
Cetintemel, U., Gadepally, V., Heer, J., Howe, B.,
Kepner, J., Kraska, T., et al. (2015). A demonstra-
tion of the bigdawg polystore system. Proceedings of
the VLDB Endowment, 8(12):1908–1911.
Guimar
˜
aes, P. and Pereira, J. (2015). X-ray: Monitoring
and analysis of distributed database queries. In IFIP
International Conference on Distributed Applications
and Interoperable Systems, pages 80–93. Springer.
Gurusamy, V., Kannan, S., and Nandhini, K. (2017). The
Real Time Big Data Processing Framework: Advan-
tages and Limitations. Intl. Journal of Computer Sci-
ences and Engineering (JCSE), 5(12):305–312.
Halperin, D., Teixeira de Almeida, V., Choo, L. L., and et al.
(2014). Demonstration of the myria big data manage-
ment service. In Proceedings of the 2014 ACM SIG-
MOD Intl. Conf. on Mngt. of Data, pages 881–884.
Haslhofer, B., Momeni Roochi, E., Schandl, B., and Zan-
der, S. (2011). Europeana rdf store report. Technical
report, University of Vienna.
Hausenblas, M. and Nadeau, J. (2013). Apache drill: in-
teractive ad-hoc analysis at scale. Big data, 1(2):100–
104.
Iancu, B. and Georgescu, T. M. (2018). Saving Large Se-
mantic Data in Cloud: A Survey of the Main DBaaS
Solutions. Informatica Economica, 22(1).
Karimov, J., Rabl, T., and Markl, V. (2018). Polybench: The
first benchmark for polystores. In Technology Confer-
ence on Performance Evaluation and Benchmarking,
pages 24–41. Springer.
Kolev, B., Bondiombouy, C., Valduriez, P., Jim
´
enez-Peris,
R., Pau, R., and Pereira, J. (2016a). The CloudMdsQL
Multistore System. In Proc. of Intl. Conf. on Manage-
ment of Data (SIGMOD’16), pages 2113–2116. ACM.
Kolev, B., Valduriez, P., Bondiombouy, C., Jimenez-Peris,
R., Pau, R., and Pereira, J. (2016b). CloudMd-
sQL: Querying Heterogeneous Cloud Data Stores
with a Common Language. Distributed and parallel
databases, 34(4):463–503.
Lenzerini, M. (2002). Data integration: A theoretical per-
spective. In Proceedings of the Twenty-First ACM
SIGMOD-SIGACT-SIGART Symposium on Principles
of Database Systems, pages 233–246. ACM.
Modoni, G. E., Sacco, M., and Terkaj, W. (2014). A survey
of rdf store solutions. In Intl. Conf. on Engineering,
Technology and Innovation (ICE), pages 1–7. IEEE.
Nayak, A., Poriya, A., and Poojary, D. (2013). Type of
NOSQL databases and its comparison with relational
databases. International Journal of Applied Informa-
tion Systems, 5(4):16–19.
¨
Ozsu, M. T. and Valduriez, P. (2020). Principles of dis-
tributed database systems. Springer, 4th edition.
Sheth, A. P. and Larson, J. A. (1990). Federated database
systems for managing distributed, heterogeneous, and
autonomous databases. ACM Computing Surveys
(CSUR), 22(3):183–236.
Stonebraker, M. (2015). The case for polystore. https://wp.
sigmod.org/?p=1629.
Stonebraker, M., Bear, C., C¸ etintemel, U., Cherniack, M.,
Ge, T., Hachem, N., Harizopoulos, S., Lifter, J.,
Rogers, J., and Zdonik, S. (2007). One size fits all?
part 2: Benchmarking results. In Proc. CIDR.
Tan, R., Chirkova, R., Gadepally, V., and Mattson, T. G.
(2017). Enabling query processing across heteroge-
neous data models: A survey. In IEEE Intl. Conf. on
Big Data (Big Data), pages 3211–3220. IEEE.
Zheng, Z., Wang, P., Liu, J., and Sun, S. (2015). Real-Time
Big Data Processing Framework: Challenges and So-
lutions. Applied Math. & Inf. Sciences, 9(6):3169.
Zulkefli, N. S. S., Rahman, N. A., Bakar, Z. A., Nordin, S.,
Sembok, T. M. T., and Teo, N. H. I. (2013). Evalua-
tion of triple indices in retrieving web documents. In
Intl. Conf. on Advanced Computer Science Applica-
tions and Technologies (ACSAT), pages 525–529.
Modern Federated Database Systems: An Overview
283