made using features that were chosen through
brainstorming and researches, taking into account the
needs that companies have when using these big data
streaming platforms. The Flink, Kafka, and Storm
platforms were the ones that achieved the best range,
as they contain more features that we analyzed.
As future work, we intend to choose three of the
compared platforms to evaluate them with a
benchmark application. Research on existing
benchmarks will be carried out and the one that best
fits to evaluate the platforms will be chosen. The
evaluation will be made taking into account the
features that have been compared. We intend to
choose the best platform and use it in a real
environment. An extensive quantitative assessment
(performance) of these systems will also be a good
suggestion.
REFERENCES
Amakobe, M. (2016) ‘A comparison between Apache
Samza and Storm’, Colorado Tech University.
Behera, R. K, Das, S., Jena, M., Rath, S. K. & Sahoo, B.
(2017). ‘A Comparative Study of Distributed Tools for
Analyzing Streaming Data’, 2017 Int. Conference on
Information Technology (ICIT), pp. 79–84.
D'Silva, G. M., Khan, A., Gaurav & Bari, S. (2018) ‘Real-
time processing of IoT events with historic data using
Apache Kafka and Apache Spark with dashing
framework’, 2017 2nd IEEE Int. Conference on Recent
Trends in Electronics, Information & Communication
Technology (RTEICT), pp. 1804–1809.
Foundation, A. S. (2019 a) Apache Kafka. Available at:
https://kafka.apache.org/.
Foundation, A. S. (2019 b) Apache Storm. Available at:
https://storm.apache.org/.
Freiknecht, J., Papp, S, Freiknecht, J. & Papp, S. (2018)
‘Apache Kafka’, Encyclopedia of Big Data
Technologies. Springer, Cham, p. 8.
Ghasemi, E. & Chow, P. (2019) ‘Accelerating Apache
Spark with FPGAs’, 2016, Wiley Online Library,
Concurrency and Computation: Practice and
Experience, v31, Issue 2.
Gurusamy, V., Kannan, S. and Nandhini, K. (2017) ‘The
Real Time Big Data Processing Framework
Advantages and Limitations’, Int. Journal of Computer
Sciences and Eng., 5(12): pp 305-312.
Hoseiny Farahabady, M. R., Dehghani Samani, H. R.,
Wang, Y., Zomaya, A. Y. & Tari, Z, (2016) ‘A QoS-
aware controller for Apache Storm’, 2016 IEEE 15th
Int. Symposium on Network Computing and
Applications (NCA), pp. 334–342.
Imanuel (2019) Top 20, free open source and premium
stream analytics platforms. Available at:
https://www.predictiveanalyticstoday.com/top-open-
source-commercial-stream-analytics-platforms.
Instaclustr (2019) Apache Kafka. Available at:
https://www.instaclustr.com/apache-kafka/#apache-
kafka-advantages.
Katsifodimos, A. and Schelter, S. (2016) ‘Apache Flink:
Stream Analytics at Scale’, 2016 IEEE Int. Conference
on Cloud Eng. Workshop (IC2EW), pp. 193–193.
Kirillov, A. (2016) Apache Spark. Available at:
http://datastrophic.io/tag/spark/.
Kleppmann, M. (2018) ‘Apache Samza’, Encyclopedia of
Big Data Technologies. SpringerLink, p. 8.
Kleppmann, M. and Kreps, J. (2015) ‘Kafka, Samza and the
Unix Philosophy of Distributed Data’, IEEE Data
Engineering Bulletin, December 2015, 38(4), pp.4–14.
Kolajo, T., Daramola, O. and Adebiyi, A. (2019) ‘Big data
stream analysis: a systematic literature review’, Journal
of Big Data volume 6, Article number: 47 (2019).
Levy, E. (2019) 7 Popular Stream Processing Frameworks
Compared. Available at: https://www.upsolver.com/
blog/popular-stream-processing-frameworks-compared.
Nasiri, H., Nahesi, S. and Goudarzi, M. (2019) ‘Evaluation
of Distributed Stream Processing Frameworks for IoT
Applications in Smart Cities’, Journal of Big Data
volume 6, Article number: 52 (2019).
Neves, P., Bernardino, J. (2015) ‘Big Data Issues’, In
Proceedings of the 19th International Database
Engineering & Applications Symposium (IDEAS ’15),
ACM, New York, USA, pp. 200–201.
Point, T. (2019) Apache Storm. Available at:
https://www.tutorialspoint.com/apache_storm.
Safaei, A. A. (2017) ‘Real-time processing of streaming big
data’,
Real-Time Systems, v. 53, pp. 1–44.
Shaheen, J. A. (2017) ‘Apache Kafka: Real time implemen-
tation with Kafka architecture review’, Int. Journal of
Advanced Science and Technology, pp.35-42.
Shahverdi, E. (2018) ‘Comparative Evaluation for the
Performance of Big Stream Processing Systems’, Int.
Journal of Pure and Applied Mathematics, V. 119 No.
16, pp.937-948.
Shahverdi, E., Awad, A. and Sakr, S. (2019) ‘Big Stream
Processing Systems: An Experimental Evaluation’,
2019 IEEE 35th Int. Conference on Data Eng.
Workshops (ICDEW), pp.53-60.
Shoro, A. G. and Soomro, T. R. (2015) ‘Big Data Analysis:
Apache Spark Perspective’, Int. Journal of Technical
Innovation in Modern Engineering & Science
(IJTIMES), V.4, Issue 5.
Stratosphere, A. F. and Markl, B. V. (2018) ‘Mosaics in big
data’, DEBS ’18: The 12th ACM Int. Conference on
Distributed and Event-based Systems, pp. 7–13.
Sun, G., Song, Y., Gong, Z., Zhou, X. & Bi, Y. (2019)
‘Survey on streaming data computing system’, ACM
TURC 2019: ACM Turing Celebration Conf., pp. 1–8.
Team, D. (2019) Apache Kafka Tutorial. Available at:
https://data-flair.training/blogs/apache-kafka-tutorial/.
Vaidya, N. (2019) Apache Spark Architecture – Spark
Cluster Architecture Explained. Available at:
https://www.edureka.co/blog/spark-architecture/.