
Table 3: Setting of the performance evaluation of the data
stream processing system.
Configurations Latency im-
provement
(in%)
Throughput
improvement
(in%)
C4 27.05 55
C5 28.2 163.33
increases garbage collection time, extending job du-
ration for fewer executors.
Table 3 shows that doubling cores and RAM re-
duces latency by 30% and significantly improves
throughput, with increases of up to 160% in C5, high-
lighting the impact of vertical scaling.
In summary, the balanced configuration achieves
the highest throughput with the lowest latency, with
vertical scaling offering the best performance gains.
Batch interval prioritization depends on application
needs, as larger intervals increase both latency and
throughput. With current hardware and a 30-second
batch interval, the system supports 140 users for con-
tinuous HAR data processing.
The Spark-based data stream processing frame-
work achieves significant performance improve-
ments, with 28.2% lower latency and 163.33% higher
throughput using 28 cores and 28GB RAM. Reducing
latency in cloud computing is challenging due to con-
nectivity dependencies, while edge computing (e.g.,
smartphones) simplifies latency reduction. Through-
put improvements depend on edge device capacity but
are affected in cloud computing by internet transfer
delays, which can hinder real-time applications.
With online servers offering 64 cores and 64GB
RAM costing around 300e per month and support-
ing 300-400 users, compute storage costs remain sig-
nificant but manageable. Reducing pre-processing
on edge devices while running intensive analytics on
central servers could be a viable future approach.
ACKNOWLEDGEMENTS
This work is supported by the German Federal Min-
istry of Education and Research (BMBF) within the
Junior research group ”Integration and analysis of
multimodal sensor signals for research into neurolog-
ical movement disorders” (MoveGroup) at the Uni-
versity of Lubeck (grant number: 01ZZ2007).
REFERENCES
Agarwal, P. and Alam, M. (2020). A lightweight deep learn-
ing model for human activity recognition on edge de-
vices. Procedia Computer Science, 167:2364–2373.
Ali Mohamed, M., El-Henawy, I. M., and Salah, A. (2021).
Usages of spark framework with different machine
learning algorithms. Computational Intelligence and
Neuroscience, 2021(1):1896953.
Apache Kafka (2024). Apache kafka performance. Ac-
cessed: 2024-10-21.
Aroganam, G., Manivannan, N., and Harrison, D. (2019).
Review on wearable technology sensors used in con-
sumer sport applications. Sensors, 19(9):1983.
Chithra, S., Maheswari, D., and Sethurathinam, C. (2022).
A comparative study on cloud computing and edge
computing with its applications. Indian J. Nat. Sci,
12:32241–32247.
Docker Inc. (2024). Docker overview. Accessed: 2024-10-
21.
Fudickar, S., Kiselev, J., Frenken, T., Wegel, S., Dim-
itrowska, S., Steinhagen-Thiessen, E., and Hein, A.
(2020). Validation of the ambient tug chair with light
barriers and force sensors in a clinical trial. Assistive
Technology, 32(1):1–8.
Garg, N. (2013). Apache kafka. Packt Publishing Birming-
ham, UK.
Inoubli, W., Aridhi, S., Mezni, H., Maddouri, M., and
Nguifo, E. M. (2018). An experimental survey on big
data frameworks. Future Generation Computer Sys-
tems, 86:546–564.
Jain, R. (1990). The art of computer systems performance
analysis. john wiley & sons.
Khannouz, M. and Glatard, T. (2020). A benchmark of data
stream classification for human activity recognition on
connected objects. Sensors, 20(22):6486.
Maaloul, K., Brahim, L., and Abdelhamid, N. M.
(2023). Real-time human activity recognition from
smart phone using linear support vector machines.
TELKOMNIKA (telecommunication Computing Elec-
tronics and Control), 21(3):574–583.
Mart
´
ın, C., Langendoerfer, P., Zarrin, P. S., D
´
ıaz, M., and
Rubio, B. (2022). Kafka-ml: Connecting the data
stream with ml/ai frameworks. Future Generation
Computer Systems, 126:15–33.
Namiot, D. (2015). On big data stream processing. Inter-
national Journal of Open Information Technologies,
3(8):48–51.
Nasiri, H., Nasehi, S., and Goudarzi, M. (2019). Evalua-
tion of distributed stream processing frameworks for
iot applications in smart cities. Journal of Big Data,
6(1):52.
Salloum, S., Dautov, R., Chen, X., Peng, P. X., and Huang,
J. Z. (2016). Big data analytics on apache spark. In-
ternational Journal of Data Science and Analytics,
1:145–164.
Zebin, T., Scully, P. J., Peek, N., Casson, A. J., and
Ozanyan, K. B. (2019). Design and implementation
of a convolutional neural network on an edge comput-
ing smartphone for human activity recognition. IEEE
Access, 7:133509–133520.
HEALTHINF 2025 - 18th International Conference on Health Informatics
578