
Figure 9: Latency Over Time for Each Component (During
Fault Injection).
helping to identify performance issues.
By demonstrating the effectiveness of a hybrid
approach that combines core architecture knowledge
with latency anomalies, this work contributes to the
development of more resilient and fault-tolerant dis-
tributed systems. We demonstrated that specific pat-
terns (architecture patterns such as load balancing or
anomaly change patterns such as gradual or steep in-
creases) can be identified and that this type of knowl-
edge is useful to complement other monitoring in-
sights, without being comprehensive in terms of pat-
tern coverage here. Our aim was a proof-of-concept
solution to demonstrate the utility of the approach.
Future research will add to automation, e.g., re-
mediation processes to complement our anomaly de-
tection in root cause analyses (Pahl, 2023). Currently,
the core of knowledge mining and anomaly detection
is automated, but not all pattern types are covered and
rely still on manual classification.
REFERENCES
Ahmed, M., Mahmood, A., and Hu, J. (2016). A survey
of network anomaly detection techniques. Journal of
Network and Computer Applications.
Azimi, S. and Pahl, C. (2024). Anomaly analytics in data-
driven machine learning applications. International
journal of data science and analytics, pages 1–26.
Dragoni, N., Lanese, I., Larsen, S., Mazzara, M., Mustafin,
R., and Safina, L. (2017). Microservices: yesterday,
today, and tomorrow. In Present & Ulterior SW Eng.
Fonseca, R., Porter, G., Katz, R. H., Shenker, S., and Sto-
ica, I. (2007). X-trace: A pervasive network tracing
framework. In USENIX Symposium.
Forsberg, V. (2019). Automatic anomaly detection and root
cause analysis for microservice clusters. Master’s the-
sis, Ume
˚
a University.
Ikram, A., Chakraborty, S., Mitra, S., Saini, S. K., Bagchi,
S., and Kocaoglu, M. (2022). Root cause analysis of
failures in microservices through causal discovery. In
International Conference on Software Engineering.
Komosny, D. (2022). General internet service assessment
by latency including partial measurements. PeerJ
Comput Sci, 8:e1072.
Li, Z., Chen, J., Jiao, R., Zhao, N., Wang, Z., Zhang, S.,
Wu, Y., Jiang, L., Yan, L., Wang, Z., Chen, Z., Zhang,
W., Nie, X., Su, K., and Pei, D. (2021). Practical root
cause localization for microservice systems via trace
analysis. In Intl Symp on Quality of Service.
Merkel, D. (2014). Docker: lightweight linux containers for
consistent development and deployment. Linux Jrnl.
Mohamed, H. and El-Gayar, O. (2021). End-to-end latency
prediction of microservices workflow on kubernetes:
A comparative evaluation of machine learning models
and resource metrics. In HICSS.
Pahl, C. (2023). Research challenges for machine learning-
constructed software. Service Oriented Computing
and Applications, 17(1):1–4.
Poonam, S. and Sangwan, S. (2020). A comparative study
of various load balancing algorithms in cloud comput-
ing environment. Intl Jrnl of Advanced Research in
Engineering and Technology, 11(12).
Reed, D. P. and Perigo, L. (2024). Measuring isp perfor-
mance in broadband america: A study of latency un-
der load. In University of Colorado Boulder.
Samir, A. and Pahl, C. (2020). Detecting and localizing
anomalies in container clusters using markov models.
Electronics.
Scolati, R., Fronza, I., El Ioini, N., Samir, A., Barzegar,
H. R., and Pahl, C. (2020). A containerized edge cloud
architecture for data stream processing. In Lecture
Notes in Computer Science. Springer.
Sigelman, B. H., Barroso, L. A., Burrows, M., Stephenson,
P., Plakal, M., Beaver, D., Jaspan, S., and Shanbhag,
C. (2010). Dapper, a large-scale distributed systems
tracing infrastructure. Google research.
Sundberg S., Brunstrom A., F.-R. S. and S., C. (2024). Mea-
suring network latency from a wireless isp: Variations
within and across subnets. Preprint.
von Leon, D., Miori, L., Sanin, J., El Ioini, N., Helmer, S.,
and Pahl, C. A lightweight container middleware for
edge cloud architectures. Fog and Edge Computing:
Principles and Paradigms.
Wang, R., Qiu, H., Cheng, X., and Liu, X. (2023). Anomaly
detection with a container-based stream processing
framework for industrial internet of things. Journal
of Industrial Information Integration, 35.
Yu, G., Chen, P., Chen, H., Guan, Z., Huang, Z., Jing, L.,
Weng, T., Sun, X., and Li, X. (2024). Microrank: End-
to-end latency issue localization with extended spec-
trum analysis in microservice environments. In Con-
ference on Computer Communications.
Anomaly Detection for Partially Observable Container Systems Based on Architecture Profiling
135