
gorithm based on Long Short-Term Memory (LSTM)
to improve the end-to-end latency for cloud-native ap-
plications.
Libra (Balla et al., 2020) is an adaptive autoscaler,
which automatically detects the optimal resource set
for a single Pod, then manages the horizontal scaling
process. Additionally, if the load or the underlying
virtualized environment changes, Libra adapts the re-
source definition for the Pod and adjusts the horizon-
tal scaling process accordingly.
In (Yuan and Liao, 2024) authors propose a pre-
dictive autoscaling Kubernetes operator based on time
series forecasting algorithms, aimed to dynamically
adjust the number of running instances in the cluster
to optimize resource management. In this work, the
Holt–Winter forecasting method and the Gated Re-
current Unit (GRU) neural network, two robust time
series forecasting algorithms, are employed and dy-
namically managed.
Gwydion (Santos et al., 2025), is a microservices-
based application autoscaler that enables different au-
toscaling goals through Reinforcement Learning (RL)
algorithms. Gwydion is based on the OpenAI Gym
library and is aimed to bridge the gap between RL
and autoscaling research by training RL algorithms
on real cloud environments for two opposing reward
strategies: cost-aware and latency-aware. Gwydion
focuses on improving resource usage and reducing the
service response time by considering microservice in-
ter dependencies when scaling horizontally.
In (Pramesti and Kistijantoro, 2022) an autoscaler
based on response time prediction is proposed for mi-
croservice applications running in Kubernetes envi-
ronments. The prediction function is developed using
a machine learning model that features performance
metrics at the microservice and node levels. The re-
sponse time prediction is then used to calculate the
number of Pods required by the application to meet
the target response time.
StatuScale (Wen et al., 2024) is a status-aware and
elastic scaling framework which is based on a load
status detector that can select appropriate elastic scal-
ing strategies for differentiated resource scheduling in
vertical scaling. Additionally, StatuScale employs a
horizontal scaling controller that utilizes comprehen-
sive evaluation and resource reduction to manage the
number of replicas for each microservice.
6 CONCLUSIONS
In this work, we propose extending the Kubernetes
platform with a custom Pod autoscaling strategy
aimed at minimizing SLO violations in the response
times of containerized applications running in cloud
environments, while simultaneously reducing infras-
tructure costs. Our primary goal is to address the lim-
itations of the Kubernetes Horizontal Pod Autoscaler,
which scales Pod replicas based on low-level resource
usage metrics. This approach makes it challenging to
define scaling targets that are properly correlated with
the desired response time SLOs and maximum infras-
tructure costs. The idea is to propose a Pod autoscal-
ing policy based on high-level metrics, such as actual
application response times and infrastructure costs, to
more accurately achieve the desired SLO and cost tar-
gets.
For future work, we plan to enhance the efficiency
of the proposed autoscaling policy by using AI and
time series analysis techniques to identify patterns in
user requests and predict their trends. This will enable
the development of a proactive autoscaling policy that
scales up the number of replicas to ensure improved
service performance, while minimizing infrastructure
over provisioning and reducing unnecessary costs.
ACKNOWLEDGEMENTS
This work was partially funded by the European
Union under the Italian National Recovery and Re-
silience Plan (NRRP) of NextGenerationEU, Mission
4 Component C2 Investment 1.1 - Call for tender No.
1409 of 14/09/2022 of Italian Ministry of University
and Research - Project ”Cloud Continuum aimed at
On-Demand Services in Smart Sustainable Environ-
ments” - CUP E53D23016420001.
REFERENCES
Balla, D., Simon, C., and Maliosz, M. (2020). Adap-
tive scaling of kubernetes pods. In NOMS 2020 -
2020 IEEE/IFIP Network Operations and Manage-
ment Symposium, pages 1–5.
Calcaterra, D., Di Modica, G., Mazzaglia, P., and Tomar-
chio, O. (2021). TORCH: a TOSCA-Based Orchestra-
tor of Multi-Cloud Containerised Applications. Jour-
nal of Grid Computing, 19(1).
Chen, T., Bahsoon, R., and Yao, X. (2018). A survey and
taxonomy of self-aware and self-adaptive cloud au-
toscaling systems. ACM Comput. Surv., 51(3).
Detti, A., Funari, L., and Petrucci, L. (2023). µbench: An
open-source factory of benchmark microservice ap-
plications. IEEE Transactions on Parallel and Dis-
tributed Systems, 34(3):968–980.
Do, T. V., Do, N. H., Rotter, C., Lakshman, T., Biro, C., and
B
´
erczes, T. (2025). Properties of horizontal pod au-
toscaling algorithms and application for scaling cloud-
CLOSER 2025 - 15th International Conference on Cloud Computing and Services Science
78