
method approximates the Pareto front for mapping
tasks to heterogeneous processors. The Pareto front
represents the optimal deployment models that strike
a balance between the optimized objectives within
real-time constraints. Through empirical evaluation,
PQP demonstrates its efficacy compared to genetic al-
gorithms while also providing a solution to the refac-
toring problem, enabling designers to efficiently ex-
plore system configurations and adjustments.
As future work, we aim to extend PQP’s applica-
bility to more diverse case studies, incorporating addi-
tional objectives and refactoring scenarios. Addition-
ally, we plan to address the task scheduling process
from a multi-objective perspective, aiming to mini-
mize both worst-case response time and energy re-
quirements simultaneously.
REFERENCES
Akesson, B., Nasri, M., Nelissen, G., Altmeyer, S., and
Davis, R. I. (2020). An empirical survey-based study
into industry practice in real-time systems. In 2020
IEEE Real-Time Systems Symposium (RTSS), pages
3–11. IEEE.
Barto, A. G. (2021). Reinforcement learning: An introduc-
tion by Richards’ Sutton. SIAM Rev, 6(2):423.
Bellman, R. (1957). A markovian decision process. Journal
of mathematics and mechanics, pages 679–684.
Cao, Y., Smucker, B. J., and Robinson, T. J. (2015). On
using the hypervolume indicator to compare pareto
fronts: Applications to multi-criteria optimal exper-
imental design. Journal of Statistical Planning and
Inference, 160:60–74.
Caviglione, L., Gaggero, M., Paolucci, M., and Ronco,
R. (2021). Deep reinforcement learning for multi-
objective placement of virtual machines in cloud dat-
acenters. Soft Computing, 25(19):12569–12588.
Coello, C. A. C. (2007). Evolutionary algorithms for solv-
ing multi-objective problems. Springer.
Fonseca, C. M., Paquete, L., and L
´
opez-Ib
´
anez, M. (2006).
An improved dimension-sweep algorithm for the hy-
pervolume indicator. In 2006 IEEE international con-
ference on evolutionary computation, pages 1157–
1163. IEEE.
Haouari, B., Mzid, R., and Mosbahi, O. (2022). On the use
of reinforcement learning for real-time system design
and refactoring. In International Conference on Intel-
ligent Systems Design and Applications, pages 503–
512. Springer.
Haouari, B., Mzid, R., and Mosbahi, O. (2023a). Psrl: A
new method for real-time task placement and schedul-
ing using reinforcement learning. In Software Engi-
neering and Knowledge Engineering, pages 555–560.
ksi research.
Haouari, B., Mzid, R., and Mosbahi, O. (2023b). A re-
inforcement learning-based approach for online opti-
mal control of self-adaptive real-time systems. Neural
Computing and Applications, 35(27):20375–20401.
Huseyinov, I. and Bayrakdar, A. (2022). Novel nsga-ii and
spea2 algorithms for bi-objective inventory optimiza-
tion. Studies in Informatics and Control, 31(3):31–42.
Kashani, M. H., Zarrabi, H., and Javadzadeh, G. (2017). A
new metaheuristic approach to task assignment prob-
lem in distributed systems. In 2017 IEEE 4th Interna-
tional Conference on Knowledge-Based Engineering
and Innovation (KBEI), pages 0673–0677. IEEE.
Lakhdhar, W., Mzid, R., Khalgui, M., and Frey, G. (2018).
A new approach for optimal implementation of multi-
core reconfigurable real-time systems. In ENASE,
pages 89–98.
Lakhdhar, W., Mzid, R., Khalgui, M., Frey, G., Li, Z., and
Zhou, M. (2020). A guidance framework for synthesis
of multi-core reconfigurable real-time systems. Infor-
mation Sciences, 539:327–346.
Lassoued, R. and Mzid, R. (2022). A multi-objective evo-
lution strategy for real-time task placement on het-
erogeneous processors. In International Conference
on Intelligent Systems Design and Applications, pages
448–457. Springer.
Liu, C. L. and Layland, J. W. (1973). Scheduling algo-
rithms for multiprogramming in a hard-real-time en-
vironment. Journal of the ACM (JACM), 20(1):46–61.
Mehiaoui, A., Wozniak, E., Babau, J.-P., Tucci-
Piergiovanni, S., and Mraidha, C. (2019). Optimiz-
ing the deployment of tree-shaped functional graphs
of real-time system on distributed architectures. Auto-
mated Software Engineering, 26:1–57.
Mirjalili, S. (2019). Genetic algorithm. In Evolutionary
algorithms and neural networks, pages 43–55.
Van Moffaert, K. and Now
´
e, A. (2014). Multi-objective re-
inforcement learning using sets of pareto dominating
policies. The Journal of Machine Learning Research,
15(1):3483–3512.
Vidyarthi, D. P. and Tripathi, A. K. (2001). Maximizing
reliability of distributed computing system with task
allocation using simple genetic algorithm. Journal of
Systems Architecture, 47(6):549–554.
Yang, L., Sun, Q., Zhang, N., and Liu, Z. (2020). Opti-
mal energy operation strategy for we-energy of energy
internet based on hybrid reinforcement learning with
human-in-the-loop. IEEE Transactions on Systems,
Man, and Cybernetics: Systems, 52(1):32–42.
Zhu, Q., Zeng, H., Zheng, W., Natale, M. D., and
Sangiovanni-Vincentelli, A. (2013). Optimization of
task allocation and priority assignment in hard real-
time distributed systems. ACM Transactions on Em-
bedded Computing Systems (TECS), 11(4):1–30.
Zitzler, E. and Thiele, L. (1998). Multiobjective opti-
mization using evolutionary algorithms—a compara-
tive case study. In International conference on par-
allel problem solving from nature, pages 292–301.
Springer.
Reinforcement Learning for Multi-Objective Task Placement on Heterogeneous Architectures with Real-Time Constraints
189