
The problem of instability in cascaded controllers
also extends to the setting of networked control sys-
tems. To take an example from the setting presented in
this paper, oftentimes several temperature control units
send flow setpoints to multiple flow control valves
which are connected to the same hydraulic circuit and
thus have unknown physical coupling effects between
each other thus rendering any single-agent control law
suboptimal and potentially destabilizing. It may thus
be interesting to learn the physical dependencies of the
cascaded controllers and leverage this information to
stabilize such controller networks. This directly leads
the research in the direction of multi-agent reinforce-
ment learning. We propose to continue research on
these ideas and extend the framework we present in
this work accordingly.
ACKNOWLEDGEMENTS
We are indebted to Volkher Scholz for his many in-
valuable ideas and insights which helped shape the
contents of this paper and for proofreading several
drafts of this paper. We further extend our gratitude
to Stefan Mischler for his support and helpful advice,
enabling us to achieve the presented results. Also we
would be remiss not to thank the team at MathWorks
who helped us out with many helpful clarifications and
support. Last but not least, we would like to thank
our colleagues for many stimulating and motivating
discussions.
REFERENCES
Ahmad, M. A., Ishak, H., Nasir, A. N. K., and Ghani, N. A.
(2021). Data-based pid control of flexible joint robot us-
ing adaptive safe experimentation dynamics algorithm.
Bulletin of Electrical Engineering and Informatics.
Cohen, G. H. and Coon, G. A. (2022). Theoretical Considera-
tion of Retarded Control. Transactions of the American
Society of Mechanical Engineers, 75(5):827–834.
Esrafilian-Najafabadi, M. and Haghighat, F. (2023). Trans-
fer learning for occupancy-based HVAC control: A
data-driven approach using unsupervised learning of
occupancy profiles and deep reinforcement learning.
Energy and Buildings, 300:113637.
Fiducioso, M., Curi, S., Schumacher, B., Gwerder, M., and
Krause, A. (2019). Safe contextual bayesian optimiza-
tion for sustainable room temperature pid control tun-
ing. In Proceedings of the Twenty-Eighth International
Joint Conference on Artificial Intelligence, IJCAI-19,
pages 5850–5856. International Joint Conferences on
Artificial Intelligence Organization.
Fux, S. F., Mohajer, B., and Mischler, S. (2023). A compari-
son of the dynamic temperature responses of two differ-
ent heat exchanger modelling approaches in simulink
simscape for HVAC applications. In Wagner, G.,
Werner, F., and Rango, F. D., editors, Proceedings
of the 13th International Conference on Simulation
and Modeling Methodologies, Technologies and Appli-
cations, SIMULTECH 2023, Rome, Italy, July 12-14,
2023, pages 417–424. SCITEPRESS.
Gaing, Z.-L. (2004). A particle swarm optimization approach
for optimum design of pid controller in avr system.
IEEE Transactions on Energy Conversion, 19(2):384–
391.
Gao, F. and Han, L. (2012). Implementing the Nelder-Mead
simplex algorithm with adaptive parameters. Computa-
tional Optimization and Applications, 51(1):259–277.
Garcia, C. E. and Morari, M. (1982). Internal model control.
a unifying review and some new results. Industrial &
Engineering Chemistry Process Design and Develop-
ment, 21(2):308–323.
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep
Learning. MIT Press.
Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018).
Soft Actor-Critic: Off-Policy Maximum Entropy Deep
Reinforcement Learning with a Stochastic Actor. Tech-
nical report. arXiv:1801.01290 [cs, stat] type: article.
Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha,
S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P.,
and Levine, S. (2019). Soft actor-critic algorithms and
applications. (arXiv:1812.05905). arXiv:1812.05905
[cs, stat].
Hazan, E., Lee, H., Singh, K., Zhang, C., and Zhang, Y.
(2018). Spectral Filtering for General Linear Dynami-
cal Systems. In Proceedings of the 32nd International
Conference on Neural Information Processing Systems,
NIPS’18, pages 4639–4648, Red Hook, NY, USA. Cur-
ran Associates Inc.
Kingma, D. P. and Ba, J. (2017). Adam: A method for
stochastic optimization.
MacQueen, J. (1965). Some methods for classification and
analysis of multivariate observations. Proceedings of
the Fifth Berkeley Symposium on Mathematical Statis-
tics and Probability (June 21-July 18, 1965 and Decem-
ber 27, 1965-January 7, 1966), pages pages 281–297.
Referenced by: MathSciNet [MR0214227].
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M. A.
(2013). Playing atari with deep reinforcement learning.
CoRR, abs/1312.5602.
Mok, R. and Ahmad, M. A. (2022). Fast and optimal tuning
of fractional order pid controller for avr system based
on memorizable-smoothed functional algorithm. En-
gineering Science and Technology, an International
Journal.
Neumann-Brosig, M., Marco, A., Schwarzmann, D., and
Trimpe, S. (2020). Data-efficient autotuning with
bayesian optimization: An industrial control study.
IEEE Transactions on Control Systems Technology,
28(3):730–740.
Rasmussen, C. E. and Williams, C. K. I. (2005). Gaussian
Processes for Machine Learning. The MIT Press.
SIMULTECH 2024 - 14th International Conference on Simulation and Modeling Methodologies, Technologies and Applications
440