Q-Learning Based LQR Occupant-Centric Control of Non-Residential
Buildings
Oumaima Ait-Essi
1
, Joseph J. Yam
´
e
1,∗ a
, Hicham Jamouli
2 b
and Fr
´
ed
´
eric Hamelin
1 c
1
CRAN, CNRS, UMR 7039, Universit
´
e de Lorraine, Campus Sciences, Vandoeuvre-l
`
es-Nancy, France
2
LISAD, ENSA, Universit
´
e Ibn Zohr, Agadir, Morocco
{oumaima.ait-essi, joseph.yame, frederic.hamelin}@univ-lorraine.fr, h.jamouli@uiz.ac.ma
Keywords:
Reinforcement Learning, Q-Learning, Data-Based Linear Quadratic Control, HVAC-VAV Systems, Building
Occupants, Occupant-Centric Control.
Abstract:
We propose a novel approach to the control of variable-air-volume (VAV)-HVAC systems for the regulation of
thermal comfort in rooms of a non-residential building where the number of occupants may vary considerably
and randomly during the day. Specifically, we develop a reinforcement learning control algorithm based
on model-free optimal linear quadratic control. We leverage the quality function, the so-called Q-function,
derived from Bellman dynamic programming, to develop a learning control algorithm based solely on system-
generated data including building dynamics and its occupants. Simulations are carried out on a new HVAC-
VAV system installed in a building at the University of Lorraine, demonstrating the potential of the proposed
method for maintaining climatic conditions and the comfort of room occupants while optimizing the airflow
demand of VAV boxes, which is correlated with the energy consumed per room.
1 INTRODUCTION
Energy consumption in buildings accounts for over
36.9% of primary energy consumption, of which
17.2% is accounted for by commercial and non-
residential buildings (EIA, 2024). Heating, ventila-
tion, and air conditioning (HVAC) systems account
for 40% of total building energy consumption. En-
suring occupant comfort while achieving energy sav-
ings is a key objective in optimal building opera-
tion, as comfort plays a vital role in human well-
being and productivity. Recent contributions regard-
ing human-building interactions highlight the impact
of occupant information, such as occupancy and be-
havior, on building energy consumption. The poten-
tial for improving the operation of buildings and their
control systems through such human-building interac-
tions is now well recognized, and has led to occupant-
centric control (OCC) as an important research topic
(Soleimanijavid et al., 2024; Ouf et al., 2021; Yu
et al., 2024; Xu et al., 2023; Jia et al., 2017). Al-
a
https://orcid.org/0000-0002-4349-6240
b
https://orcid.org/0000-0002-9064-0372
c
https://orcid.org/0000-0002-5535-5680
∗
Corresponding author: joseph.yame@univ-
lorraine.fr.
though the concept of OCC is not perfectly defined, it
can be categorized in two main ways (Ouf et al., 2021)
: occupant-centric controls and occupant behavior-
centric controls. In the first meaning, OCC deals with
the presence/absence of occupants and HVAC control
based on occupants counts while in its second mean-
ing OCC focuses on occupant behaviors and prefer-
ences from occupant’s interactions with building sys-
tems, e.g., thermostats setpoints adjusting, windows
openings, exercising, etc. With regards to the behav-
ioral aspects of OCC, information and potential char-
acteristics can be extracted from energy consumption
data using machine learning methods which are sub-
sequently used to identify and classify typical behav-
ior patterns (Yu et al., 2024). However, occupant be-
haviors are highly stochastic and unpredictable, with
temporal complexities in the process of identifying
consistent behavioral strategies (Xu et al., 2023). To
meet these challenges, reinforcement learning is in-
creasingly becoming one of the most effective ways
of developing control strategies that take occupant be-
havior into account to ensure thermal comfort and
optimize building energy consumption (Han et al.,
2020), (Liu and Gou, 2024), (Wang et al., 2023).
In the work presented here, OCC is addressed un-
der its first categorization as the presence/absence and
the number of occupants in a building have a direct
494
Ait-Essi, O., Yamé, J., Jamouli, H. and Hamelin, F.
Q-Learning Based LQR Occupant-Centric Control of Non-Residential Buildings.
DOI: 10.5220/0013066800003822
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024) - Volume 1, pages 494-502
ISBN: 978-989-758-717-7; ISSN: 2184-2809
Proceedings Copyright © 2024 by SCITEPRESS – Science and Technology Publications, Lda.