5 CONCLUSION AND
PERSPECTIVES
In this paper, we propose ways to improve the con-
vergence of Reinforcement Learning algorithms for
light signal control to reduce energy costs while sat-
isfying the user. The setting considered in this work
is when the reactions of the user depend not only on
present conditions but on the entire history of the sig-
nal values. Furthermore, we consider the use of a sin-
gle light source and no change in the activity of the
user. We present a detailed study to determine which
Reinforcement Learning algorithm is most appropri-
ate for the problem at hand. The pursuit algorithm
seems to be the way to go for our problem.
In a future work, we consider taking into account
the global lighting environment with several light
sources where the activity of the user could vary in
the building. We also consider to look into the per-
formances of other variants of the pursuit algorithm
(like the hierarchical pursuit algorithm) and compare
its performances with other stateless Reinforcement
Learning algorithms. This comparison will allow us
to make finer conclusions on the behaviour of the al-
gorithms on our problem, and possibly to circumvent
the burden of choosing the best set of actions. Also,
we consider to look into other variants of the Re-
inforcement Learning framework, particularly state-
based Reinforcement Learning. Challenges to do so
include choosing which elements might be relevant as
to determine the state and the size of the state. Other
areas of building control like HVAC control might
also benefit from our work, provided we take into ac-
count inertia phenomena and more complex energy
consumption models in our approaches.
REFERENCES
Cheng, Z., Zhao, Q., Wang, F., Jiang, Y., Xia, L., and
Ding, J. (2016). Satisfaction based q-learning for in-
tegrated lighting and blind control. Energy and Build-
ings, 2016.
Einhorn, H. (1979). Discomfort glare: a formula to bridge
differences. Lighting Research and Technology.
et al, P. M. F. (2012). Neural network pmv estimation for
model-based predictive control of hvac systems. IEEE
World Congress on Computational Intelligence.
et al, S. L. (2018). Inference of thermal preference profiles
for personalized thermal environments. Conference:
ASHRAE Winter Conference 2018.
F.Reinhart, C. (2004). Lightswitch-2002: a model for
manual and automated control of electric lighting and
blinds. Solar Energy, 77:15–28.
GABC (2020). The global alliance for buildings and con-
struction (gabc). Technical report, The Global Al-
liance for Buildings and Construction.
Garg, V. and N.K.Bansal (2000). Smart occupancy sensors
to reduce energy consumption. Energy and Buildings,
32:91–87.
Haddam, N., Boulakia, B. C., and Barth, D. (2020). A
model-free reinforcement learning approach for the
energetic control of a building with non-stationary
user behaviour. 4th International Conference on
Smart Grid and Smart Cities, 2020.
Jennings, J. D., Rubinstein, F. M., DiBartolomeo, D., and
Blanc, S. L. (2000). Comparison of control options
in private offices in an advanced lighting controls
testbed. Journal of Illuminating Engineering Society,
29:39–60.
Kanagasabai, R., , and Sastry, P. S. (1996). Finite time anal-
ysis of the pursuit algorithm for learning automata.
IEEE Transactions on Systems, Man, and Cybernet-
ics, Part B (Cybernetics).
Mardaljevic, J. (2013). Rethinking daylighting and compli-
ance. SLL/CIBSE International Lighting Conference.
2013.
Nagy, Z., Yong, F. Y., Frei, M., and Schlueter, A. (2015).
Occupant centered lighting control for comfort and
energy efficient building operation. Energy and Build-
ings, 94:100–108.
Neida, B. V., Maniccia, D., and Tweed, A. (2001). An anal-
ysis of the energy and cost savings potential of occu-
pancy sensors for commercial lighting systems. Jour-
nal of Illuminating Engineering Society, 30:111–125.
P. Petherbridge, R. H. (1950). Discomfort glare and the
lighting of buildings. Transactions of the Illuminating
Engineering Society.
Park, J. Y., Dougherty, T., Fritz, H., and Nagy, Z. (2019).
LightLearn: An adaptive and occupant centered con-
troller for lighting based on reinforcement learning.
Building and Environment.
Sutton, R. S. and Barto, A. G. (1998). Introduction to Re-
inforcement Learning. MIT Press, Cambridge, MA,
USA, 1st edition.
Thathachar, M. A. L. (1990). Stochastic automata and
learning systems. Sadhana, Vol 15, 1990.
Yang, L., Zolt
´
an, N., Goffin, P., and Schlueter, A. (2015).
Reinforcement learning for optimal control of low ex-
ergy buildings. Applied Energy.
Yazidi, A., Zhang, X., Jiao, L., and Oommen, B. J. (2019).
The hierarchical continuous pursuit learning automa-
tion: a novel scheme for environments with large
numbers of actions. IEEE transactions on neural net-
works and learning systems, 31(2):512–526.
Zhang, T., Baasch, G., Ardakanian, O., and Evins, R.
(2021). On the joint control of multiple building sys-
tems with reinforcement learning. e-Energy ’21: Pro-
ceedings of the Twelfth ACM International Confer-
ence on Future Energy Systems, pages 60–72.
Accelerated Variant of Reinforcement Learning Algorithms for Light Control with Non-stationary User Behaviour
85