agent’s decision to create new and non-trivial routing
and spectrum assignment protocols for optical net-
works.
5 CONCLUSIONS AND FINAL
REMARKS
We present a new Deep Reinforcement Learning
Framework for Optical Networks called DREAM-ON
GYM. The framework allows the implementation of
deep reinforcement learning in a straightforward and
versatile manner to solve resource allocation prob-
lems in optical network architectures, such as routing,
spectrum or wavelength allocation, and band or core
selection in multiband or multicore architectures. To
this end, we provide a set of functions and modules al-
lowing agents and environments to interact to train the
models. The application relies on adapting the Flex
Net Sim Simulator to train and evaluate the agents.
This way, we reduce the time and complexity of im-
plementing and evaluating DRL for Optical Network
problems.
We exemplify the usability of our framework by
choosing a path with three different agents in an elas-
tic optical network. With this example, we demon-
strated the easy-to-use tool showing the difference in
the performance of the three agents. In addition, by
creating an app, we can make an application for a sim-
ple evaluation and training for a given optical network
context.
In future works, we will use the framework to
allow interpretability and generalization of the mod-
els while training and evaluating DRL in optical net-
works and adding new capabilities to the framework,
such as compatibility for survivability problems.
ACKNOWLEDGEMENTS
Financial support from FONDECYT Iniciaci
´
on
11220650 is gratefully acknowledged.
REFERENCES
Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M.
(2019). Optuna: A next-generation hyperparameter
optimization framework. In Proceedings of the 25th
ACM SIGKDD international conference on knowl-
edge discovery & data mining, pages 2623–2631.
Chen, X., Li, B., Proietti, R., Lu, H., Zhu, Z., and Yoo, S. B.
(2019). Deeprmsa: A deep reinforcement learning
framework for routing, modulation and spectrum as-
signment in elastic optical networks. Journal of Light-
wave Technology, 37(16):4155–4163.
El Sheikh, N. E. D., Paz, E., Pinto, J., and Beghelli, A.
(2021). Multi-band provisioning in dynamic elastic
optical networks: a comparative study of a heuristic
and a deep reinforcement learning approach. In 2021
International Conference on Optical Network Design
and Modeling (ONDM), pages 1–3. IEEE.
Falc
´
on, F., Espa
˜
na, G., and B
´
orquez-Paredes, D. (2021).
Flex net sim: A lightly manual.
Fan, J., Wang, Z., Xie, Y., and Yang, Z. (2020). A the-
oretical analysis of deep q-learning. In Learning for
Dynamics and Control, pages 486–489. PMLR.
Hill, A., Raffin, A., Ernestus, M., Gleave, A., Kanervisto,
A., Traore, R., Dhariwal, P., Hesse, C., Klimov, O.,
Nichol, A., Plappert, M., Radford, A., Schulman,
J., Sidor, S., and Wu, Y. (2018). Stable baselines.
https://github.com/hill-a/stable-baselines.
Klinkowski, M., Lechowicz, P., and Walkowiak, K. (2018).
Survey of resource allocation schemes and algo-
rithms in spectrally-spatially flexible optical network-
ing. Optical Switching and Networking, 27(September
2017):58–78.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T.,
Harley, T., Silver, D., and Kavukcuoglu, K. (2016).
Asynchronous methods for deep reinforcement learn-
ing. In International conference on machine learning,
pages 1928–1937. PMLR.
Morales, P., Franco, P., Lozada, A., Jara, N., Calder
´
on,
F., Pinto-R
´
ıos, J., and Leiva, A. (2021). Multi-band
environments for optical reinforcement learning gym
for resource allocation in elastic optical networks. In
2021 International Conference on Optical Network
Design and Modeling (ONDM), pages 1–6. IEEE.
Naeem, M., Rizvi, S. T. H., and Coronato, A. (2020). A gen-
tle introduction to reinforcement learning and its ap-
plication in different fields. IEEE Access, 8:209320–
209344.
Natalino, C. and Monti, P. (2020). The optical rl-gym:
An open-source toolkit for applying reinforcement
learning in optical networksf. In 2020 22nd Interna-
tional Conference on Transparent Optical Networks
(ICTON), pages 1–5. IEEE.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and
Klimov, O. (2017). Proximal policy optimization al-
gorithms. arXiv preprint arXiv:1707.06347.
Towers, M., Terry, J. K., Kwiatkowski, A., Balis, J. U.,
Cola, G. d., Deleu, T., Goul
˜
ao, M., Kallinteris, A.,
KG, A., Krimmel, M., Perez-Vicente, R., Pierr
´
e, A.,
Schulhoff, S., Tai, J. J., Shen, A. T. J., and Younis,
O. G. (2023). Gymnasium.
Zhang, Y., Xin, J., Li, X., and Huang, S. (2020). Overview
on routing and resource allocation based machine
learning in optical networks. Optical Fiber Technol-
ogy, 60:102355.
Zitkovich, M., Saavedra, G., and B
´
orquez-Paredes, D.
(2023). Event-oriented simulation module for dy-
namic elastic optical networks with space division
multiplexing. In Proceedings of the 13th International
Conference on Simulation and Modeling Methodolo-
gies, Technologies and Applications (SIMULTECH),
volume 1, pages 295–302.
SIMULTECH 2024 - 14th International Conference on Simulation and Modeling Methodologies, Technologies and Applications
222