
7 CONCLUSION
In this paper, we introduced PenGym, a framework
for creating real environments to support training
RL agents for penetration testing purposes. By en-
abling the execution of real actions with real obser-
vations, PenGym eliminates the need for the proba-
bilistic modeling of actions, resulting in a more ac-
curate representation of security dynamics compared
to simulation-based environments. Since PenGym
agents are trained in a cyber range, they experience
actual network conditions. This approach potentially
enhances the applicability of the trained agents when
they are deployed in a real-world infrastructure.
The framework has been validated and refined
through several experiments, demonstrating the cor-
rectness of the action implementation, as well as the
fact that PenGym can provide reliable results and rea-
sonable execution times for training. Although the
PenGym training times are longer than those of simu-
lation environments (e.g., 113.12 s versus 45.12 s on
average in our experiments when using the ‘tiny’ sce-
nario), we consider that this value is reasonable given
the actual cyber range and network use.
The PenGym framework was released on GitHub
(https://github.com/cyb3rlab/PenGym) as open
source. We plan to add support for more complex en-
vironments, thus enabling users to simulate more re-
alistic and larger scenarios that would make possible
more comprehensive testing and analysis.
In addition, we aim to improve the cyber range
creation method, which is currently mainly a manual
process that is time-consuming and cumbersome. Our
next goal is to automate cyber range creation by lever-
aging the functionality of the open-source CyTrONE
framework (Beuran et al., 2018), so that the creation
process is streamlined and more efficient.
REFERENCES
Beuran, R., Tang, D., Pham, C., Chinen, K., Tan, Y., and
Shinoda, Y. (2018). Integrated framework for hands-
on cybersecurity training: CyTrONE. Computers &
Security, 78C:43–59.
Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,
Schulman, J., Tang, J., and Zaremba, W. (2016). Ope-
nAI Gym. arXiv:1606.01540.
Chaudhary, S., O’Brien, A., and Xu, S. (2020). Automated
post-breach penetration testing through reinforcement
learning. In 2020 IEEE Conference on Communica-
tions and Network Security (CNS), pages 1–2. IEEE.
Furfaro, A., Piccolo, A., Parise, A., Argento, L., and Sacca,
D. (2018). A cloud-based platform for the emulation
of complex cybersecurity scenarios. Future Genera-
tion Computer Systems, 89:791–803.
Ghanem, M. C. and Chen, T. M. (2018). Reinforcement
learning for intelligent penetration testing. In 2018
Second World Conference on Smart Trends in Sys-
tems, Security and Sustainability (WorldS4), pages
185–192.
Li, L., El Rami, J.-P. S., Taylor, A., Rao, J. H., and
Kunz, T. (2022). Enabling a network AI gym for au-
tonomous cyber agents. In 2022 International Con-
ference on Computational Science and Computational
Intelligence (CSCI), pages 172–177. IEEE.
Lyon, G. (2014). Nmap security scanner. https://nmap.org/.
Maynor, D. (2011). Metasploit toolkit for penetration test-
ing, exploit development, and vulnerability research.
Syngess Publishing, Elsevier.
McInerney, D. (2020). Pymetasploit3.
https://pypi.org/project/pymetasploit3/.
Microsoft Defender Research Team (2021). CyberBat-
tleSim. https://github.com/microsoft/cyberbattlesim.
Created by Christian Seifert, Michael Betser, William
Blum, James Bono, Kate Farris, Emily Goren, Justin
Grana, Kristian Holsheimer, Brandon Marken, Joshua
Neil, Nicole Nichols, Jugal Parikh, Haoran Wei.
Molina-Markham, A., Miniter, C., Powell, B., and Rid-
ley, A. (2021). Network environment design for au-
tonomous cyberdefense. arXiv:2103.07583.
National Vulnerability Database (2021). CVE-2021-4034.
https://nvd.nist.gov/vuln/detail/CVE-2021-4034.
Accessed: May 2, 2023.
Norman, A. (2021). Python-nmap.
https://pypi.org/project/python-nmap/.
Pozdniakov, K., Alonso, E., Stankovic, V., Tam, K., and
Jones, K. (2020). Smart security audit: Reinforce-
ment learning with a deep neural network approxima-
tor. In 2020 International Conference on Cyber Sit-
uational Awareness, Data Analytics and Assessment
(CyberSA), pages 1–8.
Schwartz, J. and Kurniawati, H. (2019). Autonomous
penetration testing using reinforcement learning.
arXiv:1905.05965.
Standen, M., Lucas, M., Bowman, D., Richer, T. J., Kim, J.,
and Marriott, D. (2021). Cyborg: A gym for the devel-
opment of autonomous cyber agents. In Proceedings
of the 1st International Workshop on Adaptive Cyber
Defense.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement learn-
ing: An introduction. MIT press.
The MITRE Corporation (2018). Brawl. https://github.com
/mitre/brawl-public-game-001.
Zhou, S., Liu, J., Hou, D., Zhong, X., and Zhang, Y. (2021).
Autonomous penetration testing based on improved
deep Q-network. Applied Sciences, 11(19).
PenGym: Pentesting Training Framework for Reinforcement Learning Agents
509