Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification
Simon Schwan, Sabine Glesner
2025
Abstract
Deep reinforcement learning solves complex control problems but is often challenging to apply in practice for non-experts. Goal-oriented specification allows to define abstract goals in a tree and thereby, aims at lowering the entry barriers to RL. However, finding an effective specification and translating it to an RL environment is still difficult. We address this challenge with our idea of iterative environment design and automate the construction of environments from goal trees. We validate our method based on four established case studies and our results show that learning goals by iteratively refining specifications is feasible. In this way, we counteract the common trial-and-error practice in the development to accelerate the use of RL in real-world applications.
DownloadPaper Citation
in Harvard Style
Schwan S. and Glesner S. (2025). Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 240-251. DOI: 10.5220/0013148500003890
in Bibtex Style
@conference{icaart25,
author={Simon Schwan and Sabine Glesner},
title={Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2025},
pages={240-251},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013148500003890},
isbn={978-989-758-737-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification
SN - 978-989-758-737-5
AU - Schwan S.
AU - Glesner S.
PY - 2025
SP - 240
EP - 251
DO - 10.5220/0013148500003890
PB - SciTePress