Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification

Simon Schwan, Sabine Glesner

2025

Abstract

Deep reinforcement learning solves complex control problems but is often challenging to apply in practice for non-experts. Goal-oriented specification allows to define abstract goals in a tree and thereby, aims at lowering the entry barriers to RL. However, finding an effective specification and translating it to an RL environment is still difficult. We address this challenge with our idea of iterative environment design and automate the construction of environments from goal trees. We validate our method based on four established case studies and our results show that learning goals by iteratively refining specifications is feasible. In this way, we counteract the common trial-and-error practice in the development to accelerate the use of RL in real-world applications.

Download


Paper Citation


in Harvard Style

Schwan S. and Glesner S. (2025). Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 240-251. DOI: 10.5220/0013148500003890


in Bibtex Style

@conference{icaart25,
author={Simon Schwan and Sabine Glesner},
title={Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2025},
pages={240-251},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013148500003890},
isbn={978-989-758-737-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Iterative Environment Design for Deep Reinforcement Learning Based on Goal-Oriented Specification
SN - 978-989-758-737-5
AU - Schwan S.
AU - Glesner S.
PY - 2025
SP - 240
EP - 251
DO - 10.5220/0013148500003890
PB - SciTePress