Learning to Run a Marathon: Avoid Overfitting to Speed

Krisztián Gábrisch, István Megyeri

2025

Abstract

Research and development in reinforcement learning is a dynamically evolving field, with a particular focus on robustness and continuous optimization of reward. The models learned in the OpenAI GYM and Mujoco environments investigated here seek to make different dummies move in one direction as fast as possible without losing stability. During the learning process, the models are usually trained for a predefined number of steps, which can act as a limiting factor and result in an unexpected limitation in the model performance. This iteration limitation can contribute to model instability, often leading to model failure, thus hindering the model’s ability to collect additional rewards. In our observations, we also note that models face a major problem in simultaneously optimizing their stability and speed. We traced the learning process of the models through twenty checkpoints, and defined various metrics to select the models that are most suitable for us. We have noticed that the model obtained at the last checkpoint does not always perform the best, so it is worth monitoring the learning process so we can get better models during the learning process. Our code and pretrained models are available at https://github.com/szegedai/rl run marathon.

Download


Paper Citation


in Harvard Style

Gábrisch K. and Megyeri I. (2025). Learning to Run a Marathon: Avoid Overfitting to Speed. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 829-836. DOI: 10.5220/0013186200003890


in Bibtex Style

@conference{icaart25,
author={Krisztián Gábrisch and István Megyeri},
title={Learning to Run a Marathon: Avoid Overfitting to Speed},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2025},
pages={829-836},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013186200003890},
isbn={978-989-758-737-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Learning to Run a Marathon: Avoid Overfitting to Speed
SN - 978-989-758-737-5
AU - Gábrisch K.
AU - Megyeri I.
PY - 2025
SP - 829
EP - 836
DO - 10.5220/0013186200003890
PB - SciTePress