Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning
Alexander Hill, Marc Groefsema, Matthia Sabatelli, Raffaella Carloni, Marco Grzegorczyk
2024
Abstract
This paper proposes a novel method of utilising guide policies in Reinforcement Learning problems; Contextual Online Imitation Learning (COIL). This paper demonstrates that COIL can offer improved performance over both offline Imitation Learning methods such as Behavioral Cloning, and also Reinforcement Learning algorithms such as Proximal Policy Optimisation which do not take advantage of existing guide policies. An important characteristic of COIL is that it can effectively utilise guide policies that exhibit expert behavior in only a strict subset of the state space, making it more flexible than classical methods of Imitation Learning. This paper demonstrates that through using COIL, guide policies that achieve good performance in sub-tasks can also be used to help Reinforcement Learning agents looking to solve more complex tasks. This is a significant improvement in flexibility over traditional Imitation Learning methods. After introducing the theory and motivation behind COIL, this paper tests the effectiveness of COIL on the task of mobile-robot navigation in both a simulation and real-life lab experiments. In both settings, COIL gives stronger results than offline Imitation Learning, Reinforcement Learning, and also the guide policy itself.
DownloadPaper Citation
in Harvard Style
Hill A., Groefsema M., Sabatelli M., Carloni R. and Grzegorczyk M. (2024). Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 178-185. DOI: 10.5220/0012312700003636
in Bibtex Style
@conference{icaart24,
author={Alexander Hill and Marc Groefsema and Matthia Sabatelli and Raffaella Carloni and Marco Grzegorczyk},
title={Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={178-185},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012312700003636},
isbn={978-989-758-680-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning
SN - 978-989-758-680-4
AU - Hill A.
AU - Groefsema M.
AU - Sabatelli M.
AU - Carloni R.
AU - Grzegorczyk M.
PY - 2024
SP - 178
EP - 185
DO - 10.5220/0012312700003636
PB - SciTePress