Improving Reward Estimation in Goal-Conditioned Imitation Learning with Counterfactual Data and Structural Causal Models
Mohamed Jabri, Mohamed Jabri, Mohamed Jabri, Mohamed Jabri, Panagiotis Papadakis, Panagiotis Papadakis, Ehsan Abbasnejad, Ehsan Abbasnejad, Ehsan Abbasnejad, Gilles Coppin, Gilles Coppin, Javen Shi, Javen Shi, Javen Shi
2023
Abstract
Imitation learning has emerged as a pragmatic alternative to reinforcement learning for teaching agents to execute specific tasks, mitigating the complexity associated with reward engineering. However, the deployment of imitation learning in real-world scenarios is hampered by numerous challenges. Often, the scarcity and expense of demonstration data hinder the effectiveness of imitation learning algorithms. In this paper, we present a novel approach to enhance the sample efficiency of goal-conditioned imitation learning. Leveraging the principles of causality, we harness structural causal models as a formalism to generate counterfactual data. These counterfactual instances are used as additional training data, effectively improving the learning process. By incorporating causal insights, our method demonstrates its ability to improve imitation learning efficiency by capitalizing on generated counterfactual data. Through experiments on simulated robotic manipulation tasks, such as pushing, moving, and sliding objects, we showcase how our approach allows for the learning of better reward functions resulting in improved performance with a limited number of demonstrations, paving the way for a more practical and effective implementation of imitation learning in real-world scenarios.
DownloadPaper Citation
in Harvard Style
Jabri M., Papadakis P., Abbasnejad E., Coppin G. and Shi J. (2023). Improving Reward Estimation in Goal-Conditioned Imitation Learning with Counterfactual Data and Structural Causal Models. In Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: LICARSA; ISBN 978-989-758-670-5, SciTePress, pages 329-337. DOI: 10.5220/0012268200003543
in Bibtex Style
@conference{licarsa23,
author={Mohamed Jabri and Panagiotis Papadakis and Ehsan Abbasnejad and Gilles Coppin and Javen Shi},
title={Improving Reward Estimation in Goal-Conditioned Imitation Learning with Counterfactual Data and Structural Causal Models},
booktitle={Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: LICARSA},
year={2023},
pages={329-337},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012268200003543},
isbn={978-989-758-670-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: LICARSA
TI - Improving Reward Estimation in Goal-Conditioned Imitation Learning with Counterfactual Data and Structural Causal Models
SN - 978-989-758-670-5
AU - Jabri M.
AU - Papadakis P.
AU - Abbasnejad E.
AU - Coppin G.
AU - Shi J.
PY - 2023
SP - 329
EP - 337
DO - 10.5220/0012268200003543
PB - SciTePress