Figure 7: Average success rate for different numbers of
demonstrations: Our method is more resilient to the de-
crease of the number of demonstrations.
and make a step toward achieving more reliable imi-
tation learning using causality.
This work was funded in part by the region of Brittany
under the ROGAN project. We are grateful for their
support, which made this research possible.
Improving Reward Estimation in Goal-Conditioned Imitation Learning with Counterfactual Data and Structural Causal Models