Authors:
Anna Nickelson
;
Nicholas Zerbel
;
Gaurav Dixit
and
Kagan Tumer
Affiliation:
Collaborative Robotics and Intelligent Systems (CoRIS) Institute, Oregon State University, Corvallis, OR, U.S.A.
Keyword(s):
Counterfactuals, Quality Diversity, Multi-Objective, Evolutionary Learning.
Abstract:
Success in many real-world problems cannot be adequately defined under a single objective and instead requires multiple, sometimes competing, objectives to define the problem. To perform well in these environ-ments, autonomous agents need to have a variety of skills and behaviors to balance these objectives. The combination of Multi-Objective Optimization (MOO) and Quality Diversity (QD) methods, such as in MultiObjective Map Elites (MOME), aim to provide a set of policies with diverse behaviors that cover multiple objectives. However, MOME is unable to diversify its search across the behavior space, resulting in significantly reduced coverage of the global Pareto front. This paper introduces Counterfactual Behavior Shaping for Multi-Objective Map Elites (C-MOME), a method that superimposes counterfactual agents onto the state space of a learning agent to more richly define the diversity of agent behaviors. Counterfactuals explicitly introduce new forms of diversity in agent behavior
s, resulting in C-MOME’s effective coverage of behavioral niches; this provides a broader set of Pareto optimal policies. We show that C-MOME covers more than twice as much of the behavior space compared to MOME while increasing the hypervolume of the global Pareto front by over 40%.
(More)