Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning

Gavin Rens

doi:10.5220/0013238900003890

Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning

Gavin Rens

2025

Abstract

Humanoid robots must master numerous tasks with sparse rewards, posing a challenge for reinforcement learning (RL). We propose a method combining RL and automated planning to address this. Our approach uses short goal-conditioned policies (GCPs) organized hierarchically, with Monte Carlo Tree Search (MCTS) planning using high-level actions (HLAs). Instead of primitive actions, the planning process generates HLAs. A single plan-tree, maintained during the agent’s lifetime, holds knowledge about goal achievement. This hierarchy enhances sample efficiency and speeds up reasoning by reusing HLAs and anticipating future actions. Our Hierarchical Goal-Conditioned Policy Planning (HGCPP) framework uniquely integrates GCPs, MCTS, and hierarchical RL, potentially improving exploration and planning in complex tasks.

Download

Paper Citation

in Harvard Style

Rens G. (2025). Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 507-514. DOI: 10.5220/0013238900003890

in Bibtex Style

@conference{icaart25,
author={Gavin Rens},
title={Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART},
year={2025},
pages={507-514},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013238900003890},
isbn={978-989-758-737-5},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART
TI - Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning
SN - 978-989-758-737-5
AU - Rens G.
PY - 2025
SP - 507
EP - 514
DO - 10.5220/0013238900003890
PB - SciTePress