Double Q-Learning for a Simple Parking Problem: Propositions of Reward Functions and State Representations

Przemysław Klȩsk

2025

Abstract

We consider a simple parking problem where the goal for the learning agent is to park the car from a range of initial random positions to a target place with front and back end-points distinguished, without obstacles in the scene but with an imposed time regime, e.g. 25s. It is a sequential decision problem with a continuous state space and a high frequency of decisions to be taken. We employ the double Q-learning computational approach, using the bang–bang control and neural approximations for the Q functions. Our main focus is laid on the design of rewards and state representations for this problem. We propose a family of parameterized reward functions that include, in particular, a penalty for the so-called “gutter distance”. We also study several variants of vector state representations that (apart from observing velocity and direction) relate some key points on the car with key points in the park place. We show that a suitable combination of the state representation and rewards can effectively guide the agent towards better trajectories. Thereby, the learning procedure can be carried out within a reasonably small number of episodes, resulting in high success rate at the testing stage.

Download


Paper Citation


in Harvard Style

Klȩsk P. (2025). Double Q-Learning for a Simple Parking Problem: Propositions of Reward Functions and State Representations. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 156-171. DOI: 10.5220/0013302100003890


in Bibtex Style

@conference{icaart25,
author={Przemysław Klȩsk},
title={Double Q-Learning for a Simple Parking Problem: Propositions of Reward Functions and State Representations},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART},
year={2025},
pages={156-171},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013302100003890},
isbn={978-989-758-737-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART
TI - Double Q-Learning for a Simple Parking Problem: Propositions of Reward Functions and State Representations
SN - 978-989-758-737-5
AU - Klȩsk P.
PY - 2025
SP - 156
EP - 171
DO - 10.5220/0013302100003890
PB - SciTePress