Authors:
Aurélien Mombelli
;
Alain Quilliot
and
Mourad Baiou
Affiliation:
LIMOS, UCA, 1 Rue de la Chebarde, 63170 Aubière, France
Keyword(s):
Shortest Path, Risk Aware, Time-dependant, A*, Reinforcement Learning.
Abstract:
In this paper, we deal with a fleet of autonomous vehicles which is required to perform internal logistics tasks inside some protected areas. This fleet is supposed to be ruled by a hierarchical supervision architecture which, at the top level, distributes and schedules Pick up and Delivery tasks, and, at the lowest level, ensures safety at the crossroads and controls the trajectories. We focus here on the top level and deals with the problem which consist in inserting an additional vehicle into the current fleet and routing it while introducing a time dependent estimation of the risk induced by the traversal of any arc at a given time. We propose a model and design a bi-level heuristic and an A*-like heuristic which both rely on a reinforcement learning scheme in order to route and schedule this vehicles according to a well-fitted compromise between speed and risk.