Assured Reinforcement Learning with Formally Verified Abstract Policies
George Mason, Radu Calinescu, Daniel Kudenko, Alec Banks
We present a new reinforcement learning (RL) approach that enables an autonomous agent to solve decision making problems under constraints. Our assured reinforcement learning approach models the uncertain environment as a high-level, abstract Markov decision process (AMDP), and uses probabilistic model checking to establish AMDP policies that satisfy a set of constraints defined in probabilistic temporal logic. These formally verified abstract policies are then used to restrict the RL agent's exploration of the solution space so as to avoid constraint violations. We validate our RL approach by using it to develop autonomous agents for a flag-collection navigation task and an assisted-living planning problem.
