loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Ilio Boone and Gavin Rens

Affiliation: DTAI group, KU Leuven, Belgium

Keyword(s): Markov Decision Process, Non-Markovian Reward Models, Mealy Reward Model (MRM), Learning MRMs, Non-stationary.

Abstract: In sequential decision-theoretic systems, the dynamics might be Markovian (behavior in the next step is independent of the past, given the present), or non-Markovian (behavior in the next step depends on the past). One approach to represent non-Markovian behaviour has been to employ deterministic finite automata (DFA) with inputs and outputs (e.g. Mealy machines). Moreover, some researchers have proposed frameworks for learning DFA-based models. There are at least two reasons for a system to be non-Markovian: (i) rewards are gained from temporally-dependent tasks, (ii) observations are non-stationary. Rens et al. (2021) tackle learning the applicable DFA for the first case with their ARM algorithm. ARM cannot deal with the second case. Toro Icarte et al. (2019) tackle the problem for the second case with their LRM algorithm. In this paper, we extend ARM to deal with the second case too. The advantage of ARM for learning and acting in non-Markovian systems is that it is based on well- understood formal methods with many available tools. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.144.122.20

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Boone, I. and Rens, G. (2022). Learning Optimal Behavior in Environments with Non-stationary Observations. In Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-547-0; ISSN 2184-433X, SciTePress, pages 729-736. DOI: 10.5220/0010898200003116

@conference{icaart22,
author={Ilio Boone. and Gavin Rens.},
title={Learning Optimal Behavior in Environments with Non-stationary Observations},
booktitle={Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2022},
pages={729-736},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010898200003116},
isbn={978-989-758-547-0},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Learning Optimal Behavior in Environments with Non-stationary Observations
SN - 978-989-758-547-0
IS - 2184-433X
AU - Boone, I.
AU - Rens, G.
PY - 2022
SP - 729
EP - 736
DO - 10.5220/0010898200003116
PB - SciTePress