Authors:
Fernando Fradique Duarte
1
;
Nuno Lau
2
;
Artur Pereira
2
and
Luís Reis
3
Affiliations:
1
Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
;
2
Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
;
3
Faculty of Engineering, Department of Informatics Engineering, University of Porto, Porto, Portugal
Keyword(s):
Convolutional Long Short-Term Memory, Grid Long-Short Term Memory, Long Short-Term Memory, Mixture Density Network, Reinforcement Learning.
Abstract:
Memory-based Deep Reinforcement Learning has been shown to be a viable solution to successfully learn control policies directly from high-dimensional sensory data in complex vision-based control tasks. At the core of this success lies the Long Short-Term Memory or LSTM, a well-known type of Recurrent Neural Network. More recent developments have introduced the ConvLSTM, a convolutional variant of the LSTM and the MDN-RNN, a Mixture Density Network combined with an LSTM, as memory modules in the context of Deep Reinforcement Learning. The defining characteristic of the ConvLSTM is its ability to preserve spatial information, which may prove to be a crucial factor when dealing with vision-based control tasks while the MDN-RNN can act as a predictive memory eschewing the need to explicitly plan ahead. Also of interest to this work is the GridLSTM, a network of LSTM cells arranged in a multidimensional grid. The objective of this paper is therefore to perform a comparative study of sever
al memory modules, based on the LSTM, ConvLSTM, MDN-RNN and GridLSTM in the scope of Deep Reinforcement Learning, and more specifically as the memory modules of the agent. All experiments were validated using the Atari 2600 videogame benchmark.
(More)