Protocols and Open Problems for General Agents. J.
Artif. Intell. Res., 61, 523–562
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T.
P., Harley, T., Silver, D., & Kavukcuoglu, K. (2016).
Asynchronous Methods for Deep Reinforcement
Learning. In Proceedings of the 33nd International
Conference on Machine Learning, ICML 2016, (Vol.
48, pp. 1928–1937)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A.,
Veness, J., Bellemare, M. G., Graves, A., Riedmiller,
M. A., Fidjeland, A., Ostrovski, G., Petersen, S.,
Beattie, C., Sadik, A., Antonoglou, I., King, H.,
Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D.
(2015). Human-level control through deep
reinforcement learning. Nat., 518(7540), 529–533
Mott, A., Zoran, D., Chrzanowski, M., Wierstra, D., &
Rezende, D. J. (2019). Towards Interpretable
Reinforcement Learning Using Attention Augmented
Agents. In Advances in Neural Information Processing
Systems 32: Annual Conference on Neural Information
Processing Systems 2019, NeurIPS 2019, (pp. 12329–
12338)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,
Chanan, G., Killeen, T., Lin, Z., Gimelshein, N.,
Antiga, L., Desmaison, A., Köpf, A., Yang, E. Z.,
DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S.,
Steiner, B., Fang, L., … Chintala, S. (2019). PyTorch:
An Imperative Style, High-Performance Deep Learning
Library. In Advances in Neural Information Processing
Systems 32: Annual Conference on Neural Information
Processing Systems 2019, NeurIPS 2019, (pp. 8024–
8035)
Schulman, J., Moritz, P., Levine, S., Jordan, M. I., &
Abbeel, P. (2016). High-Dimensional Continuous
Control Using Generalized Advantage Estimation. In
4th International Conference on Learning
Representations, ICLR 2016
Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K.,
& Woo, W. (2015). Convolutional LSTM Network: A
Machine Learning Approach for Precipitation
Nowcasting. In Advances in Neural Information
Processing Systems 28: Annual Conference on Neural
Information Processing Systems 2015, (pp. 802–810)
Sorokin, I., Seleznev, A., Pavlov, M., Fedorov, A., &
Ignateva, A. (2015). Deep Attention Recurrent Q-
Network. CoRR, abs/1512.01693. http://arxiv.org/
abs/1512.01693
Srivastava, N., Mansimov, E., & Salakhutdinov, R. (2015).
Unsupervised Learning of Video Representations using
LSTMs. In Proceedings of the 32nd International
Conference on Machine Learning, ICML 2015, (Vol.
37, pp. 843–852)
Sutskever, I., Vinyals, O., & Le, Q. v. (2014). Sequence to
Sequence Learning with Neural Networks. In
Advances in Neural Information Processing Systems
27: Annual Conference on Neural Information
Processing Systems 2014, (pp. 3104–3112)
Tang, Y., Nguyen, D., & Ha, D. (2020). Neuroevolution of
self-interpretable agents. In GECCO ’20: Genetic and
Evolutionary Computation Conference, 2020 (pp. 414–
424). ACM.
Wayne, G., Hung, C.-C., Amos, D., Mirza, M., Ahuja, A.,
Grabska-Barwinska, A., Rae, J. W., Mirowski, P.,
Leibo, J. Z., Santoro, A., Gemici, M., Reynolds, M.,
Harley, T., Abramson, J., Mohamed, S., Rezende, D. J.,
Saxton, D., Cain, A., Hillier, C., … Lillicrap, T. P.
(2018). Unsupervised Predictive Memory in a Goal-
Directed Agent. CoRR, abs/1803.10760. http://
arxiv.org/abs/1803.10760.