
Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M.
(2019). Optuna: A Next-Generation Hyperparameter
Optimization Framework. In Proceedings of the 25th
ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining.
Andrychowicz, M., Raichuk, A., Sta
´
nczyk, P., Orsini, M.,
Girgin, S., Marinier, R., Hussenot, L., Geist, M.,
Pietquin, O., Michalski, M., Gelly, S., and Bachem,
O. (2020). What Matters in On-Policy Reinforce-
ment Learning? A Large-Scale Empirical Study.
arXiv:2006.05990.
Balandat, M., Karrer, B., Jiang, D. R., Daulton, S., Letham,
B., Wilson, A. G., and Bakshy, E. (2020). BoTorch: A
Framework for Efficient Monte-Carlo Bayesian Opti-
mization. arXiv:1910.06403.
Bergstra, J., Bardenet, R., Bengio, Y., and K
´
egl, B. (2011).
Algorithms for Hyper-parameter Optimization. In
Proceedings of the 24th International Conference on
Neural Information Processing Systems, NIPS ’11,
pages 2546–2554, Red Hook, NY, USA. Curran As-
sociates.
Cardenoso Fernandez, F. and Caarls, W. (2018). Parame-
ters [sic] Tuning and Optimization for Reinforcement
Learning Algorithms Using Evolutionary Computing.
In 2018 International Conference on Information Sys-
tems and Computer Science (INCISCOS), pages 301–
305, Quito, Ecuador. IEEE.
Chandramouli, S., Shi, D., Putkonen, A., De Peuter, S.,
Zhang, S., Jokinen, J., Howes, A., and Oulasvirta,
A. (2024). A Workflow for Building Computationally
Rational Models of Human Behavior. Computational
Brain & Behavior, 7:399–419.
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Janoos,
F., Rudolph, L., and Madry, A. (2020). Implementa-
tion Matters in Deep RL: A Case Study on PPO and
TRPO. In 8th International Conference on Learning
Representations, ICLR 2020, Addis Ababa, Ethiopia,
April 26-30, 2020. OpenReview.net.
Eppe, M., Gumbsch, C., Kerzel, M., Nguyen, P. D. H., Butz,
M. V., and Wermter, S. (2022). Intelligent Problem-
Solving As Integrated Hierarchical Reinforcement
Learning. Nature Machine Intelligence, 4(1):11–20.
Fetterman, A. J., Kitanidis, E., Albrecht, J., Polizzi, Z.,
Fogelman, B., Knutins, M., Wr
´
oblewski, B., Simon,
J. B., and Qiu, K. (2023). Tune As You Scale: Hyper-
parameter Optimization For Compute Efficient Train-
ing. arXiv:2306.08055.
Huang, S., Dossa, R. F. J., Ye, C., Braga, J., Chakraborty,
D., Mehta, K., and Ara
´
ujo, J. G. M. (2022). CleanRL:
High-Quality Single-File Implementations of Deep
Reinforcement Learning Algorithms. The Journal of
Machine Learning Research, 23(1):12585–12602.
Jiang, Y., Guo, Z., Tavakoli, H. R., Leiva, L. A., and
Oulasvirta, A. (2024). EyeFormer: Predicting Per-
sonalized Scanpaths with Transformer-Guided Rein-
forcement Learning. arXiv:2404.10163.
Ladosz, P., Mammadov, M., Shin, H., Shin, W., and Oh, H.
(2024). Autonomous Landing on a Moving Platform
Using Vision-Based Deep Reinforcement Learning.
IEEE Robotics and Automation Letters, 9(5):4575–
4582.
Nobandegani, A. S., Shultz, T. R., and Rish, I. (2022).
Cognitive Models As Simulators: The Case of Moral
Decision-Making. In Proceedings of the 44th Annual
Conference of the Cognitive Science Society.
Oldewage, E. T., Engelbrecht, A. P., and Cleghorn, C. W.
(2020). Movement Patterns of a Particle Swarm
in High Dimensional Spaces. Information Sciences,
512:1043–1062.
Parker-Holder, J., Rajan, R., Song, X., Biedenkapp, A.,
Miao, Y., Eimer, T., Zhang, B., Nguyen, V., Calandra,
R., Faust, A., Hutter, F., and Lindauer, M. (2022). Au-
tomated Reinforcement Learning (AutoRL): A Survey
and Open Problems. Journal of Artificial Intelligence
Research, 74:517–568.
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus,
M., and Dormann, N. (2021). Stable-Baselines3: Re-
liable Reinforcement Learning Implementations. The
Journal of Machine Learning Research, 22(1):12348–
12355.
Shahria, M. T., Sunny, M. S. H., Zarif, M. I. I., Ghommam,
J., Ahamed, S. I., and Rahman, M. H. (2022). A Com-
prehensive Review of Vision-Based Robotic Applica-
tions: Current State, Components, Approaches, Barri-
ers, and Potential Solutions. Robotics, 11(6).
Shang, J. and Ryoo, M. S. (2023). Active Vision Rein-
forcement Learning under Limited Visual Observabil-
ity. In Oh, A., Neumann, T., Globerson, A., Saenko,
K., Hardt, M., and Levine, S., editors, Advances in
Neural Information Processing Systems, volume 36,
pages 10316–10338. Curran Associates.
Shi, D., Zhu, Y., Jokinen, J. P., Acharya, A., Putkonen,
A., Zhai, S., and Oulasvirta, A. (2024). CRTypist:
Simulating Touchscreen Typing Behavior via Compu-
tational Rationality. In Proceedings of the 2024 CHI
Conference on Human Factors in Computing Systems,
CHI ’24. ACM.
Song, X., Perel, S., Lee, C., Kochanski, G., and Golovin,
D. (2023). Open Source Vizier: Distributed Infras-
tructure and API for Reliable and Flexible Blackbox
Optimization. arXiv:2207.13676.
Souza, G. K. B. and Ottoni, A. L. C. (2024). AutoRL-
Sim: Automated Reinforcement Learning Simulator
for Combinatorial Optimization Problems. Modelling,
5(3):1056–1083.
Towers, M., Kwiatkowski, A., Terry, J., Balis, J. U.,
De Cola, G., Deleu, T., Goul
˜
ao, M., Kallinteris,
A., Krimmel, M., KG, A., Perez-Vicente, R., Pierr
´
e,
A., Schulhoff, S., Tai, J. J., Tan, H., and You-
nis, O. G. (2024). Gymnasium: A Standard In-
terface for Reinforcement Learning Environments.
arXiv:2407.17032.
ICAART 2025 - 17th International Conference on Agents and Artificial Intelligence
358