
Advances in Neural Information Processing Systems,
36.
Liu, M., Zhu, M., and Zhang, W. (2022). Goal-
conditioned reinforcement learning: Problems and so-
lutions. arXiv preprint arXiv:2201.08299.
Louppe, G., Wehenkel, L., Sutera, A., and Geurts, P. (2013).
Understanding variable importances in forests of ran-
domized trees. Advances in neural information pro-
cessing systems, 26.
Mar
´
ee, R., Geurts, P., and Wehenkel, J. P. L. (2005). Ran-
dom subwindows for robust image classification. In
2005 IEEE Computer Society Conference on Com-
puter Vision and Pattern Recognition (CVPR’05), vol-
ume 1, pages 34–40. IEEE.
Mart
´
ın H, J. A., de Lope, J., and Maravall, D. (2009). The
k nn-td reinforcement learning algorithm. In Methods
and Models in Artificial and Natural Computation. A
Homage to Professor Mira’s Scientific Legacy: Third
International Work-Conference on the Interplay Be-
tween Natural and Artificial Computation, IWINAC
2009, Santiago de Compostela, Spain, June 22-26,
2009, Proceedings, Part I 3, pages 305–314. Springer.
Min, J. and Elliott, L. T. (2022). Q-learning with online
random forests. arXiv preprint arXiv:2204.03771.
Mirchevska, B., Blum, M., Louis, L., Boedecker, J., and
Werling, M. (2017). Reinforcement learning for au-
tonomous maneuvering in highway scenarios. In
Workshop for Driving Assistance Systems and Au-
tonomous Driving, pages 32–41.
Mir
´
o-Nicolau, M., i Cap
´
o, A. J., and Moy
`
a-Alcover, G.
(2025). A comprehensive study on fidelity metrics
for xai. Information Processing & Management,
62(1):103900.
Muschalik, M., Baniecki, H., Fumagalli, F., Kolpaczki, P.,
Hammer, B., and H
¨
ullermeier, E. (2024). shapiq:
Shapley interactions for machine learning. In The
Thirty-eight Conference on Neural Information Pro-
cessing Systems Datasets and Benchmarks Track.
Nauta, M., Trienes, J., Pathak, S., Nguyen, E., Peters,
M., Schmitt, Y., Schl
¨
otterer, J., Van Keulen, M., and
Seifert, C. (2023). From anecdotal evidence to quan-
titative evaluation methods: A systematic review on
evaluating explainable ai. ACM Computing Surveys,
55(13s):1–42.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E.,
DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and
Lerer, A. (2017). Automatic differentiation in pytorch.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Weiss, R., Dubourg, V., et al. (2011). Scikit-learn:
Machine learning in python. the Journal of machine
Learning research, 12:2825–2830.
Prasad, N., Cheng, L.-F., Chivers, C., Draugelis, M., and
Engelhardt, B. E. (2017). A reinforcement learning
approach to weaning of mechanical ventilation in in-
tensive care units. arXiv preprint arXiv:1704.06300.
Puterman, M. L. (2014). Markov decision processes: dis-
crete stochastic dynamic programming. John Wiley &
Sons.
Rudin, C. (2019). Stop explaining black box machine learn-
ing models for high stakes decisions and use inter-
pretable models instead. Nature machine intelligence,
1(5):206–215.
Sallab, A. E., Abdou, M., Perot, E., and Yogamani,
S. (2017). Deep reinforcement learning frame-
work for autonomous driving. arXiv preprint
arXiv:1704.02532.
Schmidhuber, J. (2019). Reinforcement learning upside
down: Don’t predict rewards–just map them to ac-
tions. arXiv preprint arXiv:1912.02875.
Shah, D. and Xie, Q. (2018). Q-learning with nearest neigh-
bors. Advances in Neural Information Processing Sys-
tems, 31.
Shwartz-Ziv, R. and Armon, A. (2022). Tabular data: Deep
learning is not all you need. Information Fusion,
81:84–90.
Song, Y. and Wang, L. (2024). Multiobjective tree-based re-
inforcement learning for estimating tolerant dynamic
treatment regimes. Biometrics, 80(1):ujad017.
Srivastava, R. K., Shyam, P., Mutz, F., Ja
´
skowski, W.,
and Schmidhuber, J. (2019). Training agents using
upside-down reinforcement learning. arXiv preprint
arXiv:1912.02877.
Sutton, R. S. (1995). Generalization in reinforcement learn-
ing: Successful examples using sparse coarse coding.
Advances in neural information processing systems, 8.
Wehenkel, L., Ernst, D., and Geurts, P. (2006). Ensembles
of extremely randomized trees and some generic ap-
plications. In Robust methods for power system state
estimation and load forecasting.
Winter, E. (2002). The shapley value. Handbook of game
theory with economic applications, 3:2025–2054.
Yu, C., Liu, J., Nemati, S., and Yin, G. (2021). Reinforce-
ment learning in healthcare: A survey. ACM Comput-
ing Surveys (CSUR), 55(1):1–36.
Zhao, Y., Kosorok, M. R., and Zeng, D. (2009). Reinforce-
ment learning design for cancer clinical trials. Statis-
tics in medicine, 28(26):3294–3315.
IAI 2025 - Special Session on Interpretable Artificial Intelligence Through Glass-Box Models
868