
Borji, A. (2023). A categorical archive of ChatGPT failures.
arXiv preprint arXiv:2302.03494.
Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., and Su,
J. K. (2019). This looks like that: deep learning for
interpretable image recognition. Advances in neural
information processing systems, 32.
Choi, J. H., Hickman, K. E., Monahan, A., and Schwarcz,
D. (2023). ChatGPT goes to law school. Available at
SSRN.
Daws, R. (2020). Medical chatbot using OpenAI’s GPT-3
told a fake patient to kill themselves. AI News,
https://artificialintelligence-news.com/2020/10/28/
medical-chatbot-openai-gpt3-patient-kill-themselves/.
Accessed May 2021.
Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati,
A., Xiao, C., Prakash, A., Kohno, T., and Song, D.
(2018). Robust physical-world attacks on deep learn-
ing visual classification. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 1625–1634.
Gupta, A., Anpalagan, A., Guan, L., and Khwaja, A. S.
(2021). Deep learning for object detection and scene
perception in self-driving cars: Survey, challenges,
and open issues. Array, 10:100057.
John-Mathews, J.-M. (2022). Some critical and ethical
perspectives on the empirical turn of ai interpretabil-
ity. Technological Forecasting and Social Change,
174:121209.
Krishna, S., Han, T., Gu, A., Pombra, J., Jabbari, S., Wu,
S., and Lakkaraju, H. (2022). The disagreement prob-
lem in explainable machine learning: A practitioner’s
perspective. arXiv preprint arXiv:2202.01602.
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin,
V., Goyal, N., K
¨
uttler, H., Lewis, M., Yih, W.-t.,
Rockt
¨
aschel, T., et al. (2020). Retrieval-augmented
generation for knowledge-intensive NLP tasks. Ad-
vances in Neural Information Processing Systems,
33:9459–9474.
Li, J. et al. (2022). Recent advances in end-to-end automatic
speech recognition. APSIPA Transactions on Signal
and Information Processing, 11(1).
Lundberg, S. M. and Lee, S.-I. (2017). A unified approach
to interpreting model predictions. Advances in neural
information processing systems, 30.
MacDonald, S., Steven, K., and Trzaskowski, M. (2022).
Interpretable ai in healthcare: Enhancing fairness,
safety, and trust. In Artificial Intelligence in
Medicine: Applications, Limitations and Future Di-
rections, pages 241–258. Springer.
OpenAI (2023). GPT-4 technical report (arxiv:
2303.08774).
Parekh, D., Poddar, N., Rajpurkar, A., Chahal, M., Ku-
mar, N., Joshi, G. P., and Cho, W. (2022). A review
on autonomous vehicles: Progress, methods and chal-
lenges. Electronics, 11(14):2162.
Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). ” why
should i trust you?” explaining the predictions of any
classifier. In Proceedings of the 22nd ACM SIGKDD
international conference on knowledge discovery and
data mining, pages 1135–1144.
Rudin, C. (2019). Stop explaining black box machine learn-
ing models for high stakes decisions and use inter-
pretable models instead. Nature machine intelligence,
1(5):206–215.
Rudin, C. and Radin, J. (2019). Why are we using black box
models in ai when we don’t need to? a lesson from
an explainable ai competition. Harvard Data Science
Review, 1(2):1–9.
Slack, D., Hilgard, S., Jia, E., Singh, S., and Lakkaraju, H.
(2020). Fooling LIME and SHAP: Adversarial attacks
on post hoc explanation methods. In Proceedings of
the AAAI/ACM Conference on AI, Ethics, and Society,
pages 180–186.
Torabi, S. and Wahde, M. (2017). Fuel consumption op-
timization of heavy-duty vehicles using genetic algo-
rithms. In 2017 IEEE Congress on Evolutionary Com-
putation (CEC), pages 29–36. IEEE.
Tyson, J. (2023). Shortcomings of ChatGPT. Journal of
Chemical Education, 100(8):3098–3101.
Vaishya, R., Misra, A., and Vaish, A. (2023). ChatGPT: Is
this version good for healthcare and research? Di-
abetes & Metabolic Syndrome: Clinical Research &
Reviews, 17(4):102744.
Veitch, E. and Alsos, O. A. (2022). A systematic review
of human-ai interaction in autonomous ship systems.
Safety Science, 152:105778.
Virgolin, M., De Lorenzo, A., Randone, F., Medvet, E., and
Wahde, M. (2021). Model learning with personalized
interpretability estimation (ML-PIE). In Proceedings
of the Genetic and Evolutionary Computation Confer-
ence Companion, pages 1355–1364.
Wahde, M., Della Vedova, M. L., Virgolin, M., and Su-
vanto, M. (2023). An interpretable method for auto-
mated classification of spoken transcripts and written
text. Evolutionary Intelligence, pages 1–13.
Wahde, M. and Virgolin, M. (2023). Daisy: An implemen-
tation of five core principles for transparent and ac-
countable conversational ai. International Journal of
Human–Computer Interaction, 39(9):1856–1873.
Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang,
X., Zhao, E., Zhang, Y., Chen, Y., et al. (2023).
Siren’s song in the ai ocean: A survey on hallu-
cination in large language models. arXiv preprint
arXiv:2309.01219.
Zhou, C., Li, Q., Li, C., Yu, J., Liu, Y., Wang, G.,
Zhang, K., Ji, C., Yan, Q., He, L., et al. (2023). A
comprehensive survey on pretrained foundation mod-
els: A history from bert to chatgpt. arXiv preprint
arXiv:2302.09419.
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
108