Do
˘
gan, R. I., Leaman, R., and Lu, Z. (2014). Ncbi disease
corpus: a resource for disease name recognition and
concept normalization. Journal of biomedical infor-
matics, 47:1–10.
Dragulinescu, S. (2016). Inference to the best explanation
and mechanisms in medicine. Theoretical medicine
and bioethics, 37:211–232.
Eyre, H., Chapman, A. B., Peterson, K. S., Shi, J., Alba,
P. R., Jones, M. M., Box, T. L., DuVall, S. L., and
Patterson, O. V. (2021). Launching into clinical space
with medspacy: a new clinical text processing toolkit
in python. In AMIA Annual Symposium Proceedings,
volume 2021, page 438. American Medical Informat-
ics Association.
Fleiss, J. L. (1971). Measuring nominal scale agree-
ment among many raters. Psychological bulletin,
76(5):378.
Gu, Y., Tinn, R., Cheng, H., Lucas, M., Usuyama,
N., Liu, X., Naumann, T., Gao, J., and Poon, H.
(2020). Domain-specific language model pretraining
for biomedical natural language processing.
Gururangan, S., Marasovi
´
c, A., Swayamdipta, S., Lo, K.,
Beltagy, I., Downey, D., and Smith, N. A. (2020).
Don’t stop pretraining: Adapt language models to do-
mains and tasks. In Proceedings of ACL.
Honnibal, M. and Montani, I. (2017). spaCy 2: Natural lan-
guage understanding with Bloom embeddings, convo-
lutional neural networks and incremental parsing. To
appear.
Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., and
Szolovits, P. (2021). What disease does this patient
have? a large-scale open domain question answer-
ing dataset from medical exams. Applied Sciences,
11(14):6421.
Johnson, R. H. (2000). Manifest Rationality: A Pragmatic
Theory of Argument. Lawrence Earlbaum Associates.
Josephson, J. R. and Josephson, S. G. (1994). Abductive in-
ference: Computation, Philosophy, Technology. Cam-
bridge University Press.
Kim, J.-D., Ohta, T., Tsuruoka, Y., Tateisi, Y., and Col-
lier, N. (2004). Introduction to the bio-entity recog-
nition task at jnlpba. In Proceedings of the interna-
tional joint workshop on natural language process-
ing in biomedicine and its applications, pages 70–75.
Citeseer.
K
¨
ohler, S., Gargano, M., Matentzoglu, N., Carmody, L. C.,
Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Bal-
agura, G., Baynam, G., Brower, A. M., et al. (2021).
The human phenotype ontology in 2021. Nucleic
acids research, 49(D1):D1207–D1217.
Krallinger, M., Rabal, O., Leitner, F., Vazquez, M., Salgado,
D., Lu, Z., Leaman, R., Lu, Y., Ji, D., Lowe, D. M.,
et al. (2015). The chemdner corpus of chemicals and
drugs and its annotation principles. Journal of chem-
informatics, 7(1):1–17.
Kumar, S. and Talukdar, P. (2020). Nile: Natural language
inference with faithful natural language explanations.
arXiv preprint arXiv:2005.12116.
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma,
P., and Soricut, R. (2019). Albert: A lite bert for
self-supervised learning of language representations.
arXiv preprint arXiv:1909.11942.
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H.,
and Kang, J. (2020). Biobert: a pre-trained biomedi-
cal language representation model for biomedical text
mining. Bioinformatics, 36(4):1234–1240.
Li, J., Sun, Y., Johnson, R. J., Sciaky, D., Wei, C.-H.,
Leaman, R., Davis, A. P., Mattingly, C. J., Wiegers,
T. C., and Lu, Z. (2016). Biocreative v cdr task cor-
pus: a resource for chemical disease relation extrac-
tion. Database, 2016.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). Roberta: A robustly optimized bert pre-
training approach. arXiv preprint arXiv:1907.11692.
Manzini, E., Garrido-Aguirre, J., Fonollosa, J., and Perera-
Lluna, A. (2022). Mapping layperson medical termi-
nology into the human phenotype ontology using neu-
ral machine translation models. Expert Systems with
Applications, 204:117446.
Michalopoulos, G., Wang, Y., Kaka, H., Chen, H., and
Wong, A. (2020). Umlsbert: Clinical domain knowl-
edge augmentation of contextual embeddings using
the unified medical language system metathesaurus.
arXiv preprint arXiv:2010.10391.
Miller, T. (2019). Explanation in artificial intelligence: In-
sights from the social sciences. Artif. Intell., 267:1–
38.
Mohan, S. and Li, D. (2019). Medmentions: A large
biomedical corpus annotated with umls concepts.
arXiv preprint arXiv:1902.09476.
Narang, S., Raffel, C., Lee, K., Roberts, A., Fiedel, N.,
and Malkan, K. (2020). Wt5?! training text-to-text
models to explain their predictions. arXiv preprint
arXiv:2004.14546.
Naseem, U., Khushi, M., Reddy, V. B., Rajendran, S., Raz-
zak, I., and Kim, J. (2021). Bioalbert: A simple and
effective pre-trained language model for biomedical
named entity recognition. 2021 International Joint
Conference on Neural Networks (IJCNN), pages 1–7.
Ngai, H. and Rudzicz, F. (2022). Doctor XAvIer: Explain-
able diagnosis on physician-patient dialogues and
XAI evaluation. In Proceedings of the 21st Workshop
on Biomedical Language Processing, pages 337–344,
Dublin, Ireland. Association for Computational Lin-
guistics.
Pennington, J., Socher, R., and Manning, C. D. (2014).
Glove: Global vectors for word representation. In
Proceedings of EMNLP 2014, pages 1532–1543.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D.,
Sutskever, I., et al. (2019). Language models are un-
supervised multitask learners. OpenAI blog, 1(8):9.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang,
S., Matena, M., Zhou, Y., Li, W., and Liu, P. J.
(2019). Exploring the limits of transfer learning
with a unified text-to-text transformer. arXiv preprint
arXiv:1910.10683.
raj Kanakarajan, K., Kundumani, B., and Sankarasubbu,
M. (2021). Bioelectra: pretrained biomedical text en-
coder using discriminators. In Proceedings of the 20th
NLPinAI 2023 - Special Session on Natural Language Processing in Artificial Intelligence
448