
94/Medical-Corpus-Semantic-Similar ity-Eva
luation.git. All the computations presented in this
paper were performed using the (Gricad, ) infrastruc-
ture (https://gri cad.univ-grenoble-alpes.f
r), which is supported by Grenoble research commu-
nities.
REFERENCES
Alam, F., Afzal, M., and Malik, K. M. (2020). Comparative
analysis of semantic similarity techniques for medical
text. In 2020 International Conference on Information
Networking (ICOIN), pages 106–109.
Banerjee, S. and Lavie, A. (2005). METEOR: An automatic
metric for MT evaluation with improved correlation
with human judgments. In Proceedings of the ACL
Workshop on Intrinsic and Extrinsic Evaluation Mea-
sures for Machine Translation and/or Summarization,
pages 65–72, Ann Arbor, Michigan. Association for
Computational Linguistics.
Beltagy, I., Lo, K., and Cohan, A. (2019). SciBERT: A
pretrained language model for scientific text. In Inui,
K., Jiang, J., Ng, V., and Wan, X., editors, Proceed-
ings of the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th Inter-
national Joint Conference on Natural Language Pro-
cessing (EMNLP-IJCNLP), pages 3615–3620, Hong
Kong, China. Association for Computational Linguis-
tics.
Chen, Z., Song, Y., Chang, T.-H., and Wan, X. (2020). Gen-
erating radiology reports via memory-driven trans-
former. In Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing
(EMNLP), pages 1439–1449, Online. Association for
Computational Linguistics.
Endo, M., Krishnan, R., Krishna, V., Ng, A. Y., and Ra-
jpurkar, P. (2021). Retrieval-based chest x-ray report
generation using a pre-trained contrastive language-
image model. In Proceedings of Machine Learning
for Health, volume 158 of Proceedings of Machine
Learning Research, pages 209–219.
Gricad. infrastructure supported by grenoble research com-
munities.
Honnibal, M., Montani, I., Van Landeghem, S., and Boyd,
A. (2020). spaCy: Industrial-strength Natural Lan-
guage Processing in Python.
Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S.,
Chute, C., Marklund, H., Haghgoo, B., Ball, R., Sh-
panskaya, K., Seekins, J., Mong, D. A., Halabi, S. S.,
Sandberg, J. K., Jones, R., Larson, D. B., Langlotz,
C. P., Patel, B. N., Lungren, M. P., and Ng, A. Y.
(2019). Chexpert: A large chest radiograph dataset
with uncertainty labels and expert comparison. Pro-
ceedings of the AAAI Conference on Artificial Intelli-
gence, 33(01):590–597.
Jain, S., Agrawal, A., Saporta, A., Truong, S., Duong,
D. N., Bui, T., Chambon, P., Zhang, Y., Lungren,
M. P., Ng, A. Y., Langlotz, C., and Rajpurkar, P.
(2021). Radgraph: Extracting clinical entities and re-
lations from radiology reports. In Thirty-fifth Con-
ference on Neural Information Processing Systems
Datasets and Benchmarks Track (Round 1).
Johnson, A. E. W., Pollard, T. J., Berkowitz, S. J., Green-
baum, N. R., Lungren, M. P., Deng, C.-y., Mark, R. G.,
and Horng, S. (2019). Mimic-cxr, a de-identified pub-
licly available database of chest radiographs with free-
text reports. Scientific Data, 6(1):317.
Li, J., Sun, Y., Johnson, R. J., Sciaky, D., Wei, C.-H., Lea-
man, R., Davis, A. P., Mattingly, C. J., Wiegers, T. C.,
and Lu, Z. (2016). BioCreative V CDR task cor-
pus: a resource for chemical disease relation extrac-
tion. Database, 2016:baw068.
Lin, C.-Y. (2004). ROUGE: A package for automatic evalu-
ation of summaries. In Text Summarization Branches
Out, pages 74–81, Barcelona, Spain. Association for
Computational Linguistics.
Miura, Y., Zhang, Y., Tsai, E., Langlotz, C., and Jurafsky,
D. (2021). Improving factual completeness and con-
sistency of image-to-text radiology report generation.
In Toutanova, K., Rumshisky, A., Zettlemoyer, L.,
Hakkani-Tur, D., Beltagy, I., Bethard, S., Cotterell, R.,
Chakraborty, T., and Zhou, Y., editors, Proceedings of
the 2021 Conference of the North American Chapter
of the Association for Computational Linguistics: Hu-
man Language Technologies, pages 5288–5304, On-
line. Association for Computational Linguistics.
Neumann, M., King, D., Beltagy, I., and Ammar, W. (2019).
ScispaCy: Fast and Robust Models for Biomedical
Natural Language Processing. In Proceedings of the
18th BioNLP Workshop and Shared Task, pages 319–
327, Florence, Italy. Association for Computational
Linguistics.
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002).
Bleu: a method for automatic evaluation of machine
translation. In Proceedings of the 40th Annual Meet-
ing of the Association for Computational Linguistics,
pages 311–318, Philadelphia, Pennsylvania, USA.
Association for Computational Linguistics.
Patricoski, J., Kreimeyer, K., Balan, A., Hardart, K., Tao,
J., Anagnostou, V., Botsis, T., Investigators, J. H. M.
T. B., et al. (2022). An evaluation of pretrained bert
models for comparing semantic similarity across un-
structured clinical trial texts. Stud Health Technol In-
form, 289:18–21.
Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A., and
Lungren, M. P. (2020). Chexbert: Combining auto-
matic labelers and expert annotations for accurate ra-
diology report labeling using bert. In Conference on
Empirical Methods in Natural Language Processing.
Yu, F., Endo, M., Krishnan, R., Pan, I., Tsai, A., Reis, E. P.,
Fonseca, E. K. U. N., Ho Lee, H. M., Abad, Z. S. H.,
Ng, A. Y., Langlotz, C. P., Venugopal, V. K., and Ra-
jpurkar, P. (2022). Evaluating Progress in Automatic
Chest X-Ray Radiology Report Generation. preprint,
Radiology and Imaging.
BIOINFORMATICS 2024 - 15th International Conference on Bioinformatics Models, Methods and Algorithms
494