Barcelona, Spain. Association for Computational Lin-
guistics.
Cardon, R., Grabar, N., Grouin, C., and Hamon, T. (2020).
Presentation of the DEFT 2020 Challenge : open do-
main textual similarity and precise information extrac-
tion from clinical cases. In Actes de la 6e conf
´
erence
conjointe Journ
´
ees d’
´
Etudes sur la Parole (JEP, 33e
´
edition), Traitement Automatique des Langues Na-
turelles (TALN, 27e
´
edition), Rencontre des
´
Etudiants
Chercheurs en Informatique pour le Traitement Au-
tomatique des Langues (R
´
ECITAL, 22e
´
edition). Ate-
lier D
´
Efi Fouille de Textes, pages 1–13, Nancy, France.
ATALA et AFCP.
Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., and Spe-
cia, L. (2017). SemEval-2017 Task 1: Semantic
Textual Similarity Multilingual and Crosslingual Fo-
cused Evaluation. In Proceedings of the 11th Interna-
tional Workshop on Semantic Evaluation (SemEval-
2017), pages 1–14, Vancouver, Canada. Association
for Computational Linguistics.
Chandrasekaran, D. and Mago, V. (2021). Evolution of Se-
mantic Similarity—A Survey. ACM Comput. Surv.,
54(2). Place: New York, NY, USA Publisher: Associ-
ation for Computing Machinery.
Chen, Q., Du, J., Kim, S., Wilbur, W. J., and Lu, Z.
(2020). Deep learning with sentence embeddings pre-
trained on biomedical corpora improves the perfor-
mance of finding similar sentences in electronic med-
ical records. BMC Medical Informatics and Decision
Making, 20(1):73.
Dice, L. R. (1945). Measures of the Amount of Ecologic
Association Between Species. Ecology, 26(3):297–
302.
Grabar, N. and Cardon, R. (2018). CLEAR – Simple Cor-
pus for Medical French. In Proceedings of the 1st
Workshop on Automatic Text Adaptation (ATA), pages
3–9, Tilburg, the Netherlands. Association for Com-
putational Linguistics.
Grabar, N., Claveau, V., and Dalloux, C. (2018). CAS:
French Corpus with Clinical Cases. In Lavelli, A.,
Minard, A.-L., and Rinaldi, F., editors, Proceedings of
the Ninth International Workshop on Health Text Min-
ing and Information Analysis, Louhi@EMNLP 2018,
Brussels, Belgium, October 31, 2018, pages 122–128.
Association for Computational Linguistics.
Jaccard, P. (1912). The Distribution of
the Flora in the Alpine Zone.1. New
Phytologist, 11(2):37–50. eprint:
https://nph.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1469-
8137.1912.tb05611.x.
Jones, K. S. (2004). A statistical interpretation of term
specificity and its application in retrieval. Journal of
Documentation. Publisher: Emerald Group Publish-
ing Limited.
Le, Q. V. and Mikolov, T. (2014). Distributed Representa-
tions of Sentences and Documents. arXiv:1405.4053
[cs]. arXiv: 1405.4053.
Levenshtein, V. I. (1965). Binary codes capable of correct-
ing deletions, insertions, and reversals. Soviet physics.
Doklady, 10:707–710.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean,
J. (2013). Distributed Representations of Words and
Phrases and their Compositionality. arXiv:1310.4546
[cs, stat]. arXiv: 1310.4546.
Ochiai, A. (1957). Zoogeographical studies on the soleoid
fishes found in Japan and its neighbouring regions-II.
Bull. Jpn. Soc. scient. Fish., 22:526–530.
P, S. and Shaji, A. P. (2019). A Survey on Semantic Simi-
larity. In 2019 International Conference on Advances
in Computing, Communication and Control (ICAC3),
pages 1–8.
Rastegar-Mojarad, M., Liu, S., Wang, Y., Afzal, N., Wang,
L., Shen, F., Fu, S., and Liu, H. (2018). BioCre-
ative/OHNLP Challenge 2018. In Proceedings of the
2018 ACM International Conference on Bioinformat-
ics, Computational Biology, and Health Informatics,
BCB ’18, page 575, New York, NY, USA. Associa-
tion for Computing Machinery.
So
˘
gancıo
˘
glu, G.,
¨
Ozt
¨
urk, H., and
¨
Ozg
¨
ur, A. (2017).
BIOSSES: a semantic sentence similarity estimation
system for the biomedical domain. Bioinformatics,
33(14):i49–i58.
Ukkonen, E. (1992). Approximate string-matching with q-
grams and maximal matches. Theoretical Computer
Science, 92(1):191–211.
CONCORDIA: COmputing semaNtic sentenCes for fRench Clinical Documents sImilArity
83