on developing more interpretable systems or generic 
prediction  explanation  methods  that  mitigate  this 
problem.  Moreover,  such  systems  could  be  very 
powerful when combined with a human-in-the-loop 
approach, by allowing the human to learn how text 
can be written to teach the correct code to the system. 
REFERENCES 
Atutxa,  A.,  de  Ilarraza,  A.D.,  Gojenola,  K.,  Oronoz,  M., 
Perez-de-Viñaspre,  O.,  2019.  Interpretable  deep 
learning  to  map  diagnostic  texts  to  ICD-10 
codes. International Journal of Medical 
Informatics, 129, pp.49-59. 
Baghdadi, Y., Bourrée, A., Robert, A., Rey, G., Gallay, A., 
Zweigenbaum,  P.,  Grouin,  C.,  Fouillet,  A.,  2019. 
Automatic  classification  of  free-text  medical  causes 
from  death  certificates  for  reactive  mortality 
surveillance in France. International journal of medical 
informatics, 131, p.103915. 
Baumel,  T.,  Nassour-Kassis, J., Cohen, R., Elhadad, M., 
Elhadad,  N.,  2018,  June.  Multi-label  classification  of 
patient  notes:  case  study  on  ICD  code  assignment. 
In Workshops at the Thirty-Second AAAI Conference 
on Artificial Intelligence. 
Boytcheva,  S.,  2011,  September.  Automatic  matching  of 
ICD-10  codes  to  diagnoses  in  discharge  letters. 
In Proceedings of the Second Workshop on Biomedical 
Natural Language Processing (pp. 11-18). 
Cao, L., Gu, D., Ni, Y., Xie, G., 2019. Automatic ICD Code 
Assignment  based  on  ICD’s  Hierarchy  Structure  for 
Chinese Electronic Medical Records. AMIA Summits 
on Translational Science Proceedings, 2019, p.417. 
Chen, Y., Lu, H., Li, L., 2017. Automatic ICD-10 coding 
algorithm  using  an  improved  longest  common 
subsequence  based  on  semantic  similarity. PloS 
one, 12(3), p.e0173410. 
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., 
Bougares,  F.,  Schwenk,  H.  and  Bengio,  Y.,  2014. 
Learning  phrase  representations using  RNN  encoder-
decoder  for  statistical  machine  translation. arXiv 
preprint arXiv:1406.1078. 
Choi, E., Bahadori, M.T., Schuetz, A., Stewart, W.F. and 
Sun, J., 2016, December. Doctor ai: Predicting clinical 
events  via  recurrent  neural  networks.  In Machine 
Learning for Healthcare Conference (pp. 301-318). 
Chollet, F., and others, 2015. Keras, https://keras.io. 
Chung,  J.,  Gulcehre,  C.,  Cho,  K.,  Bengio,  Y.,  2014. 
Empirical evaluation of gated recurrent neural networks 
on  sequence  modeling. arXiv preprint arXiv:1412. 
3555. 
Du, J., Chen, Q., Peng, Y., Xiang, Y., Tao, C., Lu, Z., 2019. 
ML-Net: multi-label classification of biomedical texts 
with  deep  neural  networks. Journal of the American 
Medical Informatics Association, 26(11),  pp.1279-
1285. 
Duarte, F., Martins, B., Pinto, C.S., Silva, M.J., 2018. Deep 
neural models for ICD-10 coding of death certificates 
and autopsy reports in free-text. Journal of biomedical 
informatics, 80, pp.64-77. 
Feinerer,  I.,  2018.  Introduction  to  the  tm  Package  Text 
Mining in R. Retrieved, March 1, p.2019. 
Gal,  Y.  and  Ghahramani,  Z.,  2016.  A  theoretically 
grounded  application  of  dropout  in  recurrent  neural 
networks.  In  Advances  in  neural  information 
processing systems (pp. 1019-1027). 
Gargiulo,  F.,  Silvestri,  S.,  Ciampi,  M.,  2018.  Deep 
Convolution Neural Network for Extreme Multi-label 
Text Classification. In HEALTHINF (pp. 641-650). 
Hochreiter, S. and Schmidhuber, J., 1997. Long short-term 
memory. Neural computation, 9(8), pp.1735-1780. 
Karimi, S., Dai, X.,  Hassanzadeh,  H., Nguyen, A.,  2017, 
August.  Automatic  diagnosis  coding  of  radiology 
reports:  a  comparison  of  deep  learning  and 
conventional  classification methods. In BioNLP 2017 
(pp. 328-332). 
Kemp, J., Rajkomar, A., Dai, A.M., 2019. Improved Patient 
Classification with Language Model Pretraining Over 
Clinical Notes. arXiv preprint arXiv:1909.03039. 
Koh,  P.W.  and  Liang,  P.,  2017,  August.  Understanding 
black-box  predictions  via  influence  functions. 
In Proceedings of the 34th International Conference on 
Machine Learning-Volume 70 (pp. 1885-1894). JMLR. 
org. 
Koopman,  B.,  Karimi,  S.,  Nguyen,  A.,  McGuire,  R., 
Muscatello,  D.,  Kemp,  M.,  Truran,  D.,  Zhang,  M., 
Thackway,  S.,  2015.  Automatic  classification  of 
diseases from free-text death certificates for real-time 
surveillance. BMC medical informatics and decision 
making, 15(1), p.53. 
Koopman,  B.,  Zuccon,  G.,  Nguyen,  A.,  Bergheim,  A., 
Grayson, N., 2015. Automatic ICD-10 classification of 
cancers from free-text death certificates. International 
journal of medical informatics, 84(11), pp.956-965. 
Lin, C., Lou, Y.S., Tsai,  D.J., Lee, C.C.,  Hsu, C.J., Wu, 
D.C., Wang, M.C., Fang, W.H., 2019. Projection Word 
Embedding Model with Hybrid Sampling Training for 
Classifying  ICD-10-CM  Codes:  Longitudinal 
Observational  Study. JMIR medical informatics, 7(3), 
p.e14499. 
Liu, J., Zhang, Z., Razavian, N., 2018. Deep ehr: Chronic 
disease prediction using medical notes. arXiv preprint 
arXiv:1808.04928. 
Menger, V., Scheepers, F., van Wijk, L.M., Spruit, M., 
2018.  DEDUCE:  A  pattern  matching  method  for 
automatic  de-identification  of  Dutch  medical 
text. Telematics and Informatics, 35(4), pp.727-736. 
Mikolov,  T.,  Chen,  K.,  Corrado,  G.  and  Dean,  J.,  2013. 
Efficient estimation of word representations in vector 
space. arXiv preprint arXiv:1301.3781. 
Mikolov,  T.,  Karafiát,  M.,  Burget,  L.,  Černocký,  J., 
Khudanpur, S., 2010. Recurrent neural network based 
language model. In Eleventh annual conference of the 
international speech communication association. 
Mikolov,  T.,  Sutskever,  I.,  Chen,  K.,  Corrado,  G.S.  and 
Dean,  J.,  2013.  Distributed  representations  of  words 
and phrases and their compositionality. In Advances in