Grouin, C., Griffon, N., and N
´
ev
´
eol, A. (2015). Is it pos-
sible to recover personal health information from an
automatically de-identified corpus of french ehrs? In
Proceedings of the Sixth International Workshop on
Health Text Mining and Information Analysis, pages
31–39.
HIPAA (1996). Guidance regarding methods for
de-identification of protected health informa-
tion in accordance with the Health Insurance
Portability and Accountability Act (HIPAA)
Privacy Rule, https://www.hhs.gov/hipaa/for-
professionals/privacy/special-topics/de-
identification/index.html. Accessed: 2020-01-17.
Kushida, C. A., Nichols, D. A., Jadrnicek, R., Miller, R.,
Walsh, J. K., and Griffin, K. (2012). Strategies for de-
identification and anonymization of electronic health
record data for use in multicenter research studies.
Medical care, 50(Suppl):S82.
Lange, L., Adel, H., and Str
¨
otgen, J. (2019). NLNDE:
The Neither-Language-Nor-Domain-Experts’ Way of
Spanish Medical Document De-Identification. arXiv
preprint arXiv:2007.01030.
Li, X.-B. and Qin, J. (2017). Anonymizing and sharing
medical text records. Information Systems Research,
28(2):332–352.
Loukides, G., Denny, J. C., and Malin, B. (2010). The dis-
closure of diagnosis codes can breach research par-
ticipants’ privacy. Journal of the American Medical
Informatics Association, 17(3):322–327.
Machanavajjhala, A., Kifer, D., Gehrke, J., and Venkita-
subramaniam, M. (2007). l-diversity: Privacy beyond
k-anonymity. ACM Transactions on Knowledge Dis-
covery from Data (TKDD), 1(1):3–es.
Marimon, M., Gonzalez-Agirre, A., Intxaurrondo, A., Ro-
drguez, H., Lopez Martin, J., Villegas, M., and
Krallinger, M. (2019). Automatic De-Identification of
Medical Texts in Spanish: the MEDDOCAN Track,
Corpus, Guidelines, Methods and Evaluation of Re-
sults. In Proceedings of the Iberian Languages Eval-
uation Forum (IberLEF 2019). vol. TBA, p. TBA.
CEUR Workshop Proceedings (CEUR-WS. org), Bil-
bao, Spain (Sep 2019), TBA.
Meystre, S. M., Ferr
´
andez,
´
O., Friedlin, F. J., South, B. R.,
Shen, S., and Samore, M. H. (2014a). Text de-
identification for privacy protection: A study of its
impact on clinical text information content. Journal
of biomedical informatics, 50:142–150.
Meystre, S. M., Friedlin, F. J., South, B. R., Shen, S., and
Samore, M. H. (2010). Automatic de-identification
of textual documents in the electronic health record:
a review of recent research. BMC medical research
methodology, 10(1):70.
Meystre, S. M., Shen, S., Hofmann, D., and Gundlapalli,
A. V. (2014b). Can physicians recognize their own
patients in de-identified notes? In MIE, pages 778–
782.
Nadeau, D. and Sekine, S. (2007). A survey of named entity
recognition and classification. Lingvisticae Investiga-
tiones, 30(1):3–26.
Nowok, B., Raab, G. M., and Dibben, C. (2016). synthpop:
Bespoke Creation of Synthetic Data in R. Journal of
Statistical Software, 74(11):1–26.
Obeid, J. S., Heider, P. M., Weeda, E. R., Matuskowitz,
A. J., Carr, C. M., Gagnon, K., Crawford, T., and
Meystre, S. M. (2019). Impact of de-identification on
clinical text classification using traditional and deep
learning classifiers. Studies in health technology and
informatics, 264:283.
Papernot, N., Abadi, M., Erlingsson, U., Goodfellow, I., and
Talwar, K. (2017). Semi-supervised knowledge trans-
fer for deep learning from private training data. Proc-
cedings of 5th International Conference on Learning
Representations, ICLR 2017, Toulon, France, April
24-26, 2017.
Rama, T., Brekke, P., Nytrø, Ø., and Øvrelid, L. (2018).
Iterative development of family history annotation
guidelines using a synthetic corpus of clinical text. In
Proceedings of the Ninth International Workshop on
Health Text Mining and Information Analysis, pages
111–121.
Stubbs, A., Filannino, M., and Uzuner,
¨
O. (2017). De-
identification of psychiatric intake records: Overview
of 2016 cegs n-grid shared tasks track 1. Journal of
biomedical informatics, 75:S4–S18.
Stubbs, A., Kotfila, C., and Uzuner,
¨
O. (2015). Auto-
mated systems for the de-identification of longitudinal
clinical narratives: Overview of 2014 i2b2/uthealth
shared task track 1. Journal of biomedical informatics,
58:S11–S19.
Sweeney, L. (1996). Replacing personally-identifying in-
formation in medical records, the scrub system. In
Proceedings of the AMIA annual fall symposium, page
333. American Medical Informatics Association.
Uzuner,
¨
O., Luo, Y., and Szolovits, P. (2007). Evaluating the
state-of-the-art in automatic de-identification. Jour-
nal of the American Medical Informatics Association,
14(5):550–563.
Wagner, I. and Eckhoff, D. (2018). Technical privacy met-
rics: a systematic survey. ACM Computing Surveys
(CSUR), 51(3):1–38.
Yeniterzi, R., Aberdeen, J., Bayer, S., Wellner, B.,
Hirschman, L., and Malin, B. (2010). Effects of
personal identifier resynthesis on clinical text de-
identification. Journal of the American Medical In-
formatics Association, 17(2):159–168.
Yoo, J. S., Thaler, A., Sweeney, L., and Zang, J. (2018).
Risks to patient privacy: A re-identification of patients
in maine and vermont statewide hospital data. Tech-
nology Science. Oct.
De-identification of Clinical Text for Secondary Use: Research Issues
599