The Comprehension of Medical Words - Cross-lingual Experiments in French and Xhosa
Natalia Grabar, Izak van Zyl, Retha de la Harpe, Thierry Hamon
2014
Abstract
This paper presents cross-lingual experiments in automatic detection of medical words that may be difficult to understand by patients. The study relies on Natural Language Processing (NLP) methods, conducted in three steps, across two languages, French and Xhosa: (1) the French data are processed by NLP methods and tools to reproduce the manual categorization of words as understandable or not; (2) the Xhosa data are clustered with a non-supervised algorithm; (3) an analysis of the Xhosa results and their comparison with the results observed on the French data is performed. Some similarities between the two languages are observed.
References
- Allwood, J., Grönqvist, L., and Hendrikse, A. (2003). Developing a tag set and tagger for the african languages of South Africa with special reference to Xhosa. Southern African Linguistics and Applied Language Studies, 21(4):223-237.
- AMA (1999). Health literacy: report of the council on scientific affairs. Ad hoc committee on health literacy for the council on scientific affairs, American Medical Association. JAMA, 281(6):552-7.
- Berland, G., Elliott, M., Morales, L., Algazy, J., Kravitz, R., Broder, M., Kanouse, D., Munoz, J., Puyol, J., Lara, M., Watkins, K., Yang, H., and McGlynn, E. (2001). Health information on the internet. accessibility, quality, and readability in english ans spanish. JAMA, 285(20):2612-2621.
- Bosch, S., Pretorius, L., and Fleisch, A. (2008). Experimental bootstrapping of morphological analysers for Nguni languages. Nordic Journal of African Studies, 17(2):66-88.
- Brown, P., deSouza, P., Mercer, R., Della Pietra, V., and Lai, J. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4):467-479.
- Coˆté, R. (1996). Répertoire d'anatomopathologie de la SNOMED internationale, v3.4. Université de Sherbrooke, Sherbrooke, Québec.
- Curran, J. R. (2004). From distributional to semantic similarity. PhD thesis, University of Edinburgh.
- Deléger, L. and Zweigenbaum, P. (2008). Paraphrase acquisition from comparable medical corpora of specialized and lay texts. In AMIA 2008, pages 146-50.
- Dold, A. and Cocks, M. (1999). A preliminary list of Xhosa plant names from the Eastern Cape, South Africa. Bothalia, 29:267-292.
- Elhadad, N. and Sutaria, K. (2007). Mining a lexicon of technical terms and lay equivalents. In BioNLP.
- Fleiss, J. and Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33:613-619.
- Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 23:221-233.
- Flores, G., Abreu, M., Olivar, M., and Kastner, B. (1998). Access barriers to health care for latino children. Arch Pediatr Adolesc Med, 152:1119-1125.
- Franc¸ois, T. and Fairon, C. (2013). Les apports du TAL à la lisibilité du franc¸ais langue étrangère. TAL, 54(1):171-202.
- Harris, Z. S. (1968). Mathematical Structures of Language. Wiley, New York, NY, USA.
- Kincaid, J., Fishburne, R. J., Rogers, R., and Chissom, B. (1975). Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, Naval Technical Training, U. S. Naval Air Station, Memphis, TN.
- Landis, J. and Koch, G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33:159-174.
- Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions and reversals. Soviet physics. Doklady, 707(10).
- Levin, M. (2006a). Different use of medical terminology and culture-specific models of disease affecting communication between Xhosa-speaking patients and English-speaking doctors at a South African paediatric teaching hospital. S Afr Med J, 96:1080-1084.
- Levin, M. (2006b). Language as a barrier to care for Xhosaspeaking patients at a South African paediatric teaching hospital. S Afr Med J, 96:1076-1079.
- Messai, R., Zeng, Q., Mousseau, M., and Simonet, M. (2006). Building a bilingual french-english patientoriented terminology for breast cancer. In MedNet.
- Miller, T., Leroy, G., Chatterjee, S., Fan, J., and Thoms, B. (2007). A classifier to evaluate language specificity of medical documents. In HICSS, pages 134-140.
- Moropa, K. (2007). Analysing the English-Xhosa parallel corpus of technical texts with Paraconc: a case study of term formation processes. Southern African Linguistics and Applied Language Studies, 25(1):183- 205.
- Namer, F. (2000). FLEMM : un analyseur flexionnel du franc¸ais à base de règles. Traitement automatique des langues (TAL), 41(2):523-547.
- Namer, F. (2009). Morphologie, Lexique et TAL : l'analyseur DériF. TIC et Sciences cognitives. Hermes Sciences Publishing, London.
- Pretorius, L. and Bosch, S. (2009). Exploiting crosslinguistic similarities in Zulu and Xhosa computational morphology. In AFLAT, pages 96-103.
- Rittman, R. (2008). Automatic discrimination of genres. VDM, Saarbrucken, Germany.
- Roux, J., Louw, P., and Niesler, T. (2004). The African speech technology project: An assessment. In LREC, pages 93-96.
- Schlemmer, A. and Mash, B. (2006). The effects of a language barrier in a South Africa district hospital. S Afr Med J, 96:1084-1087.
- Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing, pages 44-49, Manchester, UK.
- Specia, L., Jauhar, S., and Mihalcea, R. (2012). Semeval2012 task 1: English lexical simplification. In *SEM 2012, pages 347-355.
- TLFi (2001). Trésor de la Langue Franc¸aise - I. INaLF/ATILF. Disponible l'adresse www.tlfi.fr.
- Van der Stouwe, C. (2009). A phonetic and phonological report on the Xhosa language. Technical report. Accessed 1 October 2013, http://bit.ly/1bZwt1j.
- Witten, I. and Frank, E. (2005). Data mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco.
- Woloshin, S., Bickell, N., Schwartz, L., Gany, F., and Welch, H. (1995). Language barriers in medicine in the united states. JAMA, 273(9):724-728.
- Zeng, Q. T., Tse, T., Divita, G., Keselman, A., Crowell, J., and Browne, A. C. (2006). Exploring lexical forms: first-generation consumer health vocabularies. In AMIA 2006, pages 1155-1155.
- Zeng, X. and Parmanto, B. (2003). Evaluation of web accessibility of consumer health information websites. In AMIA 2003, pages 743-7.
Paper Citation
in Harvard Style
Grabar N., van Zyl I., de la Harpe R. and Hamon T. (2014). The Comprehension of Medical Words - Cross-lingual Experiments in French and Xhosa . In Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2014) ISBN 978-989-758-010-9, pages 334-342. DOI: 10.5220/0004803803340342
in Bibtex Style
@conference{healthinf14,
author={Natalia Grabar and Izak van Zyl and Retha de la Harpe and Thierry Hamon},
title={The Comprehension of Medical Words - Cross-lingual Experiments in French and Xhosa},
booktitle={Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2014)},
year={2014},
pages={334-342},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004803803340342},
isbn={978-989-758-010-9},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2014)
TI - The Comprehension of Medical Words - Cross-lingual Experiments in French and Xhosa
SN - 978-989-758-010-9
AU - Grabar N.
AU - van Zyl I.
AU - de la Harpe R.
AU - Hamon T.
PY - 2014
SP - 334
EP - 342
DO - 10.5220/0004803803340342