Authors:
Wilson Alzate Calderón
1
;
Alexandra Pomares Quimbaya
1
;
Rafael A. Gonzalez
1
and
Oscar Mauricio Muñoz
2
Affiliations:
1
Pontificia Universidad Javeriana, Colombia
;
2
Pontificia Universidad Javeriana and Hospital Universitario San Ignacio, Colombia
Keyword(s):
Electronic Medical Record, Big Data, Natural Language Processing, Text Mining, Framework.
Abstract:
In the healthcare domain the analysis of Electronic Medical Records (EMR) may be classified as a Big Data
problem since it has the three fundamental characteristics: Volume, Variety and Speed. A major drawback is
that most of the information contained in medical records is narrative text, where natural language processing
and text mining are key technologies to enhance the utility of medical records for research, analysis and
decision support. Among the tasks performed for natural language processing, the most critical, in terms of
time consumption, are the pre-processing tasks that give some structure to the original non-structured text.
Studying existing research on the use of Big Data techniques in the healthcare domain reveals few practical
contributions, especially for EMR analysis. To fill this gap, this paper presents BigTexts, a framework that
provides pre-built functionalities for the execution of pre-processing tasks over narrative texts contained in
EMR using Big Data tech
niques. BigTexts enables faster results on EMR narrative text analysis improving
decision making in healthcare.
(More)