Text Mining of Medical Documents in Spanish: Semantic Annotation and

Detection of Recommendations

Carlos Teller

ıa

1 a

, Sergio Ilarri

2 b

and Carlos S

anchez

Instituto Aragon

es de Ciencias de la Salud, Zaragoza, Spain

I3A, University of Zaragoza, Zaragoza, Spain

University of Zaragoza, Zaragoza, Spain

Keywords:

Medical Documents, Information Extraction, Text Mining, Classiﬁcation, Semantic Annotation, Detection of

Recommendations in Texts, Spanish Texts.

Abstract:

In medical practice, identifying relevant facts and therapeutic recommendations from health-related documents

is a key issue to ensure an efﬁcient and effective service to patients. However, the automatic analysis of text

documents to extract relevant data is a challenging task. This is the case particularly when we deal with

documents written in languages other than English, for which the availability of lexical resources and tools is

much more limited and less experiences have been reported. In this paper, we present our experience dealing

with texts written in Spanish in a medical context. By applying text mining techniques and exploiting semantic

resources, we present an approach to automatically label documents using appropriate medical terms. Besides,

we also describe a technique that attempts to detect practice recommendations for doctors automatically in

clinical guides. An experimental evaluation shows the beneﬁts of applying text mining techniques as a support

system for doctors as well as its feasibility. The scarcity of experimental evaluations with medical documents

in Spanish motivated our work.

1 INTRODUCTION

The amount of text documents containing rele-

vant medical information is continuously growing.

Whereas this is a positive trend that proves a signif-

icant dissemination of research results in the health

area, it is also very challenging for doctors to identify

the most relevant data and keep up with the latest re-

search and medical recommendations. Even within a

single document dealing with a speciﬁc health topic, it

can be difﬁcult to quickly ﬁnd the key points and dis-

tinguish recent research results from well-established

practice recommendations and guidelines, especially

if this has to be done in a short time while examining

a patient during a consultation. In this context, the

development of software support tools that can help

health professionals to ﬁlter and identify relevant in-

formation quickly would be very proﬁtable. Thus, a

tool assisting in the task of identifying relevant facts

and therapeutic recommendations from health-related

documents could improve the efﬁciency and effective-

https://orcid.org/0000-0002-6394-3212

https://orcid.org/0000-0002-7073-219X

ness of health providers.

For this purpose, the application of text mining

techniques could be very helpful. However, analyz-

ing text written in natural language is challenging.

Moreover, the difﬁculties increase signiﬁcantly when

we have to manage documents written in non-English

languages, as appropriate lexical resources and tools

are scarce in that case and the number of experiences

reported is signiﬁcantly much smaller.

As the beneﬁts for both citizens and health pro-

fessionals could be huge and the amount of research

performed in this context is still quite limited, we

are researching techniques to deal with unstructured

medical documents written in Spanish. More speciﬁ-

cally, in this paper, we present a practical experience

developed to tackle two problems: the automatic la-

belling of medical documents using suitable medical

concepts and the identiﬁcation of recommendations

and guidelines (practice recommendations for doc-

tors) in health-related texts. Furthermore, an experi-

mental evaluation using anonymized clinical histories

(for the labelling task) and clinical guides (for the de-

tection of recommendations) shows the beneﬁts and

the feasibility of applying text mining techniques as a

Tellería, C., Ilarri, S. and Sánchez, C.

Text Mining of Medical Documents in Spanish: Semantic Annotation and Detection of Recommendations.

DOI: 10.5220/0010059101970208

In Proceedings of the 16th International Conference on Web Information Systems and Technologies (WEBIST 2020), pages 197-208

ISBN: 978-989-758-478-7

197

support system for doctors. The structure of the rest

of this paper is as follows. In Section 2, we describe

the state of the art. In Section 3, we present our ap-

proach for the automatic labelling of medical docu-

ments. In Section 4, we describe the technique used to

detect practice recommendations in texts. Finally, in

Section 5, we show our conclusions and outline some

prospective lines of future work.

2 RELATED WORK

Exploiting data available in a non-structured format,

such as text documents, is a difﬁcult task for which

a myriad of text mining techniques have been de-

veloped (Aggarwal and Zhai, 2012). Typical oper-

ations that can be performed with texts include: in-

formation retrieval (Manning et al., 2008) (obtention

of relevant documents satisfying a given query, usu-

ally a keyword-based query), text classiﬁcation (Se-

bastiani, 2002) (automatic allocation of documents to

an appropriate category from a set of possible pre-

deﬁned classes), information extraction (Jiang, 2012)

(retrieval of speciﬁc data from the text), textual anno-

tation (Liao and Zhao, 2019) (automatic assignment

of suitable labels to texts) and named entity recogni-

tion (Marrero et al., 2013) (detection of named enti-

ties such as references to people or company names,

geographic places, etc.), and document summariza-

tion (Gholamrezazadeh et al., 2009).

Concerning speciﬁcally health documents, there is

also a growing interest in applying text mining tech-

niques to automatically process text data in order to

maximize the probability of ﬁnding the relevant data

and minimize the cost, which would lead to an over-

all improvement of health services. A typical exam-

ple is the use of text mining in biomedicine (Simp-

son and Demner-Fushman, 2012; Spasic et al., 2005).

Most works that apply text mining on medical docu-

ments focus on a speciﬁc area, such as oncology (Yim

et al., 2016), radiology (Pons et al., 2016), geri-

atrics (Chen et al., 2019), or suicide prevention (Cop-

persmith et al., 2018). According to (Marrero et al.,

2010), two relevant peculiarities that imply additional

difﬁculties for the biomedical domain are the difﬁcul-

ties regarding terminological consensus and the lack

of terminological patterns in practice.

Most existing text mining approaches over med-

ical documents focus on texts written in English,

where a number of tools and linguistic resources are

available. According to (N

eol et al., 2018), where

the challenges and opportunities of clinical natural

language processing in languages other than English

are studied, “Chinese and Spanish have recently at-

tracted sustained efforts”, but studies for Spanish are

for the moment quite behind other non-English lan-

guages such as French, German and even Chinese.

As examples of some efforts performed for the Span-

ish language, we can cite (Casta

no et al., 2016; Cos-

tumero et al., 2014; Marimon et al., 2019). Thus,

in (Casta

no et al., 2016) an unsupervised machine

learning approach to discover the equivalence be-

tween terms (considering synonyms, abbreviations,

acronyms, and frequent typos) is presented; although

this work focuses on documents in Spanish, no spe-

ciﬁc resource for Spanish was used. The work pre-

sented in (Costumero et al., 2014) tackles the problem

of detecting negation regarding clinical conditions in

Spanish medical documents. Techniques to process

Spanish medical texts to remove sensitive patient in-

formation have also been proposed (Marimon et al.,

2019). Finally, it is also interesting to mention the

possibility of applying automatic machine translation

to medical documents, which would potentially en-

able the application of resources and tools available

for the chosen target language (Wu et al., 2011). As

opposed to these works, in this paper we present our

experience concerning the use of text mining for the

semantic annotation of medical documents and for the

detection of recommendations in clinical guides. This

contributes to the state of the art by reporting how dif-

ferent techniques can be exploited to provide suitable

results, thus increasing the scarce amount of experi-

ences with medical texts in Spanish.

3 AUTOMATIC LABELLING OF

HEALTH DOCUMENTS IN

SPANISH

We have decided to use two lexical resources as a ba-

sis for automatic labelling of medical documents in

Spanish: SNOMED CT (Spanish edition) and DeCS.

3.1 SNOMED CT

SNOMED CT (Systematized Nomenclature of Medi-

cine – Clinical Terms) (Cornet and de Keizer, 2008;

SNOMED International, 2020) is a clinical terminol-

ogy that has been translated to several languages, in-

cluding Spanish. It is delivered through two CSV

ﬁles, one containing the terms (more than 1 million

terms) and their classes, and another one with the re-

lations (more than 5 million relations) between the

terms (e.g., subclass relationships and synonymy).

Based on the Spanish edition of SNOMED CT, we

have built a dictionary of terms containing 951213

WEBIST 2020 - 16th International Conference on Web Information Systems and Technologies

198

terms (including synonyms) divided into 20 classes.

Although the initial number of classes in SNOMED

CT was 98, we have combined some classes into

a single one when, based on the available informa-

tion, they were considered too similar (e.g., “medica-

mento cl

ınico”, which is“clinical drug” in English,

and “f

armaco de uso cl

ınico”, which is “drug of clin-

ical use” in English). Table 1 shows some examples

of terms correctly detected in clinical histories of our

dataset thanks to the use of SNOMED CT.

Table 1: SNOMED-CT: examples of terms detected in our

dataset of clinical histories.

Term Associated class (or

associated classes)

cambio degenerativo anomal

ıa morfol

ogica

canal estructura corporal

cuerpo vertebral estructura corporal

deshidrataci

on trastorno

3.2 DeCS

DeCS (Descriptores en Ciencias de la Salud / Health

Sciences Descriptors) (BIREME, 2020a) is a multi-

lingual dictionary aimed at facilitating the indexation

and retrieval of scientiﬁc medical documents (stored

in specialized repositories such as LILACS and

MEDLINE). It was developed based on MeSH (Med-

ical Subject Headings) (Trieschnigg et al., 2009),

provided by the U.S. National Library of Medicine.

Thanks to a hierarchy relating the different terms, it

is possible to make searches more speciﬁc or more

general, by moving down or up through the hierarchy,

respectively. It contains 33966 descriptors and quali-

ﬁers: 29431 of them come from MeSH and 4535 are

exclusive of DeCS. DeCS is delivered through several

ﬁles. The one we have used is an XML ﬁle containing

all the terms in Spanish; through the DeCS Web Ser-

vices (BIREME, 2020b), we have retrieved informa-

tion related to the ﬁeld treeId, such as the upper class

of the hierarchy for a given term, by using a URL with

the structure http://decs.bvsalud.org/cgi-bin/mx/cgi=

@vmx/decs/?tree id=hidi. Overall, we have obtained

91823 medical terms (less than a 10% of the number

of terms obtained with SNOMED CT). Table 2 shows

some examples of terms correctly detected in clinical

histories of our dataset thanks to the use of DeCS.

3.3 Annotation Methods Considered

Our main goal is to evaluate whether using the given

semantic resources (SNOMED CT and DeCS) can

help to achieve satisfactory annotations. Therefore,

with the two resources described above, we have com-

Table 2: DeCS: examples of terms detected in our dataset

of clinical histories.

Term Associated class (or associated

classes)

alergia ENFERMEDADES - SALUD

UBLICA

anamnesis T

ECNICAS Y EQUIPOS

ANAL

ITICOS, DIAGN

OSTICOS

Y TERAPE

UTICOS

bocio ENFERMEDADES - SALUD

UBLICA

CEC T

ECNICAS Y EQUIPOS

ANAL

ITICOS, DIAGN

OSTICOS

Y TERAPE

UTICOS

pared several annotation methods. Given the difﬁ-

culty to ﬁnd annotated medical datasets in Spanish,

it is challenging to have a large gold-standard corpus

available for machine learning modeling and evalua-

tion. Building such a corpus through manual annota-

tion would be time consuming and would require the

participation of health care professionals. Therefore,

rather than applying machine learning techniques to

try to learn appropriate annotation models, we rely on

other types of methods (not based on machine learn-

ing) and we will evaluate their performance on a set

of documents manually annotated in order to assess

how well the automatic methods behave.

1) String Matching. We have ﬁrst considered a sim-

ple model that tries to ﬁnd an exact matching between

the words in the text and the terms in the correspond-

ing semantic resource (SNOMED CT and DeCS). A

preprocessing stage removes ﬁrst non-valid charac-

ters that may be present in the text documents and/or

terms, and everything is initially transformed to low-

ercase for the purpose of comparison. Then, each

term in the resource is compared with each ngram in

the text, to try to ﬁnd suitable matchings.

We tested the re Python library (Python Software

Foundation, 2020), but in our experiments the exe-

cution times were high (between 38.96 and 206.53

seconds, on an HP Pavilion with Intel Core i7-8700

and 16 GB RAM, depending on the size of the docu-

ment and whether SNOMED CT or DeCS was used).

Finally, we used FlashText (Singh, 2017a; Singh,

2017b), which performed the task much more efﬁ-

ciently (in about 3, 29% of the time, on average); an

object KeywordProcessor is built, containing all the

dictionary entries, to enable a quick detection of text

matches through the use of a trie data structure (Fred-

kin, 1960; Sahni and Mehta, 2018).

2) Approximate Detector with a Spanish Dictio-

nary (String Matching with Lemmatization and

Spell Correction using a Spanish Dictionary). An

obvious shortcoming of the string matching approach

is that even a slight change in the form of a word or

Text Mining of Medical Documents in Spanish: Semantic Annotation and Detection of Recommendations

199

group of words will lead to a mismatch. To allevi-

ate this problem, with this method we ﬁrst perform

more preprocessing steps. More speciﬁcally, the pre-

processing steps are the following: 1) transformation

of all the text to lowercase (except in the case of words

with all the letters in uppercase, which are assumed to

be acronyms), 2) removal of stopwords (prepositions,

determiners, etc.), 3) lemmatization of the words in

the text and the terms (i.e., obtention of the lemma

of each word; e.g., the lema of “rojo”, “roja”, “ro-

jos” and “rojas” is “rojo”, which represents the red

color in Spanish). FreeLing (Padr

o, 2008; Padr

o and

Stanilovsky, 2012), which offers an API that can be

called from Python, has been used to perform these

tasks. With these tools, a tokenization of the input is

applied, stopwords are removed, and the lemmas of

the remaining words are obtained.

Besides, as some words in the input text data in-

cluded typos (this is to be expected in documents

written by doctors, as they usually have to write

a considerable amount of text in a short time), we

also applied a spell checker to correct the poten-

tial typos before trying to ﬁnd a suitable match,

based on the Levenshtein distance, that tries to ob-

tain the most similar correct word (e.g., “hormigueo”,

which is “tingling” in English, instead of the mis-

spelled word “hormiguoe”); for this purpose, we

used symspellpy (mammothb, 2019), a port of Sym-

Spell (Garbe, 2019) for Python, along with a Span-

ish dictionary, obtained from (Dave, 2019), contain-

ing 1211000 entries. The terms detected after apply-

ing a spell checker are subject to some uncertainty, as

the application of a spell checker automatically could

actually lead to a term different than the one intended

in the original text; for example, the word “reinitis”

may appear in a text instead of “retinitis” (in English,

also “retinitis”) but it could be corrected as “rinitis”

(in English, “rhinitis”), which is a different disease.

We also considered other tools, such as the NLTK

(Natural Language Toolkit) library (Loper and Bird,

2002; NLTK Project, 2020a) with its package Snow-

ballStemmer (NLTK Project, 2020b), but it does not

offer a lemmatization functionality; instead, it only al-

lows to retrieve the lexeme of words, which is not as

appropriate (e.g., the lexeme of both “hombre”/“man”

and “hombro”/“shoulder” is “hombr”, even though

“hombre” and “hombro” are two Spanish words with

very different meanings). We also performed some

tests with spacy (Explosion AI, 2020); although this

tool incorporates a lemmatizer, we have noticed that

some nouns are lemmatized obtaining an inﬁnitive

verb form, even though the morpholinguistic analysis

correctly identiﬁes the original word as a noun.

3) Approximate Detector using Lexical Medi-

cal Resources (String Matching with Lemmati-

zation and Spell Correction using SNOMED-CT

or DeCS). It is equivalent to the previous method

but using either SNOMED-CT or DeCS as a dictio-

nary for spell correction. The python-Levenshtein li-

brary (Haapala, 2019) has been used to compute the

Levenshtein distance; a minimum threshold of 0.9

is applied to consider the equivalence between two

words. The main disadvantage of this approach is

the execution time needed (200-500 seconds, with the

aforementioned HP Pavilion, depending on the length

of the text): the complexity is O(n*m), where n is the

number of terms in the dictionary and m is the number

of words in the text.

3.4 Experimental Comparison of the

Annotation Methods

To compare the performance of the methods, we have

performed tests with 30 real text documents randomly

extracted from the input dataset, corresponding to

anonymized clinical histories provided by the Insti-

tuto Aragon

es de Ciencias de la Salud (IACS), which

is the entity that promotes knowledge in Biomedicine

and Health Sciences in the region of Arag

on (Spain).

In total, there are 1212946 documents, from which

we ﬁnally considered a subset with the 1859 docu-

ments that contained more than 1500 characters. Even

though the size of the dataset is not very large, we

have to take into account that the text documents

are rich in medical terms (the average number of

medical terms is 36.7, with a standard deviation of

16.15). Moreover, we did not observe signiﬁcant dif-

ferences between the performance observed for indi-

vidual documents (e.g., the average standard devia-

tion of the precision and recall for individual docu-

ments is around 0.1). So, the results can be consid-

ered representative for this experimental evaluation.

Enlarging the dataset is possible, but time consum-

ing and subject to two limitations: the clinical his-

tories need to be carefully anonymized, to guarantee

the privacy of the patients, and the documents used

for testing have to be manually annotated.

The results obtained are shown in Table 3. The

best of the three approaches, in terms of F-measure,

is the second method. Besides, we can see how the

use of SNOMED CT can lead to better results. It

should be noted that the use of the approximate de-

tector using lexical medical resources leads to higher

recall values but also to a decreased precision, partic-

ularly when using SNOMED CT (that has a higher

number of terms). This is because the probability of

incorrectly detecting a similar word increases using

WEBIST 2020 - 16th International Conference on Web Information Systems and Technologies

200

Table 3: Automatic annotation of clinical histories: experimental results.

String matching Approximate detector Approximate detector

using a Spanish dictionary using lexical medical resources

SNOMED CT DeCS SNOMED CT DeCS SNOMED CT DeCS

Precision 0.82 0.72 0.77 0.65 0.56 0.62

Recall 0.63 0.53 0.74 0.66 0.81 0.69

F-measure 0.71 0.61 0.75 0.66 0.66 0.65

the approximate matching; this is also characteristic

of the traditional tradeoff between precision and re-

call (Buckland and Gey, 1994). Given the results ob-

tained, the second method could be considered (and

extended, if needed) as a basis to develop a system

that can facilitate the task of annotation of texts.

Indeed, most undetected terms are due to the use

of acronyms and abbreviations used by doctors to re-

fer to diagnostics, therapies, corporal structures, and

diseases (e.g., HVI, VD, HTP, etc.). These acronyms

appear frequently in the clinical histories but their ap-

pearance in the lexical medical resources is scarce,

which leads to a decreased recall (i.e., false nega-

tives). Some acronyms are not standardized and they

may even have different meanings (e.g., “TEP” can

mean “Tromboembolismo Pulmonar” / “Pulmonary

Embolism” or “Tri

angulo de Evaluaci

on Pedi

atrica” /

“Triangle of Pediatric Evaluation”). Therefore, the re-

call can be improved by deﬁning a suitable dictionary

of acronyms and incorporating context-dependent de-

tection methods to disambiguate the correct meaning

of certain acronyms. Another important source of

false negatives is the presence of commercial names

of drugs, which appear in the clinical histories but

not in the lexical resources used (SNOMED CT and

DeCS), where the active pharmaceutical ingredients

may instead be present; the complementary use of

data sources like DrugBank (https://www.drugbank.

ca/) or DrugCentral (http://drugcentral.org/) could be

considered to tackle this problem.

Concerning the precision, SNOMED CT and

DeCS contain some detected words that have not been

manually annotated as medical terms in the clinical

histories used for experimentation. Several false pos-

itives correspond to terms whose identiﬁcation would

change the real meaning of the word (e.g., “pico” in

a text with the meaning of “peak value” rather than a

part of anatomy, which is“beak” in English, “Urgen-

cias” representing the area of a hospital that receives

patients that may have an emergency issue, which

in English is “Emergency Department”, rather than

an “urgency” in the medical sense, “base” represent-

ing the lower part of something, which is “basis” in

English, rather than a chemical substance, which is

“base” in English, etc.). In some cases, false nega-

tives arise because the terms were not considered rep-

resentative enough from a medical point of view, but

without implying a signiﬁcant mistake.

4 AUTOMATIC DETECTION OF

RECOMMENDATIONS FROM

CLINICAL GUIDES IN SPANISH

We have also developed a classiﬁer whose goal is de-

tecting, given a medical text, if a certain text frag-

ment is providing a recommendation (a suggestion

based on medical evidence) or just other information.

For this purpose, we focus on Spanish clinical guides

(“Gu

ıas de Pr

actica Cl

ınica del Sistema Nacional de

Salud”/“Clinical Practice Guidelines of the National

Health System”), which collect recommendations and

scientiﬁc evidences for clinical treatments in different

circumstances, assessing the risks and beneﬁts of the

different approaches. These guides are periodically

updated to reﬂect new knowledge on the topic covered

and they are available in PDF format from a public

website (IACS, 2018). Speciﬁcally, we have consid-

ered 65 clinical guides for our experiments. First, we

used a developed tool that transforms the PDF ﬁles of

the clinical guides into text ﬁles and formats them ap-

propriately, taking the structure of the clinical guides

into account. With this tool, we obtained 58864 sen-

tences from the clinical guides.

4.1 Methods Considered for the

Detection of Recommendations

Proposing new classiﬁcation methods is not our goal

at this point. Rather, we would like to assess the

feasibility of applying known techniques for recom-

mendation classiﬁcation in this context. Therefore,

for experimental evaluation, a baseline and four su-

pervised machine learning classiﬁers (Caruana and

Niculescu-Mizil, 2006) frequently used in the context

of natural language processing (NLP), applied over

a vector representation of the texts using the metric

Term Frequency – Inverse Document Frequency (TF-

IDF) (Lan et al., 2007), have been implemented and

evaluated:

Text Mining of Medical Documents in Spanish: Semantic Annotation and Detection of Recommendations

201

1) Verb Categorization (used as a baseline), which

is based on a compiled list of verbs that are fre-

quently used in medical recommendations in natu-

ral language. We have selected 45 different verbs

(such as “justiﬁcar” / “justify”, “mejorar” / “im-

prove”, “solucionar” / “solve”, “aliviar” / “alleviate”,

“curar” / “heal”, “ayudar” / “help”, etc.). A text is

estimated to be a recommendation if it contains one

of the verbs in the list. A lemmatization process is

applied to avoid mismatches due to the presence of

verbs in conjugated forms.

2) A Support Vector Machine (SVM) (Hearst et al.,

1998), which tries to determine the best hyperplane

that separates the training data in the target classes.

We performed tests with different values for the soft

margin parameter C (0.1, 0.5, and 1.0), as well as tests

with both a linear kernel and a polynomial kernel.

3) Multinomial Naive Bayes (Kibriya et al., 2004),

which applies the Bayes theorem to estimate the class

of a document based on the words it contains and

it is based on the assumption that all the predictors

(words) are independent.

4) Random Forest (Breiman, 2001), where several

decision trees are built (based on different training

sets) and their predictions are combined. The goal is

to reduce the variability of the model and increase its

precision, at the expense of higher latency and mem-

ory consumption as well as a decreased interpretabil-

ity (compared with single decision trees). We have

performed tests with both 1000 and 2000 estimators

(decision trees).

5) K-Nearest Neighbors (Chakrabarti et al., 2008),

where the class of an instance is estimated based on

the predictions of the k nearest neighbors, weighting

the predictions depending on the distance to the given

instance. We have performed tests with two different

distance metrics (the Euclidean distance and the Man-

hattan distance) and different values of the number of

neighbors k (k = 3 and k = 5).

The verb categorization approach has been imple-

mented as a Python script, based on the use of a dic-

tionary of recommendation verbs. The other meth-

ods are implemented using the Python library scikit-

learn (Pedregosa et al., 2011).

4.2 Experimental Comparison of the

Recommendation Detection

Methods

To evaluate a classiﬁcation approach, we need a la-

belled data set, so a process of manual detection of

recommendations by humans was followed: ﬁve per-

sons (two family doctors and three computer scien-

tists) analyzed each a subset of the documents to ﬁnd

recommendations. It is important to stress that the

identiﬁcation of a sentence as a recommendation or

not may depend on the subjectivity of the person; for

example, given the sentence “La presentaci

on de TAG

as complejos y graves en el inicio, el fracaso en

completar el tratamiento y la cantidad de tratamientos

intermedios durante el per

ıodo de seguimiento se aso-

cian con peores resultados de la TCC a largo plazo”

(“The presentation of more complex and serious TAG

at the beginning, failure when completing the treat-

ment and the amount of intermediate treatments dur-

ing the follow-up period are associated with worse re-

sults of the TCC in the long term”), present in one

of the clinical guides, was classiﬁed as a recommen-

dation by some persons while others considered that

exposing these results did not implicitly convey any

recommendation. As a similar example, we also ob-

served disagreement in the interpretation of the text

“Un modelo integrado en el que los m

edicos de fa-

milia son apoyados por especialistas, que durante 8

semanas (4-8 sesiones) ayudan a los pacientes a de-

sarrollar habilidades cognitivo-conductuales a trav

de relajaci

on, reconocimiento de pensamientos an-

siog

enicos y de falta de autoconﬁanza, b

usqueda de

alternativas

utiles y entrenamiento en acciones para

resoluci

on de problemas, t

ecnicas para mejorar el

sue

no y trabajo en casa” (“An integrated model where

family doctors are supported by specialists, who dur-

ing 8 weeks (4-8 sessions) help the patients to de-

velop cognitive and behavioral abilities through re-

laxation, acknowledgement of anxiogenic and lack

of self-conﬁdence thoughts, search of useful alterna-

tives and training in actions for problem solving, tech-

niques to improve sleep and work at home”).

In order to assess an agreement score for the hu-

man detectors, 100 texts were randomly selected and

they were classiﬁed by the 5 persons mentioned above

(i.e., they identiﬁed recommendations and no recom-

mendations), calculating the score as the percentage

of agreements among all the annotators over the total

number of texts. In this way, we obtained an agree-

ment score of 63%, which may not seem very high

but it is due to the fact that, as explained in the pre-

vious paragraph, the interpretation of a sentence as

a recommendation or not may depend on the subjec-

tivity of the person reading the sentence. The agree-

ment score between the two doctors is 79% and the

agreement score considering only the annotations of

the three Computer Scientists is 70%. Considering

the annotations of the two doctors, the Cohen’s kappa

is 0.581, which indicates a moderate agreement.

To compare the different techniques, each person

labelled a subset of the documents used for testing

and we applied a k-fold cross validation with k = 5.

WEBIST 2020 - 16th International Conference on Web Information Systems and Technologies

202

Table 4 summarizes the experimental results in terms

of precision, recall and F-measure. The method that

provides the best results is the one using a random

forest, with no clear impact when passing from 1000

to 2000 trees, so the one with 1000 trees is consid-

ered the best approach, among the ones compared,

due to its higher simplicity. The second best method

is the one using SVM with a linear kernel and C = 1.

Next in the rank is the kNN approach with k = 3 or

k = 5 using the Euclidean distance as the distance

metric. Multinomial Naive Bayes achieves an inter-

mediate performance, worse than the kNN approach

with the Euclidean distance but better than some vari-

ants of SVM (linear SVM with C = 0.1 and the poly-

nomial SVM approach tested) and the kNN approach

with the Manhattan distance. The verb categoriza-

tion approach (used as a baseline) is the one obtaining

the worse results along with the polynomial SVM ap-

proach as well as the kNN approach with k=3 and the

Manhattan distance.

The overall performance of most methods is quite

acceptable, especially if we take into account that the

score agreement between human annotators is 63%.

Due to the existing subjectivity, directly comparing

the classiﬁcation performed by the system with the

one proposed by a user is not completely fair. Indeed,

the percentage of failures (in terms of false negatives

and false positives) of the methods usually fall below

the disagreement score among humans (37%), and

therefore they could be explained by the subjectiv-

ity when interpreting sentences as recommendations.

Besides, an F-measure of 0.82, achieved by the ran-

dom forest methods, is a quite good result for prac-

tical applications, as it means that in general doctors

can reliably use this method as a support tool to ﬁnd

recommendations quickly.

5 CONCLUSIONS

The automatic processing of health documents can

bring signiﬁcant beneﬁts to existing health systems,

for example by helping doctors to ﬁnd relevant prac-

tice recommendations or key terms. In this paper,

we have tackled the problem of applying text min-

ing to health-related documents written in Spanish,

which is a big challenge, as most resources, tools, and

experiences have been developed for English docu-

ments. Speciﬁcally, we have tackled the problem of

automatic annotation of clinical histories with medi-

cal terms, as well as the problem of detecting recom-

mendations in clinical guidelines. Based on the ex-

perimental evaluation performed, the methods evalu-

ated can be used as a basis for further research, as we

could expect further improvements by sophisticating

the techniques applied or extending and ﬁne-tuning

them for the speciﬁc use cases considered. Our work,

based on a real-world case study, contributes to in-

creasing the scarce literature providing experimental

evaluations with medical documents in Spanish.

Snapshots of a preliminary prototype of a deci-

sion support system application that we are develop-

ing can be seen in Figures 1 and 2. The text to be

analyzed can be entered by the user directly or ob-

tained by using an implemented tool that extracts the

text from PDF ﬁles. On the top part of Figure 1 we

show the original text, with the terms detected shown

between “ ” and in bold, and ended with an annota-

tion in brackets to indicate the class associated to the

term detected. For example, “antihistam

ınico” (“an-

tihistamine” in English) has been detected as a term

belonging to class “sustancia” (“substance”) and “ur-

ticaria” has been also detected as a term belonging to

classes “trastorno” (that could be translated as “out-

break” in this context) and “anomal

ıa morfol

ogica”

(“morphological anomaly”). The middle part of Fig-

ure 1 indicates how the text shown is categorized

by the classiﬁer (in this case, as a recommendation,

suggestion or evidence). Finally, in the bottom part

of Figure 1 the terms detected and their classes are

summarized, to provide a quick overview. Figure 2

shows another example of output for a different input

text that has not been detected as a recommendation,

which is correct. For clarity and demonstration pur-

poses, we show here a short piece of text, but we have

tested the annotator with larger texts corresponding to

clinical histories of patients (e.g., see Appendix 5).

As future work, we plan to consider a number

of improvements to the methods proposed in this pa-

per and to extend our current experimental evaluation,

which has already shown promising results that sup-

port the feasibility and interest of the proposals pre-

sented and contributed to the scarce amount of ex-

periences with medical texts in Spanish. One of the

directions we want to pursue is to analyze ways to

perform a context-dependent analysis of the text (e.g.,

by identifying general topics at a paragraph or section

level, that can be used to later evaluate the probability

that a given word refers to a certain medical term, es-

pecially in the case of misspelled words). We would

also like to analyze how additional strategies to deal

with acronyms can improve the results. In the case

studies presented in this paper, we have focused on

clinical histories and medical guides, but the evalu-

ated methods may behave differently (and require sig-

niﬁcant adaptations) when applied to health-related

texts with different structure and typology (like sci-

entiﬁc articles), where the application of text mining

Text Mining of Medical Documents in Spanish: Semantic Annotation and Detection of Recommendations

203

Table 4: Detection of recommendations in clinical guides: experimental results.

Precision Recall F-measure

Verb categorization (baseline) 0.64 0.49 0.56

SVM linear, C=1 0.84 0.79 0.81

SVM linear, C=0.5 0.85 0.75 0.80

SVM linear, C=0.1 0.75 0.46 0.57

SVM polynomial (degree=2), C=1 0.46 0.65 0.54

Multinomial Naive Bayes 0.81 0.73 0.77

Random Forest (1000 estimators) 0.86 0.78 0.82

Random Forest (2000 estimators) 0.86 0.78 0.82

kNN (k=3, Euclidean distance) 0.77 0.84 0.80

kNN (k=5, Euclidean distance) 0.78 0.83 0.80

kNN (k=3, Manhattan distance) 0.94 0.39 0.55

kNN (k=5, Manhattan distance) 0.91 0.43 0.58

Figure 1: Prototype of a decision support system: output sample (recommendation).

Figure 2: Prototype of a decision support system: output sample (no recommendation).

could provide beneﬁts to other types of end users (like

researchers). Finally, performing a large-scale ex-

perimental evaluation with these and other proposed

methods (e.g., using deep learning, if a large set of

data could be compiled) would help to better validate

the signiﬁcance of the results and reﬁne the proposed

techniques.

ACKNOWLEDGEMENTS

This work has been supported by the project

TIN2016-78011-C4-3-R (AEI/FEDER, UE) and the

Government of Aragon (COSMOS group, reference

T64 20R). We thank the Health Sciences Institute

in Arag

on for providing us with real anonymized

datasets.

WEBIST 2020 - 16th International Conference on Web Information Systems and Technologies

204

REFERENCES

Aggarwal, C. C. and Zhai, C., editors (2012). Mining Text

Data. Springer.

BIREME (2020a). DeCS. http://decs.bvsalud.org/. Last ac-

cess: August 24, 2020. BIREME (Latin American and

Caribbean Center on Health Sciences Information).

BIREME (2020b). DeCS Web Services. http://wiki.

reddes.bvsalud.org/index.php/Servicios DeCS. Last

access: August 24, 2020. BIREME (Latin American

and Caribbean Center on Health Sciences Informa-

tion).

Breiman, L. (2001). Random Forests. Machine Learning,

45(1):5–32.

Buckland, M. and Gey, F. (1994). The relationship between

recall and precision. Journal of the American Society

for Information Science, 45(1):12–19.

Caruana, R. and Niculescu-Mizil, A. (2006). An empiri-

cal comparison of supervised learning algorithms. In

23rd International Conference on Machine Learning

(ICML 2006), pages 161–168. ACM.

Casta

no, J., Gambarte, M. L., Park, H. J., Avila Williams,

M. d. P., P

erez, D., Campos, F., Luna, D., Ben

ıtez,

S., Berinsky, H., and Zanetti, S. (2016). A machine

learning approach to clinical terms normalization. In

15th Workshop on Biomedical Natural Language Pro-

cessing, pages 1–11. Association for Computational

Linguistics.

Chakrabarti, S., Cox, E., Frank, E., Gting, R. H., Han, J.,

Jiang, X., Kamber, M., Lightstone, S. S., Nadeau,

T. P., Neapolitan, R. E., Pyle, D., Refaat, M., Schnei-

der, M., Teorey, T. J., and Witten, I. H. (2008). Data

Mining: Know It All. Morgan Kaufmann Publishers

Inc.

Chen, T., Dredze, M., Weiner, J. P., Hernandez, L., Kimura,

J., and Kharrazi, H. (2019). Extraction of geri-

atric syndromes from electronic health record clini-

cal notes: Assessment of statistical natural language

processing methods. JMIR Medical Informatics,

7(1):e13039:1–e13039:12.

Coppersmith, G., Leary, R., Crutchley, P., and Fine, A.

(2018). Natural language processing of social media

as screening for suicide risk. Biomedical Informatics

Insights, 10:1–11.

Cornet, R. and de Keizer, N. (2008). Forty years of

SNOMED: a literature review. BMC Medical Infor-

matics and Decision Making, 8(S1).

Costumero, R., Lopez, F., Gonzalo-Mart

ın, C., Millan, M.,

and Menasalvas, E. (2014). An approach to detect

negation on medical documents in Spanish. In Brain

Informatics and Health, pages 366–375. Springer.

Dave, H. (2019). FrequencyWords – Repository for Fre-

quency Word List Generator and processed ﬁles.

https://github.com/hermitdave/FrequencyWords. Last

access: August 24, 2020.

Explosion AI (2016–2020). spaCy. https://spacy.io. Last

access: August 24, 2020.

Fredkin, E. (1960). Trie memory. Communications of the

ACM, 3(9):490–499.

Garbe, W. (2019). SymSpell. https://github.com/wolfgarbe/

SymSpell. Last access: August 24, 2020.

Gholamrezazadeh, S., Salehi, M. A., and Gholamzadeh, B.

(2009). A comprehensive survey on text summariza-

tion systems. In Second International Conference on

Computer Science and its Applications (CSA 2009),

pages 1–6.

Haapala, A. (2019). Python-Levenshtein. https://github.

com/ztane/python-Levenshtein. Last access: August

24, 2020.

Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J.,

and Scholkopf, B. (1998). Support Vector Ma-

chines. IEEE Intelligent Systems and their Applica-

tions, 13(4):18–28.

IACS (2018). Gu

ıas de Pr

actica Cl

ınica del Sistema Na-

cional de Salud / Clinical Practice Guidelines of the

National Health System. https://portal.guiasalud.es.

Last access: August 24, 2020. Instituto Aragon

es de

Ciencias de la Salud (IACS).

Jiang, J. (2012). Information extraction from text. In Mining

Text Data, pages 11–41. Springer.

Kibriya, A. M., Frank, E., Pfahringer, B., and Holmes, G.

(2004). Multinomial Naive Bayes for text categoriza-

tion revisited. In Australasian Joint Conference on Ar-

tiﬁcial Intelligence (AI 2004), volume 3339 of Lecture

Notes in Computer Science, pages 488–499. Springer.

Lan, M., Tan, C. L., Su, J., and Low, H. B. (2007). Text

representations for text categorization: A case study

in biomedical domain. In International Joint Con-

ference on Neural Networks (IJCNN 2007)), pages

2557–2562. IEEE.

Liao, X. and Zhao, Z. (2019). Unsupervised approaches for

textual semantic annotation, a survey. ACM Comput-

ing Surveys, 52(4):66:1–66:45.

Loper, E. and Bird, S. (2002). NLTK: The Natural Lan-

guage Toolkit. arXiv, cs/0205028.

mammothb (2019). symspellpy – Python port of SymSpell.

https://github.com/mammothb/symspellpy. Last ac-

cess: August 24, 2020.

Manning, C. D., Raghavan, P., and Sch

utze, H. (2008). In-

troduction to Information Retrieval. Cambridge Uni-

versity Press.

Marimon, M., Gonzalez-Agirre, A., Intxaurrondo, A.,

Rodr

ıguez, H., Martin, J. L., Villegas, M., and

Krallinger, M. (2019). Automatic de-identiﬁcation

of medical texts in Spanish: the MEDDOCAN track,

corpus, guidelines, methods and evaluation of results.

In Iberian Languages Evaluation Forum (IberLEF

2019), volume 2421, pages 618–638. CEUR Workhop

Proceedings.

Marrero, M., S

anchez-Cuadrado, S., Urbano, J., Morato, J.,

and Moreiro, J.-A. (2010). Sistemas de recuperaci

de informaci

on adaptados al dominio biom

edico. In-

formaci

on biom

edica, 19(3):246–254.

Marrero, M., Urbano, J., S

anchez-Cuadrado, S., Morato,

J., and G

omez-Berb

ıs, J. M. (2013). Named Entity

Recognition: Fallacies, challenges and opportunities.

Computer Standards & Interfaces, 35(5):482–489.

eol, A., Dalianis, H., Velupillai, S., Savova, G., and

Zweigenbaum, P. (2018). Clinical natural language

Text Mining of Medical Documents in Spanish: Semantic Annotation and Detection of Recommendations

205

processing in languages other than English: opportu-

nities and challenges. Journal of Biomedical Seman-

tics, 9(1).

NLTK Project (2020a). NLTK. https://www.nltk.org. Last

access: August 24, 2020.

NLTK Project (2020b). NLTK – nltk.stem package. https://

www.nltk.org/api/nltk.stem.html. Last access: August

24, 2020.

Padr

o, L. (2008). FreeLing. http://nlp.lsi.upc.edu/freeling.

Last access: August 24, 2020.

Padr

o, L. and Stanilovsky, E. (2012). FreeLing 3.0: To-

wards wider multilinguality. In Eighth International

Conference on Language Resources and Evaluation

(LREC 2012), pages 2473–2479. European Language

Resources Association (ELRA).

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,

Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,

Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,

Cournapeau, D., Brucher, M., Perrot, M., and

Edouard

Duchesnay (2011). Scikit-learn: Machine learning

in Python. Journal of Machine Learning Research,

12:2825–2830.

Pons, E., Braun, L. M. M., Hunink, M. G. M., and Kors,

J. A. (2016). Natural language processing in radiol-

ogy: A systematic review. Radiology, 279(2):329–

343.

Python Software Foundation (2020). re – regular expres-

sion operations in Python. https://docs.python.org/3/

library/re.html. Last access: August 24, 2020.

Sahni, S. and Mehta, D. P. (2018). Handbook of Data Struc-

tures and Applications, Second Edition. Chapman and

Hall/CRC.

Sebastiani, F. (2002). Machine learning in automated text

categorization. ACM Computing Surveys, 34(1):1–47.

Simpson, M. S. and Demner-Fushman, D. (2012). Biomed-

ical text mining: A survey of recent progress. In Min-

ing Text Data, pages 465–517. Springer.

Singh, V. (2017a). FlashText Python module. https:

//ﬂashtext.readthedocs.io. Last access: August 24,

2020.

Singh, V. (2017b). Replace or retrieve keywords in docu-

ments at scale. arXiv:1711.00046.

SNOMED International (2020). SNOMED CT. http:

//www.snomed.org/snomed-ct/why-snomed-ct. Last

access: August 24, 2020.

Spasic, I., Ananiadou, S., McNaught, J., and Kumar, A.

(2005). Text mining and ontologies in biomedicine:

Making sense of raw text. Brieﬁngs in Bioinformat-

ics, 6(3):239–251.

Trieschnigg, D., Pezik, P., Lee, V., de Jong, F., Kraaij, W.,

and Rebholz-Schuhmann, D. (2009). MeSH up: effec-

tive MeSH text classiﬁcation for improved document

retrieval. Bioinformatics, 25(11):1412–1418.

Wu, C., Xia, F., Deleger, L., and Solti, I. (2011). Statistical

machine translation for biomedical text: Are we there

yet? In AMIA Annual Symposium, pages 1290–1299.

American Medical Informatics Association (AMIA).

Yim, W., Yetisgen, M., Harris, W. P., and Kwan, S. W.

(2016). Natural language processing in oncology.

JAMA Oncology, 2(6):797.

APPENDIX: EXAMPLES OF

TEXTS ANNOTATED USING

SNOMED-CT AND DeCS

In this appendix, we show two examples of texts an-

notated by using the annotation tool developed in this

work. The ﬁrst text has been annotated considering

the dictionary created with data from SNOMED-CT

and the second text with the dictionary of terms of

DeCS. It should be noted that some texts contain ty-

pos, as they are real texts written by doctors during

their daily practice (no proof-reading has been applied

to correct the potential mistakes; only some sensitive

data, such as the age of a patient, have been removed

from the original texts). In the examples, we use dif-

ferent colors to represent the terms that are correctly

annotated (shown in light green), terms that are incor-

rectly detected but that are not really relevant (shown

as strikethrough text), and terms not detected but that

should have been detected as relevant (shown in bold

light red).

Example 1: Annotation Using SNOMED-CT

Input Text: LUMBOCIATICADescripci

on de

la(s) exploraci

on(es): – EXPLORACI

ON: RM de

columna lumbosacra, secuencias en ponderaci

on T1

sagital, secuencia DIXON sagital y T1 y T2 plano

axial. Hallazgos: P

erdida de la lordosis lumbar con

rectiﬁcaci

on. Abombamientos de platillos general-

izados, pero con correcta altura de cuerpos verte-

brales. Alineaci

on anteroposterior conservada. Mod-

erados signos espondil

osicos con incipiente osteoﬁ-

tosis de predominio anterior. Salidas difusas circun-

ferenciales discales. Disminuci

on generalizada de in-

tensidad de se

nal a nivel discal en T2 indicativo de

deshidrataci

on, mucho m

as evidente en los

ultimos

niveles lumbares. Esclerosis interapoﬁsaria asociada.

– NIVEL L2-L3: bandas parcheadas de hiperse

nal en

T1 y T2 en platillos indicativos de cambios degen-

erativos tipo II. Salida difusa circunferencial discal.

Leve deshidrataci

on discal. Ligera esclerosis inter-

apoﬁsaria. – NIVEL L3-L4: salida difusa circunfer-

encial discal. Leve deshidrataci

on discal. Ligera es-

clerosis interapoﬁsaria. – NIVEL L4-L5: salida di-

fusa circunferencial discal. Peque

na hernia postero-

medial del n

ucleo pulposo. Marcada deshidrataci

discal. Disminuci

on del espacio intersom

atico. Mar-

cada esclerosis interapoﬁsaria, con hipertroﬁa. Se

asocia con hipertroﬁa ligamentaria que disminuyen

el calibre transverso del canal. En su conjunto se

reconoce ligero compromiso de recesos laterales y

de ambos for

amenes secundario. – NIVEL L5-S1:

Grandes bandas de hiperse

nal en T2 y T2 en platil-

WEBIST 2020 - 16th International Conference on Web Information Systems and Technologies

206

los, fundamentalmente en el inferior de L5, que in-

dican cambios degenerativos tipo II. Importante dis-

minuci

on del espacio intersom

atico. Hiperintensidad

de se

nal discal en T1 y T2, que se suprime con la se-

cuencia saturaci

on grasa, que indica recambio degen-

erativo graso discal. Marcada hipertroﬁa interapoﬁs-

aria. Osteoﬁtosis y protrusi

on disco osteoﬁtaria pos-

teromedial. Ligera disminuci

on del calibre del canal.

Diagn

ostico: Nombre Responsable 1: [1] – Fecha de

Firma: ZARAGOZA, N.Colegiado: Categor

ıa Pro-

fesional 1: – Informe de Resultados de Pruebas de

Imagen. Servicio de Radiodiagn

ostico. Fecha de

Impresi

on: – P

erdida de la lordosis lumbar con rec-

tiﬁcaci

on. – Signos espondil

osicos con salidas di-

fusas circunferenciales discales. Deshidrataci

on es

discales asociadas y esclerosis interapoﬁsaria. – L2-

L3: cambios degenerativos y II en platillos. – L4-

5: disminuci

on del espacio intersom

atico, esclerosis

e hipertroﬁa interapoﬁsaria ligamentaria con dismin-

uci

on de calibre transverso del canal. Compromiso

de recesos laterales. Peque

na hernia posteromedial y

de n

ucleo pulposo. – L5-S1: cambios degenerativos

tipo II en platillos. Recambio graso discal. Marcada

hipertroﬁa interapoﬁsaria.

Results and Comments: 1) only the term “disco”

(“disc”) is a false positive in this text; although it

could be considered a relevant medical term, in the

text it is not used with the meaning attributed by the

class associated to the term in SNOMED-CT (“drug,

medicine”) but rather as a part of the human spine;

2) in this case, the term “espondilosis” is detected

only after applying a spell checker over the word

“espondil

osicos” (otherwise, it would have not been

detected). The terms detected after applying a spell

checker are subject to some uncertainty (as the appli-

cation of a spell checker automatically could actually

lead to a term different than the one intended in the

original text), although in this case the detection is

correct.

Example 2: Annotation using DeCS

Input Text: SINDROME CORONARIO AGUDO

Paciente intervenido triple Bypass coronario AMI a

DA y vena safena a Dx yDp .– se copiainforme, (se

encuentra en OMI) le indican que han solicitado con-

sulta en Cardiologia, pero no consta en su historico.

Motivo del Alta: Curaci

on o mejor

ıa. Motivo in-

mediato del ingreso: Paciente de a

nos de edad que

ingresa procedente de Hospital Miguel Servet para

cirug

ıa coronaria urgente. Anamnesis: Antecedentes

personales: Dudosa alergia a Amoxcilina-Clavulan-

ico . Exfumador. No HTA . DM tipo 2 (ADO).

Dislipemia. Poliquistosis renal y ectasia pielocali-

cial derecha. Esteatosis hep

atica. Bocio. Diagnos-

Table 5: Example of annotation with SNOMED-CT.

Detected term Associated class (or associ-

ated classes)

cambio degenera-

tivo

anomal

ıa morfol

ogica

canal estructura corporal

cuerpo vertebral estructura corporal

deshidrataci

on trastorno

disco f

armaco de uso cl

ınico

disminuci

on anomal

ıa morfol

ogica

esclerosis anomal

ıa morfol

ogica

exploraci

on procedimiento

grasa sustancia, estructura corporal

hallazgo hallazgo

hernia anomal

ıa morfol

ogica

hipertroﬁa anomal

ıa morfol

ogica

lordosis trastorno

ucleo pulposo, L5-

estructura corporal

protrusi

on anomal

ıa morfol

ogica

se reconoce hallazgo

signo hallazgo

espondilosis

(trastorno)

trastorno

ticado de SAOS . Reﬂujo Gastroesof

agico. Hipoacu-

sia. Intervenido de septoplastia. Colelitiasis. Cole-

cistectom

ıa. Historia Cardiol

ogica: Estudiado por

dolor tor

acico at

ıpico en Medicina Interna, y Car-

diolog

ıa, con ergometr

ıa no sugerente de isquemia

con 10 METS de carga en . El acude a Urgencias

por cl

ınica de

angor de reposo de algunas horas de

duraci

on, sin componente postural, y desencadena-

dos por esfuerzo hace unas semanas. A su llegada

a Urgencias, nuevo dolor, realizando ECG que ev-

idencia pseudopositivizaci

on de onda T, que desa-

parece tras comenzar pc de SLN + m

ınima elevaci

de TnUS (troponina pico 180), decidiendo ingreso en

UCI. El se realiza coronariograf

ıa que evidencia en-

fermedad multivaso, es presentado en sesi

on m

edico-

quir

urgica decidi

endose cirug

ıa en el ingreso. Explo-

raciones Complementarias: Ecocardiograma : Cavi-

dades cardiacas y Aorta ascendente de dimensiones

normales. HVI ligera. Contractilidad global conser-

vada, sin apreciar alteraciones segmentarias. Patr

relajaci

on disminuida, sin elevaci

on de las PTDVI

. V

alvulas estructural y funcionalmente normales

(VAo trivalva) Contractilidad normal del VD . Cava

y suprahep

aticas no dilatadas, sin inversi

on de ﬂu-

jos y normocolapso inspiratorio. No signos indirec-

tos de HTP . No afectaci

on peric

ardica Cateterismo:

Tronco: Sin lesiones. DA en segmento proximal pre-

senta estenosis cr

ıtica y luego estenosis signiﬁcativa

respectivamente. 1ra diagonal: 1 mm con lesi

on sig-

niﬁcativa ostial. 2da diagonal. lesi

on signiﬁcativa os-

tial. Lesi

on signiﬁcativa en tercio distal de CX que in-

Text Mining of Medical Documents in Spanish: Semantic Annotation and Detection of Recommendations

207

volucra ostium de rama marginal. Arteria Intermedia:

Estenosis en l

ımite de la signiﬁcancia en segmento

proximal. CD: Estenosis ligera tercio proximal. DP

con estenosis ostial en l

ımite de la signiﬁcancia y

lesi

on en tercio medio signiﬁcativa. Procedimientos

Terap

euticos: Fecha de la intervenci

on: . Cirujano.

Se realiza bajo CEC triple bypass coronario: AMI a

DA y vena safena a Dx y DP. En quir

ofano inestabil-

idad hemodin

amica con bradicardia extrema que pre-

cisa de entrada urgente en C.

Results and Comments: 1) most undetected terms

are acronyms and abbreviations used by doctors (e.g.,

HTA, HVI, PTDVI, VD, HTP, etc.); 2) incorrectly de-

tected terms correspond to terms that are not relevant

in the medical context of the text or whose identiﬁca-

tion would change the real meaning of the word (e.g.,

“relajaci

on” in the text has a different meaning that a

social phenomenon, “pico” in the text has the mean-

ing of “peak value” rather than a part of anatomy,

“Urgencias” in the text represents the area of a hospi-

tal that receives patients that may have an emergency

issue rather than an “urgency” in the medical sense,

etc.).

Table 6: Example of annotation with DeCS (1/2).

Detected

term

Associated class (or associated

classes)

alergia ENFERMEDADES - SALUD

UBLICA

anamnesis T

ECNICAS Y EQUIPOS

ANAL

ITICOS, DIAGN

OSTICOS

Y TERAPE

UTICOS

bocio ENFERMEDADES - SALUD

UBLICA

bradicardia ENFERMEDADES

CEC T

ECNICAS Y EQUIPOS

ANAL

ITICOS, DIAGN

OSTICOS

Y TERAPE

UTICOS

cardiolog

ıa DISCIPLINAS Y OCUPACIONES

cateterismo T

ECNICAS Y EQUIPOS

ANAL

ITICOS, DIAGN

OSTICOS

Y TERAPE

UTICOS

cirug

ıa DISCIPLINAS Y OCUPACIONES

cirujano DENOMINACIONES DE GRUPOS -

ATENCI

ON DE SALUD

Table 7: Example of annotation with DeCS (2/2).

Detected

term

Associated class (or associated classes)

colecistec-

tom

ıa

ECNICAS Y EQUIPOS ANAL

ITICOS,

DIAGN

OSTICOS Y TERAPE

UTICOS

colelitiasis ENFERMEDADES

consulta ATENCI

ON DE SALUD

dislipemia ENFERMEDADES

dolor ENFERMEDADES - PSIQUIATR

IA Y

PSICOLOG

IA - FEN

OMENOS Y PRO-

CESOS

dolor

tor

acico

ENFERMEDADES

ECG T

ECNICAS Y EQUIPOS ANAL

ITICOS,

DIAGN

OSTICOS Y TERAPE

UTICOS

ectasia ENFERMEDADES

elevaci

on FEN

OMENOS Y PROCESOS

enfermedad ENFERMEDADES - SALUD PUBLICA

ergometr

ıa T

ECNICAS Y EQUIPOS ANAL

ITICOS,

DIAGN

OSTICOS Y TERAPE

UTICOS

estenosis ENFERMEDADES

hemodin

amica FEN

OMENOS Y PROCESOS

hep

atico ORGANISMOS

hipoacusia ENFERMEDADES

historia HUMANIDADES

hospital ATENCI

ON DE SALUD - VIGILANCIA

SANITARIA - SALUD P

UBLICA

ingreso ATENCI

ON DE SALUD - SALUD

UBLICA

inversi

on ATENCI

ON DE SALUD - SALUD

UBLICA

isquemia ENFERMEDADES

lesi

on ENFERMEDADES - SALUD P

UBLICA

medicina in-

terna

DISCIPLINAS Y OCUPACIONES

edico DENOMINACIONES DE GRUPOS -

SALUD P

UBLICA - ATENCI

ON DE

SALUD

paciente DENOMINACIONES DE GRUPOS

pico ANATOM

proced-

imiento

terap

eutico

ECNICAS Y EQUIPOS ANAL

ITICOS,

DIAGN

OSTICOS Y TERAPE

UTICOS -

VIGILANCIA SANITARIA

quir

ofano ATENCI

ON DE SALUD - VIGILANCIA

SANITARIA

reﬂujo gas-

troesof

agico

ENFERMEDADES

relajaci

on ANTROPOLOG

IA, EDUCACI

ON,

SOCIOLOG

IA Y FEN

OMENOS SO-

CIALES

reposo ANTROPOLOG

IA, EDUCACI

ON,

SOCIOLOG

IA Y FEN

OMENOS SO-

CIALES

signo ENFERMEDADES

ındrome

coronario

agudo

ENFERMEDADES

troponina COMPUESTOS QU

IMICOS Y DRO-

GAS

UCI ATENCI

ON DE SALUD - VIGILANCIA

SANITARIA

urgencia ENFERMEDADES - ATENCI

ON DE

SALUD

WEBIST 2020 - 16th International Conference on Web Information Systems and Technologies

208