Authors:
Ioana Barbantan
;
Camelia Lemnaru
and
Rodica Potolea
Affiliation:
Technical University of Cluj-Napoca, Romania
Keyword(s):
Electronic Health Records, Concept Annotation, Ontology, Negation, Prefixes, Structure.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
BioInformatics & Pattern Discovery
;
Computational Intelligence
;
Context Discovery
;
Evolutionary Computing
;
Information Extraction
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Soft Computing
;
Symbolic Systems
Abstract:
Exploiting efficiently medical data from Electronic Health Records (EHRs) is a current joint research focus of the knowledge extraction and the medical communities. EHR structuring is essential for the efficient exploitation of the information they capture. To that end, concept identification and categorization represent key tasks. This paper presents a disease identification approach which applies several NLP document pre-processing steps, queries the SNOMED-CT ontology and then applies a filtering rule on the retrieved information. The hierarchical approach provides a better filtering of the concepts, reducing the amount of falsely identified disease concepts. We have performed a series of evaluations on the Medline abstracts dataset. The results obtained so far are promising – our method achieves a precision of 87.79% and a recall of 87.12%, better than the results obtained by Apache’s cTAKES system on the same task and dataset.