Authors:
Rashmi Burse
;
Michela Bertolotto
and
Gavin Mcardle
Affiliation:
University College Dublin, Belfield, Dublin 4, Ireland
Keyword(s):
Biomedical Named Entity Recognition, Lexical Auditing, Semantic Analysis, Quality Assurance, Biomedical Ontologies, SNOMED.
Abstract:
Existing lexical auditing techniques for Quality Assurance (QA) of biomedical ontologies exclusively consider lexical patterns of concept names and do not take semantic domains associated with the tokens constituting those patterns into consideration. For many similar lexical patterns the corresponding semantic domains may not be similar. Therefore, not considering the semantic aspect of similar lexical patterns can lead to poor QA of biomedical ontologies. Semantic domain association can be accomplished by using a Biomedical Named Entity Recognition (Bio-NER) system. However, the existing Bio-NER systems are developed with the goal of extracting information from natural language text, like discharge summaries, and as a result do not annotate individual tokens of a clinical concept. Annotating individual tokens of a clinical concept with their semantic domains is important from a QA perspective, since these annotations can be leveraged to gain insight into the type of attributes that
should be associated with the concept. In this paper we present an annotator that atomically annotates the tokens of a clinical concept by crafting atomic dictionaries from the sub-hierarchies of Systematized Nomenclature of Medicine (SNOMED). Semantic analysis of lexically similar concepts by atomically annotating semantic domains to the tokens will ensure improved QA of biomedical ontologies.
(More)