Authors:
Juliano Gaspar
;
Emanuel Catumbela
;
Bernardo Marques
and
Alberto Freitas
Affiliation:
University of Porto, Portugal
Keyword(s):
Outliers detection, Data mining, Medical data.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Health Information Systems
;
Information Systems Analysis and Specification
;
Knowledge Management
;
Ontologies and the Semantic Web
;
Pattern Recognition and Machine Learning
;
Sensor Networks
;
Signal Processing
;
Society, e-Business and e-Government
;
Soft Computing
;
Web Information Systems and Technologies
Abstract:
Background: Patient medical records contain many entries relating to patient conditions, treatments and lab results. Generally involve multiple types of data and produces a large amount of information. These databases can provide important information for clinical decision and to support the management of the hospital. Medical databases have some specificities not often found in others non-medical databases. In this context, outlier detection techniques can be used to detect abnormal patterns in health records (for instance, problems in data quality) and this contributing to better data and better knowledge in the process of decision making.
Aim: This systematic review intention to provide a better comprehension about the techniques used to detect outliers in healthcare data, for creates automatisms for those methods in the order to facilitate the access to information with quality in healthcare.
Methods: The literature was systematically reviewed to identify articles mentioning out
lier detection techniques or anomalies in medical data. Four distinct bibliographic databases were searched: Medline, ISI, IEEE and EBSCO.
Results: From 4071 distinct papers selected, 80 were included after applying inclusion and exclusion criteria. According to the medical specialty 32% of the techniques are intended for oncology and 37% of them using patient data. Considering only articles that used administrative medical data, 59% of the techniques were statistical based.
Conclusion: The area with outliers detection techniques most widely used in medical administrative data is the statistics, when compared with techniques from data mining such as clustering and nearest neighbor.
(More)