Authors:
Pedro Matias
1
;
2
;
Duarte Folgado
1
;
Hugo Gamboa
1
;
3
and
André V. Carreiro
1
Affiliations:
1
Associação Fraunhofer Portugal Research, Rua Alfredo Allen 455/461, Porto, Portugal
;
2
Faculdade de Ciências e Tecnologia, FCT, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
;
3
Laboratório de Instrumentação, Engenharia Biomédica e Física da Radiação (LIBPhys-UNL), Departamento de Física, Faculdade de Ciências e Tecnologia, FCT, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
Keyword(s):
Anomaly Detection, Time Series, Variational AutoEncoders, Unsupervised Learning, ECG.
Abstract:
The rise of time series data availability has demanded new techniques for its automated analysis regarding several tasks, including anomaly detection. However, even though the volume of time series data is rapidly increasing, the lack of labeled abnormal samples is still an issue, hindering the performance of most supervised anomaly detection models. In this paper, we present an unsupervised framework comprised of a Variational Autoencoder coupled with a local similarity score, which learns solely on available normal data to detect abnormalities in new data. Nonetheless, we propose two techniques to improve the results if at least some abnormal samples are available. These include a training set cleaning method for removing the influence of corrupted data on detection performance and the optimization of the detection threshold. Tests were performed in two datasets: ECG5000 and MIT-BIH Arrhythmia. Regarding the ECG5000 dataset, our framework has shown to outperform some supervised and
unsupervised approaches found in the literature by achieving an AUC score of 98.79%. In the MIT-BIH dataset, the training set cleaning step removed 60% of the original training samples and improved the anomaly detection AUC score from 91.70% to 93.30%.
(More)