Authors:
Baya Lydia BOUDJELOUD
and
François POULET
Affiliation:
ESIEA Recherche, France
Keyword(s):
Data Mining, outlier detection, data visualisation, genetic algorithm, high dimensional data.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Evolutionary Programming
;
Health Information Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
The outlier detection problem has important applications in the field of fraud detection, network robustness analysis, and intrusion detection. Such applications have to deal with high dimensional data sets with hundreds of dimensions. However, in high dimensional space, the data are sparse and the notion of proximity fails to retain its meaningfulness. Many recent algorithms use heuristics such as genetic algorithms, the taboo search... in order to palliate these difficulties in high dimensional data. We present in this paper a new hybrid algorithm for outlier detection in high dimensional data. We evaluate the performances of the new algorithm on different high dimensional data sets, and visualise its results for some data sets.