Authors:
Clara Pizzuti
and
Giandomenico Spezzano
Affiliation:
National Research Council (CNR), Italy
Keyword(s):
Genetic programming, Data mining, Classification, Ensemble classifiers, Streaming data, Fractal dimension.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Co-Evolution and Collective Behavior
;
Computational Intelligence
;
Evolutionary Computing
;
Genetic Algorithms
;
Informatics in Control, Automation and Robotics
;
Intelligent Control Systems and Optimization
;
Soft Computing
Abstract:
Distributed stream-based classification methods have many important applications such as sensor data analysis, network security, and business intelligence. An important challenge is to address the issue of concept drift in the data stream environment, which is not easily handled by the traditional learning techniques. This paper presents a Genetic Programming (GP) based boosting ensemble method for the classification of distributed streaming data able to adapt in presence of concept drift. The approach handles flows of data coming from multiple locations by building a global model obtained by the aggregation of the local models coming from each node. The algorithm uses a fractal dimension-based change detection strategy, based on self-similarity of the ensemble behavior, that permits the capture of time-evolving trends and patterns in the stream, and to reveal changes in evolving data streams. Experimental results on a real life data set show the validity of the approach in maintaini
ng an accurate and up-to-date GP ensemble.
(More)