Authors:
Victor Eruhimov
1
;
Vladimir Martyanov
1
;
Eugene Tuv
1
and
George C. Runger
2
Affiliations:
1
Intel, Analysis and Control Technology, United States
;
2
Industrial Engineering, Arizona State University, United States
Keyword(s):
Data streams, ensembles, variable importance, multivariate control.
Related
Ontology
Subjects/Areas/Topics:
Business Analytics
;
Change Detection
;
Computer Vision, Visualization and Computer Graphics
;
Data Engineering
;
Feature Extraction
;
Features Extraction
;
Image and Video Analysis
;
Informatics in Control, Automation and Robotics
;
Intelligent Control Systems and Optimization
;
Intelligent Fault Detection and Identification
;
Machine Learning in Control Applications
;
Signal Processing, Sensors, Systems Modeling and Control
Abstract:
Data streams with high dimensions are more and more common as data sets become wider. Time segments of stable system performance are often interrupted with change events. The change-point problem is to detect such changes and identify attributes that contribute to the change. Existing methods focus on detecting a single (or few) change-point in a univariate (or low-dimensional) process. We consider the important high-dimensional multivariate case with multiple change-points and without an assumed distribution. The problem is transformed to a supervised learning problem with time as the output response and the process variables as inputs. This opens the problem to a wide set of supervised learning tools. Feature selection methods are used to identify the subset of variables that change. An illustrative example illustrates the method in an important type of application.