CHANGE-POINT DETECTION WITH SUPERVISED LEARNING AND FEATURE SELECTION

Victor Eruhimov, Vladimir Martyanov, Eugene Tuv, George C. Runger

Abstract

Data streams with high dimensions are more and more common as data sets become wider. Time segments of stable system performance are often interrupted with change events. The change-point problem is to detect such changes and identify attributes that contribute to the change. Existing methods focus on detecting a single (or few) change-point in a univariate (or low-dimensional) process. We consider the important high-dimensional multivariate case with multiple change-points and without an assumed distribution. The problem is transformed to a supervised learning problem with time as the output response and the process variables as inputs. This opens the problem to a wide set of supervised learning tools. Feature selection methods are used to identify the subset of variables that change. An illustrative example illustrates the method in an important type of application.

References

  1. Amit, Y. and Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9(7):1545-1588.
  2. Belisle, P., Joseph, L., Macgibbon, B., Wolfson, D. B., and Berger, R. D. (1998). Change-point analysis of neuron spike train data. Biometrics, 54:113-123.
  3. Boser, B., Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Haussler, D., editor, 5th Annual ACM Workshop on COLT, Pittsburgh, PA, pages 144-152. ACM Press.
  4. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123-140.
  5. Breiman, L. (2001). Statistical modeling: The two cultures. Statistical Science, 16(3):199-231.
  6. Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139-157.
  7. Freund, Y. and Schapire, R. E. (1996). Experiments with a new boosting algorithm. In the 13th International Conference on Machine Learning, pages 148-156. Morgan Kaufman.
  8. Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182.
  9. Hansen, L. K. and Salamon, P. (1990). Neural network ensembles. IEEE Trans. on Pattern Analysis and Machine Intelligence, 12(10):993-1001.
  10. Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(8):832-844.
  11. Hotelling, H. (1947). Multivariate quality controlillustrated by the air testing of sample bombsights. Techniques of Statistical Analysis, pages 111-184.
  12. Li, F., Runger, G. C., and Tuv, E. (2006). Supervised learning for change-point detection. IIE Transactions, 44(14-15):2853-2868.
  13. Liu, H. and Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowledge and Data Eng., 17(4):491-502.
  14. Pievatolo, A. and Rotondi, R. (2000). Analysing the interevent time distribution to identify seismicity phases: a bayesian nonparametric approach to the multiple change-points problem. Applied Statistics, 49(4):543-562.
  15. Tuv, E. (2006). Ensemble learning and feature selection. In Guyon, I., Gunn, S., Nikravesh, M., and Zadeh, L., editors, Feature Extraction, Foundations and Applications. Springer.
  16. Tuv, E., Borisov, A., Runger, G., and Torkkola, K. (2007). Best subset feature selection with ensembles, artificial variables, and redundancy elimination. Journal of Machine Learning Research. submitted.
  17. Valentini, G. and Dietterich, T. (2003). Low bias bagged support vector machines. In ICML 2003, pages 752- 759.
Download


Paper Citation


in Harvard Style

Eruhimov V., Martyanov V., Tuv E. and C. Runger G. (2007). CHANGE-POINT DETECTION WITH SUPERVISED LEARNING AND FEATURE SELECTION . In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-972-8865-82-5, pages 359-363. DOI: 10.5220/0001631303590363


in Bibtex Style

@conference{icinco07,
author={Victor Eruhimov and Vladimir Martyanov and Eugene Tuv and George C. Runger},
title={CHANGE-POINT DETECTION WITH SUPERVISED LEARNING AND FEATURE SELECTION},
booktitle={Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2007},
pages={359-363},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001631303590363},
isbn={978-972-8865-82-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - CHANGE-POINT DETECTION WITH SUPERVISED LEARNING AND FEATURE SELECTION
SN - 978-972-8865-82-5
AU - Eruhimov V.
AU - Martyanov V.
AU - Tuv E.
AU - C. Runger G.
PY - 2007
SP - 359
EP - 363
DO - 10.5220/0001631303590363