2-CLASS EIGEN TRANSFORMATION CLASSIFICATION TREES

Frank Plastria, Steven De Bruyne

2009

Abstract

We propose a classification algorithm that extends linear classifiers for binary classification problems by looking for possible later splits to deal with remote clusters. These additional splits are searched for in directions given by several eigen transformations. The resulting structure is a tree that possesses unique properties that allow, during the construction of the classifier, the use of criteria that are more directly related to classification power than is the case with traditional classification trees. We show that the algorithm produces classifiers equivalent to linear classifiers where these latter are optimal, and otherwise offer higher flexibility while being more robust than traditional classification trees. It is shown how the classification algorithm can outperform traditional classification algorithms on a real life example. The new classifiers retain the level of interpretability of linear classifiers and traditional classification trees unavailable with more complex classifiers. Additionally, they not only allow to easily identify the main properties of the separate classes, but also to identify properties of potential subclasses.

References

  1. Breiman L., Friedman J.H., Olshen R.A., Stone C.J. (1984) Classification and regression trees. Wadsworth, Belmont
  2. Fisher R.A. (1936) The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7: 179-188
  3. Hotelling H. (1933) Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24: 417-441.
  4. Jolliffe I.T. (1986) Principal component analysis. Springer, Berlin
  5. Plastria F., De Bruyne S., Carrizosa E. (2008) Dimensionality Reduction for Classification: Comparison of Techniques and Dimension Choice. Advanced Data Mining and Applications, Tang C., Ling C.X., Zhou X., Cercone N.J., Li X. (Eds.) Lecture Notes in Artificial Intelligence 5139: 411-418, Springer, Berlin
  6. Quinlan J.R. (1993) C4.5 Programs for Machine Learning. Morgan Kaufmann
  7. Newman D.J., Hettich S., Blake C.L., Merz C.J. (1998). UCI Repository of machine learning databases http://www.ics.uci.edu/ mlearn/ MLRepository.html. Irvine, CA: University of California, Department of Information and Computer Science.
  8. Vapnik V. (1995) The nature of statistical learning theory. Springer, New York
  9. Witten I.H., Frank E. (2005) Data Mining: Practical machine learning tools and techniques, 2nd Edition Morgan Kaufmann, San Francisco, 2005. Weka software: http://www.cs.waikato.ac.nz/ ml/weka/index.html
Download


Paper Citation


in Harvard Style

Plastria F. and De Bruyne S. (2009). 2-CLASS EIGEN TRANSFORMATION CLASSIFICATION TREES . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 251-258. DOI: 10.5220/0002266202510258


in Bibtex Style

@conference{kdir09,
author={Frank Plastria and Steven De Bruyne},
title={2-CLASS EIGEN TRANSFORMATION CLASSIFICATION TREES},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={251-258},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002266202510258},
isbn={978-989-674-011-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - 2-CLASS EIGEN TRANSFORMATION CLASSIFICATION TREES
SN - 978-989-674-011-5
AU - Plastria F.
AU - De Bruyne S.
PY - 2009
SP - 251
EP - 258
DO - 10.5220/0002266202510258