CONSOLIDATED TREE CONSTRUCTION ALGORITHM: STRUCTURALLY STEADY TREES

J. M. Pérez, J. Muguerza, O. Arbelaitz, I. Gurrutxaga

Abstract

This paper presents a new methodology for building decision trees or classification trees (Consolidated Trees Construction algorithm) that faces up the problem of unsteadiness appearing in the paradigm when small variations in the training set happen. As a consequence, the understanding of the made classification is not lost, making this technique different from techniques such as bagging and boosting where the explanatory feature of the classification disappears. The presented methodology consists on a new meta-algorithm for building structurally more steady and less complex trees (consolidated trees), so that they maintain the explaining capacity and they are faster, but, without losing the discriminating capacity. The meta-algorithm uses C4.5 as base classifier. Besides the meta-algorithm, we propose a measure of the structural diversity used to analyse the stability of the structural component. This measure gives an estimation of the heterogeneity in a set of trees from the structural point of view. The obtained results have been compared with the ones get with C4.5 in some UCI Repository databases and a real application of customer fidelisation from a company of electrical appliances.

References

  1. Quinlan J. R., 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc.(eds), San Mateo, California.
  2. Dietterich T.G., 2000. Ensemble Methods in Machine Learning. Lecture Notes in Computer Science, Vol. 1857. Multiple Classifier Systems: Proc. 1st. Inter. Workshop, MCS, Cagliari, Italy, 1-15
  3. Chawla N.V., Hall L.O., Bowyer K.W., Moore Jr., Kegelmeyer W.P., 2002. Distributed Pasting of Small Votes. Lecture Notes in Computer Science Vol. 2364. Multiple Classifier Systems: Proc. 3th. Inter. Workshop, MCS, Cagliari, Italy, 52-61
  4. Breiman L., 1996. Bagging Predictors. Machine Learning, 24, 123-140
  5. Freund, Y., Schapire, R. E., 1996. Experiments with a New Boosting Algorithm. Proceedings of the 13th International Conference on Machine Learning, 148- 156
  6. Duin R.P.W, Tax D.M.J., 2000. Experiments with Classifier Combining Rules. Lecture Notes in Computer Science 1857. Multiple Classifier Systems: Proc. 1st. Inter. Workshop, MCS, Cagliari, Italy, 16- 29
  7. Bauer E., Kohavi R., 1999. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning, 36, 105-139
  8. Blake, C.L., Merz, C.J., 1998. UCI Repository of Machine Learning Databases. University of California, Irvine, Dept. of Information and Computer Sciences http://www.ics.uci.edu/mlearn/MLRepository.html
  9. Hastie T., Tibshirani R. Friedman J., 2001. The Elements of Statistical Learning. Springer-Verlang (es). ISBN: 0-387-95284-5
  10. Dietterich T.G., 1998. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10, 7, 1895-1924
  11. Dietterich T.G., 2000. An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning, 40, 139-157
  12. Skurichina M., Kuncheva L.I., Duin R.P.W., 2002. Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy. Lecture Notes in Computer Science 2364. Multiple Classifier Systems: Proc. 3th. Inter. Workshop, MCS, Cagliari, Italy, 62-71
Download


Paper Citation


in Harvard Style

M. Pérez J., Muguerza J., Arbelaitz O. and Gurrutxaga I. (2004). CONSOLIDATED TREE CONSTRUCTION ALGORITHM: STRUCTURALLY STEADY TREES . In Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 972-8865-00-7, pages 14-21. DOI: 10.5220/0002602200140021


in Bibtex Style

@conference{iceis04,
author={J. M. Pérez and J. Muguerza and O. Arbelaitz and I. Gurrutxaga},
title={CONSOLIDATED TREE CONSTRUCTION ALGORITHM: STRUCTURALLY STEADY TREES},
booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2004},
pages={14-21},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002602200140021},
isbn={972-8865-00-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - CONSOLIDATED TREE CONSTRUCTION ALGORITHM: STRUCTURALLY STEADY TREES
SN - 972-8865-00-7
AU - M. Pérez J.
AU - Muguerza J.
AU - Arbelaitz O.
AU - Gurrutxaga I.
PY - 2004
SP - 14
EP - 21
DO - 10.5220/0002602200140021