Authors:
J. M. Pérez
;
J. Muguerza
;
O. Arbelaitz
and
I. Gurrutxaga
Affiliation:
Informatika Fakultatea, Spain
Keyword(s):
decision trees, steadiness, explaining capacity, structural diversity measure
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Health Information Systems
;
Industrial Applications of Artificial Intelligence
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
This paper presents a new methodology for building decision trees or classification trees (Consolidated Trees Construction algorithm) that faces up the problem of unsteadiness appearing in the paradigm when small variations in the training set happen. As a consequence, the understanding of the made classification is not lost, making this technique different from techniques such as bagging and boosting where the explanatory feature of the classification disappears. The presented methodology consists on a new meta-algorithm for building structurally more steady and less complex trees (consolidated trees), so that they maintain the explaining capacity and they are faster, but, without losing the discriminating capacity. The meta-algorithm uses C4.5 as base classifier. Besides the meta-algorithm, we propose a measure of the structural diversity used to analyse the stability of the structural component. This measure gives an estimation of the heterogeneity in a set of trees from the struc
tural point of view. The obtained results have been compared with the ones get with C4.5 in some UCI Repository databases and a real application of customer fidelisation from a company of electrical appliances.
(More)