5 Conclusions
The two-level (or meta-level) approach to machine learning has been studied for
many years. This research paper continues in our larger project whose purpose is to
design, implement, and empirically compare the meta-learner Meta-CN4 with other
algorithms for processing of missing attribute values. Namely, this paper exhibits a
portion of the above project that considers the importance of the ‘foldness’ S of such a
meta-combiner as its crucial parameter. The only, but widely used criterion in our
experiments was the classification accuracy acquired from testing sets.
By analyzing the results of our experiments we came to the following:
Although there were carried out the experiments only for a few values of the
parameter S, we can observe that there is the ‘optimal’ value S that maximizes the
classification accuracy. One can easily observe it namely along the series for S=32
that exhibits always worse performance than that for other values.
To be more precise, the statistical results of the t-test (with the confidence level
0.05) depict that the performance of the meta-combiner for S=4, S=8, and S=16 are
statistically equivalent, but they are significantly better than that for S=2 and S=32.
Because of time limitations, we did not perform more experiments. We did not use
the stack generalizer (fold S=K) because it is much more time consuming; the paper
[9] indicates that the timing cost for the stack generalizer is much more larger than
that for the meta-combiner for relatively small parameters S.
For the future research, we plan to perform more experiments and to study how the
optimal value of the parameter S depends on a processed database. It is just our
impression that even for this issue (to find an optimal value of S), we would need to
introduce another ‘meta-level’.
References
1. Batista, G., Monard, M.C.: An analysis of four missing data treatment methods for
supervised learning. Applied Artificial Intelligence, 17 (2003), 519-533
2. Berka, P. and Bruha, I.: Various discretizing procedures of numerical attributes: Machine
Learning, and Knowledge Discovery in Databases, Heraklion, Crete (1995), 136-141
3 Boswell, R.: Manual for CN2, version 4.1. Turing Institute, Techn. Rept. P-2145/Rab/4/1.3
(1990)
4 Bruha, I.: Unknown attribute values processing utilizing expert knowledge on attribute
hierarchy. 8th European Conference on Machine Learning, Workshop Statistics, Machine
Learning, and Knowledge Discovery in Databases, Heraklion, Crete (1995), 130-135
5 Bruha, I.: Unknown attribute values processing by meta-learner. International Symposium
on Methodologies for Intelligent Systems (ISMIS-2002), Lyon, France (2002)
6 Bruha, I. and Franek, F.: Comparison of various routines for unknown attribute value
processing: Covering paradigm. International Journal Pattern Recognition and Artificial
Intelligence, 10, 8 (1996), 939-955
7 Clark, P. and Boswell, R.: Rule induction with CN2: Some recent improvements.
EWSL'91, Porto (1991), 151-163
8 Clark. P. and Niblett, T.: The CN2 induction algorithm. Machine Learning, 3 (1989), 261-
283
9 Fan, D.W., , Chan, P.K., Stolfo, S.J.: A comparative evaluation of combiner and
97