
study, we present a stacking variant methodology
that uses three different learning methods: the Naive
Bayes, the C4.5 and the BP algorithms as base
classifiers and M5
as meta level classifier. A
number of comparisons with other ensembles that
use the C4.5 or NB or BP as base classifiers showed
that this method gives better accuracy in many cases.
In spite of these results, no general method will
work always. Therefore, we can only state that a
particular method for creating an ensemble can be
better than the best single model and continue to
work on identifying the generation and combination
methods that can best solve different classification
problems.
The stacked generalization architecture for
classifier combination has still many open questions.
For example, there are currently no strict rules
saying which base classifiers should be used and
what features of the training set should be used to
train the combining classifier.
In a future work, we will use a feature selection
pre-process before the usage of the stacking. Feature
subset selection is the process of identifying and
removing as much irrelevant and redundant features
as possible. This will reduce the dimensionality of
the data enabling the proposed ensemble to operate
faster and maybe more effectively.
REFERENCES
Bauer, E., Kohavi, R, 1999. An empirical comparison of
voting classification algorithms: Bagging, boosting,
and variants. Machine Learning 36, 105–139.
Blake, C.L., Merz, C.J, 1998. UCI Repository of machine
learning databases. Irvine, CA: University of
California, Department of Information and Computer
Science.
(www.ics.uci.edu/~mlearn/MLRepository.html)
Breiman, L, 1996. Bagging Predictors. Machine Learning
24, 123-140.
Dietterich, T.G, 2001. Ensemble methods in machine
learning. In Kittler, J., Roli, F., eds. Multiple Classifier
Systems. LNCS Vol. 1857, Springer, 1–15.
Dzeroski, S., Zenko, B., 2002. Is Combining Classifiers
Better than Selecting the Best One. ICML 2002: 123-
130.
Frank, E., Wang., Y., Inglis, S., Holmes, G., & Witten, I.
H., 1998. Using model trees for classification.
Machine Learning 32, 63-76.
Freund, Y., Schapire, R., 1996. Experiments with a New
Boosting Algorithm, Proceedings: ICML’96, p. 148-
156.
Jensen, F., 1996. An Introduction to Bayesian Networks.
Springer.
Ji, C., Ma, S., 1997. Combinations of weak classifiers.
IEEE Transaction on Neural Networks 8, 32–42.
Kotsiantis, S., Pierrakeas, C., Pintelas, P., 2003.
Preventing student dropout in distance learning
systems using machine learning techniques,
Proceedings of Seventh International Conference on
Knowledge-Based Intelligent Information &
Engineering Systems, Lecture Notes in Artificial
Intelligence, Vol. 2774, Springer-Verlag, 267-274.
Mitchell, T., 1997. Machine Learning. McGraw Hill.
Opitz, D., Maclin, R., 1999. Popular Ensemble Methods:
An Empirical Study, Artificial Intelligence Research
11, 169-198, Morgan Kaufmann.
Quinlan, J.R., 1993. C4.5: Programs for machine
learning. Morgan Kaufmann, San Francisco.
Salzberg, S., 1997. On Comparing Classifiers: Pitfalls to
Avoid and a Recommended Approach, Data Mining
and Knowledge Discovery 1, 317–328.
Schaffer, C., 1993. Selecting a classification method by
cross-validation. Machine Learning 13, 135-143.
Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S.,
1998, Boosting the margin: A new explana-tion for the
effectiveness of voting methods. The Annals of
Statistics 26, 1651–1686.
Seewald, A. K., Furnkranz, J., 2001. An evaluation of
grading classifiers. In Advances in Intelligent Data
Analysis: Proceedings of the Fourth International
Symposium (IDA-01), pages 221–232, Berlin,
Springer.
Seewald, A.K, 2002. How to Make Stacking Better and
Faster While Also Taking Care of an Unknown
Weakness, in Sammut C., Hoffmann A. (eds.),
Proceedings of the Nineteenth International
Conference on Machine Learning (ICML 2002),
Morgan Kaufmann Publishers, pp.554-561.
Ting, K., & Witten, I., 1999. Issues in Stacked
Generalization, Artificial Intelligence Research 10,
271-289, Morgan Kaufmann.
Turban, E., Aronson, J., 1998. Decision Support Systems
and Intelligent Systems, Prentice Hall.
Wang, Y., Witten, I., 1997, Induction of model trees for
predicting continuous classes, In Proc. of the Poster
Papers of the European Conference on ML, Prague,
128–137.
Witten, I., Frank, E. (2000), Data Mining: Practical
Machine Learning Tools and Techniques with Java
Implementations, Morgan Kaufmann, San Mateo,
2000.
Wolpert, D., 1992, Stacked Generalization. Neural
Networks 5, 241–260.
A HYBRID DECISION SUPPORT TOOL - Using ensemble of classifiers
453