overall system combines these separate learners to-
gether into an integrated model using majority voting
or weighted majority voting. These approaches are
mainly inspired by Boosting (Freund and Schapire,
1997) and Bagging (Breiman, 1996).
In this paper we apply a meta-learning approach
instead of either combined dataset or ensemble tech-
niques to deal with concept drift. Our research fo-
cuses on Supervised Learning (SL), where we im-
plement Multi-Steps Learning (MSL) to enhance SL
performance on the data with concept drift. Nearly
all of the methods dealing with concept drift can be
seen as attempts to improve the performance of a
Na
¨
ıve Bayesian learning algorithm. These algorithms
are good baseline for experimental analysis of a new
method, as their accuracies are easy to adjust and
therefore allow room to improve. In experiments, we
compare MSL to a Bayesian combined dataset method
and two Bayesian ensemble techniques.
The remainder of this paper is organized as fol-
lows. In Section 2, we discuss the properties of en-
semble techniques, then briefly review some existing
methods proposed to cope with concept drift. Sec-
tion 3 presents our proposed method, MSL, for en-
hancing supervised learning on evolving data. Sec-
tion 4 describes the experimental setup and presents
the performance evaluation. Finally in Section 5 the
key findings are emphasized, and some proposed fu-
ture work is outlined.
2 RELATED WORK
Boosting and Bagging are the two most famous en-
semble methods in ML and DM; they inspire so-
lutions to many learning problems that require en-
semble models, such as on-line learning, distributed
learning, and incremental learning. They mainly pro-
vide two components: a mechanism of utilising in-
stances in the available training sets, and a mecha-
nism of combining the base learners. There are many
existing combining rules (Kittler et al., 1998) in the
field, where majority voting (used by Bagging) and
weighted majority voting (used by Boosting) are the
two mechanisms used most widely.
Streaming Ensemble Algorithm (SEA) (Street and
Kim, 2001), is a pioneering method for dealing with
concept drift in streaming data. It maintains a con-
stant number of classifiers in its ensemble pool and
when a new dataset is available, it performs major-
ity classification on the new instances. It then re-
evaluates the composite classifiers according to their
classification accuracies and replaces some classifiers
with new classifiers, if they are evaluated as under
performing. The overall accuracy is improved by us-
ing the updated classifiers.
Beyond SEA, Bifet designed a new streaming en-
semble method (Bifet et al., 2009), which enhances
supervised learning by using two adapted versions of
Bagging algorithms: adaptive windowing (ADWIN)
and Adaptive-Size Hoeffding Tree (ASHT). The AD-
WIN is a change detector and the ASHT is an incre-
mental anytime decision tree induction algorithm.
Dynamic weighted majority (DWM) (Kolter and
Maloof, 2007) is a representative method using
weighted majority voting. It maintains a couple of
learners trained in different datasets in different time
periods where each learner has a weight to specify
how reliable it is. All the weights are updated over
time according to the evaluation of the new datasets
and the learners with low weights are removed or re-
placed with new learners. The overall system makes
predictions using weighted majority voting among the
base learners. One advantage of DWM is the number
of base learners that should be used is initially spec-
ified and this set of base learners is continuously up-
dated by the training process to reflect this.
There are also other methods for concept drift,
which do not use ensemble techniques. Widmer
(Widmer, 1997) was the first to propose tracking the
context changes through meta-learning. Bach (Bach
and Maloof, 2008) proposed to use only two learners
(paired learners) to cope with all presented concepts
in datasets.
3 MSL LEARNER
The implementation of our learning system, Multi-
Steps Learning (MSL), is presented in Algorithm 1. It
consists of three types of learners, “old learners” that
learn previous knowledge, “new learners” that learn
current knowledge and “meta learners” that learn the
experiences on how to select learners. Generally, if
concept drift occurs, then a MSL learner will be com-
posed by three learners, a new learner, a meta learner
and an old learner. An important point to note is that
the old learner could be also a MSL learner. In other
words, although the new learner and the meta learner
must be single learners, the old learner could actually
be a composite learner. Figure 2 presents the struc-
ture of a MSL learner, which includes another MSL
learner as its component (older learner). By using this
structure, MSL can encode all the learned concepts in
a traceable hierarchy. We will see this in the following
section.
Zero or Singular Concept Drift in datasets. Initially
MSL system builds a learner on the first dataset. If
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
258