learning can damage the classification when the ini-
tial modeling assumptions are incorrect. In particu-
lar if the classifier is inadequate for the task or when
there is a different bias in the data distribution of la-
beled and unlabeled data. In order to tackle this prob-
lem, in this paper we propose the use of an ensemble
of classifiers that shows a robust performance across
domains and weights the unlabeled instances accord-
ing to the probability of predicted labels. WSA was
experimentally evaluated and compared against other
classifiers on several datasets with very promising re-
sults.
The rest of this paper is organized as follows: Sec-
tion 2 describes the related work and the Adaboost
algorithm. Section 3 discusses the proposed WSA al-
gorithm. Section 4 presents the experimental results
of WSA on different datasets and finally, Section 5
concludes this work and gives the directions for fu-
ture work.
2 RELATED WORK
There are several works in the literature based on
boosting techniques using semi-supervised learning
framework (Bennett et al., 2002; Buc et al., 2002).
Boosting is a popular learning method than can pro-
vides a framework for improving the performance of
any given leaner by building an ensemble of clas-
sifiers. In (Buc et al., 2002), the authors extended
MarginBoost into a semi-supervised framework, in
an algorithm called SSMBoost for binary class prob-
lems. They developed a margin definition for unla-
beled data and a gradient descent algorithm that corre-
sponds to the resulting margin cost function. They use
a mixture model trained with the Expectation Max-
imization algorithm as base classifier. Our work is
based on the use of probability of predicted labels
by the current classifier to weight the unlabeled data,
which can be labeled with multiple class.
Other approach is present in (Bennett et al., 2002).
The author proposed a new algorithm called ASSEM-
BLE, that assigns pseudo-classes and small weights
to all unlabeled examples and weights the labeled ex-
amples according to a starting classifier. From then
on, the unlabeled data are classified with the current
classifier and the weights are assigned to instances
as in AdaBoost (Freund and Schapire, 1996). In
(Chen and Wang, 2008) the authors propose a local
smoothness regularizer to semi-supervised boosting
algorithms based on the universaloptimization frame-
work of margin cost functionals.
The new semi-supervised ensemble of classifiers
proposed in this work, called WSA, differs from AS-
SEMBLE and SSMBoost in how labeled and unla-
beled instances are weighted. Unlabeled instances are
weighted according to a confidence measure based on
the probability of the predicted label, while the la-
beled instances are weighted according to the clas-
sifier error as in AdaBoost. The use of weights in
the learning process reduces the initial bias induced
by the first classifier on the unlabeled data. This bias
could reduce the performance of the ensemble, as it
occurs in many semi-supervised algorithms.
Our new semi-supervised ensemble WSA is based
on the supervised multi-class AdaBoost ensemble,
which is described in the next section.
2.1 AdaBoost
The main idea of AdaBoost is to combine a series
of base classifiers using a weighted linear combina-
tion. Each time a new classifier is generated, it tries
to minimize the expected error by assigning a higher
weight to the samples that were wrongly classified in
the previous stages. Formally, AdaBoost starts from
a set L of labeled instances, where each instance, x
i
,
is assigned a weight, W(x
i
). It considers N classes,
where the known class of instance x
i
is y
i
. The base
classifier is h, and h
t
is one of the T classifiers in the
ensemble. AdaBoost produces a linear combination
of the H base classifiers, F(x) =
∑
t
α
t
h
t
, where α
t
is the weight of each classifier. The weight is pro-
portional to the error of each classifier on the train-
ing data. Initially the weights are equal for all the
instances, and these are used to generate the first base
classifier, h
1
(using the training algorithm for the base
classifier, which should consider the weight of each
instance). Then the error, e
1
of h
1
is obtained by
adding the weights of the incorrectly classified in-
stances. The weight of each correctly classified in-
stance is decreased by the factor β
t
= e
t
/(1− e
t
), and
these weights are used to train the next base classifier.
The cycle is repeated until e
t
≥ 0.5 or when a prede-
fined maximum number of iterations is reached. Ad-
aBoost final classifier is a linear combination of the T
classifiers, whose weights are proportional to β
t
(see
Algorithm 1).
3 WSA (WEIGHTED
SEMI-SUPERVISED
ADABOOST)
WSA receives a sets of labeled data (L) and unlabeled
data (U). An initial weight =
1
|L|
is assigned to all
examples in L. The first classifier h
1
is built using