Bagging KNN Classifiers using Different Expert Fusion
Strategies
Amer. J. AlBaghdadi and Fuad M. Alkoot
Telecommunication and Navigation Institute
P.O.Box 6866, Hawally,32043, Kuwait
Abstract. An experimental evaluation of Bagging K-nearest neighbor classifiers
(KNN) is performed. The goal is to investigate whether varying soft methods of
aggregation would yield better results than Sum and Vote. We evaluate the per-
formance of Sum, Product, MProduct, Minimum, Maximum, Median and Vote
under varying parameters. The results over different training set sizes show minor
improvement due to combining using Sum and MProduct. At very small sample
size no improvement is achieved from bagging KNN classifiers. While Minimum
and Maximum do not improve at almost any training set size, Vote and Median
showed an improvement when larger training set sizes were tested. Reducing the
number of features at large training set size improved the performance of the
leading fusion strategies.
1 Introduction
Bagging Predictors [4], proposed by Breiman is a method of generating multiple ver-
sions of a predictor or classifier, via bootstraping and then using those to get an aggre-
gated classifier. Methods of combining suggested by Breiman are Voting when classifier
outputs are labels, and Averaging when classifier outputs are numerical measurements.
The multiple versions of classifiers are formed by making bootstrap [8] replicas of the
training set, and these are then used to train additional experts. He postulates the nec-
essary condition for bagging to improve accuracy as a perturbation of the learning set
causes significant changes in the classifier, namely the classifier must be unstable. Bag-
ging has been successfully applied to practical cases to improve the performance of
unstable classifiers. A sample of such papers includes [6]. Many have investigated its
performance and compared it to boosting or other methods [9, 12,7,2,13, 5]
Breimans results [4] show that bagging more than 25 replicas does not further im-
prove the performance. He also notes that a fewer replicas are required when the classi-
fier outputs are numerical results rather than labels, but more are required as the number
of classes increases. Regarding the bootstrap training set size, he used the size equal to
the cardinality of the original training set and his tests showed no improvement when
the boot training set was double the size of the original training set.
In this paper we Bagg K-NN classifiers, in order to find whether it is possible to
achieve an improvement under varying parameters. We focus on the small training set
case in which the KNN classifier can be expected to be unstable. We aggregate the
generated bootstrap sets using six different methods, namely: Sum, Product, MProduct,
J. AlBaghdadi A. and M. Alkoot F. (2005).
Bagging KNN Classifiers using Different Expert Fusion Strategies.
In Proceedings of the 5th International Workshop on Pattern Recognition in Information Systems, pages 219-224
DOI: 10.5220/0002572002190224
Copyright
c
SciTePress