ON IMPROVING SEMI-SUPERVISED MARGINBOOST INCREMENTALLY USING STRONG UNLABELED DATA

Thanh-Binh Le, Sang-Woon Kim

Abstract

The aim of this paper is to present an incremental learning strategy by which the classification accuracy of the semi-supervised MarginBoost (SSMB) algorithm (d’Alch ´ e Buc, 2002) can be improved. In SSMB, both a limited number of labeled and a multitude of unlabeled data are utilized to learn a classification model. However, it is also well known that the utilization of the unlabeled data is not always helpful for semi-supervised learning algorithms. To address this concern when dealing with SSMB, in this paper we study a means of selecting only “small” helpful portion of samples from the additional available data. More specifically, this is done by performing SSMB after incrementally reinforcing the given labeled training data with a part of strong unlabeled data; we train the classification model in an incremental fashion by employing a small amount of “strong” samples selected from the unlabeled data per iteration. The proposed scheme is evaluated with well-known benchmark databases, including some UCI data sets, in two approaches: dissimilarity-based classification (DBC) (Pekalska and Duin, 2005) as well as conventional feature-based classification. Our experimental results demonstrate that, compared to previous approaches, it achieves better classification accuracy results.

References

  1. Cesa-Bianchi, N., G. C. Z. L. (2006). Incremental algorithms for hierarchical classification. Journal of Machine Learning Research, 7:31-54.
  2. d'Alché Buc, F., G. Y. A. C. (2002). Semi-supervised marginboost. In Advances in Neural Information Processing Systems, volume 14, pages 553-560. the MIT press.
  3. Duin, R. P. W., J. P. d. D. P. P. P. E. and Tax, D. M. J. (2004). PRTools 4: a Matlab Toolbox for Pattern Recognition. Delft University of Technology, The Netherlands.
  4. Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition, 2nd. Academic Press, San Diego, CA.
  5. Mallapragada, P. K., J. R. J. A. K. L. Y. (2009). Semiboost: Boosting for semi-supervised learning. IEEE Trans. Pattern Anal. and Machine Intell., 31(11):2000-2014.
  6. Mason, L., B. J. B. P. L. F. M. (2000). Functional gradient techniques for combining hypotheses. In Advances in Large Margin Classifiers. the MIT press.
  7. Pekalska, E. and Duin, R. P. W. (2005). The Dissimilarity Representation for Pattern Recognition: Foundations and Applications. World Scientific, Singapore.
Download


Paper Citation


in Harvard Style

Le T. and Kim S. (2012). ON IMPROVING SEMI-SUPERVISED MARGINBOOST INCREMENTALLY USING STRONG UNLABELED DATA . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8425-98-0, pages 265-268. DOI: 10.5220/0003721202650268


in Bibtex Style

@conference{icpram12,
author={Thanh-Binh Le and Sang-Woon Kim},
title={ON IMPROVING SEMI-SUPERVISED MARGINBOOST INCREMENTALLY USING STRONG UNLABELED DATA},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2012},
pages={265-268},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003721202650268},
isbn={978-989-8425-98-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - ON IMPROVING SEMI-SUPERVISED MARGINBOOST INCREMENTALLY USING STRONG UNLABELED DATA
SN - 978-989-8425-98-0
AU - Le T.
AU - Kim S.
PY - 2012
SP - 265
EP - 268
DO - 10.5220/0003721202650268