Improved Boosting Performance by Exclusion of Ambiguous Positive Examples

Miroslav Kobetski, Josephine Sullivan


In visual object class recognition it is difficult to densely sample the set of positive examples. Therefore, frequently there will be areas of the feature space that are sparsely populated, in which uncommon examples are hard to disambiguate from surrounding negatives without overfitting. Boosting in particular struggles to learn optimal decision boundaries in the presence of such hard and ambiguous examples. We propose a twopass dataset pruning method for identifying ambiguous examples and subjecting them to an exclusion function, in order to obtain more optimal decision boundaries for existing boosting algorithms. We also provide an experimental comparison of different boosting algorithms on the VOC2007 dataset, training them with and without our proposed extension. Using our exclusion extension improves the performance of all the tested boosting algorithms except TangentBoost, without adding any additional test-time cost. In our experiments LogitBoost performs best overall and is also significantly improved by our extension. Our results also suggest that outlier exclusion is complementary to positive jittering and hard negative mining.


  1. Angelova, A., Abu-Mostafam, Y., and Perona, P. (2005). Pruning training sets for learning of object categories. In CVPR.
  2. Bauer, E. and Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. MLJ.
  3. Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. In CVPR.
  4. Dietterich, T. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. MLJ.
  5. Fan, R., Chang, K., Hsieh, C., Wang, X., and Lin, C. (2008). LIBLINEAR: A library for large linear classification. JMLR.
  6. Felzenszwalb, P., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. PAMI.
  7. Freund, Y. (1995). Boosting a weak learning algorithm by majority. IANDC.
  8. Freund, Y. (1999). An adaptive version of the boost by majority algorithm. In COLT.
  9. Freund, Y. and Schapire, R. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. In COLT.
  10. Freund, Y. and Science, C. (2009). A more robust boosting algorithm. arXiv:0905.2138.
  11. Friedman, J. (2001). Greedy function approximation: a gradient machine. AOS.
  12. Friedman, J., Hastie, T., and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. AOS.
  13. Grove, A. and Schuurmans, D. (1998). Boosting in the limit: Maximizing the margin of learned ensembles. In AAAI.
  14. Hays, J. and Efros, A. (2007). Scene completion using millions of photographs. TOG.
  15. Kumar, M. P. and Packer, B. (2010). Self-paced learning for latent variable models. In NIPS.
  16. Kumar, M. P., Zisserman, A., and Torr, P. H. S. (2009). Efficient discriminative learning of parts-based models. In ICCV.
  17. Laptev, I. (2009). Improving object detection with boosted histograms. IVC.
  18. Leistner, C., Saffari, A., Roth, P. M., and Bischof, H. (2009). On robustness of on-line boosting - a competitive study. In ICCV Workshops.
  19. Long, P. M. and Servedio, R. A. (2008). Random classification noise defeats all convex potential boosters. In ICML.
  20. Masnadi-Shirazi, H., Mahadevan, V., and Vasconcelos, N. (2010). On the design of robust classifiers for computer vision. In CVPR.
  21. Masnadi-shirazi, H. and Vasconcelos, N. (2008). On the design of loss functions for classification: theory, robustness to outliers, and savageboost. In NIPS.
  22. Mason, L., Baxter, J., Bartlett, P., and Frean, M. (1999). Boosting algorithms as gradient descent in function space. In NIPS.
  23. Rätsch, G., Onoda, T., and Müller, K. (2001). Soft margins for AdaBoost. MLJ.
  24. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In CVPR.
  25. Torralba, A., Fergus, R., and Freeman, W. T. (2008). 80 Million Tiny Images: a Large Data Set for Nonparametric Object and Scene Recognition. PAMI.
  26. Vezhnevets, A. and Barinova, O. (2007). Avoiding boosting overfitting by removing confusing samples. In ECML.
  27. Vijayanarasimhan, S. (2011). Large-scale live active learning: Training object detectors with crawled data and crowds. In CVPR.
  28. Viola, P. and Platt, J. (2006). Multiple instance boosting for object detection. In NIPS.
  29. Warmuth, M., Glocer, K., and Rätsch, G. (2008). Boosting algorithms for maximizing the soft margin. In NIPS.

Paper Citation

in Harvard Style

Kobetski M. and Sullivan J. (2013). Improved Boosting Performance by Exclusion of Ambiguous Positive Examples . In Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8565-41-9, pages 11-21. DOI: 10.5220/0004204400110021

in Bibtex Style

author={Miroslav Kobetski and Josephine Sullivan},
title={Improved Boosting Performance by Exclusion of Ambiguous Positive Examples},
booktitle={Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

in EndNote Style

JO - Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Improved Boosting Performance by Exclusion of Ambiguous Positive Examples
SN - 978-989-8565-41-9
AU - Kobetski M.
AU - Sullivan J.
PY - 2013
SP - 11
EP - 21
DO - 10.5220/0004204400110021