AN ADAPTIVE INTERFACE FOR ACTIVE LOCALIZATION

Kenji Okuma, Eric Brochu, David G. Lowe, James J. Little

2011

Abstract

Thanks to large-scale image repositories, vast amounts of data for object recognition are now easily available. However, acquiring training labels for arbitrary objects still requires tedious and expensive human effort. This is particularly true for localization, where humans must not only provide labels, but also training windows in an image. We present an approach for reducing the number of labelled training instances required to train an object classifier and for assisting the user in specifying optimal object location windows. As part of this process, the algorithm performs localization to find bounding windows for training examples that are best aligned with the current classification function, which optimizes learning and reduces human effort. To test this approach, we introduce an active learning extension to a latent SVM learning algorithm. Our user interface for training object detectors employs real-time interaction with a human user. Our active learning system provides a mean performance improvement of 4.5% in the average precision over a state of the art detector on the PASCAL Visual Object Classes Challenge 2007 with an average of just 40 minutes of human labelling effort per class.

References

  1. Abramson, Y. and Freund, Y. (2005). Semi-automatic visual learning (Seville): A tutorial on active learning for visual object recognition. In CVPR.
  2. Collins, B., Deng, J., Li, K., and Fei-Fei, L. (2008). Towards scalable dataset construction: An active learning approach. In ECCV.
  3. Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. In CVPR.
  4. Everingham, M., van Gool, L., Williams, C., Winn, J., and Zisserman, A. (2007). The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http:// www.pascal-network.org/challenges/VOC/voc2007/ workshop/index.html.
  5. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2009). Object detection with discriminatively trained part based models. PAMI.
  6. Kapoor, A., Grauman, K., Urtasuna, R., and Darrrell, T. (2007). Active learning with Gaussian processes for object categorization. In ICCV.
  7. Lazebnik, S., Schmid, C., and Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR.
  8. Lewis, D. D. and Gale, W. A. (1994). A sequential algorithm for training text classifiers. In Proc. of the 17th annual international ACM SIGIR conference on Research and development in information retrieval.
  9. Moosmann, F., Larlus, D., and Jurie, F. (2006). Learning saliency maps for object categorization. In ECCV.
  10. Mutch, J. and Lowe, D. G. (2006). Multiclass object recognition with sparse, localized features. In CVPR.
  11. Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., and Zhang, H.-J. (2008). Two-dimensional active learning for image classification. In CVPR.
  12. Schölkopf, B. and Smola, A. (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, MA, USA.
  13. Settles, B. (2010). Active learning literature survey. Technical Report 1648, University of Wisconsin-Madison.
  14. Siddiquie, B. and Gupta, A. (2010). Beyond active noun tagging: Modeling contextual interactions for multiclass active learning. In CVPR.
  15. Tong, S. and Koller, D. (2001). Support vector machine active learning with applications to text classification. ML, 2:45-66.
  16. Vijayanarasimhan, S., Jain, P., and Grauman, K. (2010). Far-sighted active learning on a budget for image and video recognition. In CVPR.
  17. Vijayanarasimhan, S. and Kapoor, A. (2010). Visual recognition and detection under bounded computational resources. In CVPR.
  18. Viola, P. and Jones, M. J. (2004). Robust real-time face detection. IJCV, 57(2):137-154.
  19. Zhang, H., Berg, A. C., Maire, M., and Malik, J. (2006). SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In CVPR.
  20. Zhang, L., Tong, Y., and Ji, Q. (2008). Active image labeling and its application to facial action labeling. In ECCV.
Download


Paper Citation


in Harvard Style

Okuma K., Brochu E., G. Lowe D. and J. Little J. (2011). AN ADAPTIVE INTERFACE FOR ACTIVE LOCALIZATION . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011) ISBN 978-989-8425-47-8, pages 248-258. DOI: 10.5220/0003317302480258


in Bibtex Style

@conference{visapp11,
author={Kenji Okuma and Eric Brochu and David G. Lowe and James J. Little},
title={AN ADAPTIVE INTERFACE FOR ACTIVE LOCALIZATION},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011)},
year={2011},
pages={248-258},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003317302480258},
isbn={978-989-8425-47-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011)
TI - AN ADAPTIVE INTERFACE FOR ACTIVE LOCALIZATION
SN - 978-989-8425-47-8
AU - Okuma K.
AU - Brochu E.
AU - G. Lowe D.
AU - J. Little J.
PY - 2011
SP - 248
EP - 258
DO - 10.5220/0003317302480258