THE COMBINATION OF HMAX AND HOGS IN AN ATTENTION GUIDED FRAMEWORK FOR OBJECT LOCALIZATION

Tobias Brosch; Heiko Neumann

doi:10.5220/0003708702810288

THE COMBINATION OF HMAX AND HOGS IN AN ATTENTION GUIDED FRAMEWORK FOR OBJECT LOCALIZATION

Tobias Brosch, Heiko Neumann

2012

Abstract

Object detection and localization is a challenging task. Among several approaches, more recently hierarchical methods of feature-based object recognition have been developed and demonstrated high-end performance measures. Inspired by the knowledge about the architecture and function of the primate visual system, the computational HMAX model has been proposed. At the same time robust visual object recognition was proposed using feature distributions, e.g. histograms of oriented gradients (HOGs). Since both models build upon an edge representation of the input image, the question arises, whether one kind of approach might be superior to the other. Introducing a new biologically inspired attention steered processing framework, we demonstrate that the combination of both approaches gains the best results.

References

Agarwal, S., Awan, A., and Roth, D. (2004a). Learning to Detect Objects in Images via a Sparse, Part-Based Representation. TPAMI, 26(11):1475-1490.
Agarwal, S., Awan, A., and Roth, D. (2004b). UIUC Image Database for Car Detection download page. http://l2r.cs.uiuc.edu/ cogcomp/Data/Car/. [Online; accessed 27-Mar.-2010].
Amit, Y., Geman, D., and Fan, X. (2004). A Coarse-toFine Strategy for Multiclass Shape Detection. TPAMI, 26(12):1606-21.
An, S., Peursum, P., Liu, W., Venkatesh, S., and Chen, X. (2010). Exploiting Monge Structures in Optimum Subwindow Search. In CVPR.
bwGRID (2011). member of the German D-Grid initiative, funded by the Ministry for Education and Research (Bundesministerium f ür Bildung und Forschung) and the Ministry for Science, Research and Arts Baden-Wuerttemberg (Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg). http://www.bw-grid.de. [Online; accessed 13- Apr.-2011].
Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. In CVPR, volume 1, pages 886-893.
Enzweiler, M. and Gavrila, D. M. (2009). Monocular Pedestrian Detection: Survey and Experiments. TPAMI, 31(12):2179-95.
Fan, R., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (2008). LIBLINEAR: A Library for Large Linear Classification. JMLR, 9:1871-1874.
Fradkin, D. and Muchnik, I. (2006). Support Vector Machines for Classification. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 70:13-20.
Freund, Y. and Schapire, R. E. (1997). A DecisionTheoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55:119-139.
Fritz, M., Leibe, B., Caputo, B., and Schiele, B. (2005). Integrating Representative and Discriminative Models for Object Category Detection. In ICCV.
Hamker, F. H. (2005). The Emergence of Attention by Population-based Inference and its Role in Distributed Processing and Cognitive Control of Vision. Computer Vision and Image Understanding, 100:64-106.
Heisele, B., Serre, T., Mukherjee, S., and Poggio, T. (2001). Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images. In CVPR.
Jiang, X., Rosen, E., Zeffiro, T., Vanmeter, J., Blanz, V., and Riesenhuber, M. (2006). Evaluation of a Shape-Based Model of Human Face Discrimination using FMRI and Behavioral Techniques. Neuron, 50(1):159-72.
Lampert, C. H., Blaschko, M. B., and Hofmann, T. (2008). Beyond Sliding Windows: Object Localization by Efficient Subwindow Search. In CVPR, pages 1-8.
Mutch, J. and Lowe, D. G. (2008). Object Class Recognition and Localization using sparse Features with limited Receptive Fields. IJCV, 80(1):45-57.
Pedersoli, M., Gonzàlez, J., Bagdanov, A. D., and Villanueva, J. J. (2010). Recursive Coarse-to-Fine Localization for fast Object Detection. In ECCV, volume 6.
Riesenhuber, M. and Poggio, T. (1999a). Are Cortical Models Really Bound by the “Binding Problem”? Neuron, 24:87-93.
Riesenhuber, M. and Poggio, T. (1999b). Hierarchical Models of Object Recognition in Cortex. Nature Neuroscience, 2(11):1019-1025.
Schyns, P. G. and Oliva, A. (1994). From Blobs To Boundary Edges: Evidence for Time- and Spatial-ScaleDependent Scene Recognition. Psychological Science, 5(4):195-200.
Serre, T., Wolf, L., and Poggio, T. (2005). Object Recognition with Features inspired by Visual Cortex. In CVPR, pages 994-1000.
Viola, P. and Michael, J. (2001). Rapid Object Detection using a Boosted Cascade of Simple Features. In CVPR.
Zhu, Q., Avidan, S., Yeh, M.-C., and Cheng, K.-T. (2006). Fast Human Detection Using a Cascade of Histograms of Oriented Gradients. In CVPR.

Download

Paper Citation

in Harvard Style

Brosch T. and Neumann H. (2012). THE COMBINATION OF HMAX AND HOGS IN AN ATTENTION GUIDED FRAMEWORK FOR OBJECT LOCALIZATION . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM, ISBN 978-989-8425-99-7, pages 281-288. DOI: 10.5220/0003708702810288

in Bibtex Style

@conference{icpram12,
author={Tobias Brosch and Heiko Neumann},
title={THE COMBINATION OF HMAX AND HOGS IN AN ATTENTION GUIDED FRAMEWORK FOR OBJECT LOCALIZATION},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},
year={2012},
pages={281-288},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003708702810288},
isbn={978-989-8425-99-7},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,
TI - THE COMBINATION OF HMAX AND HOGS IN AN ATTENTION GUIDED FRAMEWORK FOR OBJECT LOCALIZATION
SN - 978-989-8425-99-7
AU - Brosch T.
AU - Neumann H.
PY - 2012
SP - 281
EP - 288
DO - 10.5220/0003708702810288