Efficient Implementation of a Recognition System using the Cortex Ventral Stream Model

Ahmad Bitar, Mohammad M. Mansour, Ali Chehab

Abstract

In this paper, an efficient implementation for a recognition system based on the original HMAX model of the visual cortex is proposed. Various optimizations targeted to increase accuracy at the so-called layers S1, C1, and S2 of the HMAX model are proposed. At layer S1, all unimportant information such as illumination and expression variations are eliminated from the images. Each image is then convolved with 64 separable Gabor filters in the spatial domain. At layer C1, the minimum scales values are exploited to be embedded into the maximum ones using the additive embedding space. At layer S2, the prototypes are generated in a more efficient way using Partitioning Around Medoid (PAM) clustering algorithm. The impact of these optimizations in terms of accuracy and computational complexity was evaluated on the Caltech101 database, and compared with the baseline performance using support vector machine (SVM) and nearest neighbor (NN) classifiers. The results show that our model provides significant improvement in accuracy at the S1 layer by more than 10% where the computational complexity is also reduced. The accuracy is slightly increased for both approximations at the C1 and S2 layers.

References

  1. Amayeh, G., Tavakkoli, A., and Bebis, G. (2009). Accurate and efficient computation of gabor features in realtime applications. In proceeding of the 5th International Symposium on Advances in Visual Computing: Part I, volume 5875 of Lecture Notes in Computer Science, pp. 243-252.
  2. Bermudez-Contreras, E., Buxton, H., and Spier, E. (2008). Attention can improve a simple model for object recognition. In Image and Vision Computing, vol. 26, pp. 776-787.
  3. Cadieu, C., Kouh, M., Riesenhuber, M., and Poggio, T. (2005). Shape representation in v4: Investigating position-specific tuning for boundary conformation with the standard model of object recognition. In Journal of vision, Vol. 5, no. 8.
  4. Chikkerur, S. and Poggio, T. (2011). Approximations in the hmax model. In MIT-CSAIL-TR-2011-021, CBCL298, 12p.
  5. Grauman, K. and Darrell, T. (2005). The pyramid match kernel: Discriminative classification with sets of image features. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1458 - 1465, vol. 2.
  6. Holub, A. and Welling, M. (2005). Exploiting unlabelled data for hybrid object classification. In Advances in Neural Information Processing Systems (NIPS 2005) Workshop in Inter-Class Transfer.
  7. Kumar, P. and Wasan, S. K. (2011). Comparative study of k-means, pam and rough k-means algorithms using cancer datasets. In proceedings of CSIT: 2009 International Symposium on Computing, Communication, and Control (ISCCC Singapore, 2011, pp. 136-140.
  8. Mutch, J. and Lowe, D. G. (2006). Multiclass object recognition with sparse, localized features. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol.1, pp. 11-18.
  9. Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., and Poggio, T. (2005a). A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. In CBCL Paper #259/AI Memo #2005-036, Massachusetts Institute of Technology, Cambridge, MA.
  10. Serre, T., Kreiman, G., Kouh, M., Cadieu, C., Knoblich, U., and Poggio, T. (2007a). A quantitative theory of immediate visual recognition. In Progress in Brain Research, Computational Neuroscience: Theoretical Insights into Brain Function, vol. 165, pp. 33-56.
  11. Serre, T. and Riesenhuber, M. (2004). Realistic modeling of simple and complex cell tuning in the hmax model, and implications for invariant object recognition in cortex. In Massachusetts Institute of Technology, Cambridge, MA. CBCL, Paper 239/Al Memo 2004-017.
  12. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., and Poggio, T. (2007b). Robust object recognition with cortex-like mechanisms. In IEEE Conference on Pattern Analysis and Machine Intelligence, vol.29, pp. 411-426.
  13. Serre, T., Wolf, L., and Poggio, T. (2005b). Object recognition with features inspired by visual cortex. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05), pp. 994-1000.
  14. Sharif, M., Anis, S., Raza, M., and Mohsin, S. (2012). Enhanced svd based face recognition. In Journal of Applied Computer Science & Mathematics, no. 12, p.49.
Download


Paper Citation


in Harvard Style

Bitar A., Mansour M. and Chehab A. (2015). Efficient Implementation of a Recognition System using the Cortex Ventral Stream Model . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-090-1, pages 138-147. DOI: 10.5220/0005308901380147


in Bibtex Style

@conference{visapp15,
author={Ahmad Bitar and Mohammad M. Mansour and Ali Chehab},
title={Efficient Implementation of a Recognition System using the Cortex Ventral Stream Model},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={138-147},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005308901380147},
isbn={978-989-758-090-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)
TI - Efficient Implementation of a Recognition System using the Cortex Ventral Stream Model
SN - 978-989-758-090-1
AU - Bitar A.
AU - Mansour M.
AU - Chehab A.
PY - 2015
SP - 138
EP - 147
DO - 10.5220/0005308901380147