Semantic Place Recognition based on Deep Belief Networks and Tiny Images

Ahmad Hasasneh, Emmanuelle Frenoux, Philippe Tarroux

2012

Abstract

This paper presents a novel approach for robot semantic place recognition (SPR) based on Restricted Boltzmann Machines (RBMs) and a direct use of tiny images. RBMs are able to code images as a superposition of a limited number of features taken from a larger alphabet. Repeating this process in a deep architecture leads to an efficient sparse representation of the initial data in the feature space. A complex problem of classification in the input space is thus transformed into an easier one in the feature space. In this article, we show that SPR can thus be achieved using tiny images instead of conventional Bag-of-Words (BoW) methods. After appropriate coding, a softmax regression in the feature space suffices to compute the probability to be in a given place according to the input image.

References

  1. Bell, A. J. and Sejnowski, T. J. (1997). Edges are the 'independent components' of natural scenes. Vision Research, 37(23):3327-3338.
  2. Dubois, M., Guillaume, H., Frenoux, E., and Tarroux, P. (2011). Visual place recognition using bayesian filtering with markov chains. In ESANN 2011, Bruges, Belgium.
  3. Field, D. (1994). What is the goal of sensory coding? Neural Computation, 6:559-601.
  4. Guillaume, H., Dubois, M., Frenoux, E., and Tarroux, P. (2011). Temporal bag-of-words - a generative model for visual place recognition using temporal integration. In VISAPP, pages 286-295, Vilamoura, Algarve, Portugal. SciTePress.
  5. Hinton, G. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14:1771-1800.
  6. Hinton, G. (2010). A practical guide to training restricted Boltzmann machines - version 1. Technical report, Department of Computer Science, University of Toronto, Toronto, Canada.
  7. Hinton, G., Krizhevsky, A., and Wang, S. (2011). Transforming auto-encoders. In Artificial neural networks and machine learning - ICANN 2011.
  8. Hinton, G., Osindero, S., and Teh, Y. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554.
  9. Krizhevsky, A. (2009). Learning multiple layers of features from tiny images. Master sc. thesis, Department of Computer Science, University of Toronto, Toronto, Canada.
  10. Krizhevsky, A. (2010). Convolutional deep belief networks ocifar-10. Technical report, University of Toronto, Toronto, Canada.
  11. Oliva, A. and Torralba, A. (2006). Building the gist of a scene: the role of global image features in recognition. Progress in Brain Research, 14:23-36.
  12. Olshausen, B. and Field, D. (2004). Sparse coding of sensory inputs. Current Opinion in Neurobiology, 14:481-487.
  13. Pronobis, A. and Caputo, B. (2007). Confidence-base cue integration for visual place recognition. In IROS 2007.
  14. S. Thrun, W. B. and Fox, D. (2005). Probabilistic Robotics (Intelligent Robotics and Autonomous Agents). MIT Press, Cambridge, MA, 1st edition.
  15. Smolensky, P. (1986). Information processing in dynamical systems foundations of harmony theory. In Rumelhart, D. and McClelland, J., editors, Parallel Distributed Processing Explorations in the Microstructure of Cognition, volume 1: Foundations. McGraw-Hill, New York.
  16. Torralba, A., Fergus, R., and Weiss, Y. (2008). Small codes and large image databases for recognition. In IEEE Conference on Computer Vision and Pattern Recognition - CVPR 08, Anchorage, AK.
  17. Torralba, A., Murphy, K., Freeman, W., and Rubin, M. (2003). Context-based vision system for place and object recognition. Technical Report AI MEMO 2003- 005, MIT, Cambridge, MA.
  18. Ullah, M. M., Pronobis, A., Caputo, B., Jensfelt, P., and Christensen, H. (2008). Towards robust place recognition for robot localization. In IEEE International Conference on Robotics and Automation (ICRA'2008), Pasadena, CA.
  19. Ullah, M. M., Pronobis, A., Caputo, B., Luo, J., and Jensfelt, P. (2007). The cold database. Technical report, CAS - Centre for Autonomous Systems. School of Computer Science and Communication. KTH Royal Institute of Technology, Stockholm.
  20. Wu, J. and Rehg, J. M. (2011). Centrist: A visual descriptor for scene categorization. IEEE Trans. Pattern Anal. Mach. Intell., 33(8):1489-1501.
Download


Paper Citation


in Harvard Style

Hasasneh A., Frenoux E. and Tarroux P. (2012). Semantic Place Recognition based on Deep Belief Networks and Tiny Images . In Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-989-8565-22-8, pages 236-241. DOI: 10.5220/0004029902360241


in Bibtex Style

@conference{icinco12,
author={Ahmad Hasasneh and Emmanuelle Frenoux and Philippe Tarroux},
title={Semantic Place Recognition based on Deep Belief Networks and Tiny Images},
booktitle={Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2012},
pages={236-241},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004029902360241},
isbn={978-989-8565-22-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - Semantic Place Recognition based on Deep Belief Networks and Tiny Images
SN - 978-989-8565-22-8
AU - Hasasneh A.
AU - Frenoux E.
AU - Tarroux P.
PY - 2012
SP - 236
EP - 241
DO - 10.5220/0004029902360241