3D Object Categorization and Recognition based on Deep Belief Networks and Point Clouds

Fatima Zahra Ouadiay, Nabila Zrira, El Houssine Bouyakhf, M. Majid Himmi

2016

Abstract

3D object recognition and categorization are an important problem in computer vision field. Indeed, this is an area that allows many applications in diverse real problems as robotics, aerospace, automotive industry and food industry. Our contribution focuses on real 3D object recognition and categorization using the Deep Belief Networks method (DBN). We extract descriptors from cloud keypoints, then we train the resulting vectors with DBN. We evaluate the performance of this contribution on two datasets, Washington RGB-D object dataset and our own real 3D object dataset. The second one is built from real objects, following the same acquisition conditions than those used for Washington dataset acquisition. By this proposed approach, a DBN could be designed to treat the high-level features for real 3D object recognition and categorization. The experiment results on standard dataset show that our method outperforms the state-of-the-art used in the 3D object recognition and categorization.

References

  1. Aldoma, A., Marton, Z.-C., Tombari, F., Wohlkinger, W., Potthast, C., Zeisl, B., Rusu, R. B., Gedikli, S., and Vincze, M. (2012). Point cloud library. IEEE Robotics & Automation Magazine, 1070(9932/12).
  2. Alexandre, L. A. (2012). 3d descriptors for object and category recognition: a comparative evaluation. In Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Portugal, volume 1, page 7. Citeseer.
  3. Alexandre, L. A. (2016a). 3d object recognition using convolutional neural networks with transfer learning between input channels. In Intelligent Autonomous Systems 13, pages 889-898. Springer.
  4. Alexandre, L. A. (2016b). 3d object recognition using convolutional neural networks with transfer learning between input channels. In Intelligent Autonomous Systems 13, pages 889-898. Springer.
  5. Bay, H., Tuytelaars, T., and Van Gool, L. (2006). Surf: Speeded up robust features. In Computer VisionECCV 2006, pages 404-417. Springer.
  6. Bengio, Y. (2009). Learning deep architectures for ai. Foundations and trends R in Machine Learning, 2(1):1- 127.
  7. Bo, L., Ren, X., and Fox, D. (2011). Depth kernel descriptors for object recognition. In Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on, pages 821-826. IEEE.
  8. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. (2011). Natural language processing (almost) from scratch. The Journal of Machine Learning Research, 12:2493-2537.
  9. Deng, L. and Yu, D. (2014). Deep learning: Methods and applications. Foundations and Trends in Signal Processing, 7(3-4):197-387.
  10. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.- r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Processing Magazine, IEEE, 29(6):82-97.
  11. Hinton, G. E., Osindero, S., and Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7):1527-1554.
  12. Lai, K., Bo, L., Ren, X., and Fox, D. (2011a). A largescale hierarchical multi-view rgb-d object dataset. In Robotics and Automation (ICRA), 2011 IEEE International Conference on, pages 1817-1824. IEEE.
  13. Lai, K., Bo, L., Ren, X., and Fox, D. (2011b). A largescale hierarchical multi-view rgb-d object dataset. In Robotics and Automation (ICRA), 2011 IEEE International Conference on, pages 1817-1824. IEEE.
  14. Le, Q. V. (2013). Building high-level features using large scale unsupervised learning. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pages 8595-8598. IEEE.
  15. Liang, D., Weng, K., Wang, C., Liang, G., Chen, H., and Wu, X. (2014). A 3d object recognition and pose estimation system using deep learning method. In Information Science and Technology (ICIST), 2014 4th IEEE International Conference on, pages 401-404. IEEE.
  16. Lowe, D. G. (1999). Object recognition from local scaleinvariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on, volume 2, pages 1150-1157. Ieee.
  17. Nair, V. and Hinton, G. E. (2009a). 3d object recognition with deep belief nets. In Advances in Neural Information Processing Systems, pages 1339-1347.
  18. Nair, V. and Hinton, G. E. (2009b). 3d object recognition with deep belief nets. In Advances in Neural Information Processing Systems, pages 1339-1347.
  19. Savarese, S. and Fei-Fei, L. (2007). 3d generic object categorization, localization and pose estimation. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pages 1-8. IEEE.
  20. Schwarz, M., Schulz, H., and Behnke, S. (2015). Rgbd object recognition and pose estimation based on pre-trained convolutional neural network features. In Robotics and Automation (ICRA), 2015 IEEE International Conference on, pages 1329-1335. IEEE.
  21. Sermanet, P., Kavukcuoglu, K., Chintala, S., and LeCun, Y. (2013). Pedestrian detection with unsupervised multistage feature learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3626-3633.
  22. Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory.
  23. Socher, R., Huval, B., Bath, B., Manning, C. D., and Ng, A. Y. (2012). Convolutional-recursive deep learning for 3d object classification. In Advances in Neural Information Processing Systems, pages 665-673.
  24. Toldo, R., Castellani, U., and Fusiello, A. (2009). A bag of words approach for 3d object categorization. In Computer Vision/Computer Graphics CollaborationTechniques, pages 116-127. Springer.
  25. Tombari, F., Salti, S., and Di Stefano, L. (2010). Unique signatures of histograms for local surface description. In Computer Vision-ECCV 2010 , pages 356- 369. Springer.
  26. Tombari, F., Salti, S., and Stefano, L. D. (2011). A combined texture-shape descriptor for enhanced 3d feature matching. In Image Processing (ICIP), 2011 18th IEEE International Conference on, pages 809-812. IEEE.
  27. Yu, J., Weng, K., Liang, G., and Xie, G. (2013). A visionbased robotic grasping system using deep learning for 3d object recognition and pose estimation. In Robotics and Biomimetics (ROBIO), 2013 IEEE International Conference on, pages 1175-1180. IEEE.
Download


Paper Citation


in Harvard Style

Ouadiay F., Zrira N., Bouyakhf E. and Himmi M. (2016). 3D Object Categorization and Recognition based on Deep Belief Networks and Point Clouds . In Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-989-758-198-4, pages 311-318. DOI: 10.5220/0005979503110318


in Bibtex Style

@conference{icinco16,
author={Fatima Zahra Ouadiay and Nabila Zrira and El Houssine Bouyakhf and M. Majid Himmi},
title={3D Object Categorization and Recognition based on Deep Belief Networks and Point Clouds},
booktitle={Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2016},
pages={311-318},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005979503110318},
isbn={978-989-758-198-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - 3D Object Categorization and Recognition based on Deep Belief Networks and Point Clouds
SN - 978-989-758-198-4
AU - Ouadiay F.
AU - Zrira N.
AU - Bouyakhf E.
AU - Himmi M.
PY - 2016
SP - 311
EP - 318
DO - 10.5220/0005979503110318