In general, the use of PCD training is better than
CD training, and FEPCD outperforms PCD training.
This result is pertinent, since FEPCD uses free en-
ergy as a criterion for the goodness of a chain in or-
der to obtain elite samples from the generative model
that can more accurately compute the gradient of
the log probability of training data. FEPCD outper-
forms PCD and CD in terms of accuracy, although
its computational complexity is high and takes rela-
tively longer time in training as compared to the two
methods. Our next goal will be to optimize the per-
formance of FEPCD in order to reduce the computa-
tional complexity.
7 CONCLUSIONS
In this work, we focused on 3D object categorization
using geometric features extracted from Viewpoint
Feature Histogram (VFH) descriptor and learned with
both Generative and Discriminative Deep Belief Net-
work (GDBN/DDBN) architectures. GDBN is the
probabilistic model with many Restricted Boltzmann
Machines (RBMs) which are trained sequentially. On
the other hand, DDBN is constructed from the Dis-
criminative Restricted Boltzmann Machine (DRBM)
which is based on RBM and the joint distribution
model. The experimental results using DDBN are en-
couraging, especially that our approach is able to rec-
ognize 3D objects under different views. In a future
work, we will attempt to embed our algorithm in our
robot TurtleBot2 in order to grasp the real-world ob-
jects. Moreover, we will utilize a hybrid deep archi-
tecture that combines the advantage of generative and
discriminative models.
REFERENCES
Aldoma, A., Tombari, F., Rusu, R., and Vincze, M. (2012).
OUR-CVFH–Oriented, Unique and Repeatable Clus-
tered Viewpoint Feature Histogram for Object Recog-
nition and 6DOF Pose Estimation. Springer.
Aldoma, A., Vincze, M., Blodow, N., Gossow, D., Gedikli,
S., Rusu, R., and Bradski, G. (2011). Cad-model
recognition and 6dof pose estimation using 3d cues. In
Computer Vision Workshops (ICCV Workshops), 2011
IEEE International Conference on, pages 585–592.
IEEE.
Alexandre, L. A. (2016). 3d object recognition using con-
volutional neural networks with transfer learning be-
tween input channels. In Intelligent Autonomous Sys-
tems 13, pages 889–898. Springer.
Azevedo, F. A., Carvalho, L. R., Grinberg, L. T., Farfel,
J. M., Ferretti, R. E., Leite, R. E., Lent, R., Herculano-
Houzel, S., et al. (2009). Equal numbers of neuronal
and nonneuronal cells make the human brain an iso-
metrically scaled-up primate brain. Journal of Com-
parative Neurology, 513(5):532–541.
Basu, J. K., Bhattacharyya, D., and Kim, T.-h. (2010). Use
of artificial neural network in pattern recognition. In-
ternational Journal of Software Engineering and Its
Applications, 4(2).
Bo, L., Ren, X., and Fox, D. (2011). Depth kernel descrip-
tors for object recognition. In Intelligent Robots and
Systems (IROS), 2011 IEEE/RSJ International Con-
ference on, pages 821–826. IEEE.
Carreira-Perpinan, M. A. and Hinton, G. E. (2005). On
contrastive divergence learning. In Proceedings of the
tenth international workshop on artificial intelligence
and statistics, pages 33–40. Citeseer.
Deng, L. (2014). A tutorial survey of architectures, algo-
rithms, and applications for deep learning. APSIPA
Transactions on Signal and Information Processing,
3:e2.
Hinton, G. E., Osindero, S., and Teh, Y.-W. (2006a). A
fast learning algorithm for deep belief nets. Neural
computation, 18(7):1527–1554.
Hinton, G. E., Osindero, S., and Teh, Y.-W. (2006b). A
fast learning algorithm for deep belief nets. Neural
computation, 18(7):1527–1554.
Janoch, A., Karayev, S., Jia, Y., Barron, J. T., Fritz, M.,
Saenko, K., and Darrell, T. (2013). A category-level
3d object dataset: Putting the kinect to work. In Con-
sumer Depth Cameras for Computer Vision, pages
141–165. Springer.
Keyvanrad, M. A. and Homayounpour, M. M. (2014).
Deep belief network training improvement using elite
samples minimizing free energy. arXiv preprint
arXiv:1411.4046.
Lai, K., Bo, L., Ren, X., and Fox, D. (2011a). A large-
scale hierarchical multi-view rgb-d object dataset. In
Robotics and Automation (ICRA), 2011 IEEE Interna-
tional Conference on, pages 1817–1824. IEEE.
Lai, K., Bo, L., Ren, X., and Fox, D. (2011b). A large-
scale hierarchical multi-view rgb-d object dataset. In
Robotics and Automation (ICRA), 2011 IEEE Interna-
tional Conference on, pages 1817–1824. IEEE.
Larochelle, H. and Bengio, Y. (2008). Classification us-
ing discriminative restricted boltzmann machines. In
Proceedings of the 25th international conference on
Machine learning, pages 536–543. ACM.
LeCun, Y., Huang, F. J., and Bottou, L. (2004). Learning
methods for generic object recognition with invari-
ance to pose and lighting. In Computer Vision and
Pattern Recognition, 2004. CVPR 2004. Proceedings
of the 2004 IEEE Computer Society Conference on,
volume 2, pages II–97. IEEE.
Liu, Y., Zhou, S., and Chen, Q. (2011). Discriminative deep
belief networks for visual data classification. Pattern
Recognition, 44(10):2287–2296.
Madry, M., Ek, C. H., Detry, R., Hang, K., and Kragic, D.
(2012). Improving generalization for 3d object cate-
gorization with global structure histograms. In Intelli-
gent Robots and Systems (IROS), 2012 IEEE/RSJ In-
ternational Conference on, pages 1379–1386. IEEE.
VISAPP 2017 - International Conference on Computer Vision Theory and Applications
106