original approach in (Wohlkinger et al., 2012), which
is, to the best of our knowledge, the best result on this
dataset. Neural network-based approaches, like the
one we used here, are, however, much faster. Also,
PointNet architecture was used mostly for processing
of full 3D data, and not partial scans, as we use it
here. On the Cat200 subset of the 3DNet dataset,
which is significantly more computationally involved,
we couldn’t reach the same level of accuracy as re-
ported in (Wohlkinger et al., 2012); arguably, due to
GPU memory limitations. Generally, it would be inte-
resting to see a result analogous to those obtained on
the famous ImageNet classification challenge, on a si-
milar dataset of 3D objects or depth scans from depth
sensors. We are aware of several related challenges
on the ShapeNet collection (Chang et al., 2015), but
not for classification.
There are, of course, similar approaches to object
recognition and detection in scenes, also based on ge-
nerated views of 3D models of objects. However, they
are mostly based on volumetric or multi-view repre-
sentations, which can, arguably, be less practical in
certain applications. Also, here we were interested in
evaluating the performance on the specialized point
cloud classification problem; namely, if a classifica-
tion method performs well, especially the one based
on neural networks, it can be used as a basis for more
interesting problems of object detection (recognition)
in more complex scenes. For example, such approa-
ches were used for object detection in both RGB and
RGB-D images; however, that is still a very active
area of research.
ACKNOWLEDGEMENTS
This work has been fully supported by the Croa-
tian Science Foundation under the project number IP-
2014-09-3155.
REFERENCES
Bronstein, M. M., Bruna, J., LeCun, Y., Szlam, A., and Van-
dergheynst, P. (2017). Geometric deep learning: going
beyond euclidean data. IEEE Signal Processing Ma-
gazine, 34(4):18–42.
Cai, Z. (2017). Feature Learning for RGB-D Data. PhD
thesis, University of Sheffield.
Carvalho, . L. E. and Wangenheim, . A. (2017). Literature
review for 3d object classification/recognition.
Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Hu-
ang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su,
H., et al. (2015). Shapenet: An information-rich 3d
model repository. arXiv preprint arXiv:1512.03012.
Defferrard, M., Bresson, X., and Vandergheynst, P. (2016).
Convolutional neural networks on graphs with fast lo-
calized spectral filtering. In Advances in Neural Infor-
mation Processing Systems, pages 3844–3852.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei,
L. (2009). Imagenet: A large-scale hierarchical image
database. In Computer Vision and Pattern Recogni-
tion, 2009. CVPR 2009. IEEE Conference on, pages
248–255. IEEE.
Engelcke, M., Rao, D., Zeng Wang, D., Hay Tong, C.,
and Posner, I. (2017). Vote3Deep: Fast Object De-
tection in 3D Point Clouds Using Efficient Convolu-
tional Neural Networks. In Proceedings of the IEEE
International Conference on Robotics and Automation
(ICRA).
Firman, M. (2016). Rgbd datasets: Past, present and future.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition Workshops, pages 19–
31.
Hua, B.-S., Tran, M.-K., and Yeung, S.-K. (2017). Point-
wise convolutional neural network. arXiv preprint
arXiv:1712.05245.
Kingma, D. P. and Ba, J. (2014). Adam: A method for sto-
chastic optimization. arXiv preprint arXiv:1412.6980.
Kipf, T. N. and Welling, M. (2016). Semi-supervised clas-
sification with graph convolutional networks. arXiv
preprint arXiv:1609.02907.
Klokov, R. and Lempitsky, V. S. (2017). Escape from
cells: Deep kd-networks for the recognition of 3d
point cloud models. CoRR, abs/1704.01222.
Ku, J., Harakeh, A., and Waslander, S. L. (2018). In defense
of classical image processing: Fast depth completion
on the cpu. arXiv preprint arXiv:1802.00036.
Lai, K., Bo, L., Ren, X., and Fox, D. (2011). A large-scale
hierarchical multi-view rgb-d object dataset. In Robo-
tics and Automation (ICRA), 2011 IEEE International
Conference on, pages 1817–1824. IEEE.
Li, Y., Pirk, S., Su, H., Qi, C. R., and Guibas, L. J. (2016).
Fpnn: Field probing neural networks for 3d data. In
Advances in Neural Information Processing Systems,
pages 307–315.
Maaten, L. v. d. and Hinton, G. (2008). Visualizing data
using t-sne. Journal of machine learning research,
9(Nov):2579–2605.
Maturana, D. and Scherer, S. (2015). Voxnet: A 3d convo-
lutional neural network for real-time object recogni-
tion. In Intelligent Robots and Systems (IROS), 2015
IEEE/RSJ International Conference on, pages 922–
928. IEEE.
Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J.,
and Bronstein, M. M. (2017). Geometric deep lear-
ning on graphs and manifolds using mixture model
cnns. In Proc. CVPR, volume 1, page 3.
Qi, C. R., Su, H., Mo, K., and Guibas, L. J. (2017a). Point-
net: Deep learning on point sets for 3d classification
and segmentation. Proc. Computer Vision and Pattern
Recognition (CVPR), IEEE.
Qi, C. R., Su, H., Nießner, M., Dai, A., Yan, M., and Gui-
bas, L. J. (2016). Volumetric and multi-view cnns for
object classification on 3d data. In Proceedings of the
Experimental Evaluation of Point Cloud Classification using the PointNet Neural Network
53