can be a useful addition to enhance current state-of-
the-art 3D point cloud object detection methods, as
shown with the improvements made on the state-of-
the-art mean average precision values @ 0.25 and
@ 0.5 for some classes. However, along with this,
the experimental results proved that the iterative sam-
pling method caused inconsistency across all classes.
This indicates the limitations of utilizing ou unsuper-
vised iterative sampling and clustering framework on
a dataset of varying classes and object shapes/sizes
demonstrating this may be best suited to applications
with primitive shapes or similar point cloud scenes. In
future works, we plan to further extend and fine tune
the framework to achieve superior results on other
common benchmark datasets.
The results show that Enhanced VoteNet and En-
hanced MLCVNet achieved high object accuracy re-
sults for both training and testing on the benchmark
SUN RGB-D dataset with all runs yielding object ac-
curacy results greater than 99.1% which is promising.
The objective of this work is to evaluate the above
dataset and methods using key considerations of in-
dustrial applications which has not been previously
done for raw point cloud object detection methods.
VoteNet and MLCVNet, which were implemented on
the 3D point cloud dataset, show promising results in
terms of accuracy, computation, and real-time capa-
bility for industrial applications. However, one ad-
ditional consideration which needs further evaluation
is the process of updating the models for new ob-
ject classes, changes in ambient conditions or infras-
tructure in an industrial setting as this is important in
modern real-world applications.
REFERENCES
Saifullahi Aminu Bello, Shangshu Yu, and Cheng
Wang. Review: Deep learning on 3d point clouds. 2020.
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh
Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu
Pan, Giancarlo Baldan, and Oscar Beijbom. nuscenes: A
multimodal dataset for autonomous driving. 2020.
B. Calli, A. Singh, A. Walsman, S. Srinivasa, P. Abbeel,
and A. M. Dollar. The ycb object and model set: Towards
common benchmarks for manipulation research. pages
510–517, 2015.
A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan,
Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J.
Xiao, L. Yi, and F. Yu. Shapenet: An information-rich 3d
model repository. 2015.
Angela Dai, Angel X. Chang, Manolis Savva, Maciej
Halber, Thomas Funkhouser, and Matthias Nießner. Scan-
net: Richly-annotated 3d reconstructions of indoor scenes.
2017.
Bertram Drost, Markus Ulrich, Paul Bergmann, Philipp
Hartinger, and Carsten Steger. Introducing mvtec itodd - a
dataset for 3d object recognition in industry. Oct 2017.
Andreas Geiger, Philip Lenz, Christoph Stiller, and
Raquel Urtasun. Vision meets robotics: The kitti dataset.
2013.
Timo Hackel, N. Savinov, L. Ladicky, Jan D. Wegner,
K. Schindler, and M. Pollefeys. Semantic3d.net: A new
large-scale point cloud classification benchmark. volume
IV-1-W1, pages 91–98, 2017.
S. Hinterstoisser, V. Lepetit, S. Ilic, S. Holzer, G. Brad-
ski, K. Konolige, and N. Navab. Modelbased training, de-
tection and pose estimation of texture-less 3d objects in
heavily cluttered scenes. 2012.
T. Hodan, P. Haluza,
ˇ
S. Obdr
ˇ
z
´
alek, J. Matas, M.
Lourakis, and X. Zabulis. T-less: An rgb-d ˇ dataset for
6d pose estimation of texture-less objects. 2017.
R. Larsen, H. Aanaes, and S. Gudmundsson. Fusion
of stereo vision and time-of-flight imaging for improved 3d
estimation. volume 1, pages 1–9, 2019.
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xin-
han Di, and Baoquan Chen. Pointcnn: Convolution on x-
transformed points. 2018.
Yongcheng Liu, Bin Fan, Shiming Xiang, and Chun-
hong Pan. Relation-shape convolutional neural network for
point cloud analysis. 2019.
D. Maturana and S. Scherer. Voxnet: A 3d convo-
lutional neural network for real-time object recognition.
pages 922–928, 2015.
Charles R. Oi, Hao Su, Kaichun Mo, and Leonidas J.
Guibas. Pointnet: Deep learning on point sets for 3d classi-
fication and segmentation. Apr 2017.
Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas.
Pointnet++: Deep hierarchical feature learning on point sets
in metric space. 2017.
Charles R. Qi, Or Litany, Kaiming He, and Leonidas J.
Guibas. Deep hough voting for 3d object detection in point
clouds. 2019.
Shuran Song, Samuel P. Lichtenberg, and Jianxiong
Xiao. Sun rgb-d: A rgb-d scene understanding benchmark
suite. pages 567–576, 2015.
Gusi Te, Wei Hu, Zongming Guo, and Amin Zheng.
Rgcnn: Regularized graph cnn for point cloud segmenta-
tion. 2018.
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma,
Michael M. Bronstein, and Justin M. Solomon. Dynamic
graph cnn for learning on point clouds. Jun 2019.
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang,
and J. Xiao. 3d shapenets: A deep representation for volu-
metric shapes. 2015.
Qian Xie, Yu-Kun Lai, Jing Wu, Zhoutao Wang, Yim-
ing Zhang, Kai Xu, and Jun Wang. MLCVNet: Multi-level
contextvotenetfor 3d object detection. 2020.
Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong.
Group-Free 3D Object Detection via Transformers. 2021.
Enhanced 3D Point Cloud Object Detection with Iterative Sampling and Clustering Algorithms
681