is different from approaches that use the raw point
clouds directly as input and might explain why they
perform overall better on the KITTI benchmark. Nev-
ertheless, relying on those values may be a problem
when dealing with sensor noise or model differences.
With the rise of new datasets, it will hopefully become
clear how robust these methods are compared to BEV
based approaches.
In addition, this work visualizes how the distance
to the sensor influences the objects representation in
the point cloud. Most convolutional neural network
rely on the assumption that features are consistent
over the full range of the image. This allows for one
filter to be used to extract features from the entire fea-
ture map. For LiDAR data this is not the case which
is shown by analyzing point clouds and objects at var-
ious distances. This observation is used to change the
detection pipeline and have a separate detector for ob-
jects in the 0-35 meter range and another detector for
objects in the 35-70 meter range. These changes lead
to improvements, most notably of 2.7% AP on the 0-
35 meter range for easy category and 5.0% AP on the
35-70 meter range for hard category, using a 70% IoU
threshold.
REFERENCES
Beltr
´
an, J., Guindel, C., Moreno, F. M., Cruzado, D.,
Garc
´
ıa, F., and de la Escalera, A. (2018). Birdnet: a
3d object detection framework from lidar information.
CoRR, abs/1805.01195.
Caesar, H., Bankiti, V., Lang, A. H., Vora, S., Li-
ong, V. E., Xu, Q., Krishnan, A., Pan, Y., Baldan,
G., and Beijbom, O. (2019). nuscenes: A multi-
modal dataset for autonomous driving. arXiv preprint
arXiv:1903.11027.
Chen, X., Ma, H., Wan, J., Li, B., and Xia, T. (2017). Multi-
view 3d object detection network for autonomous
driving. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pages
1907–1915.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). ImageNet: A Large-Scale Hierarchical
Image Database. In CVPR09.
Geiger, A., Lenz, P., Stiller, C., and Urtasun, R. (2013).
Vision meets robotics: The kitti dataset. International
Journal of Robotics Research (IJRR).
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep
Learning. The MIT Press.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep
residual learning for image recognition. CoRR,
abs/1512.03385.
Jiang, Y., Zhu, X., Wang, X., Yang, S., Li, W., Wang, H.,
Fu, P., and Luo, Z. (2017). R2CNN: rotational re-
gion CNN for orientation robust scene text detection.
CoRR, abs/1706.09579.
Ku, J., Mozifian, M., Lee, J., Harakeh, A., and Waslander,
S. (2018). Joint 3d proposal generation and object de-
tection from view aggregation. IROS.
Lang, A. H., Vora, S., Caesar, H., Zhou, L., Yang, J.,
and Beijbom, O. (2018). Pointpillars: Fast en-
coders for object detection from point clouds. CoRR,
abs/1812.05784.
Liang, M., Yang, B., Chen, Y., Hu, R., and Urtasun, R.
(2019). Multi-task multi-sensor fusion for 3d object
detection. In The IEEE Conference on Computer Vi-
sion and Pattern Recognition (CVPR).
LiDAR, V. Hdl-64 users manual.
Lin, T., Doll
´
ar, P., Girshick, R. B., He, K., Hariharan, B.,
and Belongie, S. J. (2016). Feature pyramid networks
for object detection. CoRR, abs/1612.03144.
Luo, W., Yang, B., and Urtasun, R. (2018). Fast and furious:
Real time end-to-end 3d detection, tracking and mo-
tion forecasting with a single convolutional net. 2018
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition, pages 3569–3577.
Meyer, G. P., Laddha, A., Kee, E., Vallespi-Gonzalez, C.,
and Wellington, C. K. (2019). Lasernet: An efficient
probabilistic 3d object detector for autonomous driv-
ing. CoRR, abs/1903.08701.
Patil, A., Malla, S., Gang, H., and Chen, Y. (2019). The
H3D dataset for full-surround 3d multi-object detec-
tion and tracking in crowded urban scenes. CoRR,
abs/1903.01568.
Qi, C. R., Liu, W., Wu, C., Su, H., and Guibas, L. J. (2017).
Frustum pointnets for 3d object detection from RGB-
D data. CoRR, abs/1711.08488.
Qi, C. R., Su, H., Mo, K., and Guibas, L. J. (2016). Pointnet:
Deep learning on point sets for 3d classification and
segmentation. CoRR, abs/1612.00593.
Redmon, J., Divvala, S. K., Girshick, R. B., and Farhadi, A.
(2015). You only look once: Unified, real-time object
detection. CoRR, abs/1506.02640.
Ren, S., He, K., Girshick, R. B., and Sun, J. (2015). Faster
R-CNN: towards real-time object detection with re-
gion proposal networks. CoRR, abs/1506.01497.
Shi, S., Wang, X., and Li, H. (2018). Pointrcnn: 3d object
proposal generation and detection from point cloud.
CoRR, abs/1812.04244.
Shi, S., Wang, Z., Wang, X., and Li, H. (2019). Part-aˆ
2 net: 3d part-aware and aggregation neural network
for object detection from point cloud. arXiv preprint
arXiv:1907.03670.
Simon, M., Amende, K., Kraus, A., Honer, J., S
¨
amann,
T., Kaulbersch, H., Milz, S., and Gross, H.
(2019). Complexer-yolo: Real-time 3d object detec-
tion and tracking on semantic point clouds. CoRR,
abs/1904.07537.
Wang, Y., Chao, W., Garg, D., Hariharan, B., Campbell,
M., and Weinberger, K. Q. (2018). Pseudo-lidar
from visual depth estimation: Bridging the gap in
3d object detection for autonomous driving. CoRR,
abs/1812.07179.
Wang, Z., Ding, S., Li, Y., Zhao, M., Roychowdhury, S.,
Wallin, A., Sapiro, G., and Qiu, Q. (2019). Range
adaptation for 3d object detection in lidar.
3D Object Detection from LiDAR Data using Distance Dependent Feature Extraction
299