7 CONCLUSION AND FUTURE
WORK
The main objective of this study was to implement
the method for the object trajectory estimation using
the YOLOv3 object detector algorithm an 2D laser
rangefinder. We analysed the YOLOv3 and YOLOv3-
tiny deep learning models on various classes from
datasets including Pascal VOC and Google Open Im-
ages. We used additional pre-trained convolutional
weights to increase the capability of the model to de-
tect the objects. The combination of the object detec-
tion and 2D LIDAR helps the trajectory estimation of
an object. In addition, we tried to plot the trajectory
by using the distances from the depth camera, CMOS
camera and LRF angles. We also estimated the trajec-
tory of an object using a mobile robot in a controlled
fashion. Furthermore, we used polynomial regression
with the purpose of smoothing trajectory path but only
for suitable cases. The experiments show that our ap-
proach is feasible and robust to obtain the object lo-
cation and further draw the trajectory.
As a possible future work, we plan to investigate
different algorithms for the objects trajectory predic-
tion, as well as methods related to object tracking us-
ing mobile robot control.
REFERENCES
Cabasso, J. (2009). Analog vs. ip cameras. Aventura Tech-
nologies, 1(2):1–8.
Ciaparrone, G. and at. al. (2020). Deep learning in video
multi-object tracking: A survey. Neurocomputing,
381:61–88.
Dalal, N. and Triggs, B. (2005). Histograms of oriented
gradients for human detection. In Proceedings of
CVPR’05, pages 886–893. IEEE Computer Society.
Deng, J. and at. al. (2009). Imagenet: A large-scale hi-
erarchical image database. In IEEE Conference on
Computer Vision and Pattern Recognition, pages 248
– 255.
Draelos, M. and at. al. (2015). Intel realsense=real low cost
gaze. pages 2520 – 2524.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2013).
Rich feature hierarchies for accurate object detection
and semantic segmentation. pages 98–136.
He, K. and at. al. (2015). Deep residual learning for image
recognition.
Henry, P., Krainin, M., Herbst, E., Ren, X., and Fox, D.
(2012). Rgb-d mapping: Using kinect-style depth
cameras for dense 3d modeling of indoor environ-
ments. The International Journal of Robotics Re-
search, 31(5):647–663.
Hu, J. and at. al. (2017). Squeeze-and-excitation networks.
Keselman, L. and at. al. (2017). Intel realsense stereoscopic
depth cameras.
Khalifa, A. B., Alouani, I., Mahjoub, M. A., and Amara,
N. E. B. (2020). Pedestrian detection using a moving
camera: A novel framework for foreground detection.
Cognitive Systems Research, 60:77–96.
Kim, D. H. and at. al. (2020). Real-time purchase behavior
recognition system based on deep learning-based ob-
ject detection and tracking for an unmanned product
cabinet. Expert Systems with Applications, 143.
Kim, I. and Yow, K. C. (2015). Object location estima-
tion from a single flying camera. In UBICOMM 2015,
page 95.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. In Proceedings NIPS’12, pages 1097–1105,
USA.
Kuznetsova, A. and at. al. (2018). The open images dataset
v4: Unified image classification, object detection, and
visual relationship detection at scale.
Li, Z. and at. al. (2018). Large-scale retrieval for medical
image analytics: A comprehensive review. Medical
Image Analysis, 43(10).
Liu, W. and at. al. (2016). Ssd: Single shot multibox detec-
tor.
Lu, Z. (2018). Client- server system for web-based visual-
ization and animation of learning content. PhD thesis,
Darmstadt Univ. of Tech., Germany.
Oztarak, H. and at. al (2016). Efficient active rule process-
ing in wireless multimedia sensor networks. I. Journal
of Ad Hoc and Ubiq. Computing, 21:98–136.
Redmon, J. and at. al. (2016). You only look once: Unified,
real-time object detection. 32(3):779–788.
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental
improvement.
Ren, S. and at. al. (2015). Faster r-cnn: towards real-time
object detection with region proposal networks. In
Proceedings of NIPS’15, pages 91–99.
Rincon, L. and at. al. (2019). Adaptive cognitive robot using
dynamic perception with fast deep-learning and adap-
tive on-line predictive control. Advances in Mecha-
nism and Machine Science, 73:2429–2438.
Szegedy, C. and at. al. (2016). Inception-v4, inception-
resnet and the impact of residual connections on learn-
ings.
Tan, C. and at. al. (2018). A survey on deep transfer learn-
ing.
Tian, R. and at. al. (2018). Novel automatic human-height
measurement using a digital camera. 2018 IEEE
BMSB, pages 1–4.
Wang, Y. (2014). Tan analysis of the viola-jones face detec-
tion algorithm. An Analysis of the Viola-Jones Face
Detection Algorithm, 4:128–148.
Wei, P. and at. al. (2018). Lidar and camera detection fusion
in a real time industrial multi-sensor collision avoid-
ance system.
Yang, Y. and at. al. (2020). A trajectory planning method for
robot scanning system using mask r-cnn for scanning
objects with unknown model. Neurocomputing.
Zhang, Z. (2000). A flexible new technique for camera cal-
ibration. IEEE, 22(11):1330–1334.
ROBOVIS 2020 - International Conference on Robotics, Computer Vision and Intelligent Systems
140