Cai, Z. and Vasconcelos, N. (2019). Cascade r-cnn: High
quality object detection and instance segmentation.
arXiv preprint arXiv:1906.09756.
Edward, J., Wannasuphoprasit, W., and Peshkin, M. (1999).
Cobots: Robots for collaboration with human opera-
tors.
Esteban, C. H. and Schmitt, F. (2004). Silhouette and stereo
fusion for 3d object modeling. Computer Vision and
Image Understanding.
Furukawa, Y. and Ponce, J. (2006). Carved visual hulls for
image-based modeling. In European Conference on
Computer Vision. Springer.
Geiger, A., Lenz, P., and Urtasun, R. (2012). Are we ready
for autonomous driving? the kitti vision benchmark
suite. In CVPR. IEEE.
He, K., Gkioxari, G., Doll
´
ar, P., and Girshick, R. (2017).
Mask r-cnn. In Proceedings of the IEEE international
conference on computer vision.
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B.,
Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V.,
et al. (2019). Searching for mobilenetv3. In Proceed-
ings of the IEEE International Conference on Com-
puter Vision.
Joo, H., Liu, H., Tan, L., Gui, L., Nabbe, B., Matthews,
I., Kanade, T., Nobuhara, S., and Sheikh, Y. (2015).
Panoptic studio: A massively multiview system for so-
cial motion capture. In The IEEE International Con-
ference on Computer Vision (ICCV).
Joo, H., Simon, T., Li, X., Liu, H., Tan, L., Gui, L.,
Banerjee, S., Godisart, T. S., Nabbe, B., Matthews,
I., Kanade, T., Nobuhara, S., and Sheikh, Y. (2017).
Panoptic studio: A massively multiview system for so-
cial interaction capture. IEEE Transactions on Pattern
Analysis and Machine Intelligence.
Laurentini, A. (1994). The visual hull concept for
silhouette-based image understanding. IEEE Trans-
actions on pattern analysis and machine intelligence.
Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll
´
ar, P.
(2017). Focal loss for dense object detection. In
Proceedings of the IEEE international conference on
computer vision.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.,
Fu, C.-Y., and Berg, A. C. (2016). Ssd: Single shot
multibox detector. In ECCV. Springer.
Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and
McMillan, L. (2000). Image-based visual hulls. In
Proceedings of the 27th annual conference on Com-
puter graphics and interactive techniques.
Mohammed, A., Schmidt, B., and Wang, L. (2017). Ac-
tive collision avoidance for human–robot collabora-
tion driven by vision sensors. International Journal
of Computer Integrated Manufacturing.
Navarro, S. E., Marufo, M., Ding, Y., Puls, S., G
¨
oger, D.,
Hein, B., and W
¨
orn, H. (2013). Methods for safe
human-robot-interaction using capacitive tactile prox-
imity sensors. In IEEE/RSJ International Conference
on Intelligent Robots and Systems. IEEE.
Nie, B. X., Wei, P., and Zhu, S.-C. (2017). Monocular 3d
human pose estimation by predicting depth on joints.
In 2017 IEEE International Conference on Computer
Vision (ICCV). IEEE.
Peshkin, M. and Colgate, J. E. (1999). Cobots. Industrial
Robot: An International Journal.
Phan, T.-P., Chao, P. C.-P., Cai, J.-J., Wang, Y.-J., Wang, S.-
C., and Wong, K. (2018). A novel 6-dof force/torque
sensor for cobots and its calibration method. In IEEE
International Conference on Applied System Invention
(ICASI). IEEE.
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental
improvement. arXiv preprint arXiv:1804.02767.
Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster
r-cnn: Towards real-time object detection with region
proposal networks. In Advances in neural information
processing systems.
Safeea, M. and Neto, P. (2019). Minimum distance calcu-
lation using laser scanner and imus for safe human-
robot interaction. Robotics and Computer-Integrated
Manufacturing.
Sarafianos, N., Boteanu, B., Ionescu, B., and Kakadiaris,
I. A. (2016). 3d human pose estimation: A review
of the literature and analysis of covariates. Computer
Vision and Image Understanding, 152.
Shi, S., Guo, C., Jiang, L., Wang, Z., Shi, J., Wang, X.,
and Li, H. (2020). Pv-rcnn: Point-voxel feature set
abstraction for 3d object detection. In CVPR.
Slembrouck, M., Luong, H., Gerlo, J., Sch
¨
utte, K.,
Van Cauwelaert, D., De Clercq, D., Vanwanseele,
B., Veelaert, P., and Philips, W. (2020). Multiview
3d markerless human pose estimation from openpose
skeletons. In International Conference on Advanced
Concepts for Intelligent Vision Systems. Springer.
Srivastav, V., Issenhuth, T., Kadkhodamohammadi, A.,
de Mathelin, M., Gangi, A., and Padoy, N. (2018).
Mvor: A multi-view rgb-d operating room dataset for
2d and 3d human pose estimation. arXiv preprint
arXiv:1808.08180.
Vicentini, F. (2020). Collaborative robotics: a survey. Jour-
nal of Mechanical Design.
Villani, V., Pini, F., Leali, F., and Secchi, C. (2018). Survey
on human–robot collaboration in industrial settings:
Safety, intuitive interfaces and applications. Mecha-
tronics.
Vlasic, D., Baran, I., Matusik, W., and Popovi
´
c, J. (2008).
Articulated mesh animation from multi-view silhou-
ettes. In ACM SIGGRAPH 2008 papers.
Yoo, J. H., Kim, Y., Kim, J. S., and Choi, J. W. (2020). 3d-
cvf: Generating joint camera and lidar features using
cross-view spatial feature fusion for 3d object detec-
tion. arXiv preprint arXiv:2004.12636.
Zivkovic, Z. (2004). Improved adaptive gaussian mixture
model for background subtraction. In Proceedings of
the 17th International Conference on Pattern Recog-
nition, 2004. ICPR 2004. IEEE.
Zivkovic, Z. and Van Der Heijden, F. (2006). Efficient adap-
tive density estimation per image pixel for the task of
background subtraction. Pattern recognition letters.
VISAPP 2021 - 16th International Conference on Computer Vision Theory and Applications
636