Authors:
Chloe Eunhyang Kim
1
;
Mahdi Maktab Dar Oghaz
2
;
Jiri Fajtl
2
;
Vasileios Argyriou
2
and
Paolo Remagnino
2
Affiliations:
1
VCA Technology Ltd, Surrey and U.K.
;
2
Kingston University, London and U.K.
Keyword(s):
Embedded Systems, Deep Learning, Object Detection, Convolutional Neural Network, Person Detection, YOLO, SSD, RCNN, R-FCN.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Image and Video Analysis
;
Segmentation and Grouping
Abstract:
Recent advancements in parallel computing, GPU technology and deep learning provide a new platform for complex image processing tasks such as person detection to flourish. Person detection is fundamental preliminary operation for several high level computer vision tasks. One industry that can significantly benefit from person detection is retail. In recent years, various studies attempt to find an optimal solution for person detection using neural networks and deep learning. This study conducts a comparison among the state of the art deep learning base object detector with the focus on person detection performance in indoor environments. Performance of various implementations of YOLO, SSD, RCNN, R-FCN and SqueezeDet have been assessed using our in-house proprietary dataset which consists of over 10 thousands indoor images captured form shopping malls, retails and stores. Experimental results indicate that, Tiny YOLO-416 and SSD (VGG-300) are the fastest and Faster-RCNN (Inception Res
Net-v2) and R-FCN (ResNet-101) are the most accurate detectors investigated in this study. Further analysis shows that YOLO v3-416 delivers relatively accurate result in a reasonable amount of time, which makes it an ideal model for person detection in embedded platforms.
(More)