Authors:
Sanjukta Ghosh
1
;
Peter Amon
2
;
Andreas Hutter
2
and
André Kaup
3
Affiliations:
1
Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Siemens Corporate Technology, Germany
;
2
Siemens Corporate Technology, Germany
;
3
Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Germany
Keyword(s):
Pedestrian Counting, Deep Learning, Convolutional Neural Networks, Synthetic Images, Transfer Learning, Cross Entropy Cost Function, Squared Error Cost Function.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Motion, Tracking and Stereo Vision
;
Video Surveillance and Event Detection
Abstract:
Counting pedestrians in surveillance applications is a common scenario. However, it is often challenging to obtain sufficient annotated training data, especially so for creating models using deep learning which require a large amount of training data. To address this problem, this paper explores the possibility of training a deep convolutional neural network (CNN) entirely from synthetically generated images for the purpose of counting pedestrians. Nuances of transfer learning are exploited to train models from a base model trained for image classification. A direct approach and a hierarchical approach are used during training to enhance the capability of the model for counting higher number of pedestrians. The trained models are then tested on natural images of completely different scenes captured by different acquisition systems not experienced by the model during training. Furthermore, the effectiveness of the cross entropy cost function and the squared error cost function are eva
luated and analyzed for the scenario where a model is trained entirely using synthetic images. The performance of the trained model for the test images from the target site can be improved by fine-tuning using the image of the background of the target site.
(More)