Authors:
Sankha S. Mukherjee
;
Rolf H. Baxter
and
Neil M. Robertson
Affiliation:
Heriot-Watt University, United Kingdom
Keyword(s):
Deep Learning, Intentional Tracker.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Motion, Tracking and Stereo Vision
;
Tracking and Visual Navigation
;
Video Surveillance and Event Detection
Abstract:
In this paper we improve pedestrian tracking using robust, real-time human head pose estimation in low
resolution RGB data without any smoothing motion priors such as direction of motion. This paper presents
four principal novelties. First, we train a deep convolutional neural network (CNN) for head pose classification
with data from various sources ranging from high to low resolution. Second, this classification network is then
fine-tuned on the continuous head pose manifold for regression based on a subset of the data. Third, we
attain state-of-art performance on public low resolution surveillance datasets. Finally, we present improved
tracking results using a Kalman filter based intentional tracker. The tracker fuses the instantaneous head pose
information in the motion model to improve tracking based on predicted future location. Our implementation
computes head pose for a head image in 1.2 milliseconds on commercial hardware, making it real-time and
highly scalable.