MEAN SHIFT OBJECT TRACKING USING A 4D KERNEL AND
L
INEAR PREDICTION
Katharina Quast, Christof Kobylko and Andr´e Kaup
Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg
Cauerstr. 7, 91058 Erlangen, Germany
Keywords:
Object tracking, Mean shift tracking.
Abstract:
A new mean shift tracker which tracks not only the position but also the size and orientation of an object is
presented. By using a four-dimensional kernel, the mean shift iterations are performed in a four-dimensional
search space consisting of the image coordinates, a scale and an orientation dimension. Thus, the enhanced
mean shift tracker tracks the position, size and orientation of an object simultaneously. To increase the tracking
performance by using the information about the position, size and orientation of the object in the previous
frames, a linear prediction is also integrated into the 4D kernel tracker. The tracking performance is further
improved by considering the gradient norm as an additional object feature.
1 INTRODUCTION
Object tracking is still an important and challenging
task in computer vision. Among the many different
methods developed for object tracking, the mean shift
algorithm (Comaniciu and Meer, 2002) is one of the
most famous tracking techniques, because of its ease
of implementation, computational speed, and robust
tracking performance. Besides, mean shift tracking
doesn’t require any training data as learning based
trackers like (Kalal et al., 2010). In spite of its ad-
vantages, traditional mean shift suffers from the lim-
itations of the use of a kernel with a fixed band-
width. Since the scale and the orientation of an object
changes over time, the bandwidth and the orientation
of the kernel profile should be adapted accordingly.
An intuitive approach for adapting the kernel
scale is to run the algorithm with three different ker-
nel bandwidths, former bandwidth and former band-
width ± 10%, and to choose the kernel bandwidth
which maximizes the appearance similarity (±10%
method) (Comaniciu et al., 2003). A more sophisti-
cated method using difference of Gaussian mean shift
kernel in scale space has been proposed in (Collins,
2003). The method provides good tracking results,
but is computationally very expensive.
Mean shift based methods which are adapting the
scale and the orientation of the kernel are presented
in (Bradski, 1998; Qifeng et al., 2007). In (Bradski,
1998) scale and orientation of a kernel are obtained
by estimating the second order moments of the object
silhouette, but that is of high computational costs. In
(Qifeng et al., 2007) adaptation of the kernel scale and
orientation is achieved by combining the mean shift
method with adaptive filtering, which is based on the
recursive least squares algorithm.
In this paper we propose a scale and orientation
adaptivemean shift tracker, which doesn’t require any
other iterative or recursive method nor destroys the
realtime capability of the tracking process. This is
achieved by tracking the target in a virtual 4D search
space considering the position coordinates as well as
the target scale and rotation angle as additional di-
mensions. The tracking method is further enhanced
by a linear prediction of the object scene parameters
(position, scale and orientation) and by using the im-
age gradient norm as an additional object feature.
The rest of the paper is organized as follows. Sec-
tion 2 gives an overview of standard mean shift track-
ing. Mean shift tracking in the 4D search space is
explained in Section 3. While the linear prediction is
described in Section 4 and the image gradient norm
is introduced in Section 5. Experimental results are
shown in Section 6. Section 7 concludes the paper.
2 MEAN SHIFT OVERVIEW
Mean shift tracking discriminates between a target
model in frame n and a candidate model in frame
n + 1. The target model is estimated from the dis-
588
Quast K., Kobylko C. and Kaup A..
MEAN SHIFT OBJECT TRACKING USING A 4D KERNEL AND LINEAR PREDICTION.
DOI: 10.5220/0003327305880593
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2011), pages 588-593
ISBN: 978-989-8425-47-8
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)