where X and Y denote the dimension of the image.
The moments µ
′
20
= µ
20
/µ
00
and µ
′
02
= µ
02
/µ
00
are
constructed from central moments
µ
pq
=
∑
x
∑
y
(x− x)
p
(y− y)
q
I
c
(x,y), (15)
where (x,y) is the center of gravity of the image.
Since the size of the hand should not change too much
between consecutive frames, a sharp increase in the
scatter of the color likelihood image is considered to
be caused by a false estimate. Thus, if S
t
− S
t−1
> τ
s
,
the previouslyadded histogram H
t−1
(r,g) is discarded
from the buffer. This situation occurs, for example,
when the model falsely adapts to an background ob-
ject that meets the condition in the first step.
3.5 The Algorithm
According to the importance sampling theory, the par-
ticle weights must be augmented with a correction
factor when sampling from the proposal distribution.
As derived in (Arulampalam et al., 2002), the particle
weight update equation then becomes
w
i
t
∝ w
i
t−1
p(x
i
t
|x
i
t−1
,x
i
t−2
)
q(x
i
t
|x
i
t−1
,x
i
t−1
,z
t
)
p(z
t
|x
i
t
), (16)
where the factor p(x
i
t
|x
i
t−1
,x
i
t−2
)/q(x
i
t
|x
i
t−1
,x
i
t−1
,z
t
)
emphasizes particles toward the dynamic model. This
factor is omitted here since the proposal distribution
is considered to be a better representation of the true
state than the generic dynamic model, in which case
the factor would degrade the estimate calculated from
the particles. This has been shownin our experiments.
The hand tracking algorithm used in the experi-
ments is built on the Sampling Importance Resam-
pling (SIR) algorithm and is presented in Table 1.
Table 1: Feature Guided Particle Filter (FGPF) for hand
tracking.
To create the particle set {x
i
t
,w
i
t
}
N
p
i=1
, process each particle in the
previous set {x
i
t−1
,w
i
t−1
}
N
p
i=1
as follows:
1. Extract the feature set B = {c
i
b
,ξ
i
}
N
b
i=1
2. draw x
i
t
∼ q(x
t
|x
i
t−1
,x
i
t−2
,B )
3. update weight: w
i
t
∝ w
i
t−1
p(z
t
|x
i
t
)
4. Estimate the effective sample size
ˆ
N
ef f
=
∑
N
p
i=1
(w
i
t
)
2
−1
5. If
ˆ
N
ef f
< T
ef f
, draw x
∗i
t
∼ {x
i
t
,w
i
t
}
N
p
i=1
so that p(x
∗i
t
= x
j
t
) ∝
w
j
t
and replace {x
i
t
,w
i
t
} ← {x
∗i
t
,N
−1
p
}
6. Estimate the location of the object as the mean of the subset
defined in (12).
4 EXPERIMENTS
The experiments were made with a set of twelve test
sequences, where a user moved his hand randomly.
The sequences had 15 fps, a resolution of 320x240
and a mean length of 240 frames. Eight sequences
were made by varying three parameters: background
clutter (simple/cluttered), hand speed (slow/fast) and
the presence of the user (only the hand and arm in
the frames/user’s upper torso in the background). An-
other four sequences were made varying the first two
parameters, with the user and an additional person
moving in the background. Furthermore, to simulate
real-life incidents, the hand of the user occasionally
moved outside of the camera’s field of view for a few
frames. The center of the palm was manually labeled
as the reference point for each frame of the sequences.
For a comparison, the tests were performed
also with the Mean Shift Embedded Particle Filter
(MSEPF), implemented as presented in (Shan et al.,
2004). The parameters for the tests were found man-
ually for both methods, since global optimization over
the test sequences would have been infeasible. Each
parameteres was tested with few values and the op-
timal over all sequences was chosen. The selected
parameters are presented in Table 2. For the evalua-
tion, the average tracking rate was computed for both
methods, which is defined as the proportion of frames
where the estimate is within 20 pixels of the reference
point. Since both methods are stochastic in nature, the
tests were repeated 10 times for each sequence. Ta-
ble 3 shows the results for each sequence parameter
value, averaged over the others.
To verify the advantage of the presented proposal
distribution, the experiments were also performed by
replacing the proposal distribution with the dynamic
model. This produced an overall tracking rate of 0.65,
which is considerably lower than using the proposal
distribution. In addition, tests were also carried out
using the weighting equation (16), which yielded in
an overall tracking rate of 0.90.
As the results show, the presented method outper-
forms the MSEPF with the given parameters. The
tests showed that the Feature Guided Particle Filter
was able to deal with convoluting factors, such as
background movement and clutter with fast and di-
verse motion. Moreover, the proposed method was
able to recover when the object was momentarily out
of sight. These factors also produced the major differ-
ences between the two methods, whereas the tracking
rates were relatively high for both methods with the
less complicated sequences. The main shortcoming
of MSEPF was its inability to recover when tracking
was lost, for example when image features were mo-
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
372