2.2 Object Tracking
Once moving regions are detected, the following step
is to track these regions from one frame to another.
Tracking can be based on regions, contours, features
or a model (Yilmaz et al., 2006).
Region based tracking identifies connected re-
gions corresponding to each object in the scene, de-
scribes and matches them. Active contour uses the
shape of the detected regions to match them from one
frame to another. Feature-based tracking does not aim
at tracking an object as one entity. We do look here for
distinctive features that can be local or global. Model
based tracking can be done in various ways: articu-
lated skeleton, 2-D contours, 3-D volumes. For each
new image, detected regions are compared to models
previously built.
In this paper, we propose an object tracking
method based on regions and regions features. More
recent tracking system are presented in (Zhang et al.,
2012), (Yang and Nevatia, 2012), and (Pinho and
Tavares, 2009).
3 MOTION DETECTION
Motion detection uses a pixel based model of the
background. We use a method based on the Gaus-
sian mixture model (GMM) first introduced in (Stauf-
fer and Grimson, 2002). The GMM is composed of
a mixture of weighted Gaussian densities, which al-
lows the color distribution of a given pixel to be multi-
modal. Such a model is robust against illumination
changes.
Weight ω, mean µ, and covariance Σ are the pa-
rameters of the GMM that are updated dynamically
over time. The following equation defines the proba-
bility density function P of occurrence of a color u at
the pixel coordinate s, at time t, in the image sequence
I.
P(I(s,t) = u) =
k
∑
i=1
ω
i,s,t
N(I(s,t),µ
i,s,t
,Σ
i,s,t
) (1)
Where N(I(s,t),µ
i,s,t
,Σ
i,s,t
) is the i-th Gaussian
model and ω
i,s,t
its weight. The covariance matrix
Σ
i,s,t
is assumed to be diagonal, with σ
2
i,s,t
as its diag-
onal elements. k is the number of Gaussian distribu-
tions.
For each pixel value, I(s,t), the first step is to
calculate the closest Gaussian. If the pixel value
is within T
σ
deviation of the Gaussian mean, then
parameters of the matched distribution are updated.
Otherwise, a new Gaussian with mean I(s,t), a large
initial variance, and a small initial weight is created to
replace the existing Gaussian with the lower weight.
Once Gaussians are updated, weights are normal-
ized and distributions are ordered based on the value
ω
i,s,t
/σ
i,s,t
.
As proposed in (Zivkovic and van der Heijden,
2006), we improve the GMM by adapting the num-
ber of selected Gaussian densities. To select the most
reliable densities, we modify the calculation of their
weights. The weight is decreased when a density is
not observed for a certain amount of time.
ω
i,t
= ω
i,t−1
+ α(M
i,t
− ω
i,t−1
) − α c
T
(2)
Where α is the learning rate and M
i,t
is equal to 1
for the matched distribution and 0 for the others. c
T
is a scalar representing the prior evidence.
Pixels that are matched with any of the selected
distributions are labeled as foreground. Otherwise,
pixels belong to the background. We note that the
model is updated at every frame.
This method remains sensible to shadows. Thus,
we use a shadow detection algorithm. Shadows detec-
tion requires a model that can separate chromatic and
brightness components. We use a model that is com-
patible with the mixture model (KaewTraKulPong
and Bowden, 2001). We compare foreground pixels
against current background model. If the differences
in chromatic and brightness are within some thresh-
olds, pixels are considered as shadows. We calculate
the brightness distortion a and color distortion c as
follow:
a = argmin
z
(I(s,t) − zE)
2
and c = ||I(s,t) − aE||
(3)
Where E is a position vector at the RGB mean of
the pixel background and I(s,t) is the pixel value at
position s and time t. A foreground pixel is consid-
ered as a shadow if a is within T
σ
standard deviations
and τ < c < 1. Where τ is the brightness threshold.
Finally, we modify the updating process to better
handle objects stopping in the scene. With the current
model, stopped people starts disappearing, because
they become part of the background. We modify the
updating process for the distributions parameters, i.e.
we do not update the model on areas that are consid-
ered as belonging to a tracked object. Tracked objects
are defined in the next section.
We introduce F
s,t
that is a binary image represent-
ing these tracked objects. F
s,t
is a filtered foreground
image where regions that were tracked for several
frames, or objects, are displayed. Pixels covered by
an object have value 1 while the others have value 0.
We modify the distribution parameters updating equa-
tions:
VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications
356