quire very large memory. For example, the robust
pixel-based background modeling scheme proposed
by C. Stauffer and W. Grimson (Stauffer and Grim-
son, 1999) uses a mixture of weighted normal distri-
butions at each pixel. Consequently, for a 3-channel
video a model with three mixture components at ev-
ery pixel requires 21 floating point numbers per pixel,
or a storage of over 130 GBytes per frame.
W.R. Schwartz and H. Pedrini (Schwartz et al.,
2009), extend the motion estimation approach of
Babu on foreground objects by projecting intra-frame
blocks on an eigenspace computed using PCA over a
set of consecutive frames, thus exploiting the spatial
redundancy of adjacent blocks. The cost of estimating
the PCA basis as well as the requirement of observing
foreground-free frames during the estimation process
renders this approach unsuitable.
2 SURVEILLANCE VIDEO
COMPRESSION
In the approach to be described, foreground pixels
are detected using a Gaussian mixture model (GMM),
which provides rapid adaptation to changing imaging
conditions as well as a probabilistic framework. Since
a GMM is stored at each pixel, the storage require-
ment would be prohibitive without some strategy for
model compression. In the following, a technique for
significant model data reduction without loss in de-
tection accuracy is described. The description starts
with a review of the GMM background model.
2.1 Background Modeling
The extensive literature on background modeling
methods can be assigned to two major categories.
The first one exploits temporal redundancy between
frames by applying a statistical model on each pixel.
Model parameters are estimated either on-line recur-
sively or off-line using maximum likelihood. Al-
though the normal distribution seems sound and in-
expensive at first, it cannot cope with wide varia-
tions of intensity values such as reflective surfaces,
leaf motion, weather conditions or outdoor illumi-
nation changes. A natural improvement is to use a
mixture of weighted normal distributions(GMMs), a
widely used appearance model for background and
foreground modeling. However, the amount of stor-
age required to maintain a GMM at each pixel is im-
practically large for the WAVC application. In order
for the GMM representation to be effective, the stor-
age requirement must be reduced by at least an order
of magnitude. This paper presents an innovative ap-
proach to the compression of such models in order to
detect moving objects in very large video frames. Be-
fore presenting the new compression method, a sur-
vey of the GMM background modeling approach is
provided as background. Without compression, such
models would require an impractically large amount
of storage.
Friedman and Russell successfully implemented
a GMM background model over a traffic video se-
quence, each parameter being estimated using the
general Expectation-Maximization algorithm (Fried-
man and Russell, 1997). However, the most popu-
lar pixel-based modeling scheme is that implemented
by Stauffer and Grimson (Stauffer and Grimson,
1999), which uses a fast on-line K-means approx-
imation of the mixture parameters. Several varia-
tions of this method were developed improving pa-
rameter convergence rate and overall robustness (Lee,
2005)(Zivkovic, 2004).
The second category of background models ana-
lyzes features from neighboring blocks thus exploit-
ing spatial redundancy within frames. Although
Heikkil
¨
a,and Pietik
¨
ainen (Heikkil
¨
a and Pietik
¨
ainen,
2006) implemented an operator that successfully de-
picts background statistics through a binary pat-
tern, the relatively high computational cost prevent
its use in this application. W.R. Schwartz and H.
Pedrini (Schwartz et al., 2009), propose a method
in which intra-frame blocks are projected on an
eigenspace computed using PCA over a set of consec-
utive frames, thus exploiting the spatial redundancy of
adjacent blocks. The cost of estimating the PCA basis
as well as the requirement of observing foreground-
free frames during the estimation process renders this
approach unsuitable. The same reason makes other
block-based methods that capture histogram,edge, in-
tensity (Jabri et al., 2000)(Javed et al., 2002) and
other feature informations unsuitable for high reso-
lution surveillance video.
In the proposed approach, the background model
is based on a fast-converging extension of the Stauf-
fer and Grimson approximation presented by Dar-
Shyang Lee (Lee, 2005) to model background. The
extension of Lee is explained by starting with a sum-
mary of the basic Stauffer and Grimson algorithm.
The value of each pixel is described by a mixture of
normal distributions. Thus, the probability of observ-
ing a particular color tuple X at time t is given by
Pr(X
t
) =
K−1
∑
i=0
ω
i,t
· N
X
t
,µ
µ
µ
i,t
,Σ
i,t
(1)
K is the number of distributions in the mixture (typi-
cally 3 to 5) and ω
i,t
represents the weight of distribu-
HIGH RESOLUTION SURVEILLANCE VIDEO COMPRESSION - Using JPEG2000 Compression of Random Variables
39