Helmut Grabner, Christian Leistner and Horst Bischof
Institute for Computer Graphics and Vision, Graz University of Technology, Austria
On-line learning, boosting, background modeling.
In modern video surveillance systems change and outlier detection is of highest interest. Most of these systems
are based on standard pixel-by-pixel background modeling approaches. In this paper, we propose a novel ro-
bust block-based background model that is suitable for outlier detection using an extension to on-line boosting
for feature selection. In order to be robust our system incorporates several novelties for previous proposed on-
line boosting algorithms and classifier-based background modeling systems. We introduce time-dependency
and control for on-line boosting. Our system allows for automatically adjusting its temporal behavior to the un-
derlying scene by using a control system which regulates the model parameters. The benefits of our approach
are illustrated on several experiments on challenging standard datasets.
For most video surveillance systems the detection
of moving or intruding objects is of crucial impor-
tance. As they are easy to implement and fast to com-
pute, frequently simple background subtraction meth-
ods are applied where objects are detected by blobs
of pixels which do not correspond to the background
In its simplest form the background model (BGM)
is solely one image, called the background im-
age. Having this image, pixels are marked as fore-
ground if they do not fit to it, i.elet@tokeneonedot,
are more than a certain threshold above or be-
low each pixel value. For realistic applications
and in order to handle different environmental con-
ditions (e.glet@tokeneonedot, changing lightening
conditions or foreground models moving to back-
ground and vice versa), more sophisticated, multi-
modal statistical models such as Mixture of Gaussians
(GMM) (Stauffer and Grimson, 1999) or Eigenback-
grounds (Oliver et al., 2000) are often used. More
efficient systems analyzing foreground models (Tian
et al., 2005) have been proposed. In order to further
improve the robustness some recent approaches ex-
ploit the spatial correlation between pixels arranged
in blocks (e.glet@tokeneonedot, (Russell and Gong,
2005)), e.glet@tokeneonedot, by describing statis-
tics within one block using features (Heikkil
a and
ainen, 2006), the entire block is decided to be
either background or foreground, which significantly
improves the detection reliability.
Furthermore, several adaptive methods for esti-
mating a pixel-based BGM have been proposed which
update the existing image with respect to the cur-
rent input frame (e.glet@tokeneonedot, running av-
erage (Koller et al., 1994), temporal median filter
(Lo and Velastin, 2001), approximate median filter
(McFarlane and Schofield, 1995)). For block-based
BGMs adaptiveness can be achieved by describing
each block with an on-line classifier (Grabner et al.,
2006). This, additionally, allows to adapt to con-
tinuous changes (e.glet@tokeneonedot, illumination
changes in outdoor scenes) while observing the scene.
Although being effective for describing highly dy-
namic scenes, yet, the main drawback (see also sec-
tion 2) of this on-line learner-based BGM is that its
update strategy is similar to a self-learning method
since the model updates are directly based on its
own classifier predictions and can therefore end up in
a catastrophic state (i.elet@tokeneonedot, the model
drifts away). Due to this dependency on its own
predictions, the model performs quite well for a
relatively short period of time but finally tends to
learn foreground objects very quickly without offer-
ing any control on its temporal behavior (see also Fig-
ure 1). Furthermore, the cumbersome update strategy
is highly scene dependent and, therefore, has to be
hand-tuned every now and then.
To sum up, previously proposed background mod-
Grabner H., Leistner C. and Bischof H. (2008).
In Proceedings of the Third International Conference on Computer Vision Theory and Applications, pages 612-618