textures, the approaches using additional features
or formulations were proposed. Mittal and Para-
gios(2004) introduced the optical flow as the sup-
plemental feature for the background model. With
the spatial image gradient and temporal derivatives,
the approach suppressed the false detection from the
video frames of the ocean and the park with trees.
Sheikh and Shah(2005) suggested both models of a
background and a foreground. They collected fore-
ground features from the image immediately preced-
ing the current new image and used the features to
compute the probability modelling the foreground.
Since the two models by Mittal et al.(2004) and by
Sheikh et al.(2005) use features from the image right
before the current new image, it works well in those
problems like video frames where the continuity of
the sequence of images is preserved. However, its
performance gets worse as the interval between two
images in the sequence becomes larger.
In this paper, to overcome the false detection prob-
lems caused by the dynamic textures and noises, a
new approach directly modelling a foreground is sug-
gested. The proposed model is established with the
probability distribution of pixel positions in terms of
sum of weighted intensity differences between pre-
vious images and the new image. Based on this fore-
ground position model, each pixel position on the new
image having its probability value greater than the
threshold is claimed to be the one forming the fore-
ground.
2 MODELLING A FOREGROUND
2.1 Foreground Position Distribution
A foreground is defined to be a set of objects with
motion. If an object has motion, its position changes
from image to image. This means that the object’s
motion causes intensity differences of the associated
positions. Thus, for a given position if its intensity
difference between two compared images becomes
relatively large, it is more probable for that position
to be the one forming a foreground. Based on this
consideration, the probability that each position in the
new image becomes a foreground can be defined in
terms of sum of its weighted intensity differences be-
tween the new image and all the previous images.
More formally, let the number of previous images
in a sequence be N and let S be a set of pixel positions
on an image, S = {(x,y)|1 ≤ x ≤ W,1 ≤ y ≤ H, W
is a width and H is a height of an image}. Let I
j,k
for j ∈ S,k = 1,...,N, N + 1, be the intensity value of
the position j on the kth image, where the (N + 1)th
image is the new image. If L is a random variable over
all positions of pixels on the image, the probability
P
f
(L = l) that a pixel at the position l ∈ S on the new
image becomes a foreground is given by
P
f
(L = l) =
1
Z
N
∑
t=1
∑
m∈S
w(|I
l,N+1
− I
m,t
|) · δ(l,m)
(1)
where w : R → R is a weight function such that for ev-
ery x in R, w(x) = log(1 + x). The weight function is
defined to prevent the intensity difference from being
too large once it becomes large enough to be iden-
tified as a foreground. It preserves the property of
sum of original differences in identifying each posi-
tion and also gives the reasonable range of the proba-
bility used for determining the threshold of the proba-
bility to distinguish the foreground from the dynamic
textures and the background, which is explained in
section 3.
Z is a normalizing constant given by
Z =
N
∑
t=1
∑
l∈S
∑
m∈S
w(|I
l,N+1
− I
m,t
|) · δ(l,m)
(2)
where the position matching function δ returning 1
only when two matching positions are identical is
such that
δ(l, m) =
(
1 , if l = m
0 , otherwise.
(3)
2.2 Approximating Foreground Position
Distribution
In computing the probability of equation (1), it is not
easy to align two compared images exactly enough
to compute the position matching function δ(l,m).
To allow the slight misalignment, it is approximated
with the Gaussian kernel function K(l − m) (Bishop,
2006). The approximated probability
ˆ
P
f
(L = l) is
then
ˆ
P
f
(L = l) =
1
ˆ
Z
N
∑
t=1
∑
m∈S
w(|I
l,N+1
−I
m,t
|)·K(l −m) (4)
where
ˆ
Z is a normalizing constant and K is a Gaussian
kernel function. The normalizing constant
ˆ
Z is given
by
ˆ
Z =
N
∑
t=1
∑
l∈S
∑
m∈S
w(|I
l,N+1
− I
m,t
|) · K(l − m)
(5)
and the Gaussian kernel function K : R × R → R is
such that
K(l −m) =
1
2π
1
|H|
1/2
exp
−
1
2
(l −m)
T
H
−1
(l −m)
(6)
where l − m is a 2-dimensional vector and H is a 2×2
bandwidth matrix of the kernel.
MODELLING A FOREGROUND FOR BACKGROUND SUBTRACTION FROM IMAGES - Probability Distribution of
Pixel Positions based on Weighted Intensity Differences
401