2 BASIC NOTES
In (Prati et al., 2003), the authors distinguished de-
terministic methods (e.g. (Cucchiara et al., 2001)),
which use on/off decision processes at each pixel, and
statistical approaches (see (Mikic et al., 2000)) which
contain probability density functions to describe the
shadow-membership of a give image point. The clas-
sification of the methods whether they are determinis-
tic or statistical depends often only on interpretation,
since deterministic decisions can be done using prob-
abilistic functions also. However, statistical methods
have been widely distributed recently, since they can
be used together with Markov Random Fields (MRF)
to enhance the quality of the segmentation signifi-
cantly (Wang et al., 2006).
First, we developed a deterministic method which
classifies the pixels independently, since that way, we
could perform a relevant quantitative comparison of
the different color spaces. After that we gave a prob-
abilistic interpretation to this model and we inserted
it into a MRF framework which we developed ear-
lier (Benedek and Szir
´
anyi, 2006). We compared the
different results after MRF optimization qualitatively
and observed similar relative performance of the color
spaces to the deterministic model.
Another important point of view regarding the cate-
gorization of the algorithms in (Prati et al., 2003) is
the discrimination of the non parametric and para-
metric cases. Non parametric, or ’shadow invariant’
methods convert the video images into an illuminant
invariant feature space: they remove shadows instead
of detecting them. This task is often performed by a
color space transformation, widely used illumination-
invariant color spaces are e.g. the normalized rgb
(Cavallaro et al., 2004),(Paragios and Ramesh, 2001)
and C
1
C
2
C
3
spaces (Salvador et al., 2004). (We re-
fer later to the normalized rgb as rg space, since
the third color component is determined by the first
and second.) In (Salvador et al., 2004) we find an
overview on these approaches indicating that several
assumptions are needed regarding the reflecting sur-
faces and the lightings. We have found in our experi-
ments that these assumptions are usually not fulfilled
in an outdoor environment, and these methods fail
several times. Moreover, we show later that the rg
and C
1
C
2
C
3
spaces are less effective also in the para-
metric case.
For the above reasons, we developed a parametric
model: we extracted feature vectors from the actual
and mean background values of the pixels and applied
shadow detection as solving a classification problem
in that feature space. This approach is widespread in
the literature, and the key points are the way of feature
extraction, the color space selection and the shadow-
domain description in the feature space. In Section
3, we introduce the feature vector which character-
izes the shadowed pixels effectively. In Section 4,
we describe the chosen shadow domain in the feature
space, and define the deterministic pixel classification
method. We show the quantitative classification re-
sults with the deterministic model regarding five real-
world video sequences in Section 5. Finally, we in-
troduce the MRF framework and analyse the segmen-
tation results in Section 6.
We use three assumptions in the paper: (1) The cam-
era stands in place and has no significant ego-motion.
(2) The background objects are static (e.g. there is no
waving river in the background), ad the topically valid
’background image’ is available in each moment (e.g.
by the method of (Stauffer and Grimson, 2000)). (3)
There is one emissive light source in the scene (the
sun or an artificial source), but we consider the pres-
ence of additional effects (e.g. reflection), which may
change the spectrum of illumination locally.
3 FEATURE VECTOR
Here, we define features for a parametric case where
a shadow model can be constructed including some
challenging environmental conditions. First, we in-
troduce a well-known physical approach on shadow
detection with marking that its model assumptions
may not be fulfilled in real-world video scenes. In-
stead of constructing a more difficult illumination
model, we overcome the appearing artifacts with a
statistical description. Finally, the efficiency of the
proposed model is validated by experiments.
3.1 Physical Approach on Shadow
Detection
According to the illumination model (Forsyth, 1990)
the response g(s) of a given image sensor placed at
pixel s can be written as
g(s) =
e(λ, s)ρ(λ, s)ν(λ)dλ, (1)
where e(λ, s) is the illumination function, ρ(s) de-
pends on the surface albedo and geometric, ν(λ) is
the sensor sensitivity. Accordingly, the difference in
the shadowed and illuminated background values of
a given surface point is caused by the different local
value of e(λ, s) only. For example, outdoors, the il-
lumination function is the composition of the direct
(sun), diffused (sky) and reflected (from other non-
emissive objects) light components in the illuminated