contains some noisy pixels and holes, which will be
processed by a post-processing procedure. In the
post-processing procedure, to remove noisy pixels
and holes, morphologic erosion with a 3×3 structure
element is first applied on MO
t
, then the connected
component algorithm (Haralick and Shapiro, 1992)
is used to remove small connected regions with
threshold T
sre
, and finally morphologic dilation with
a 3×3 structure element is applied to obtain the final
foreground/background segmentation results (as an
illustrated example shown in Figure 4).
3 EXPERIMENTAL RESULTS
In this study, 12 test video sequences and the
corresponding ground truth hand segmentations are
employed. They are “Office,” “Outdoor,” “Browse1,”
“LightSwitch,” “NightCar,” “IntelligenRoom,”
“ParkingLot,” “OneLeaveShopReenter,”
“WavingTree1,” “WavingTree2,” “Raining,” and
“Boat.” Here, sequences 1-3 contain some gradual
illumination variations, whereas sequence 4 contains
great illumination variations. Sequences 5-8 contain
both gradual illumination variations and shadow
effect. Finally, sequences 9-12 contain some
dynamic scenes, such as waving tree, raining, and
moving water. To evaluate the effectiveness of the
proposed approach, two comparison methods,
namely, self-organizing background subtraction
(SOBS) (Maddalena and Petrosino, 2008) and
spatially distributed model (SDM) (Dickinson et al.,
2009) are implemented in this study. The parameter
values and thresholds used in the proposed approach
are listed in Table 1, which are empirically
determined in this study.
To evaluate the performance of the three
comparison approaches, the Jaccard coefficient J
c
by
Rosin and Ioannidism (2003) and the total error
(E
tot
) by Toyama et al. (1999) are employed. A pixel
being classified as “foreground” by both the
approach and the ground truth is denoted as “true
positive” (TP). If it is classified as “foreground” by
only the approach, it is denoted as “false positive”
(FP). If it is classified as “foreground” by only the
ground truth, it is denoted as “false negative” (FN).
If TP, FP, and FN denote the numbers of “true
positive,” “false positive,” and “false negative”
pixels in a video sequence, respectively, then
,
)( FNFPTP
TP
J
c
++
=
(14)
.FNFPE
tot
+=
(15)
In Figure 5, as compared with two comparison
approaches, the proposed approach can handle video
sequences containing shadow effect and gradual
illumination variations, whereas in Figure 6, as
compared with two comparison approaches, the
proposed approach can handle video sequences
containing dynamic scenes, such as waving tree,
raining, and moving water.
Additionally, in terms of two performance
indexes, namely, Jaccard coefficients and total errors
listed in Table 2, the performance of the proposed
approach is better than those of two comparison
approaches.
4 CONCLUDING REMARKS
In this study, a video foreground/background
segmentation approach using spatially distributed
model and edge-based shadow cancellation is
proposed to deal with video sequences containing
illumination variations, shadow effect, and dynamic
scenes. Based on the experimental results obtained
in this study, as compared with two comparison
methods, the proposed approach provides the better
video segmentation results.
REFERENCES
Adam, R. and Bischof, L., 1994. Seeded region growing.
IEEE Trans. on Pattern Analysis and Machine
Intelligence, 16(6), 641-647.
Boley, D., 1998. Principle direction devisive partitioning.
Data Mining and Knowledge Discovery, 2(4), 325-344.
Canny, J. F., 1986. A computational approach to edge
detection. IEEE Trans. on Pattern Analysis and
Machine Intelligence, 8(11), 679-698.
Dickinson, P., Hunter, A., and Appiah, K., 2009. A
spatially distributed model for foreground
segmentation. Image and Vision Computing, 27(9),
1326-1335.
Haralick, R. M. and Shapiro, L. G., 1992. Reading, MA:
Addision-Wesley. Computer and Robot Vision, 28-48.
Heikkila, M., Pietikainen, M., and Member, S., 2006. A
texture-based method for modeling the background
and detecting moving objects. IEEE Trans. on Pattern
Analysis and Machine Intelligence, 28(4), 657-662.
Kim, C. and Hwang, J. N., 2002. Fast and automatic video
object segmentation and tracking for content-based
applications. IEEE Trans. on Circuits and Systems for
Video Technol., 12(2), 122-129.
Li, L. et al., 2004. Statistical modeling of complex
backgrounds for foreground object detection. IEEE
Trans. on Image Process., 13(11), 1459-1472.
Maddalena, L. and Petrosino, A., 2008. A self-organizing
approach to background subtraction for visual
SIGMAP2012-InternationalConferenceonSignalProcessingandMultimediaApplications
92