5 CONCLUSION
A new disparity contour grouping method to iso-
late and distinguish multiple foreground objects in a
scene with fast illumination and texture change is pre-
sented. Without requiring full stereo reconstruction
or tedious empirical parameter tuning, the method
achieves near-real-time performance in software and
generates not only the 2D image locations of objects
but also boundaries and disparity information, pro-
viding a natural extension to 3D processing. As no
assumption is made on the shapes and textures of ob-
jects and environment, the proposed approach suits
generic object segmentation tasks.
ACKNOWLEDGEMENTS
The authors thank Jianfeng Yin for his help with
room geometry measurement and video acquisition
and Jeremy R. Cooperstock for providing essential re-
search facilities.
REFERENCES
Ayer, S. and Sawhney, H. S. (1995). Layered representation
of motion video using robust maximum-likelihood es-
timation of mixture models and MDL encoding. In
Int’l Conf. on Computer Vision, pages 777–784.
Cucchiara, R., Grana, C., Piccardi, M., and Prati, A. (2003).
Detecting moving objects, ghosts and shadows in
video streams. IEEE Trans. Pattern Analysis and Ma-
chine Intelligence, 25(10):1337–1342.
Elder, J. H. and Goldberg, R. M. (2002). Ecological statis-
tics of Gestalt laws for the perceptual organization of
contours. Journal of Vision, 2:324–353.
Elder, J. H., Krupnik, A., and Johnston, L. A. (2003). Con-
tour grouping with prior models. IEEE Trans. Pattern
Analysis and Machine Intelligence, 25(25):1–14.
Foley, J. D., van Dam, A., Feiner, S. K., and Hughes, J. F.
(1997). Computer Graphics: Principles and Practice
in C. Addison-Wesley, 2 edition.
Fusiello, A., Trucco, E., and Verri, A. (2000). A compact
algorithm for rectification of stereo pairs. Machine
Vision and Applications, 12(1):16–22.
Ivanov, Y., Bobick, A., and Liu, J. (2000). Fast lighting
independent background subtraction. Int’l Journal of
Computer Vision, 37(2):199–207.
Jepson, A. D. and Black, M. J. (1993). Mixture models
for optical flow computation. In Computer Vision and
Pattern Recognition, pages 760–761.
Kolmogorov, V. (2001-2003). Software.
http:
//www.adastral.ucl.ac.uk/
˜
vladkolm/
software.html
.
Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and
Rother, C. (2005). Bi-layer segmentation of binocular
stereo video. In Computer Vision and Pattern Recog-
nition, pages 407–414.
Kolmogorov, V. and Zabih, R. (2002). Multi-camera scene
reconstruction via graph cuts. In European Conf. on
Computer Vision, pages 82–96.
Lin, M. H. and Tomasi, C. (2004). Surfaces with occlusions
from layered stereo. IEEE Trans. Pattern Analysis and
Machine Intelligence, 26(8):1073–1078.
Narayanan, P. J., Rander, P. W., and Kanade, T. (1998).
Constructing virtual worlds using dense stereo. In
Int’l Conf. on Computer Vision, pages 3–10.
Oliver, N. M., Rosario, B., and Pentland, A. P. (2000). A
Bayesian computer vision system for modeling human
interactions. IEEE Trans. Pattern Analysis and Ma-
chine Intelligence, 22(8):831–843.
Rittscher, J., Kato, J., Joga, S., and Blake, A. (2000). A
probabilistic background model for tracking. In Euro-
pean Conf. on Computer Vision, pages 336–350.
Stauffer, C. and Grimson, W. (1999). Adaptive background
mixture models for real-time tracking. In Computer
Vision and Pattern Recognition, pages 246–252.
Sun, J., Zheng, N.-N., and Shum, H.-Y. (2003). Stereo
matching using belief propagation. IEEE Trans. Pat-
tern Analysis and Machine Intelligence, 25(7):787–
800.
Sun, W. and Cooperstock, J. R. (2006). An empirical eval-
uation of factors influencing camera calibration accu-
racy using three publicly available techniques. Ma-
chine Vision and Applications, 17(1):51–67.
Torr, P. H., Szeliski, R., and Anandan, P. (2001). An in-
tegrated Bayesian approach to layer extraction from
image sequences. IEEE Trans. Pattern Analysis and
Machine Intelligence, 23(3):297–303.
Toyama, K., Krumm, J., Brumitt, B., and Meyers, B.
(1999). Wallflower: principles and practice of back-
ground maintenance. In Int’l Conf. on Computer Vi-
sion, pages 255–261.
Wang, J. Y. and Adelson, E. H. (1993). Layered represen-
tation for motion analysis. In Computer Vision and
Pattern Recognition, pages 361–366.
Weiss, Y. and Adelson, E. H. (1996). A unified mix-
ture framework for motion segmentation: incorporat-
ing spatial coherence and estimating the number of
models. In Computer Vision and Pattern Recognition,
pages 321–326.
Wren, C. R., Azarbayejani, A. J., Darrell, T. J., and Pent-
land, A. P. (1997). Pfinder: real-time tracking of the
human body. IEEE Trans. Pattern Analysis and Ma-
chine Intelligence, 19(7):780–785.
Zhang, L., Curless, B., and Seitz, S. M. (2002). Rapid shape
acquisition using color structured light and multi-
pass dynamic programming. In Int’l Symposium on
3D Data Processing Visualization and Transmission,
pages 24–36.