(a) (b) (c) (d)
Figure 3: Results for occluding salient objects in the foreground and distracting salient objects in the background:
(a) original left stereo image; (b) highlighted ROI using saliency-based mask M; (c) highlighted pseudo-colored nonlin-
early quantized depth image (D
Q
) using binary mask M. From the closest to the deepest plane: red, green, blue; and (d) final
ROIs highlighted in the actual image.
4 EXPERIMENTAL RESULTS
We illustrate the performance of our algorithm with
a setting (Figure 3) consisting of stereo images cap-
tured in laboratory environment with two aligned
identical cameras fixed on a professional stereo
stand. This and other stereo image pairs along
with the experimental results are currently posted at
http://mlab.fau.edu/stereo/roi3d.zip
.
In our experiments, a 3-level (L
1
, L
2
, L
3
) quanti-
zation was used, according to Equation 1, while the
threshold values were obtained empirically: T
1
= 11
and T
2
= 23.
Figure 3 shows a case in which occluding salient
objects in the foreground and distracting objects in
the background are segmented. Note that there is a
bright yellow distracter in the foreground that is not
perceived as such by the algorithm, resulting in a false
negative. It can be observed that while the 2D ROI ex-
traction fails to discriminate between two foreground
objects and fails to identify background objects as
such, our proposed algorithm successfully discrimi-
nates between the two foreground ROIs and identifies
all background ROIs.
5 CONCLUSIONS
Object and region segmentation from 2D data is not
always a straightforward task. In particular, it can
be impossible to segment occluded object because of
the depth information that is lost. In this work we
extended a previously proposed method for 2D re-
gion of interest extraction with depth information. A
disparity map was generated from two views using
the method proposed by Birchfield-Tomasi (Birch-
field and Tomasi, 1999). Using this depth information
we were able to differentiate occluding regions of in-
terest. Our experiments demonstrate the promise of
this approach but stress the need for nonlinear quan-
tization thresholds of the disparity map for successful
results. We are continuing work on this approach by
creating a method of automatically determining these
quantization thresholds and extending it to a variety of
applications. We are currently obtaining quantitative
results to further validate our method.
ACKNOWLEDGEMENTS
This research was partially sponsored by UOL
(www.uol.com.br), through its UOL Bolsa Pesquisa
program, process number 200503312101a.
REFERENCES
Birchfield, S. and Tomasi, C. (1999). Depth discontinu-
ities by pixel-to-pixel stereo. International Journal of
Computer Vision, 35(3):269–293.
Cox, I., Hingorani, S., Rao, S., and Maggs, B. (1996). A
maximum likelihood stereo algorithm. Computer Vi-
sion and Image Understanding, 63(3):542–567.
Itti, L., Koch, C., and Niebur, E. (1998). A model of
saliency-based visual attention for rapid scene anal-
ysis. IEEE Trans. on PAMI, 20(11):1254–1259.
Marques, O., Mayron, L. M., Borba, G. B., and Gamba,
H. R. (2007). An attention-driven model for group-
ing similar images with image retrieval applications.
EURASIP Journal on Applied Signal Processing.
Stentiford, F. (2003). An attention based similarity mea-
sure with application to content-based information re-
trieval. In Proceedings of the Storage and Retrieval
for Media Databases Conference, SPIE Electronic
Imaging, Santa Clara, CA.
Styles, E. A. (2005). Attention, Perception, and Memory:
An Integrated Introduction. Taylor & Francis Rout-
ledge, New York, NY.