for p, we impose a penalty on d. The amount of this
penalty is proportional to the computed difference. If
the difference is large, it is likely that d is the wrong
disparity for pixel p. This information is passed to the
subsequent calculation by imposing a large penalty.
However, if there are pixels whose costs are not much
higher than those of the optimal disparity or the opti-
mal disparity is not unique, estimation of p’s dispar-
ity is still ambiguous. Such disparities receive only a
small penalty, and it is left to the subsequent compu-
tation of Horizontal Trees to resolve this ambiguity.
The updated matching scores then represent the
input to the calculation of Horizontal Trees. We de-
termine the values of the array H using the modified
data costs m
′
(). The final disparity for each pixel p is
then selected by d
p
= argmin
d∈D
H[p, d].
2.6 Occlusion Handling
An inherent problem in stereo matching is that of oc-
cluded pixels, i.e. pixels visible in one input image,
but not in the other one. We cannot expect that our
algorithm generates correct depth information in the
absence of a matching point. Even worse, the smooth-
ness term of our energy function corrupts disparity es-
timates for non-occludedpixels by propagatingwrong
disparity information gathered in occluded regions.
To handle occlusions, we compute two disparity
maps. The first disparity map D
R
is calculated with
the right frame being the reference image. D
R
serves
solely to identify the occluded pixels of the left image.
We use D
R
to warp the right image into the geometry
of the left view. Pixels of the warped image that do
not receive contribution from at least one pixel of the
right image are marked as being occluded (Bleyer and
Gelautz, 2005). These pixels are recorded in an occlu-
sion map for the left image denoted by O
L
. We post-
process O
L
by deleting occluded pixels whose left and
right spatial neighbours are marked as non-occluded.
Such pixels typically occur for slanted surfaces that
are oversampled in the left image (Ogale and Aloi-
monos, 2004). These pixels are not occluded, but vi-
olate the uniqueness constraint.
The second disparity map D
L
is computed with the
left image being the reference frame. At this point, we
do not attempt to assign occluded pixels of O
L
to “cor-
rect” disparities. We, however, attempt to avoid that
occluded pixels of O
L
propagate wrong disparities.
We therefore extend the smoothness term of equation
(3) by an additional constraint. This constraint is: If
at least one of the two neighbouring pixels is marked
as occluded in O
L
, the smoothness penalty is set to
zero. As a consequence, an occluded pixel does not
influence the disparity assignments of its neighbour-
ing pixels by imposing the smoothness penalty.
The final disparity map of our algorithm is de-
rived from D
L
by overwriting the estimated disparities
for occluded pixels with more “meaningful” disparity
values. For each occluded pixel p in the occlusion
map O
L
, we search p’s closest non-occluded pixels
on the same horizontal scanline in left and right di-
rections. We determine the minimum of both pixels’
disparities and assign this disparity to p.
3 EXPERIMENTAL RESULTS
We use the Middlebury data set (Scharstein and
Szeliski, 2002) to evaluate the results of our algo-
rithm. The test set consists of four stereo pairs for
which ground truth data is provided. Results of our
algorithm on these stereo images are shown in Fig-
ure 5. All disparity maps have been generated using
constant parameter settings. (The parameters are set
to P
1
= 20, P
′
2
= 30, P
3
= 4, T = 30 and λ = 0.025.)
Obviously, we could improve the results by tuning the
parameters for each image pair separately.
The disparity maps in Figure 5 show that our al-
gorithm produces smooth disparity results. Due to
the occlusion handling procedure and the algorithm’s
pixel-based nature, disparity discontinuities appear to
be correctly captured. The algorithm also preserves
details in the disparity map (e.g., sticks in the cup
of the Cones data set). As opposed to other DP ap-
proaches, our disparity maps seem to be almost free of
streaks. This can be attributed to the structure of our
trees that captures horizontal and vertical smoothness
edges. Our disparity maps also hardly contain iso-
lated pixels. These are typical artefacts produced by
the Semi-Global Approach of (Hirschm¨uller, 2005)
when the DP paths do not capture enough texture.
Our approach avoids this problem by applying trees
that span all pixels of the reference image. A rela-
tively large error is, however, found to the right of the
pink teddy in the Teddy data set. This region is virtu-
ally free of texture and image noise biases the results
towards the wrong disparity.
We use the Middlebury stereo website (Scharstein
and Szeliski, 2002) for a quantitative comparison
against competing approaches. Currently, our algo-
rithm ranks on the eighth position of approximately
30 algorithms in the Middlebury online table. Most
methods that achieve a better ranking build on graph-
cuts or belief propagation. They are far slower than
the proposed algorithm, which makes a compari-
son partially unfair. Moreover, all of the better-
performing techniques apply colour segmentation and
most of them use the segmentation information as a
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
420