when N(p) = 0 (Eq. 23). This occurs mostly near the
borders of the disparity maps, but can also manifest
itself anywhere in the image where the occlusions are
large enough.
We will therefore estimate a value for the remain-
ing invalid disparities, and store them in the corrected
disparity map D
I
(p). For each pixel with an invalid
disparity value, we search to the left and to the right
on its scanline for the closest valid disparity value.
The disparity map is not updated iteratively, so that
only the information of D
B
(p) is used for each pixel.
The result is shown in Fig. 4(i).
6.3 Median Filter
In the last refinement step, small disparity outliers are
filtered using a median filter, resulting in the absolute
final disparity map D
M
(p) shown in Fig. 4(j). A me-
dian filter has the property of removing speckle noise,
in this case caused by disparity mismatches, while re-
turning a sharp signal (unlike an averaging filter). In
our method, we calculate the median for each pixel
over a 3 × 3 window using a fast bubble sort (Astra-
chan, 2003) implementation in CUDA.
7 RESULTS
We demonstrate the effectiveness of our method using
the standard Middlebury dataset Teddy (Scharstein
and Szeliski, 2003). We compare our method with
the method of (Zhang et al., 2009a), which only uses
a horizontal primary axis, using our own implementa-
tion to provide a valid comparison. All comparisons
with ground truth data use the PSNR metric, where
higher is better.
Our method is better with an increase of 0.53 dB,
i.e. from 18.65 dB to 19.18 dB. Furthermore, the re-
sults are visually better, as observed in Fig. 5. The
figure clearly shows that the edges in the image con-
tain less artifacts, especially around horizontal edges.
As we will discuss next, the refinement steps of
section 6 contribute significantly to the final quality of
the disparity maps, which show significant improve-
ments both visually and quantitatively measured by
the PSNR metric.
Fig. 6 shows the result when the bitwise fast vot-
ing is disabled. Here, all invalid disparity values are
handled by using the closest value on the pixel’s scan-
line. We make two observations. First, the borders of
the disparity maps show a clear decrease in quality.
This is caused by the fact that many invalid disparity
values can be found here due to the missing informa-
tion in one of the images. Because using the closest
valid disparity value does not take the color values
into account, artifacts are created. Second, edge fat-
tening, meaning that the disparity values leak over the
edges, can be seen everywhere in the image. Again,
this is because no color information is used to esti-
mate the invalid disparity values. The use of bitwise
voting gives an increase of 0.75 dB, from 18.43 dB to
19.18 dB.
Finally, Fig. 7 shows the result when no refine-
ment is applied at all. Many improvements can be
noticed visually, including the elimination of speckle
noise, errors at the borders of the disparity map, etc.
Compared to the ground truth, we demonstrate an im-
provement of 3.52 dB, from 15.66 dB to 19.18 dB.
As a matter of fact, bitwise voting can be applied
to any local stereo algorithm. To demonstrate this,
we applied bitwise voting to a disparity map that was
computed using fixed square aggregation windows.
This is shown in Fig. 8. As shown, the results im-
prove, but all artifacts from a naive stereo matching
algorithm cannot be eliminated.
Our method runs at 13 FPS for 450 × 375 reso-
lution images on an NVIDIA GTX TITAN, therefore
providing a real-time solution.
8 CONCLUSION
We have shown that combining horizontal and ver-
tical edge aggregation windows in stereo matching
yields high quality levels in the disparity map esti-
mation. A 0.5 dB gain over state-of-the-art methods
and smooth disparity images with sharp edge preser-
vation around objects is achieved. Nonetheless, the
complexity of the final solution is comparable to ex-
isting methods, allowing efficient GPU implementa-
tion. Furthermore, we demonstrate that the disparity
refinement has a large effect on the final quality.
REFERENCES
Astrachan, O. (2003). Bubble sort: an archaeological algo-
rithmic analysis. ACM SIGCSE Bulletin, 35(1):1–5.
Davis, J., Ramamoorthi, R., and Rusinkiewicz, S. (2003).
Spacetime stereo: A unifying framework for depth
from triangulation. In Computer Vision and Pattern
Recognition, 2003. Proceedings. 2003 IEEE Com-
puter Society Conference on, volume 2, pages II–359.
IEEE.
Lu, J., Rogmans, S., Lafruit, G., and Catthoor, F. (2007a).
High-speed dense stereo via directional center-biased
support windows on programmable graphics hard-
ware. In Proceedings of 3DTV-CON: The True Vision
Capture, Transmission and Display of 3D Video, Kos,
Greece.
Real-timeLocalStereoMatchingUsingEdgeSensitiveAdaptive Windows
125