decomposition.
The structural part I
S
(x) of the grey value image
I(x) is obtained by solving
min
I
S
Z
Ω
|∇I
s
(x)| +
1
2θ
(I
s
(x) − I(x))
2
dx . (9)
The solution to this problem can be found in (Wedel
and Cremers, 2011, p. 27f).
The textural part I
T
(x) can then be calculated by
I
T
(x) = I(x)− αI
S
(x). (10)
Parameters are set according to (Wedel and Cremers,
2011, p. 36f) to α = 0.95 and θ = 0.125 and 100
iterations are used to calculate I
S
.
5 EVALUATION
To evaluate the performance of the proposed con-
fidence measures, seven datasets of the Middlebury
benchmark (Baker et al., 2011) for which a ground
truth is available are used.
To measure the accuracy of an estimated optical
flow field, the end point error (Otte and Nagel, 1994)
is used. It is defined as
ε
∆
= ku − u
0
k, (11)
where u is the correct displacement vector and u
0
the
calculated displacement vector.
For evaluation the flow field of each dataset is
gradually filtered according to the confidence mea-
sure. The average end point error for the remaining
flow vectors is calculated in every step. For the cal-
culation of the optical flow field, the texture channel
is used in the data term unless stated otherwise. For
the energy functional a variety of different data terms
is considered. The weighting between regularization
term and data term for the confidence measure is set
to λ = 0.5 .
A comparison of the efficiency of the confidence
measures with a linear and a nonlinear data term for
two datasets of the Middlebury benchmark can be
seen in Figure 1. For both flow calculation and the
confidence measure the texture channel is used in
the data term. It can be seen that both confidence
measures are suitable to select points with low end
point error. For the dataset ’Grove2’ the linear and
nonlinear confidence measure give similar results for
selection rates of 75 % and more, although e
lin
can
reduce the error slightly more. For selection rates
below 75 % e
nonlin
is significantly more effective. It
decreases the error monotonically, while for e
lin
the
error rises again for selection rates below 18 %. For
the other dataset both confidence measures show a
similar performance. Both monotonically decrease
the error. For selection rates below 65 % e
nonlin
gives better results, for higher selection rates e
lin
gives better results. Overall e
nonlin
is more effective,
although e
lin
seems to be more suitable for high
selection rates. Taking all selection rates of both
datasets into account using the nonlinear data term
instead of the linear one improves the results on
average by 2.2 %. For a selection rate of 1 % the
average improvement is 7.6 %. Therefore, e
nonlin
is
used in the following evaluations.
Figure 2 displays the results of the confidence
measure e
nonlin
on the same datasets for three dif-
ferent data terms in the confidence measure. The
different data terms use a constancy assumption on
the grey value channel, the texture channel and both
the structure and texture channel. It can be seen
that the best results are obtained when not only the
texture channel of the images is used but the texture
and structure channel are used together. The structure
channel, which contains most of the shadow and
shading, may not be suitable to find a pixel matching
itself. Nevertheless, it gives valuable information on
how well the pixels are matched because in the areas
with significant illumination changes the optical flow
is expected to be less reliable. The structure-texture
data term also outperforms the grey value data term.
So apparently decomposing the image and compar-
ing the channels separately leads to a significant
improvement, especially for low selection rates.
For the dataset ’Grove2’ the end point error can be
reduced monotonically from 0.89 px to 0.51 px when
using only 1 % of the correspondences. This is an
improvement of 43 %. For the dataset ’Hydrangea’
the error can even be reduced by 83 %.
The results for five other datasets of the Middle-
bury benchmark can be seen in Figure 4. The
structure-texture data term gives the best results
for all datasets, except for ’RubberWhale’ where
using the texture channel only is most effective for
selection rates below 30 %. For higher selection rates
the structure-texture data term again gives the best
results. The end point error is reduced on average
over all datasets evaluated by 53.3 % if 1 % of the
flow field is selected. The average improvement over
all datasets tested for the structure-texture data term
compared to the grey value data term is 11.5 % for
a selection rate of 1 % and 3.8 % over all selection
rates.
Overall using the nonlinear structure-texture data
term improves the performance of the confidence
measure by 35.2 % for a selection rate of 1 % com-
pared to using exactly the same data term as for the
flow calculation.
ImprovedConfidenceMeasuresforVariationalOpticalFlow
391