Improved Conﬁdence Measures for Variational Optical Flow

Maren Brumm, Jan Marek Marcinczak and Rolf-Rainer Grigat

TU Hamburg-Harburg, Harburger Schlossstraße 20, 21079 Hamburg, Germany

Keywords:

Variational Optical Flow, Conﬁdence Measure, Performance Evaluation, Structure-Texture Decomposition.

Abstract:

In the last decades variational optical ﬂow algorithms have been intensively studied by the computer vision

community. However, relatively few effort has been made to obtain robust conﬁdence measures for the es-

timated ﬂow ﬁeld. As many applications do not require the whole ﬂow ﬁeld, it would be helpful to identify

the parts of the ﬁeld where the ﬂow is most accurate. We propse a conﬁdence measure based on the energy

functional that is minimized during the optical ﬂow calculation and analyze the performance of different data

terms. For evaluation, 7 datasets of the Middlebury benchmark are used. The results show that the accuracy of

the ﬂow ﬁeld can be improved by 53.3 % if points are selected according to the proposed conﬁdence measure.

The suggested method leads to an improvement of 35.2 % compared to classical conﬁdence measures.

1 INTRODUCTION

Since Horn and Schunck (Horn and Schunck, 1981)

variational optical ﬂow has been an active ﬁeld of re-

search. An optical ﬂow ﬁeld gives for every pixel in

the ﬁrst input image a motion vector that estimates

the movement to its new position in the second im-

age. There are many different applications for which

the optical ﬂow ﬁeld can be used, such as motion seg-

mentation or 3D reconstruction. However, the optical

ﬂow estimation is difﬁcult in areas with changing il-

lumination or complex movements. Commonly, there

are areas, where the ﬂow ﬁeld is less accurate than in

others. Many applications require the ﬂow ﬁeld only

for a subset of pixels. Therefore, a conﬁdence mea-

sure is needed to identify the parts of the ﬁeld where

the ﬂow is most reliable. Another issue is continu-

ous tracking, where points are tracked over several

frames. Here, a conﬁdence measure can be used to

detect vanishing points or poorly estimated points that

would lead to large errors if tracked on.

Barron et al. (Barron et al., 1994) proposed to use

the magnitude of the image gradient as conﬁdence

measure. The intention is to reward high structure

image parts as the data term has very little informa-

tive value in image parts of low structure. However,

they demonstrated that this approach is not reliable

for variational methods. Bruhn and Weickert argue in

(Bruhn and Weickert, 2006) that large gradients com-

monly result from noise and from pixels that are oc-

cluded in the second frame, while areas with small

gradients might be ﬁlled in accurately by regulariza-

tion.

The approach described in (Bruhn and Weickert,

2006; Bruhn et al., 2005) uses the energy functional

that is minimized during the ﬂow calculation as con-

ﬁdence measure. The smaller the contribution of a

pixel to the ﬁnal energy the more reliable it is rated.

In this paper we use a similar approach, but do

not limit ourself to using the same data term for the

conﬁdence measure as for the optical ﬂow estimation.

Instead we analyze the performance of linear and non-

linear data terms as well as different constancy as-

sumptions based on a structure-texture decomposi-

tion. We especially address the problem of changing

illumination by adding a data term that penalizes ar-

eas with high illumination changes.

For evaluation seven datasets with ground truth of

the Middlebury benchmark (Baker et al., 2011) are

used.

Future work will show whether additional im-

provements can be made by optimizing the weight-

ing between data term and regularization term and by

considering different norms on both terms.

2 OPTICAL FLOW ESTIMATION

Let I : Ω → R with Ω ⊂ R

be a greyscale image and

let (x, y) denote the image coordinates. For an im-

age pair I

(x, y) and I

(x, y) the optical ﬂow u(x, y) =

(u, v) describes how the points move between I

(x, y)

and I

(x, y).

389

Brumm M., Marcinczak J. and Grigat R..

Improved Conﬁdence Measures for Variational Optical Flow.

DOI: 10.5220/0005167203890394

In Proceedings of the 10th International Conference on Computer Vision Theory and Applications (VISAPP-2015), pages 389-394

ISBN: 978-989-758-091-8

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

The key assumption for the optical ﬂow estimation is

that the grey value of a point in space is constant in

all frames. Classically, this constancy assumption is

given by (Horn and Schunck, 1981):

∂I(x, y, t)

∂t

= 0 . (1)

Following the description in (Wedel and Cremers,

2011, p. 9) it can be expressed as follows:

(x, y) = I

(x + u, y + v). (2)

Using the ﬁrst taylor approximation of I

(x + u, y +v)

leads to

(x, y) = I

(x, y) + ∇I

(x, y)





, (3)

0 = I

(x, y) − I

(x, y)

| {z }

+∇I

(x, y)





. (4)

Denoting the partial derivatives with I

, I

and I

the

linearized optical ﬂow constraint reads

0 = I

+ I

u + I

v. (5)

The optical ﬂow constraint usually refers to the

constancy of the grey values in the input frames.

However, grey values are not illumination invariant.

Hence, we consider the use of other pixel properties,

see Section 4.

The given constraint leads to an under-determined

equation system and cannot be used solely to esti-

mate the optical ﬂow ﬁeld. Horn and Schunck (Horn

and Schunck, 1981) proposed to use the additional

assumption that the resulting ﬂow ﬁeld should be

smooth. This leads to the following minimization

problem:

min

u,v



Ω

|∇u| + |∇v| + λ|I

+ I

u + I

v|dΩ



. (6)

The data term enforces the optical ﬂow constraint,

while the regularization term favours smooth ﬂow

ﬁelds by penalizing deviations in the ﬂow ﬂied. The

parameter λ weights between both terms. Commonly,

it is advisable to use the L

norm instead of the L

norm because it is discontinuity preserving and robust

to outliers (Zach et al., 2007).

3 CONFIDENCE MEASURE

The proposed conﬁdence measure is based on the en-

ergy functional (6) that is minimized to obtain the op-

tical ﬂow ﬁeld. As described in (Bruhn and Weick-

ert, 2006) the underlying idea is that the ﬂow ﬁeld

is most accurate in those areas where the functional

has the lowest cost. However, there are different re-

quirements for the calculation of a ﬂow ﬁeld and the

rating of its accuracy. Therefore, we analyze different

energy formulations to include additional information

which supports classiﬁcation.

Two different strategies are tested. The ﬁrst is to

use exactly the same energy functional as conﬁdence

measure that is used to estimate the optical ﬂow:

lin

= |∇u| + |∇v| + λ

∑

i=1

+ I

u + I

v|, (7)

where N denotes the number of constancy assump-

tions used in the data term and I

(x, y) denotes the

pixel property that is assumed to be constant. This

implies that the data term is linearized. As the esti-

mated ﬂow is available, there is no need to linearize

the data term and the nonlinear data term |I

(x, y) −

(x + u, y + v)| can be calculated directly. Therefore,

the following conﬁdence measure provides a higher

accuracy as it uses the optical ﬂow constraint directly

instead of approximating it:

nonlin

= |∇u| + |∇v|+ λ

∑

i=1

(x, y)−I

(x +u, y + v)|

(8)

Instead of limiting us to using the same constancy as-

sumption in the optical ﬂow constraint for the con-

ﬁdence measure as for the ﬂow calculation, we also

consider other constancy assumptions for the conﬁ-

dence measure, as described in the next section.

4 STRUCTURE-TEXTURE

DECOMPOSITION

Wedel and Cremer propose in (Wedel and Cremers,

2011, p. 36f) to perform a structure-texture decompo-

sition (Meyer, 2001) on the input images for the cal-

culation of the optical ﬂow ﬁeld. As shadows show up

mainly in the structural part, it increases the robust-

ness to illumination changes to assume the constancy

of the textural part only.

However, for the conﬁdence measure both chan-

nels carry valuable information. The texture chan-

nel provides information on how well a pixel values

matches while the structure channel can be used to

ﬁlter out areas with signiﬁcant illumination changes

where the optical ﬂow is expected to be less reli-

able. Hence we consider using both the texture and

the structure channel for the conﬁdence measure. By

decomposing the image and comparing both channels

separately additional information can be gained com-

pared to simply comparing the grey values without

VISAPP2015-InternationalConferenceonComputerVisionTheoryandApplications

390

decomposition.

The structural part I

(x) of the grey value image

I(x) is obtained by solving

min

Ω

|∇I

(x)| +

2θ

(x) − I(x))

dx . (9)

The solution to this problem can be found in (Wedel

and Cremers, 2011, p. 27f).

The textural part I

(x) can then be calculated by

(x) = I(x)− αI

(x). (10)

Parameters are set according to (Wedel and Cremers,

2011, p. 36f) to α = 0.95 and θ = 0.125 and 100

iterations are used to calculate I

5 EVALUATION

To evaluate the performance of the proposed con-

ﬁdence measures, seven datasets of the Middlebury

benchmark (Baker et al., 2011) for which a ground

truth is available are used.

To measure the accuracy of an estimated optical

ﬂow ﬁeld, the end point error (Otte and Nagel, 1994)

is used. It is deﬁned as

∆

= ku − u

k, (11)

where u is the correct displacement vector and u

the

calculated displacement vector.

For evaluation the ﬂow ﬁeld of each dataset is

gradually ﬁltered according to the conﬁdence mea-

sure. The average end point error for the remaining

ﬂow vectors is calculated in every step. For the cal-

culation of the optical ﬂow ﬁeld, the texture channel

is used in the data term unless stated otherwise. For

the energy functional a variety of different data terms

is considered. The weighting between regularization

term and data term for the conﬁdence measure is set

to λ = 0.5 .

A comparison of the efﬁciency of the conﬁdence

measures with a linear and a nonlinear data term for

two datasets of the Middlebury benchmark can be

seen in Figure 1. For both ﬂow calculation and the

conﬁdence measure the texture channel is used in

the data term. It can be seen that both conﬁdence

measures are suitable to select points with low end

point error. For the dataset ’Grove2’ the linear and

nonlinear conﬁdence measure give similar results for

selection rates of 75 % and more, although e

lin

can

reduce the error slightly more. For selection rates

below 75 % e

nonlin

is signiﬁcantly more effective. It

decreases the error monotonically, while for e

lin

the

error rises again for selection rates below 18 %. For

the other dataset both conﬁdence measures show a

similar performance. Both monotonically decrease

the error. For selection rates below 65 % e

nonlin

gives better results, for higher selection rates e

lin

gives better results. Overall e

nonlin

is more effective,

although e

lin

seems to be more suitable for high

selection rates. Taking all selection rates of both

datasets into account using the nonlinear data term

instead of the linear one improves the results on

average by 2.2 %. For a selection rate of 1 % the

average improvement is 7.6 %. Therefore, e

nonlin

used in the following evaluations.

Figure 2 displays the results of the conﬁdence

measure e

nonlin

on the same datasets for three dif-

ferent data terms in the conﬁdence measure. The

different data terms use a constancy assumption on

the grey value channel, the texture channel and both

the structure and texture channel. It can be seen

that the best results are obtained when not only the

texture channel of the images is used but the texture

and structure channel are used together. The structure

channel, which contains most of the shadow and

shading, may not be suitable to ﬁnd a pixel matching

itself. Nevertheless, it gives valuable information on

how well the pixels are matched because in the areas

with signiﬁcant illumination changes the optical ﬂow

is expected to be less reliable. The structure-texture

data term also outperforms the grey value data term.

So apparently decomposing the image and compar-

ing the channels separately leads to a signiﬁcant

improvement, especially for low selection rates.

For the dataset ’Grove2’ the end point error can be

reduced monotonically from 0.89 px to 0.51 px when

using only 1 % of the correspondences. This is an

improvement of 43 %. For the dataset ’Hydrangea’

the error can even be reduced by 83 %.

The results for ﬁve other datasets of the Middle-

bury benchmark can be seen in Figure 4. The

structure-texture data term gives the best results

for all datasets, except for ’RubberWhale’ where

using the texture channel only is most effective for

selection rates below 30 %. For higher selection rates

the structure-texture data term again gives the best

results. The end point error is reduced on average

over all datasets evaluated by 53.3 % if 1 % of the

ﬂow ﬁeld is selected. The average improvement over

all datasets tested for the structure-texture data term

compared to the grey value data term is 11.5 % for

a selection rate of 1 % and 3.8 % over all selection

rates.

Overall using the nonlinear structure-texture data

term improves the performance of the conﬁdence

measure by 35.2 % for a selection rate of 1 % com-

pared to using exactly the same data term as for the

ﬂow calculation.

ImprovedConfidenceMeasuresforVariationalOpticalFlow

391

20 40

80 100

0.6

0.7

0.8

0.9

points taken [%]

end point error [px]

lin

)

nonlin

)

(a) dataset ’Grove2’

20 40

80 100

0.1

0.2

0.3

points taken [%]

lin

)

nonlin

)

(b) dataset ’Hydrangea’

Figure 1: Comparison of the conﬁdence measures with a linear and nonlinear data term for two sequences of the Middlebury

benchmark. For selection rates of 65% and lower the nonlinear data term gives better results.

20 40

80 100

0.5

0.6

0.7

0.8

0.9

points taken [%]

end point error [px]

nonlin

)

nonlin

, I

)

nonlin

grey

)

(a) dataset ’Grove2

20 40

80 100

0.1

0.2

0.3

points taken [%]

nonlin

)

nonlin

, I

)

nonlin

grey

)

(b) dataset ’Hydrangea’

Figure 2: Comparison of different constancy assumptions in the data term. e(I

, I

) shows the best results.

20 40

80 100

0.6

0.8

1.2

points taken [%]

end point error [px]

nonlin

)

nonlin

, I

)

nonlin

grey

)

nonlin

grad

)

(a) ﬂow calculated with I

grey

20 40

80 100

0.6

0.8

1.2

points taken [%]

nonlin

)

nonlin

, I

)

nonlin

grey

)

nonlin

grad

)

(b) ﬂow calculated with I

grad

Figure 3: Results for ﬂow ﬁelds calculated with different data terms. Again e(I

, I

) shows the best results.

To see whether it has an inﬂuence which data

term is used for the optical ﬂow calculation, the

same evaluation is made for the optical ﬂow ﬁeld

calculated with a constancy assumption on the grey

values or on the gradient of the image. Here also

the gradient of the images is tested as data term in

the conﬁdence measure. The results can be seen

in Figure 3. They show that the gradient data term

as conﬁdence measure gives worse results than the

other data terms. The ranking of the other conﬁdence

VISAPP2015-InternationalConferenceonComputerVisionTheoryandApplications

392

20 40

80 100

1.6

1.8

points taken [%]

end point error [px]

nonlin

)

nonlin

, I

)

nonlin

grey

)

(a) dataset ’Grove3’

20 40

80 100

0.1

0.2

0.3

points taken [%]

nonlin

)

nonlin

, I

)

nonlin

grey

)

(b) dataset ’RubberWhale

0 20 40

80 100

0.4

0.45

0.5

points taken [%]

end point error [px]

nonlin

)

nonlin

, I

)

nonlin

grey

)

20 40

80 100

points taken [%]

nonlin

)

nonlin

, I

)

nonlin

grey

)

(d) dataset ’Urban2’

20 40

80 100

points taken [%]

end point error [px]

nonlin

)

nonlin

, I

)

nonlin

grey

)

(e) dataset ’Urban3’

Figure 4: Comparison of different constancy assumptions in the data term for ﬁve additional datasets of the Middlebury

benchmark.

measures is the same as in the previous experiment

and the progression of the error is also similar to the

progression in the previous experiment for all three

data terms. Therefore, the conﬁdence measure seems

to be unaffected by the choice of the data term for

optical ﬂow computation.

6 CONCLUSION AND OUTLOOK

To detect areas where the estimated optical ﬂow is

most reliable, a conﬁdence measure based on the en-

ergy functional that is minimized during the optical

ﬂow calculation was evaluated. Both the use of the

linear and the nonlinear version of the data term for

the conﬁdence measure were considered, as well as

ImprovedConfidenceMeasuresforVariationalOpticalFlow

393

different data terms based on a structure-texture de-

composition.

Using the nonlinear data term gives better results

for selection rates below 65 %. For higher selection

rates the linear data term is more effective, although

the difference is not signiﬁcant.

Decomposing the image in the structure and tex-

ture channel of the structure-texture decomposition

and using both channels as separate data terms gives

the best results for all selection rates on six out of the

seven datasets tested. This result seems to be indepen-

dent on the data term used for ﬂow calculation. The

end point error can be reduced by 53.3 % when using

only 1 % of the correspondences.

Using the nonlinear structure-texture data term im-

proves the performance of the conﬁdence measure by

35.2 % for a selection rate of 1 % compared to using

exactly the same data term as for the ﬂow calculation.

Further improvements of the proposed conﬁdence

measure may be obtained by optimizing the weight-

ing between the data term and the regularization term.

Unlike the energy functional for the ﬂow calculation,

the conﬁdence measure does not have to be disconti-

nuity preserving or robust to outliers. On the contrary,

it may be eligible to penalize outliers highly. Hence,

another possibility would be to use the L

norm in-

stead of the L

norm on the data term or the regular-

ization term or both.

REFERENCES

Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M. J.,

and Szeliski, R. (2011). A database and evaluation

methodology for optical ﬂow. International Journal

of Computer Vision, 92(1):1–31.

Barron, J. L., Fleet, D. J., and Beauchemin, S. S. (1994).

Performance of optical ﬂow techniques. International

Journal of Computer Vision, 12(1):43–77.

Bruhn, A. and Weickert, J. (2006). A conﬁdence mea-

sure for variational optic ﬂow methods. In Geomet-

ric Properties for Incomplete Data, pages 283–298.

Springer.

Bruhn, A., Weickert, J., and Schn

orr, C. (2005). Lu-

cas/Kanade meets Horn/Schunck: Combining local

and global optic ﬂow methods. International Journal

of Computer Vision, 61(3):211–231.

Horn, B. K. P. and Schunck, B. G. (1981). Determining

optical ﬂow. Artiﬁcial Intelligence, 17:185–203.

Meyer, Y. (2001). Oscillating patterns in image processing

and nonlinear evolution equations: the ﬁfteenth Dean

Jacqueline B. Lewis memorial lectures, volume 22.

American Mathematical Soc.

Otte, M. and Nagel, H.-H. (1994). Optical ﬂow estima-

tion: advances and comparisons. In Computer Vision

ECCV’94, pages 49–60. Springer.

Wedel, A. and Cremers, D. (2011). Stereo Scene Flow for

3D Motion Analysis. Springer.

Zach, C., Pock, T., and Bischof, H. (2007). A duality based

approach for realtime TV-L

optical ﬂow. In Pattern

Recognition, pages 214–223. Springer.

VISAPP2015-InternationalConferenceonComputerVisionTheoryandApplications

394