OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION

PROBLEMS IN COMPUTER VISION

Application to Optical-ﬂow Estimation

Pascal Zille

1,2

and Thomas Corpetti

UMR CNRS LMFA, Ecole Centrale Lyon, Lyon, France

CNRS LIAMA TIPE, Beijing, China

Keywords:

Multi-resolution, Optical-ﬂow, Optimal Control, Data Assimilation.

Abstract:

This paper is concerned with the multi-resolution issue used in many computer vision applications. Such ap-

proaches are very popular to optimize a cost function that, in most of the situations, has been linearized for

mathematical facility reasons. In general, a multi-resolution setup consists in a redeﬁnition of the problem at

a different resolution level where the mathematical assumptions (usually linearity) hold. Following a coarse-

to-ﬁne strategy, a usual process consists in 1) optimizing the large scales and 2) use this result as an initial

condition for the estimation at ﬁner scales. Such process is repeated until the plain image resolution. One

of the main drawbacks of such downscaling approach is its incapacity to correct the eventual errors that have

been made at larger scales. These latter are indeed propagated along the scales and disturb the ﬁnal result.

In this paper, we suggest a new formulation of the multi-resolution setup where we exploit some smoothing

techniques issued from optimal control theory and in particular variational data assimilation. The time is here

artiﬁcial and is related to the various scales we are dealing with. Following a consistent mathematical frame-

work, we deﬁne an original downscaling/upscaling technique to perform the multi-resolution. We validate this

approach by deﬁning a simple optical ﬂow estimation technique based on Lucas-Kanade. Experimental results

on synthetic data demonstrate the efﬁciency of this new methodology.

1 INTRODUCTION

A number of common computer vision techniques

(related to motion estimation, segmentation, charac-

terization, ...) are deﬁned as ﬁnding a variable X by

solving an equation H (X,I, X

) that depends on the

image luminance I and the pixel grid X

(also noted x

in this paper). Because most of the usual assumptions

hold only in a linear case for obvious mathematical

properties, it is common to embed the resolution of

H (X,I,X

) in a so-called “multi-resolution” scheme

(Burt, 1988; Mallat, 1989). The main principle con-

sists in redeﬁning the images on smaller grids X

that

correspond to the initial grid X

divided by a factor

N. On such “coarse” images, we assume that de res-

olution of H (X,I,X

) can be done under linear con-

straints and this provide a coarse approximation X

of the ﬁnal solution X. In a step forward, this approx-

imation X

is used as an initial condition for a new

problem where the goal is to extract a reﬁnement dX

(with X = X

+ dX) that can be estimated from X

using linear constraints. Such process is commonly

repeated for several levels of resolutions, yielding a

succession of linear problems.

This is for instance a usual way to deal with

optical-ﬂow estimation where the common bright-

ness consistencyassumption (that links the spatial and

temporal gradients of the luminance with the spatial

velocity v

v(x

x) to estimate), deﬁned as:

∂I

∂t

+ v

v(x

x) ·∇I ≈ 0 (1)

holds only for small displacements. When embedded

in a multi-resolution scheme, this relation holds for

the images at the coarser resolutions and the associ-

ated estimations are used as initial solutions for the

ﬁner resolutions.

To get such “coarse” data on which the problem

is solved, a usual strategy consists in using a pyra-

midal decomposition of the images, for instance with

wavelet decompositionsor gaussian ﬁltering followed

by decimations, as illustrated in FIG. 1(left) for a fac-

tor N = 2. The estimation is then performed ﬁrst from

the smallest images (under linear constraints) and the

134

Zille P. and Corpetti T..

OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION PROBLEMS IN COMPUTER VISION - Application to Optical-ﬂow Estimation.

DOI: 10.5220/0003841401340143

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2012), pages 134-143

ISBN: 978-989-8565-04-4

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

Solution 1 :

Filtering + decimation

Solution 2 :

Gaussian smoothings

grid X

ℓ

Figure 1: Multiresolution strategies. Top left: classical

pyramidal decomposition; top right: successive gaussian

convolutions. Bottom: illustration of the latter technique.

Green circles represent the original pixel grid whereas the

red ones represent the grid at a given resolution ℓ.

solution is reprojected to the following level. Another

possibility proposed in (Corpetti and M´emin, 2012)

consists in obtaining X

ℓ

(related to the solution of a

problem H (X,I, X

ℓ

) at a resolution ℓ) by perform-

ing a convolution of H (X,I, X

) by a gaussian ker-

nel of standard deviation ℓ. Indeed, a multi-resolution

scheme consists in redeﬁning the problem on a grid

ℓ

which can be viewed as a coarse representation of

the initial grid X

= X

X with a Brownian isotropic un-

certainty of constant variance ℓ :

ℓ

= X

+ ℓ I

B, (2)

where B

B is a standard Brownian motion and I the

2D identity matrix (see the illustration in ﬁgure 1).

Under this scheme, any solution X

ℓ

of a problem

H (X,I,X

ℓ

) deﬁned on a grid X

ℓ

should satisfy the

expectation E(H (X,I, X

ℓ

)|X

) which is equivalent

(see (Corpetti and M´emin, 2012) for the demon-

stration) to a convolution of H (X,I, X

) with the

isotropic centered gaussian N (0,ℓ) of variance ℓ.

Therefore the multi-resolution can be performed

by solving a family of problems g

ℓ

∗ H (X,I, X

)

at the various resolutions ℓ. A main advantage of

such a formulation of the multi-resolution setup is to

naturally get rid of the intrinsic problems related to

pyramidal image decompositions (decimations and

interpolations in particular). In addition, instead of

dealing with successive decimations of factor 2 of

the initial image to ﬁx the different multiresolution

levels, the evolutions of the levels ℓ are much ﬂexible

here.

General difﬁculties of multi-resolution sys-

tems: even if several speciﬁc techniques have been

proposed for some applications (see for instance

(Alvarez et al., 2000; Baatz and Schaape, 2000;

Bajcsy and Kovacic, 1989; Brox et al., 2004; Eck

et al., 1995; Ojala et al., 2002)), in general, whatever

the multi-resolution setup chosen, one of the main

difﬁculty remains on the succession of independent

problems: at a given resolution ℓ

, the problem

consists in ﬁnding dX

ℓ

using X

ℓ

n+1

as a coarse

approximation: X

ℓ

= X

ℓ

n+1

+ dX

ℓ

. Once X

ℓ

estimated, its value is kept during the rest of the

process. This is somewhat prejudicial since it is

now recognized that small scales (related to ﬁner

resolutions) interact with larger scales. With such

schemes, at a given resolution ℓ

, the information

related to smaller scales ℓ < ℓ

can not be taken

into account. In addition, any error in the estimation

of X

ℓ

will also be kept and propagated across the

resolutions without any possibilities of correction.

To deal with these difﬁculties, we propose in

this paper an original solution based on optimal con-

trol theory and in particular on data assimilation.

Such schemes consist in estimating a sequence of un-

knowns X(t) driven by a more or less exact dynamical

model by performinga series of forward/backward in-

tegrations. The key idea of this study is to exploit

such framework where the time will in fact be related

to the various scales. The forward/backward integra-

tions will then lead to a set of upscaling/downscaling

approaches that appear to be adapted to our problem.

Before entering into the core of the technique in sec-

tion 3, we ﬁrst brieﬂy introduce the data assimilation

framework.

2 DATA ASSIMILATION

In this section we will describe the general principles

OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION PROBLEMS IN COMPUTER VISION - Application to

Optical-flow Estimation

135

of the assimilation scheme we devised in this study.

Variational data assimilation is a technique derived

from optimal control theory (Lions, 1971) to recover

a state variable’s trajectory from a sequence of mea-

surements. Opposite to sequential Bayesian ﬁlters,

which share the same aim, this framework allows

to handle high dimensional systems (and is thus in-

tensively used for instance in environmental sciences

(Bennett, 1992; Le Dimet and Talagrand, 1986; Ta-

lagrand and Courtier, 1987)). We refer the reader to

(Bennett, 1992; Le Dimet and Talagrand, 1986; Li-

ons, 1971; Talagrand and Courtier, 1987; Talagrand,

1997; Vidard et al., 2000) for complete methodologi-

cal aspects of data assimilation and applications con-

cerning geophysical ﬂows.

The problem consists in recovering, from an ini-

tial condition X

, a system’s state X partially observed

and driven by approximately known dynamics. This

can be formalized as ﬁnding X(x

x,t), for any location

x ∈ Ω at time t ∈ [t

], that satisﬁes the system:

∂X

∂t

x,t) + M(X(x

x,t)) = ν

x), (3)

X(x

x,t

) = X

x) + ν

x), (4)

Y(x

x,t) = H(X(x

x,t)) + ν

x,t), (5)

where M is the non-linear operator relative to the dy-

namics, X

is the initial vector at time t

and (ν

,ν

)

are (unknown) additive control variables relative to

noise on the dynamics and the initial condition re-

spectively. In addition, noisy measurements Y

Y of the

unknown state are available through the non-linear

operator H, up to ν

. To estimate the system’s state,

a common methodology relies on the minimization of

the cost function J :

J (X) =

Y −H(X(ν

,ν

))k

−1

kX(x

x,t

) −X

x)k

−1

∂X

∂t

x,t) + M(X(x

x,t))k

−1

dt,

(6)

where we have introduced the information matri-

ces R,B,Q relative to the covariance of the errors

(ν

,ν

). The Mahalanobis distance that has been

used reads, for an information matrix A: kXk

−1

X. The evaluation of X can be done by can-

celing the gradient δJ

(θ) = lim

β→0

J(X+βθ)−J(X)

(6). Unfortunately, the estimation of such gradient is

in practice unfeasible for a large system’s state since it

would be necessary to integrate the dynamical model

along all possible perturbations of the components of

X. This is computationally impossible with actual

hardwares when one deals with a complete sequence

of images and complex dynamical models. One way

to cope with this difﬁculty is to write an adjoint for-

mulation of the problem. To that end, the adjoint vari-

ables λ

λ that express the errors of the dynamic model

are introduced as:

λ(x

x,t) =

Ω,t

−1



∂X

∂t

+ M(X)



′

(7)

Denoting:

•



∂M

∂



and



∂H

∂



the linear tangent operators of

M and H respectively. The linear tangent of an

operator A is the directional derivative of the op-

erator (the Gˆateaux derivative):



∂A

∂



(dX) = lim

β→0

X + βdX) −A(

, (8)

• (∂

∗

and (∂

∗

their adjoint operators. The

adjoint A

∗

of a linear operator A on a space D is

such as:

∀x

∈ D ,< Ax

>=< x

∗

>, (9)

it can be shown that canceling the gradient δJ

(θ)

with respect to the adjoint variables λ

λ leads to a ret-

rograde integration of an adjoint evolution model that

takes into account the observations. Once the adjoint

variables λ

λ are estimated, one can recover the system

state X using relation (7). Finally, when dealing with

non-linear models, recovering X leads to the follow-

ing incremental algorithm (Bennett, 1992):

1. Starting from

X(x

x,t

) = X

x), perform a forward inte-

gration:

∂

∂t

+ M(

X) = 0

X(x

x,t) being available, compute the adjoint vari-

ables λ

λ(x

x,t) with the backward equation:

λ(t

) = 0 ;

−

∂λ

∂t

(t) + (∂

∗

λ(t) = (∂

∗

−1

Y −H(

X))(t)

(10)

3. Update the initial condition : dX(t

) = Bλ

λ(t

);

4. λ

λ being available, compute the state space dX(t) from

dX(t

) with the forward integration

∂dX

∂t

(t) +



∂M

∂



dX(t) = Qλ

λ(t) (11)

5. Update :

X =

X +dX

6. Loop to step (2) until convergence

Intuitively, the adjoint variables λ

λ contain infor-

mation about the discrepancy between the observa-

tions and the dynamic model. They are computed

from a current solution

X with the backward integra-

tion (10) that encompasses both the observations and

the dynamic operators. This deviation indicator be-

tween the observations and the model is then used to

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

136

reﬁne the initial condition (step (3)) and to recover

the system state through an imperfect dynamic model

where errors are Qλ

λ (step (4)). It should be noted

that if the dynamic is perfect, the associated error co-

variance Q is zero and the algorithm only reﬁnes the

initial condition.

Graphic Illustration of the Incremental Algo-

rithm. Let us try to illustrate in a simple way ’what

is the algorithm doing’. For the sake of clarity, we

will consider here the case of a prefect dynamic model

where only the initial condition is reﬁned.

-Let’s take a closer look at (step (1)) of the abovemen-

tionned algorithm: it consists in determining a ﬁrst

coarse ’trajectory’ of X. As shown in ﬁgure 2(a), we

start from an initial condition X(t

) and compute the

state variable values ∀t ∈ [t

] using a forward in-

tegration of the dynamical model M. The red circles

symbolically represent the available measurementsY

-Having a ﬁrst estimation of X, we may now move on

to (step (2)), ie solving the backward integration that

encompasses both the observations and the dynamic

operators. As represented in ﬁgure 2(b), this step con-

sists in integrating from t

to t

the adjoint dynamic

model (10) and may be understood as a backpropaga-

tion of the discrepancy between our estimated trajec-

tory and the observations Y

-The purpose of computing the adjoint variable is to

use its initial value λ

λ(t

) and the relation deﬁned in

(step (3)), thus obtaining an initial increment dX(t

We then aim at ’propagating’ this latter along the time

axis integrating forward the tangent model (11).

-Finally, updating the previous estimation with this

increment should yield a better compromise between

the dynamical model and the observations, ﬁgure 2(c)

Remarks on Variationnal Assimilation. An inter-

resting property of variationnal assimilation is the

mutual interraction of all estimations. That is for each

) ∈[t

]

the estimated value of X(t

) at a given

iteration can inﬂuence the value of X(t

) and vice

versa. This yields to more coherent estimation than

sequential assimilation where values of X for a given

time t

∗

may only interract with values for t > t

∗

. This

framework is then an appealing solution for large sys-

tem states and acts as a smoothing issue. To design an

assimilation process, we need to deﬁne:

1. The system state;

2. The dynamical model (and its adjoint);

3. The observation operator (and its adjoint);

4. The error covariance matrices.

In this paper, we suggest to exploit this framework

for deﬁning an original multi-resolution system. The

(a) forward integration starting from inital condition (step (1))

(b) backward adjoint integration and initial correction (steps (2)-(3))

Figure 2: Symbolic illustration of one iteration of the incre-

mental algorithm.

temporal variable will be related to the spatial scales,

as presented in the next section.

3 VARIATIONAL ASSIMILATION

FOR MULTI-RESOLUTION

Unlike most of existing issues related to variational

assimilation, in this article the usual temporal variable

t is now connected to the different resolutions. To

avoid ambiguities, we then rather prefer to represent

its value with the scale parameter ℓ. A value ℓ = 0

corresponds to the “plain” resolution at the image grid

OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION PROBLEMS IN COMPUTER VISION - Application to

Optical-flow Estimation

137

(ℓ = 0 ⇒ grid X

= x

x) and we assume that the system

state X evolves across the resolutions following scale

space relation:

∂X

∂ℓ

∆X. (12)

Indeed, it can easily be demonstrated that the solution

of the previous relation is

ℓ

= X(ℓ) = g

√

ℓ

∗X(0), (13)

where g

√

ℓ

is a gaussian kernel of standard devia-

tion ℓ. As shown in section 1, this relation enables

to access the various scales of X. The linear model

M(X) = −

∆X associated to relation (3) and corre-

sponding to the scale space evolution (12) is then a

perfect model for an exploration of X at different res-

olution levels. We then suggest to rely on this re-

lation, using the assimilation framework of the pre-

vious section, to deﬁne a multi-resolution system.

Following the deﬁnition of adjoint operators, we get

(∂

∗

(X) = M(X) = −

∆X.

The new multi-resolution procedure consists now

to estimate X(0) at the initial artiﬁcial time ℓ = 0

(which corresponds to the image grid) under the per-

fect model of relation (12). The initial condition X

is set to zero and the observation system Y

Y(X

) =

H(X

) deﬁned at the image grid X

reads for any

resolution ℓ > 0 :

(

Y(X

ℓ

) = g

√

ℓ

∗Y

Y(X

)

H(X

ℓ

) = g

√

ℓ

∗H(X

(14)

Following the algorithm described in section 2, as the

ﬁrst integration of the dynamic model in (12) with

null initial condition gives zeros for all ℓ, the process

is expressed as (with B and R the error covariance ma-

trices related to the uncertainty on the initial condition

and the observations):

• Initialization:

X(ℓ) = 0 for all scales ℓ ∈ [0, ℓ

1. Compute the adjoint variables λ

λ(ℓ) with the downscal-

ing equation:

λ(ℓ

) = 0 ;

−

∂λ

∂ℓ

(ℓ) −

∆λ

λ(ℓ) =



∂

ℓ



∗

−1

Y(X

ℓ

) −H(X

ℓ

))

(15)

2. Adjoint variables λ

λ being available, update the correc-

tion at the image grid : dX(0) = Bλ

λ(0);

3. Assess to the correction at all the resolution levels dX(ℓ)

from dX(0) with the upscaling equation:

∂dX

∂ℓ

∆dX (16)

4. Update :

X =

X +dX, ∀ℓ ∈ [0,ℓ

]

5. Loop to step (1) until convergence

Comments of the new multi-resolution scheme: the

ﬁrst step of the previous algorithm corresponds to the

usual downscaling approach: from coarse to ﬁne res-

olutions, we compute and propagate the errors with

equation (15). This enables to reﬁne the solution at

the image grid (step 2). Unlike the classic multi-

resolution approach which ends here, we then re-

propagate this correction at the various resolutions

(step 3). The process is then repeated until conver-

gence. This framework enables to modify former so-

lutions at given resolution ℓ = L by taking into ac-

count their inﬂuence on smaller scales ℓ < L. This

answers a difﬁculty mentioned in section 1 and au-

thorizes a correction of the estimations at all the res-

olution levels. As the dynamical model is perfect, for

each application we need to deﬁne an observationsys-

tem Y

Y(X

ℓ

) = H(X

ℓ

), its associated error covari-

ance matrix R and the one related to the null initial

condition B. In the next section, we apply this frame-

work for optical-ﬂow estimation.

4 APPLICATION TO

OPTICAL-FLOW

The goal of this section is to demonstrate the efﬁ-

ciency of the introduced multi-resolution procedure.

To that end, we use the context of optical ﬂow estima-

tion and we compare results reached using two classic

and our new multi-resolution approaches.

The optical-ﬂow problem consists in estimating a

velocity ﬁeld X(x

x) = v

v(x

x) = [u(x

x),v(x

x)]

at the image

grid x

x between two images I

x) and I

x).

As mentioned above, the framework of section 3

is valid for any kind of applications, depending on

the observation operator H. In this example related

to motion estimation, we use a simple Lucas-Kanade,

presented in the next paragraph. It should be outlined

that this observation model is implemented for a test-

ing issue but obviously, one of the main advantage

of this framework is its possibility to naturally embed

in the observation term more advances techniques, as

continuity preserving ones or some physical models

accurately deﬁned at a given resolution ℓ (sub-grid

models in particular).

4.1 Observation Operator based on

Lucas-Kanade

The most used and simple observation model pro-

posed for optical-ﬂow estimation is the brightness

consistency assumption:

∂I(x

x,t)

∂t

+ v

v(x

x,t) ·∇I(x

x,t) ∼ 0 (17)

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

138

and assumes that the points x

x keep their inten-

sity along their displacements, the luminance I

being viewed as a continuous function and ∇ =

(∂/∂x,∂/∂y)

being the gradient operator. Applied

to a pair of images this relation reads:

x+ ∆tv

v(x

x)) −I

x) = 0 ⇒

x) −I

∆t

+ v

v(x

x) ·∇I

x) = 0

(18)

where we have a ﬁrst order Taylor development of

the conservation constraint I

x+ ∆tv

v(x

x))−I

x) = 0

around ∆tv

v(x

x) and ∆t is the time between two im-

ages (by convention we assume ∆t = 1). This cre-

ates a link between the displaced frame difference

x) −I

x), the spatial gradients of the second im-

age ∇I

x) and the velocity v

v to estimate. The equa-

tions in (17-18) are commonly named the optical-ﬂow

constraint equations (OFCE) and are the basis of huge

amount of studies. The reader can refer to (Baker

and Matthews, 2004; Baker et al., 2007; Baker et al.,

2010; Barron et al., 1994; Galvin et al., 1998; Mitiche

and Bouthemy, 1996) for presentations and overviews

of optical ﬂow techniques. At this step, it is easy to

observe that:

1. in homogeneous areas, all terms vanish and there

is an inﬁnity of solutions;

2. because of the projection v

v ·∇I, only the normal

component to the photometric gradients can be

extracted with such a formulation. This problem

is known as the “aperture problem”.

Therefore, the relations in (17-18) are in themselves

not sufﬁcient to extract the velocity ﬁeld. We need to

add some constraints on the velocity to estimate.

In the work of Lucas & Kanade (Lucas and

Kanade, 1981), the authors have assumed for each lo-

cation x

x that the velocity is locally constant. It is then

estimated as:

v = min

v=(u,v)

Ω

∗



∂I(x

x,t)

∂t

+ v

v(x

x,t) ·∇I(x

x,t)



(19)

where g

is a Gaussian window of standard deviation

σ in which the velocity v

v is assumed to be homoge-

neous. Canceling the derivative of the previous rela-

tion with respect to v

v leads to:

v = −



∗





−1

∗





, (20)

where I

•

= ∂I/∂•. To guarantee a good condition-

ing of the previous matrix to invert, the spatial gradi-

ents must not vanish. The gaussian smoothing aims

in fact at alleviating homogeneous areas by capturing

the spatial information at a scale related to σ. On the

basis of relation (20), we then deﬁne our observation

system where the unknown X

ℓ

can be set at each res-

olution X

ℓ

with the linear Lucas-Kanade based obser-

vation system Y

Y(X

ℓ

) = H(X

ℓ

as:











Y(X

ℓ

) = g

√

ℓ

∗g

∗





H(X

ℓ

) = −g

√

ℓ

∗



∗





(21)

The adjoint operator of H is itself and g

is a convo-

lution with a gaussian of standard deviation σ related

to the Lucas & Kanade strategy.

4.2 Error Covariance Matrix and

Initialization Issues

Error covariance matrix related to the observation:

we have deﬁned, at each point of the grids X

ℓ

, the

matrix R

−1

used in (10) as a diagonal one such that:

−1

ℓ

) = R

max

exp−

Y(X

ℓ

) −H(X

ℓ

)]

obs

(22)

As shown in (Corpetti et al., 2009), this penalization

amounts to consider a robust norm in relation (19).

Such a robust function allows the discarding of points

having large “residual” values of the observation

error [Y

Y(X

ℓ

) − H(X

ℓ

)] (called outliers in the Ro-

bust Statistics literature (Huber, 1981; Geman and

Reynolds, 1992; Delanay and Bresler, 1998)). In our

application, it enables to properly deal with corrupted

areas that do not ﬁt our data model exactly.

Initial conditions: as mentioned in section 3,

we have no prior knowledge X

. We then have set

= 0. Therefore the associated covariance matrix B

B(X

) = Id −exp



−|I

) −I

/σ



(23)

where σ

is a parameter to ﬁx. It is related to the

adequacy of the null solution depending on the OFCE.

Complete scheme: the multi-resolution strategy

of section 3 with the observation operator deﬁned

in (21) and the associated error covariance matrixes

R and B (deﬁned in (22) and (23) respectively)

constitute the complete framework for optical-ﬂow

estimation using an original downscaling/upscaling

multi-resolution process. Let us now turn to some

experimental results.

OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION PROBLEMS IN COMPUTER VISION - Application to

Optical-flow Estimation

139

Table 1: Quantitative comparisons on the DNS sequence with a Pyramidal Lucas-Kanade (LK, (Lucas and Kanade, 1981)),

a Lucas-Kanade term embed in a multi-resolution scheme with successive convolutions without any assimilation (CONV)

and the proposed multi-resolution using variational assimilation (AMR). In addition, we have plotted numerical values issued

from a commercial technique based on correlation (COM, LA VISION SYSTEM), Horn & Schunck (HS, (Horn and Schunck,

1981)), two ﬂuid dedicated motion estimators with div-curl smoothing terms (DC 1 : (Corpetti et al., 2002); DC2 : (Yuan

et al., 2007)), a stochastic observation term (STO, (Corpetti and M´emin, 2012)).

PYR CONV AMR COM HS DC 1 DC 2 STO

AAE 6.07

4.53

3.74

4.58

4.27

4.35

3.04

3.12

RMSE 0.1699 0.1243 0.1057 0.1520 0.1385 0.1340 0.09602 0.0961

5 EXPERIMENTAL RESULTS

We have tested the technique presented in this paper

on three kind of synthetic images:

1. Synthetic ﬂuid particles: it corresponds to a syn-

thetic pair of images issued from Direct Numeri-

cal Simulation of Navier-Stokes equations. More

precisely the sequence simulates the evolution of

particles submitted to a 2D turbulent ﬂow. We

have used such data since it is known that turbu-

lent ﬂows exhibit many interactions between the

different scales. Therefore it is expected that the

beneﬁt of the approach will be highlighted.

2. Synthetic natural image: it correspondsto a syn-

thetic pair of images generated by hand with sim-

ple and discontinuous motions. On this images,

some areas are submitted to the aperture prob-

lem and we will show that the introduced multi-

resolution is able to improve the results.

3. Synthetic image issued from the middleburry

database: the Middleburry database has recently

been proposed in (Baker et al., 2007) to compare

recent and state-of-the-art optical-ﬂow methods.

It contains several sequences with various chal-

lenging situations like hidden textures, complex

scenes, non rigid motion, high motion discontinu-

ities, ... We have tested the technique on one pair

issued from this dataset.

In the experiments, the results obtained by our

technique are noted AMR for “Assimilation Multi-

Resolution”. We recall here that the objective of this

section is more to validate the multi-resolution strat-

egy rather than proposing an efﬁcient optical-ﬂow

estimation technique. Therefore we have compared

our AMR technique with two other Lucas-Kanade

estimators but using different multi-resolution ap-

proaches: PYR for the usual pyramidal decomposi-

tion and CONV for a downscaling approach on suc-

cessive convolutions (see the introduction).

5.1 Synthetic Fluid Particles

In the top of ﬁgure 3, we present: one image of the

sequence (ﬁg. 3(a)), the ground truth (ﬁg. 3(b)),

an estimated velocity ﬁeld with the CONV technique

(ﬁg. 3(c)) and with the AMR one (ﬁg. 3(d)). As one

can observe on the velocity ﬁelds, all are similar and

closed to the ground truth. However when observing

some quantitative values of AAE (Average Angular

Error) and RMSE (Root Mean Square Error) in ta-

ble 1, it is promising to observe that among the three

techniques PYR, CONV and AMR that correspond

to the same estimator with different multi-resolution

strategies, the proposed one is the most performing.

It is also interesting to note that the pyramidal tech-

nique usually exploited to obtain the various scales is

less performing than a series of convolutions. In ad-

dition, we have reported on this table some published

results, on this pair of images, of other techniques es-

pecially devoted to such ﬂuid and particle ﬂows. It

can be pointed out from these values that compared to

more advanced approaches (related to optical-ﬂow or

devoted to ﬂuid images), the improvement obtained

with this framework yields this technique very com-

petitive, despite a very simple observation term. We

can then conclude from this experience that most of

the common multi-resolution techniques are likely to

introduce some errors that can partially be removed

with a more advanced strategy, as the one presented

in this paper.

5.2 Synthetic Natural Image

In the middle of ﬁgure 3, we present: one image of the

sequence (ﬁg. 3(e)), the ground truth (ﬁg. 3(f)), an es-

timated velocity ﬁeld with the CONV technique (ﬁg.

3(g)) and with the AMR one (ﬁg. 3(h)). As shown in

ﬁg. 3(f), the synthetic velocity ﬁeld is composed two

two distinct homogeneous motions (applied to the ar-

eas in dotted red lines in ﬁg. 3(e)). One can observe

that in the right hand part of the image, a constant re-

gion is submitted to the aperture problem and yields

therefore the estimation very delicate. This difﬁculty

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

140

(a) (b) (c) (d)

(e) (f) (g) (h)

(i) (j) (k) (l)

Figure 3: Experiments on synthetic data. Top: synthetic DNS, Middle: synthetic general image made by hand; Bottom:

dimetrodon issued from the Middleburry database. For each pair, we respectively show one image of the sequence (a-e-i),

the synthetic velocity ﬁeld (b-f-j); one estimation using the CONV technique (c-g-k) and one estimation using the proposed

AMR multi-resolution (d-h-l)

clearly affects the quality of the motion ﬁelds of ﬁg-

ures 3(g–h). However from these two images, it ap-

pears clearly that the one issued from the AMR ap-

proach is better, as conﬁrmed by the quantitative val-

ues of table 2. On this example, the ﬁrst estimation

at the coarser level had some errors. There latter are

not corrected with a classical downscaling approach

whereas they are modiﬁed using our technique.

5.3 Synthetic Images of the

Middleburry Database

We have tested our approach on the “Dimetrodon”

pair of image issued from the Middleburry database.

The ﬁgure 3 embeds one image of the sequence (ﬁg.

Table 2: Quantitative comparisons on the synthetic se-

quence with a Lucas-Kanade term embed in a multi-

resolution scheme with successive convolutions without any

assimilation (CONV) and the proposed multi-resolution us-

ing variational assimilation (AMR).

CONV AMR

AAE 9.02

5.62

RMSE 1.28 0.79

3(i)), the ground truth (ﬁg. 3(j)), an estimated velocity

ﬁeld with the CONV technique (ﬁg. 3(k)) and with

the AMR one (ﬁg. 3(l)). Associated numerical val-

ues are depicted in table 3. Here again, on this com-

plex motion with many discontinuities, the original

multi-resolution introduced in this paper outperforms

OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION PROBLEMS IN COMPUTER VISION - Application to

Optical-flow Estimation

141

Table 3: Quantitative comparisons on the dimetrodon

sequence with a Lucas-Kanade term embed in a multi-

resolution scheme with successive convolutions without any

assimilation (CONV) and the proposed multi-resolution us-

ing variational assimilation (AMR).

CONV AMR

AAE 7.95

6.50

RMSE 0.98 0.62

a usual one using similar observation terms. This is a

promising behavior.

5.4 Key Message

Various situations have been tested with these three

experiments: the ﬁrst one exhibits a ﬂow with many

interactions between scales, the second one is submit-

ted to the aperture problem whereas the third one is

composed of a more complex velocity ﬁeld. From

the associated qualitative and quantitative values, it is

very interesting to point out that for a similar tech-

nique (i.e. the Lucas & Kanade estimaion), the new

multi-resolution approach is more competitive in all

applications. This is the key message of these experi-

ments.

6 CONCLUSIONS

In this paper, we have introduced an original mean

to perform multi-resolution strategies commonly used

in computer vision. These techniques, used to man-

age efﬁciently some simpliﬁcations (as linearization),

generally suffer from one drawback: their inability to

correct errors of coarser resolutions. Errors are indeed

most of the time propagated along the scales. In this

study, we have exploited a framework issued from op-

timal control theory and in particular variational data

assimilation to solve this issue. The general idea of

variationaldata assimilation techniques consist in per-

forming a set of forward/backward integrations of a

dynamical system to estimate a system state. Applied

to a scale-space equations, we have derived a consis-

tent mathematical framework to perform any multi-

resolution scheme in a set of forward/backward inte-

grations that in practice correspond to a set of down-

scaling/upscaling estimations.

We have validated the idea on a simple Lucas-

Kanade motion estimation technique for three syn-

thetic pair of images corresponding to various situ-

ations. The experimental results reveal that for all

tested images, our multi-resolution approach outper-

forms classic ones, which is a very interesting and

promising conclusion. As future works, we will

use more advanced observation terms associated with

non-linear scale space dynamics able to preserve dis-

continuities.

REFERENCES

Alvarez, L., Weickert, J., and S´anchez, J. (2000). Reli-

able estimation of dense optical ﬂow ﬁelds with large

displacements. International Journal of Computer Vi-

sion, 39(1):41–56.

Baatz, M. and Schaape, A. (2000). Multiresolution Seg-

mentation: an optimization approach for high qual-

ity multi-scale image segmentation. In Strobl, J., ed-

itor, Angewandte Geographische Informationsverar-

beitung XII. Beitr¨age zum AGIT-Symposium Salzburg

2000, Karlsruhe, Herbert Wichmann Verlag, pages

12–23.

Bajcsy, R. and Kovacic, S. (1989). Multiresolution elas-

tic matching. Computer Vision, Graphics, and Image

Processing, 46(1):1 – 21.

Baker, S. and Matthews, I. (2004). Lucas-Kanade 20 Years

On: A Unifying Framework. International Journal of

Computer Vision, 56(3):221–255.

Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.,

and Szeliski, R. (2007). A Database and Evaluation

Methodology for Optical Flow. In Int. Conf. on Comp.

Vis., ICCV 2007.

Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M. J.,

and Szeliski, R. (2010). A Database and Evaluation

Methodology for Optical Flow. International Journal

of Computer Vision, 92(1):1–31.

Barron, J., Fleet, D., Beauchemin, S., and Burkitt, T.

(1994). Performance Of Optical Flow Techniques. In-

ternational Journal of Computer Vision, 12(1):43–77.

Bennett, A. (1992). Inverse Methods in Physical Oceanog-

raphy. Cambridge University Press.

Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. (2004).

High accuracy optical ﬂow estimation based on a the-

ory for warping. pages 25–36. Springer.

Burt, P. (1988). Smart sensing within a pyramid vision ma-

chine. Proceedings of the IEEE, 76(8):1006 – 1015.

Corpetti, T., H´eas, P., M´emin, E., and Papadakis, N. (2009).

Pressure image assimilation for atmospheric motion

estimation. Tellus Series A: Dynamic Meteorology

and Oceanography, 61(1):160–178.

Corpetti, T. and M´emin, E. (2012). Stochastic uncer-

tainty models for the luminance consistency assump-

tion. IEEE Trans. on Image Processing, to appear.

Corpetti, T., M´emin, E., and P´erez, P. (2002). Dense Esti-

mation of Fluid Flows. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 24(3):365–380.

Delanay, A. and Bresler, Y. (1998). Globally convergent

edge-preserving regularized reconstruction: an appli-

cation to limited-angle tomography. IEEE Trans. Im-

age Processing, 7(2):204–221.

Eck, M., DeRose, T., Duchamp, T., Hoppe, H., Lounsbery,

M., and Stuetzle, W. (1995). Multiresolution analy-

sis of arbitrary meshes. In Proceedings of the 22nd

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

142

annual conference on Computer graphics and inter-

active techniques, SIGGRAPH ’95, pages 173–182,

New York, NY, USA. ACM.

Galvin, B., McCane, B., Novins, K., Mason, D., and Mills,

S. (1998). Recovering motion ﬁelds: an analysis of

eight optical ﬂow algorithms. In Proc. British Mach.

Vis. Conf., Southampton.

Geman, D. and Reynolds, G. (1992). Constrained Restora-

tion and The Recovery Of Discontinuities. IEEE

Trans. Pattern Anal. Machine Intell., 14(3):367–383.

Horn, B. and Schunck, B. (1981). Determining optical ﬂow.

Artiﬁcial Intelligence, 17(1-3):185–203.

Huber, P. (1981). Robust Statistics. John Wiley & Sons.

Le Dimet, F.-X. and Talagrand, O. (1986). Variational algo-

rithms for analysis and assimilation of meteorological

observations: theoretical aspects. Tellus, pages 97–

110.

Lions, J.-L. (1971). Optimal control of systems governed by

PDEs. Springer-Verlag.

Lucas, B. and Kanade, T. (1981). An iterative image regis-

tration technique with an application to stereo vision.

International joint conference on artiﬁcial, 130:121–

130.

Mallat, S. G. (1989). A theory for multiresolution signal de-

composition: the wavelet representation. IEEE Trans-

actions on Pattern Analysis and Machine Intelligence,

11:674–693.

Mitiche, A. and Bouthemy, P. (1996). Computation of im-

age motion: a synopsis of current problems and meth-

ods. Int. Journ. of Comp. Vis., 19(1):29–55.

Ojala, T., Pietikainen, M., and Maenpaa, T. (2002). Mul-

tiresolution gray-scale and rotation invariant texture

classiﬁcation with local binary patterns. IEEE Trans-

actions on Pattern Analysis and Machine Intelligence,

24:971–987.

Talagrand, O. (1997). Assimilation of observations, an in-

troduction. J. Meteor. Soc. Jap., 75:191–209.

Talagrand, O. and Courtier, P. (1987). Variational assimi-

lation of meteorological observations with the adjoint

vorticity equation. {I}: Theory. J. of Roy. Meteo. soc.,

113:1311–1328.

Vidard, P., Blayo, E., Le Dimet, F.-X., and Piacentini, A.

(2000). 4D Variational Data Analysis with Imper-

fect Model. Flow, Turbulence and Combustion, 65(3-

4):489–504.

Yuan, J., Schnoerr, C., and M´emin, E. (2007). Discrete or-

thogonal decomposition and variational ﬂuid ﬂow es-

timation. Journ. of Mathematical Imaging and Vision,

28(1):67–80.

OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION PROBLEMS IN COMPUTER VISION - Application to

Optical-flow Estimation

143