Continuous Tracking of Structures from an Image Sequence

Yann Lepoittevin

1,2

, Dominique B

eziat

, Isabelle Herlin

1,2

and Nicolas Mercier

1,2

Inria, B.P. 105, 78153 Le Chesnay, France

CEREA, Joint Laboratory ENPC - EDF R&D, Universit

e Paris-Est,

Cit

e Descartes Champs-sur-Marne, 77455 Marne la Vall

ee Cedex 2, France

Universit

e Pierre et Marie Curie, 4 place Jussieu, Paris 750005, France

Keywords:

Tracking, Motion, Data Assimilation, Satellite image, Meteorology.

Abstract:

The paper describes an innovative approach to estimate velocity on an image sequence and simultaneously

segment and track a given structure. It relies on the underlying dynamics’ equations of the studied physical

system. A data assimilation method is applied to solve evolution equations of image brightness, those of

motion’s dynamics, and those of distance map modelling the tracked structures. Results are ﬁrst quantiﬁed

on synthetic data with comparison to ground-truth. Then, the method is applied on meteorological satellite

acquisitions of a tropical cloud, in order to track this structure on the sequence. The outputs of the approach

are the continuous estimation of both motion and structure’s boundary. The main advantage is that the method

only relies on image data and on a rough segmentation of the structure at initial date.

1 INTRODUCTION

The issue of detecting and tracking a structure covers

a broad of major computer vision problems. Read-

ers can refer to (Yilmaz et al., 2006), for instance,

in order to get an extensive description on this issue.

However, images may be noisy, as this is the case for

satellite acquisitions, and assumptions on dynamics

should then be involved. To our knowledge, no paper

concerns a method that simultaneously estimates mo-

tion and segments/tracks a structure from only image

data and a rough segmentation of the structure. How-

ever, methods exist that segment and track a structure,

given motion ﬁeld and initial segmentation (Peterfre-

und, 1999; Rathi et al., 2007; Avenel et al., 2009), or

that track a structure and estimate its motion if this

structure has been accurately segmented on the ﬁrst

image (Bertalm

ıo et al., 2000).

The use of data assimilation recently emerged in

the image processing community. In (B

eziat and

Herlin, 2011), motion estimation is discussed, and

solutions are described for processing noisy images.

In (Papadakis and M

emin, 2008), an incremental 4D-

Var is used, that also computes motion ﬁeld and tracks

a structure, but relies, as inputs, on both image data

and accurate segmentation of the structure on the

whole sequence. Our approach has the advantage to

simultaneously solve the issues of motion estimation,

detection, segmentation and tracking of the structure,

based, as only inputs, on image data and their gradient

values.

Section 2 describes the main mathematical com-

ponents of the approach. Sections 3 and 4 discuss

results obtained on synthetic data and meteorological

satellite acquisitions. Section 5 concludes with some

remarks and perspectives on the research work.

2 MATHEMATICAL SETTING

Our approach is based on a 4D-Var data assimilation

algorithm, used to estimate motion on the sequence

and track a structure.

Ω denotes the bounded image domain, on which

pixels x =



x y



are considered, [0, T ] the studied

temporal interval, and A = Ω × [0, T ].

2.1 Model of Structure and Input Data

Let deﬁne the structure tracked along the image se-

quence by an implicit function φ (see Figure 1): each

pixel x at date t gets for value its signed distance to

the current position of the structure boundary.

Observations, used during the assimilation, are

images themselves and their contour points, obtained

by thresholding the maxima of the gradient norm.

386

Lepoittevin Y., Béréziat D., Herlin I. and Mercier N..

Continuous Tracking of Structures from an Image Sequence.

DOI: 10.5220/0004278503860389

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2013), pages 386-389

ISBN: 978-989-8565-48-8

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

φ(x, y)

φ(x, y) = 0

Figure 1: Implicit representation of structure’s boundary.

2.2 Evolution Model

The assumption on dynamics is the Lagrangian con-

stancy of velocity w =



u v



, rewritten as:

= 0 ⇔

∂u

∂t

+ u

∂u

∂x

+ v

∂u

∂y

= 0 (1)

= 0 ⇔

∂v

∂t

+ u

∂v

∂x

+ v

∂v

∂y

= 0 (2)

A pseudo-image I

is deﬁned, that satisﬁes the op-

tical ﬂow constraint:

∂I

∂t

+ ∇I

.w = 0 (3)

The pseudo-image is compared to satellite data dur-

ing the optimization process: they have to be almost

identical at acquisition dates. The implicit function φ

is assumed to satisfy the same heuristics, as the struc-

ture moves accordingly to image evolution:

∂φ

∂t

+ ∇φ.w = 0 (4)

The state vector, deﬁned as X =



u v I



satisﬁes the evolution system (1, 2, 3, 4), summarized

by:

∂X

∂t

+ (X(t)) = 0 (5)

2.3 4D-Var Data Assimilation

In order to estimate X, and obtain motion estimation

and tracking of the structure, the 4D-Var algorithm

considers the following three equations:

∂X

∂t

(x, t) + (X)(x, t) = 0 (6)

X(x, 0) = X

(x) + ε

(x) (7)

(X, Y)(x, t) = ε

(x, t) (8)

The ﬁrst one is the evolution equation. One can

notice that X(x, t), for any t, is determined from

X(x, 0) and the integration of Equation (6).

Equation (7) corresponds to the knowledge, that is

available on the state vector at initial date 0, and ex-

pressed as the background value X

(x). The solution

X(x, 0), estimated by 4D-Var, should stay close to this

background value. However, as it is uncertain, an er-

ror term, ε

(x), is considered. No knowledge is avail-

able on the initial velocity ﬁeld and its background

value is null; the background on the pseudo-image I

is the ﬁrst image of the sequence; and the background

of φ, denoted φ

, roughly deﬁnes the structure to be

tracked. Let be the projection of the state vector on

components I

and φ, Equation (7) is rewritten as:

(X(0)) = (X

) + ε

(9)

Equation (8), named observation equation, links

the observations to the state vector X. The obser-

vation vector Y includes the image acquisitions and

a distance map to the contour points, that have been

computed on these acquisitions. This distance map is

denoted by D

(x, t). denotes the observation opera-

tor, split in two parts: =



I φ



compares

pseudo-images I

to image observations I:

(X, Y) = I

− I = ε

(10)

Their discrepancy is described by the error ε

compares φ to the distance map D

(x, t). The absolute

value of φ should be almost equal to D

(X, Y) =

−aφ

(1 + e

−aφ

)

(|φ| − D

) = ε

(11)

The function

−aφ

(1+e

−aφ

)

is introduced to decrease the

impact of contours, that do not belong to the bound-

ary of the tracked structure. Parameter a controls the

slope of the function. The more a increases, the more

the function looks like an indicator function: only pix-

els in a small neighborhood of structure’s boundary

are considered by

during the optimization process.

Errors ε

, ε

are supposed Gaussian, zero-

mean, not correlated, with respective variance B, R

. Solving System (6, 9, 10, 11) is then written as

the minimization of the following cost function:

J(X(0)) =

(x, t)

Ω

(x)

B(x)

(12)

Let λ denote the adjoint variable, verifying:

λ(T ) = 0 (13)

−

∂λ(t)

∂t



∂

∂X



∗

λ(t) =

∗

−1

(X, Y)(t) (14)

with the adjoint operator



∂

∂X



∗

, which is deﬁned

by:

Zη, λ

η, Z

∗

. The adjoint operator



∂

∂X



∗

ContinuousTrackingofStructuresfromanImageSequence

387

is automatically generated from the discrete opera-

tor by an efﬁcient automatic differentiation soft-

ware (Hasco

et and Pascual, 2004). Then, gradient of

J is:

∂J

∂X(0)

−1

[ (X(0)) − (X

)] + λ(0) (15)

Minimization is achieved by a steepest method and

the L-BFGS algorithm (Zhu et al., 1997).

3 TWIN EXPERIMENT

A sequence of ﬁve Image Observations (see Figure 3),

= I(t

) for i = 1 to 5, is generated by integrating

model from initial conditions, displayed in Fig-

ure 2. Contours are ﬁrst computed on images I

. Then

Distance Map Observations D

(x, t

) are derived. This

is done in order to estimate motion on the whole im-

age sequence and track the brightest square.

Figure 2: Left : initial image. Right : initial motion ﬁeld.

Figure 3: Image Observations at t

and t

After assimilation, pseudo-images are compared

to Image Observations. They are almost identical and

their correlation measure is over 0.999. At dates t

the region of positive values of φ, corresponding to

the inside of the tracked structure, is compared to the

contour points, see Figure 4.

The simulation, that provides Image Observa-

tions, also provides ground-truth of the velocity ﬁeld.

This allows to perform statistics on the discrepancy

between estimated motion and ground-truth: average

error is around 1% in norm and less than one degree

in orientation. Motion estimated on the whole image

is displayed on Figure 5 with the coloured representa-

Figure 4: Comparison of φ and contour points on Image

Observation. Left: t

, Right: t

Figure 5: Left: Ground-truth. Right: Assimilation result.

tion tool of the Middlebury database

: there is no vis-

ible difference between estimation and ground-truth.

4 METEOSAT IMAGES

The assimilation method is applied on a Meteosat se-

quence and displayed on Figure 6.

Figure 6: Images of a tropical cloud in the infrared domain.

From left to right, up to down.

Figure 7, ﬁrst column, displays the contour points,

used to calculate the Distance Map Observations

(x, t

). Pseudo-images, obtained as result of data

assimilation, are displayed on the second column. On

the third column, the blue curve corresponds to the

http://vision.middlebury.edu/ﬂow/

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

388

Figure 7: Left: Contours on observations. Middle: pseudo-

images. Right: red is the result with

, blue is without.

result of assimilation without the term

in Equa-

tion 11, while the red one is obtained with

. As it

can be seen, including constraints on φ allows to im-

prove the accuracy of segmentation. Motion ﬁeld is

estimated on the whole image, but Figure 8 focuses

on the boundary of the structure. It shows that the re-

sulting velocity vectors correctly assess displacement

of the structure along the sequence. The displacement

estimated at the boundary of the tracked structure, su-

perposed on satellite images, is shown on Figure 8.

Figure 8: Motion result superposed to images on the bound-

ary of the tracked structure. From left to right, up to down.

5 CONCLUSIONS

The paper describes an innovative approach enabling

to estimate motion, segment and track a structure on

images, such as, for instance, a cloud on a satellite se-

quence. The approach is based on 4D-Var data assim-

ilation, and the state vector includes an implicit func-

tion φ modelling the boundary of the tracked struc-

ture. Results are given on synthetic and meteoro-

logical data. Additional experiments have been con-

ducted, not described in the paper, that conﬁrm the

robustness of the approach.

The main perspective of this research is to ex-

tend the method to multi-structures tracking and to

a space-time segmentation process. An additional in-

teresting perspective is to allow uncertainty on the dy-

namic equations and take into account a model error

in Equation (6).

REFERENCES

Avenel, C., M

emin, E., and P

erez, P. (2009). Tracking

closed curves with non-linear stochastic ﬁlters. In

Conference on Space-Scale and Variational Methods.

eziat, D. and Herlin, I. (2011). Solving ill-posed image

processing problems using data assimilation. Numer-

ical Algorithms, 56(2):219–252.

Bertalm

ıo, M., Sapiro, G., and Randall, G. (2000). Mor-

phing active contours. Pat. Anal. and Mach. Int.,

22(7):733–737.

Hasco

et, L. and Pascual, V. (2004). Tapenade 2.1 user’s

guide. Technical Report 0300, INRIA.

Papadakis, N. and M

emin, E. (2008). Variational assim-

ilation of ﬂuid motion from image sequence. SIAM

Journal on Imaging Sciences, 1(4):343–363.

Peterfreund, N. (1999). Robust tracking of position and ve-

locity with kalman snakes. Pat. Anal. and Mach. Int.,

21(6):564–569.

Rathi, Y., Vaswani, N., Tannenbaum, A., and Yezzi, A.

(2007). Tracking deforming objects using particle ﬁl-

tering for geometric active contours. Pat. Anal. and

Mach. Int., 29(8):1470–1475.

Yilmaz, A., Javed, O., and Shah, M. (2006). Object

tracking: A survey. ACM Computing Surveys, 38(4

2006):13.

Zhu, C., Byrd, R., Lu, P., and Nocedal, J. (1997). L-bfgs-

B: Algorithm 778: L-bfgs-B, FORTRAN routines for

large scale bound constrained optimization. ACM

Transactions on Mathematical Software, 23(4):550–

560.

ContinuousTrackingofStructuresfromanImageSequence

389