RECOGNITION OF DYNAMIC VIDEO CONTENTS BASED ON

MOTION TEXTURE STATISTICAL MODELS

∗ Tomas Crivelli, ∗† Bruno Cernuschi-Frias

∗ Faculty of Engineering, University of Buenos Aires, Buenos Aires, Argentina. †CONICET, Argentina

Patrick Bouthemy

IRISA/INRIA, Campus de Beaulieu, 35042 Rennes Cedex, France

Jian-Feng Yao

IRMAR/Univ. of Rennes 1, Campus de Beaulieu, 35042 Rennes Cedex, France

Keywords:

Motion analysis, Markov random ﬁelds, image content classiﬁcation, dynamic textures.

Abstract:

The aim of this work is to model, learn and recognize, dynamic contents in video sequences, displayed mostly

by natural scene elements, such as rivers, smoke, moving foliage, ﬁre, etc. We adopt the mixed-state Markov

random ﬁelds modeling recently introduced to represent the so-called motion textures. The approach consists

in describing the spatial distribution of some motion measurements which exhibit values of two types: a

discrete component related to the absence of motion and a continuous part for measurements different from

zero. Based on this, we present a method for recognition and classiﬁcation of real motion textures using the

generative statistical models that can be learned for each motion texture class. Experiments on sequences from

the DynTex dynamic texture database demonstrate the performance of this novel approach.

1 INTRODUCTION

In the context of visual motion analysis, motion

textures refer to dynamic video contents displayed

mostly by natural scene elements. They are closely

related to temporal or dynamic textures (Doretto et al.,

2003). Different from activities (walking, climbing,

playing) and events (open a door, answer the phone),

temporal textures show some type of stationarity and

homogeneity, both in space and time. Typical exam-

ples can be found in nature scenes: rivers, smoke,

rain, moving foliage, etc.

When analyzing a complex scene, the three types

of dynamic visual information (activities, events and

temporal textures) may be present. However, their

dissimilar nature leads to considering substantially

different approaches for each one in tasks as detec-

tion, segmentation, and recognition.

The aim of this work is to model the apparent mo-

tion contained in dynamic textures, with special inter-

est in dynamic content recognition. Generally speak-

ing, model-based approaches (Doretto et al., 2003;

Saisan et al., 2001; Yuan et al., 2004) have been

mainly dedicated to describe the evolution of inten-

sity over time, while motion-based methods (Fazekas

and Chetverikov, 2005; Lu et al., 2005; Peteri and

Chetverikov, 2005; Vidal and Ravichandran, 2005)

propose the use of motion measurements (mainly

based on optical and normal ﬂow) as input features

for a classiﬁcation step.

We adopt the mixed-state Markov random ﬁelds

(MS-MRF) model introduced in (Bouthemy et al.,

2006), to represent the so-called motion textures. The

approach consists in describing the spatial distribu-

tion of some motion measurements which exhibit val-

ues of two types: a discrete component related to the

absence of motion and a continuous part for real mea-

surements.

Based on this, we present a method for recogni-

tion and classiﬁcation of real motion textures using

the MS-MRF generative statistical models that can be

learned for each motion texture class. Experiments on

sequences from the DynTex (Peteri et al., 2005) dy-

namic texture database demonstrate the performance

of this novel approach.

283

Crivelli T., Cernuschi-Frias B., Bouthemy P. and Yao J. (2008).

RECOGNITION OF DYNAMIC VIDEO CONTENTS BASED ON MOTION TEXTURE STATISTICAL MODELS.

In Proceedings of the Third International Conference on Computer Vision Theory and Applications, pages 283-289

DOI: 10.5220/0001078102830289

 SciTePress

2 LOCAL MOTION

MEASUREMENTS

In (Fazekas and Chetverikov, 2005), the effectiveness

of normal ﬂow versus complete optical ﬂow measure-

ments, in the context of dynamic texture recognition,

is analyzed. They conclude that for small data sets,

normal ﬂow is an adequate description of temporal

texture dynamics, while complete ﬂow performs bet-

ter when the number of classes grows. However,

they use motion descriptors directly as input features

for the classiﬁcation step. No underlying statistical

model is proposed.

Our approach consists in deﬁning a statistical

model for motion measurements that allows us to rely

on the representativeness of the model, more than on

the accuracy of the motion measurements. At the

same time, normal ﬂow can be directly computed lo-

cally, avoiding the computational burden associated

to dense ﬂow estimation. Finally, our method is based

on modeling the instantaneous motion maps as spatial

random ﬁelds, where the amount of spatial statistical

interaction between motion variables will intervene in

the motion texture recognition.

We consider the normal ﬂow as local motion

measurements. However, in contrast to (Fablet and

Bouthemy, 2003), we do not consider its magnitude

only, but its vectorial expression deﬁned by V

(p) =

−

(p)

k∇

∇

∇I(p)k

∇

∇I(p)

k∇

∇

∇I(p)k

, where p is a location in the image

I. Then, we introduce the following weighted aver-

aging of the normal ﬂow vectors to smooth out noisy

measurements and enforce reliability:

(p) =

∑

q∈W

(q) k∇

∇

∇I(q) k

max(

∑

q∈W

k∇

∇

∇I(q) k

,η

)

, (1)

where η

is a constant, as in (Fablet and

Bouthemy, 2003), and W is a small window centered

in p. Finally, we consider the following scalar expres-

sion:

obs

(p) =

(p) ·

∇

∇I(p)

k∇

∇

∇I(p) k

, (2)

which projects the smoothed normal motion over the

spatial intensity gradient direction, resulting in v

obs

∈

(−∞,+∞).

In Fig. 1 we observe the result of applying the pro-

posed motion measurements to two pairs of consecu-

tive images for two different sequences. We observe

in the motion histograms that the statistical distribu-

tion of the motion measurements has two elements: a

discrete component at the null value v

obs

= 0, and a

continuous distribution for the rest of the motion val-

ues. The underlying discrete property of no-motion

Figure 1: a) Images from original sequences (left: straw,

right: water-rocks) obtained from the DynTex texture

database and their corresponding, b) motion textures, and

c) motion histograms.

for a point in the image, is represented as a null ob-

servation, and acts as a symbolic component in the

model. It is not the value by itself that matters, but

the binary property of what is called mobility: the ab-

sence or presence of motion. Thus, the null motion

value in this case, has a peculiar place in the sample

space, and consequently, has to be modeled accord-

ingly.

3 MIXED-STATE MRF MODEL

The key observation made in the previous section

about the statistical properties of motion measure-

ments, settles the necessity for an adequate represen-

tation of the associated random variables. In a ﬁrst

approach as in (Crivelli et al., 2006), observing the

histograms, the problem can be formulated deﬁning a

probability density for the motion values, that is com-

posed by two terms, i.e.

p(x) = ρδ

(x) + (1−ρ) f(x). (3)

where δ

(x) is the Dirac impulse function centered

at zero. This density is well-deﬁned and corresponds

to a random variable that has a discrete value with

probability mass concentrated at zero. In fact, it is

easy to observe that P(x = 0) = ρ.

3.1 Measure Theoretic Approach

From a probability theory point of view, it is more

natural to redeﬁne the probability space w.r.t. a new

probability measure, avoiding to deal with a func-

tional or distribution as the Dirac delta, that may com-

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

284

plicate the strict deﬁnition of the corresponding den-

sity, allowing also to generalize the case of a discrete

real value (e.g., x = 0) to a generic symbolic value or

abstract label, that may lie on an arbitrary label set.

In this section, we outline the theoretical frame-

work attached to mixed-state random variables. Let

us deﬁne M = { r}∪R

∗

where R

∗

= R \{r}, with r

a possible “discrete” value, sometimes called ground

value. A random variable X deﬁned on this space,

called mixed-state variable, is constructed as follows:

with probability ρ ∈(0, 1), set X = r, and with prob-

ability = 1 −ρ, X is continuously distributed in R

∗

Hereafter, we will assume that r = 0, without loss of

generality and all the results are inmediately extended

to abstract symbolic values for r. This is the main ad-

vantage of the measure theoretic approach.

Consequently, the distribution function of X can

be expressed as a monotone increasing function with

a “step jump” at X = 0. In order to compute the prob-

ability density function of the mixed-state variable X,

M is equipped with a “mixed” reference measure:

m(dx) = ν

(dx) + λ(dx), (4)

where ν

is the discrete measure for the value 0 and

λ the Lebesgue measure on R

∗

. Such a measure

has already been used in (Salzenstein and Pieczynski,

1997) for simultaneous fuzzy-hard image segmenta-

tion. Let us deﬁne the indicator function of the dis-

crete value 0 as 1

(x) and its complementary func-

tion 1

∗

(x) = 1

{0}

(x) = 1 −1

(x). Then, the above

random variable X has the following density function

w.r.t. m(dx):

p(x) = ρ1

(x) + (1−ρ)1

∗

(x) f(x), (5)

where f(x) is a continuous pdf from an absolutely

continuous distribution w.r.t. λ, deﬁned on R. Equa-

tion (5) corresponds to a mixed-state probability den-

sity.

4 MIXED-STATE SPATIAL

MARKOV MODELS FOR

MOTION TEXTURES

Let X = {x

}

i∈1...N

be a motion ﬁeld or motion texture

obtained as in section 2. We deﬁne the neighborhood

of any image point i, as the 8-point nearest neigh-

bors set, and X

as the subset of random variables

restricted to N

. Then,

= {i

}, (6)

where, for example, i

is the east neighbor of i in the

image grid, i

the north-west neighbor, etc.

The following mixed-state conditional model is

considered:

p(x

| X

) = ρ

) + (1−ρ

∗

) f (x

| X

6= 0)

(7)

where x

∈ X is the motion information measurement

at point i ∈ S = {1,..,N} of the image grid and ρ

P(x

= 0 |X

). Consequently, we have a distribution

consisting of two parts: a discrete component for x

0 and a continuous one for x

6= 0. This model gives

a speciﬁc attention to the null value of the random

variable x

, which corresponds to the property of no

motion for a point. For the continuous part, f, we

assume a Gaussian density with variance σ

and mean

depending on X

. The motion histograms suggest

the use of this distribution.

In contrast to (Bouthemy et al., 2006), where

a truncated zero-mean Gaussian distribution is as-

sumed, the proposed extension takes into account a

strongest correlation between continuous motion val-

ues, as we will see shortly. This is crucial in our for-

mulation as it allows describing more realistic motion

textures. Then, after some rearrangements, we may

write:

p(x

| X

) = exp



) ·S(x

) + logρ



(8)

where Θ

) ∈ R

and S(x

) ∈R

, with d = 3 for

our case and,

) =

log



1−ρ

√

2πρ



−

2σ

(9)

S(x

) =



∗

), −x

, x



(10)

As the seminal theorem of Hammersley-Clifford

states, Markov random ﬁelds with an everywhere pos-

itive density function, are equivalent to nearest neigh-

bor Gibbs distributions. The joint p.d.f. of the ran-

dom variables that compose the ﬁeld has the form,

p(X) = exp[Q(X)]/Z, where Q(X) is an energy func-

tion, and Z is called the partition function or normaliz-

ing factor of the distribution. It is a well-known result

in the Markov random ﬁelds theory that the energy

Q(X) can be expressed as a sum of potential func-

tions, Q(X) =

∑

C ⊂S

), deﬁned over all subsets

C of the lattice space S (Besag, 1974).

For a one-parameter conditional model (d = 1)

satisfying the assumption that it belongs to the expo-

nential family and that it depends only on cliques that

contain no more than two sites, i.e. auto-models, the

expression for the parameter is known w.r.t. the sufﬁ-

cient statistics S(·) of the neighbors (Besag, 1974). In

the case of multi-parameter auto-models (d > 1), the

result of (Besag, 1974) was extended in (Bouthemy

et al., 2006) and (Cernuschi-Frias, 2007) showing

that:

RECOGNITION OF DYNAMIC VIDEO CONTENTS BASED ON MOTION TEXTURE STATISTICAL MODELS

285

) = α

∑

j∈N

S(x

) (11)

Moreover, they give the expression of the ﬁrst and

second order potentials of the expansion of the energy

function Q(X):

) = α

·S(x

) (12)

) = S

)β

S(x

) (13)

We thus deﬁne a mixed-state Markov Random

Field (MS-MRF) auto-model for the motion texture,

where the local conditional probability densities are

mixed-state densities.

4.1 Motion Texture Model Parameters

From equation (11) it is clear that β

∈ R

3×3

and

∈ R

. First, we propose that, for the Gaussian

continuous density, f, the conditional mean is given

by,

) =

∑

j∈N

(14)

That is, the mean motion value of the continuous part

for a point is a sort of weighted average of the values

of its neighbors. Second, we assume that the variance

is constant for every point, i.e. σ

) = σ

. From

this assumptions, several coefﬁcients of the matrixβ

are null. Additionally, note that the second order po-

tentials are deﬁned over non-ordered pairs of points

and thus, a symmetry condition arises since V

= V

Consequently, from equation (13), β

= β

. Finally,

we have

0 0

0 0 0

0 0 h

= [

]

(15)

In general terms, the proposed conditional models

could be deﬁned by a different set of parameters for

each location of the image. This would give rise to

a motion texture model with a number of parame-

ters proportional to the image size. Unfortunately,

such high-dimensional representation of the associ-

ated dynamic information is unfeasible in practice

and does not constitutes a compact description of mo-

tion textures. Moreover, an increasingly number of

frames would be necessary for the estimation pro-

cess. This conspires against a formulation oriented

to efﬁcient content recognition and retrieval. How-

ever, if needed, our framework could deal with spa-

tially non-stationary motion textures. Then, we pro-

pose to use an homogeneous model where α

= α

and β

= β

, and k is an index that indicates the

position of the neighbor w.r.t the point, and is inde-

pendent of location i. Moreover, a necessary con-

dition in order to deﬁne a stationary spatial process,

is that the parameters related to symmetric neigh-

bors (E-W, N-S, NW-SE, NE-SW) must be the same.

This is a consequence of the symmetry of the po-

tentials. Then, k ∈ {H,V,D,AD}, for the Horizon-

tal, Vertical, Diagonal, and Anti-Diagonal interact-

ing directions. Finally, we have the set of 11 param-

eters which characterizes the motion-texture model,

φ = {a,b, c, d

We now write the expression of the resulting

Gibbs energy for a Gaussian mixed-state Markov ran-

dom ﬁeld,

Q(X) =

∑

∗

) −bx

+ cx

∑

<i, j>

∗

) + h

(16)

Here, we distinguish two main parts: a set of terms

related to the interaction between the discrete com-

ponents of the neighbors, and terms related to a con-

tinuous Gaussian Markov random ﬁeld. Although the

energy is functionally decomposed in two parts, this

does not mean that the two types of values (discrete-

continuous) are independent in the Gibbs ﬁeld, and

effectively, the estimation of the parameters has to be

done jointly.

4.2 Model-parameter Estimation

In order to estimate the eleven parameters of the

motion-texturemodel from motion measurements, we

adopt the pseudo-likelihood maximization criterion

(Besag, 1974), since the partition function for ex-

act maximum likelihood formulation is in general in-

tractable. Therefore, we search the set of parame-

ters

φ that maximizes the function L(φ) =

∏

i∈S

p(x

,φ). We use a gradient descent technique for the

optimization as the derivatives of L w.r.t φ are known

in closed form.

5 MOTION TEXTURE

RECOGNITION

One of the key aspects of a model oriented to dy-

namic content recognition, as the one proposed here,

is the ability to deﬁne a way of computing some sim-

milarity measure between models, in order to embed

it in a decision-theory-based application. In this con-

text, the Kullback-Leibler (KL) divergence is a well-

known distance (more precisely, a pseudo-distance)

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

286

between statistical models. Here we present a result

for computing this quantity between general Gibbs

distributions and obtain an expression for the case of

mixed-state models, that will allow us to classify a set

of real motion textures.

5.1 A Simmilarity Measure Between

Mixed-state Models

The KL divergence from a density p

(X) to p

(X) is

deﬁned as

KL(p

(X) | p

(X)) =

Ω

log

(X)

(X)m(dX), (17)

which is independent of the measure m.

This is not strictly a distance as it is not

symmetric; deﬁne then the symmetrized

KL divergence as d

(X), p

(X)) =

[KL(p

(X) | p

(X)) + KL(p

(X) | p

(X))].

Now, assume that p

(X) and p

(X) are Markov

random ﬁelds, i.e. p

(X) = Z

−1

expQ

(X) and

(X) = Z

−1

expQ

(X). Deﬁne ∆Q(X) = Q

(X) −

(X). Then log

(X)

= −∆Q(X) + log

, and

(X), p

(X)) =



[∆Q(X)] −E

[∆Q(X)]



(18)

We observe from this general equation, that we do not

need to have knowledge of the partition functions of

the Gibbs distributions which simpliﬁes enormously

the handling of this expression. Now, let p

(X) and

(X) be two Gaussian MS-MRF. Then,

∆Q(X) =

∑

∆αS(x

) +

∑

<i, j>

S(x

)∆β

S(x

) (19)

where ∆α = α

(2)

−α

(1)

and ∆β

= β

(2)

−β

(1)

. From

this deﬁnition:

[∆Q(X)] =

∑

∆αE

[S(x

)]

∑

<i, j>

S(x

)∆β

S(x

)

(20)

As we have an homogeneous model, the expectations

in the last equation are equal for each site of the mo-

tion ﬁeld. The same result applies to E

[∆Q(X)] and

then it is straightforward to compute equation (18). In

a practical application, the idea is to use the parame-

ters of the two models that we want to compare, to

generate synthetic ﬁelds using a Gibbs sampler (Ge-

man and Geman, 1984) from which we can estimate

the involved expectations and ﬁnally calculate the di-

vergence.

5.2 Experiments

The recognition performance of mixed-state motion

texture models was tested with real sequences ex-

tracted from the DynTex dynamic texture database

(Peteri et al., 2005). We ﬁrst took motion textures

where the homogeneity assumption was mostly valid

and divided them in 10 different classes (Figure 2):

steam, straw, trafﬁc, water, candles, shower, ﬂags,

water-rocks, waves, fountain. A total of 30 differ-

ent sequences were considered, and for each one, 5

pairs of consecutive images were selected at frames

1,20,40,60,80, for a total of 150 samples. Each mo-

tion texture class parameter set was learned from a

single pair of images picked from only one of the se-

quences belonging to each type of motion texture. All

sequences were composed by gray scale images with

a resolution of 720x576 pixels, given at a rate of 25

frames per second. In order to reduce computation

time, the original images were ﬁltered and subsam-

pled to a resolution of 180x144 pixels.

Having estimated the reference model parameters

for each of the 10 training samples, we then estimated

φ for each test sample and computed the distance with

each learned parameter vector, as explained in the last

section. The recoginition was based on assigning the

class of motion texture that was closer to the test sam-

ple.

In Table 1 we show the confusion matrix for the

10 motion texture classes. A correct recognition is

considered when both, the test sample and the closest

reference parameter vector, belong to the same class.

A promising overall classiﬁcation rate of 90.7%

was achieved. As for the confusion matrix, let us

note that it is likely that waves are classiﬁed as wa-

ter or water-rocks as they correspond to similar dy-

namic contents, straw is confused with shower, they

have similar vertical orientation, and candles can be

classiﬁed as trafﬁc, as both classes show a motion pat-

tern consisting of isolated blobs. The non-symmetry

of the confusion matrix is associated to the nature of

the tested data set, where for some classes, the tested

sequences have a closer resemblance to the training

sample, while for others, there are notorious varia-

tions, that may lead to a misclassiﬁcation.

Reported experiments for dynamic texture recog-

nition using a model-based approach as the one pre-

sented in (Saisan et al., 2001), have shown a classi-

ﬁcation rate of 89.5%, on similar data sets. Their

method is based on computing a subspace distance

between the linear models learned for each class, that

describe the evolution of image intensity over time.

Although the efectiveness of this approach is similar

to ours, the method proposed here has a big advan-

RECOGNITION OF DYNAMIC VIDEO CONTENTS BASED ON MOTION TEXTURE STATISTICAL MODELS

287

Figure 2: Sample images from the 10 motion texture classes used for the recognition experiments.

Table 1: Motion texture class confusion matrix. Each row indicates how the samples for a class were classiﬁed.

steam straw trafﬁc water candles shower ﬂags water-rocks waves fountain

steam 100% - - - - - - - - -

straw - 93.3% - - - 6.7% - - - -

trafﬁc - - 86.7% - - - - 13.3% - -

water - - - 100% - - - - - -

candles - - 13.3% - 73.4% - - 13.3% - -

shower 20% - - - - 80% - - - -

ﬂags - - - - - - 100% - - -

water-rocks - - - - - - - 93.3% - 6.7%

waves - - - 6.7% - - - 13.3% 80% -

fountain - - - - - - - - - 100%

tage, that is, we only need two consecutive frames to

estimate and recognize the mixed-state models, while

in (Saisan et al., 2001) subsequences of 75 frames are

used. This is a consequence of modeling the spatial

structure of motion rather than the time evolution of

the image.

Motion-based methods for dynamic texture clas-

siﬁcation (Fazekas and Chetverikov, 2005; Peteri and

Chetverikov, 2005) have shown an improved perfor-

mance with recognition rates of over 95% using in-

variant ﬂow statistics. Although they are the most

accurate reported results for addressing this problem,

they do not provide a general characterization of dy-

namic textures, as model-based approaches do. Con-

sequently, more complex scenarios with combined

problems, as simultaneous detection, segmentation

and recognition of these type of sequences are not di-

rectly addressable. Our framework provides an uni-

ﬁed statistical representation suitable of beeing ap-

plicable to other complex problems, as well (Crivelli

et al., 2006; Bouthemy et al., 2006).

6 CONCLUSIONS

We have presented a novel comprehensive motion

modeling framework for motion texture recognition.

Our approach appropriately handles the mixed-state

nature of motion measurements where the parametric

representation of the statistical models showed quite

satisfactory results on motion texture recognition. In

our method, we do not need to process the whole se-

quence to obtain a reliable estimate of the model in

order to achieve an accurate classiﬁcation rate. The

approach is entirely valid, with minor modiﬁcations,

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

288

for motion texture detection as well. Moreover, the

reference classes can be learned from a single instan-

taneous motion map allowing, eventually, to deﬁne an

adaptive scheme for recognition and classiﬁcation.

Future prospects are based on considering other

dissimilarity measures between statistical models,

combining the classiﬁcation and detection approach

with existing motion or dynamic texture segmentation

methods and considering the introduction of contex-

tual information through discriminative models, pos-

sibly in the form of Conditional Markov Random

Fields (CMRF).

REFERENCES

Besag, J. (1974). Spatial interaction and the statistical anal-

ysis of lattice systems. Journal of the Royal Statistical

Society. Series B, 36:192–236.

Bouthemy, P., Hardouin, C., Piriou, G., and Yao, J. (2006).

Mixed-state auto-models and motion texture model-

ing. Journal of Mathematical Imaging and Vision,

25:387–402.

Cernuschi-Frias, B. (2007). Mixed-states markov random

ﬁelds with symbolic labels and multidimensional real

values. Rapport de Recherche INRIA.

Crivelli, T., Cernuschi, B., Bouthemy, P., and Yao, J. (2006).

Segmentation of motion textures using mixed-state

markov random ﬁelds. In Proceedings of SPIE, vol-

ume 6315, 63150J.

Doretto, G., Chiuso, A., Wu, Y., and Soatto, S. (2003).

Dynamic textures. Intl. Journal of Comp. Vision,

51(2):91–109.

Fablet, R. and Bouthemy, P. (2003). Motion recognition

using non-parametric image motion models estimated

from temporal and multiscale co-ocurrence statistics.

IEEE Trans. on Pattern Analysis and Machine Intelli-

gence, 25(12):1619–1624.

Fazekas, S. and Chetverikov, D. (2005). Normal versus

complete ﬂow in dynamic texture recognition: a com-

parative study. In Texture 2005: 4th international

workshop on texture analysis and synthesis. ICCV’05,

Beijing, pages 37–42.

Geman, S. and Geman, D. (1984). Stochastic relaxation,

gibbs distributions, and the bayesian restoration of im-

ages. IEEE Transactions on Pattern Analysis and Ma-

chine Intelligence, 6:721–741.

Lu, Z., Xie, W., Pei, J., and Huang, J. (2005). Dynamic

texture recognition by spatio-temporal multiresolution

histograms. In IEEE Workshop on Motion and Video

Computing (WACV/MOTION), pages 241–246.

Peteri, R. and Chetverikov, D. (2005). Dynamic texture

recognition using normal ﬂow and texture regularity.

In Proc. of IbPRIA, Estoril, pages 223–230.

Peteri, R., Huiskes, M., and Fazekas, S. (2005). Dyn-

tex: A comprehensive database of dynamic textures.

http://www.cwi.nl/projects/dyntex/index.html.

Saisan, P., Doretto, G., Wu, Y., and Soatto, S. (2001).

Dynamic texture recognition. In Proc. of the IEEE

Conf. on Computer Vision and Pattern Recognition.

CVPR’01, Hawaii, pages 58–63.

Salzenstein, F. and Pieczynski, W. (1997). Parameter es-

timation in hidden fuzzy markov random ﬁelds and

image segmentation. Graph. Models Image Process.,

59(4):205–220.

Vidal, R. and Ravichandran, A. (2005). Optical ﬂow esti-

mation and segmentation of multiple moving dynamic

textures. In Proc. of CVPR’05, SanDiego, volume 2,

pages 516–521.

Yuan, L., Wen, F., Liu, C., and Shum, H. (2004). Synthesiz-

ing dynamic textures with closed-loop linear dynamic

systems. In Proc. of the 8th European Conf. on Com-

puter Vision, ECCV’04, Prague, volume LNCS 3022,

pages 603–616.

RECOGNITION OF DYNAMIC VIDEO CONTENTS BASED ON MOTION TEXTURE STATISTICAL MODELS

289