A GENERALIZATION OF NEGATIVE NORM MODELS IN THE

DISCRETE SETTING

Application to Stripe Denoising

J´erˆome Fehrenbach

, Pierre Weiss

and Corinne Lorenzo

Institut de Math´ematiques de Toulouse, Toulouse University, Toulouse, France

ITAV, Toulouse canceropole, Toulouse, France

Keywords:

Denoising, Cartoon+texture decomposition, Primal-dual algorithm, Stationary noise, Fluorescence mi-

croscopy.

Abstract:

Starting with a book of Y.Meyer in 2001, negative norm models attracted the attention of the imaging commu-

nity in the last decade. Despite numerous works, these norms seem to have provided only luckwarm results

in practical applications. In this work, we propose a framework and an algorithm to remove stationary noise

from images. This algorithm has numerous practical applications and we show it on 3D data from a newborn

microscope called SPIM. We also show that this model generalizes Meyer’s model and its successors in the

discrete setting and allows to interpret them in a Bayesian framework. It sheds a new light on these models

and allows to pick them according to some a priori knowledge on the texture statistics. Further results are

available on our webpage at http://www.math.univ-toulouse.fr/∼weiss/PagePublications.html.

1 INTRODUCTION

The purpose of this article is to provide variational

models and algorithms in order to remove stationary

noise from images. The models that are proposed here

turn out to be a generalization of the discretized neg-

ative norm models. This allows to analyse them in a

Bayesian framework. By stationary noise, we mean

that the noise is generated by convolving white noise

with a given kernel. The noise thus appears as ”struc-

tured” in the sense that some pattern might be visible,

see Figure 3(b),(c),(d).

This work was primarily motivated by the recent

development of a microscope called Selective Plane

Illumination Microscope (SPIM). The SPIM is a ﬂuo-

rescence microscope which allows to perform optical

sectioning of a specimen, see (Huisken et al., 2004).

One difference with conventional microscopy is that

the ﬂuorescence light is detected at an angle of 90 de-

grees with the illumination axis. This procedure tends

to degrade the images with stripes aligned with the il-

lumination axis, see Figure 5(a). This kind of noise is

well described by a stationary process. The ﬁrst con-

tribution of this paper is to provide effective denoising

algorithms dedicated to this imaging modality.

It appears that our models generalize the negative

norms models proposed by Y. Meyer (Meyer, 2001).

This work initiated numerous research in the domain

of texture+cartoon decomposition methods (Vese and

Osher, 2003; Osher et al., 2003; Aujol et al., 2006;

Garnett et al., 2007). Meyer’s idea is to decompose an

image into a piecewise smooth component and an os-

cillatory component. The use of a negativenorm k·k

to capture oscillating patterns is motivated by the fact

that if (v

) converges weakly to 0 then kv

→ 0.

This interpretation is however not really informative

on what kind of textures are well captured by nega-

tive norms. The second contribution of this paper is to

propose a Bayesian interpretation of these models in

the discrete setting. This allows a better understand-

ing of the decomposition models:

• We can associate a probability density functions

(p.d.f.) to the negative norms. This allows

to choose a model depending on some a priori

knowledge on the texture.

• We can synthetize textures which are adapted to

these negative norms.

• The Bayesian interpretation suggests a new

broader and more versatile class of translation in-

variant models used e.g. for SPIM imaging.

Connection to Previous Works. This work shares

ﬂavors with some previous works. In (Aujol et al.,

337

Fehrenbach J., Weiss P. and Lorenzo C. (2012).

A GENERALIZATION OF NEGATIVE NORM MODELS IN THE DISCRETE SETTING - Application to Stripe Denoising.

In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods, pages 337-342

DOI: 10.5220/0003742603370342

 SciTePress

2006) the authors present algorithms and results using

similar approaches. However, they do not propose a

Bayesian interpretation and consider a narrower class

of models. An alternative way of decomposing im-

ages was proposed in (Starck et al., 2005). The idea is

to seek components that are sparse in given dictionar-

ies. Different choices for the elementary atoms com-

posing the dictionary will allow to recover different

kind of textures. See (Fadili et al., 2010) for a review

of these methods and a generalization to the decom-

position into an arbitrary number of components.

The main novelties of the present work are:

1. We do not restrict to sparse components, but allow

for a more general class of random processes.

2. Similarly to (Fadili et al., 2010), the texture is de-

scribed through a dictionary. In the present work

each dictionary is composed of a single pattern

shifted in space, ensuring translation invariance.

3. A Bayesian approach takes into account the sta-

tistical nature of textures more precisely.

4. The decomposition problem is recast into a con-

vex optimization problem that is solved with a

recent algorithm (Chambolle and Pock, 2011) al-

lowing to obtain results in an interactive time.

5. Codes are provided on our

webpage http://www.math.univ-

toulouse.fr/∼weiss/PageCodes.html.

Notation: Let u be a gray-scale image. It is com-

posed of n = n

× n

pixels, and u(x) denotes the in-

tensity at pixel x. The convolution product between u

and v is u∗v. Thediscrete gradientoperator is denoted

∇. Let ϕ : R

→ R be a convex closed function (see

(Rockafellar, 1970)). ∂ϕ denotes its sub-differential.

The Fenchel conjugate of ϕ is denoted ϕ

⋆

, and its re-

solvent is deﬁned by:

(Id + ∂ϕ)

−1

(u) = argmin

v∈R

ϕ(v) +

kv− uk

2 NOISE MODEL

Our objective can be formulated as follows: we want

to recover an original image u, given an observed im-

age u

= u + b, where b is a sample of some random

process.

The most standard denoising techniques explicitly

or implicitly assume that the noise is the realization of

a random process that is pixelwise independent and

identically distributed (i.e. a white noise). Under

this assumption, the maximum a posteriori (MAP) ap-

proach leads to optimization problems of kind:

Find u ∈ argmin

u∈R

J(u) +

∑

φ(u(x) − u

(x)),

where

1. exp(−φ) is proportional to the p.d.f. of the noise

at each pixel,

2. J(u) is an image prior.

The assumption that the noise is i.i.d. appears too

restrictive in some situations, and is not adapted to

structured noise (see Figures 2 and 3).

The general model of noise considered in this

work is the following:

b =

∑

i=1

∗ ψ

, (1)

where {ψ

}

i=1

are ﬁlters that describe patterns of

noise, and {λ

}

i=1

are samples of white noise pro-

cesses {Λ

}

i=1

. Each process Λ

is a set of n i.i.d.

random variables with a p.d.f. exp(−φ

In short, the convolution that appears in the right-

hand side of (1) states that the noise b is composed of

a certain number of patterns ψ

,.. . ,ψ

that are repli-

cated in space. The noise b in (1) is a wide sense sta-

tionary noise (Shiryaev, 1996). Examples of noises

that can be generated using this model are shown in

Figure 3.

1. example (b) is a Gaussian white noise. It is

the convolution of a Gaussian white noise with a

Dirac delta function.

2. example (c) is a sine function in the x direction.

It is a sample of a uniform white noise in [−1,1]

convolved with the ﬁlter that is constant equal to

1/n

in the ﬁrst column and zero otherwise.

3. example (d) is composed of a single pattern that is

located at random places. It is the convolution of a

sample of a Bernoulli process with the elementary

pattern.

3 RESTORATION ALGORITHM

The Bayesian approach requires a p.d.f. on the space

of images. We assume that the probability of an im-

age u reads p(u) ∝ exp(−J(u)). In this work we will

consider priors of the form:

J(u) = αk∇uk

1,ε

where α > 0 is a ﬁxed parameter and if q = (q

) ∈

n×2

, kqk

1,ε

∑



(x)

+ q

(x)



, with

(t) =



|t| if |t| ≥ ε

|t|

/2ε+ ε/2 otherwise.

ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods

338

Note that lim

ε→0

k∇uk

1,ε

= TV(u) is the discrete total

variation of u, and that lim

ε→+∞

εk∇uk

1,ε

k∇uk

to an additive constant. This model thus includes TV

and H

regularizations as limit cases.

The maximum a posteriori approach in a Bayesian

framework leads to retrieve the image u and the

weights {λ

}

i=1

that maximize the conditional proba-

bility

p(u,λ

,..., λ

) =

p(u

|u,λ

,..., λ

)p(u,λ

,..., λ

)

p(u

)

By assuming that the image u and the noise

components λ

are samples of independent pro-

cesses, standard arguments show that maximizing

p(u,λ

,...,λ

) amounts to solving the following

minimization problem:

Find {λ

⋆

}

i=1

∈ argmin

{λ

}

i=1

∑

i=1

(λ

) + F(∇(

∑

i=1

∗ ψ

)),

(2)

where

F(q) = αk∇u

− qk

1,ε

The denoised image u

⋆

is then u

⋆

= u

−

∑

i=1

⋆

∗ ψ

We propose in this work to solve problem (2) with

a primal-dual algorithm developed in (Chambolle and

Pock, 2011). Let A be the following linear operator:

A : R

m×n

→ R

n×2

λ 7→ ∇(

∑

i=1

∗ ψ

By denoting

G(λ) =

∑

i=1

(λ

) (3)

problem (2) can be recast as the following convex-

concave saddle-point problem:

min

λ∈R

n×m

max

kqk

∞

≤1

hAλ,qi − F

∗

(q) + G(λ). (4)

We denote ∆(λ, q) the duality gap of this problem

(Rockafellar, 1970). This problem is solved using the

following algorithm (Chambolle and Pock, 2011): In

practice, for a correct choice of inner products and

parameters σ and τ, this algorithm requires around 50

low-cost iterations for ε = 10

−3

. More details will be

provided in a forthcoming research report.

4 BAYESIAN INTERPRETATION

OF THE DISCRETIZED

NEGATIVE NORM MODELS

In the last decade, the texture+cartoon decomposition

models based on negative norms attracted the atten-

tion of the scientiﬁc community. These models often

take the following form:

Algorithm 1: Primal-Dual algorithm.

Input:

ε: the desired precision;

(λ

): a starting point;

Output:

: an approximate solution to problem (4).

begin

n = 0;

= λ

;

while ∆(λ

) > ε∆(λ

) do

n+1

= (Id+ σ∂F

∗

)

−1

+ σA

);

n+1

= (Id+ τ∂G)

−1

(λ

− τA

∗

n+1

);

n+1

= λ

n+1

+ θ(λ

n+1

− λ

);

n = n+ 1;

end

inf

u∈BV(Ω),v∈V,u+v=u

TV(u) + kvk

(5)

where:

• u

is an image to decompose as the sum of a

texture v in V and a structure u in the space of

bounded variation functions BV(Ω),

• V is a Sobolev space of negative index,

• k · k

is an associated semi-norm.

Y. Meyer’s seminal model consists in taking V =

−1,∞

and the following norm:

kvk

= kvk

−1,∞

= inf

g∈L

∞

(Ω)

,div(g)=v

kgk

∞

In the discrete setting the negative norm models

read:

Find (u,g) ∈ argmin kgk

+ αk∇uk

subject to u

= u+ v

v = ∇

(6)

where u

, u and v are in R

, g =





∈ R

n×2

and

∇

g = ∂

+ ∂

. The operators ∂

and ∂

denote

the discrete derivatives with respect to both space di-

rections. If p = ∞, we get the discrete Meyer model.

From an experimental point of view, the choices p = 2

and p = 1 seem to provide better practical results

(Vese and Osher, 2003).

In order to show the equivalence of these models

with the ones proposed in Equation (2), we express

the differential operators as convolution products. As

the discrete derivative operators are usually transla-

tion invariant, this reads:

∇

g = h

∗ g

+ h

∗ g

where ∗ denotes the convolution product and h

and

are derivative ﬁlters (typically h

= (1,−1) and



−1



A GENERALIZATION OF NEGATIVE NORM MODELS IN THE DISCRETE SETTING - Application to Stripe

Denoising

339

This simple remark leads to an interesting inter-

pretation of g: it represents the coefﬁcients of an im-

age v in a dictionary composed of the vectors h

and

translated in space.

The negative norms models can thus be inter-

preted as decomposition models in a very simple tex-

ture dictionary. Next, let us show that problem (5) can

be interpreted in a MAP formalism.

Let us deﬁne a probability density function:

Deﬁnition 1 (Negative Norm p.d.f.). Let Γ be a ran-

dom vector in R

and Θ be a random vector in

[0,2π]

. Let us assume that p(Γ) ∝ exp(−kΓk

) and

that Θ has a uniform distribution. These two random

vectors allow to deﬁne a third one:

G =



Γcos(Θ)

Γsin(Θ)



Now let us show that problem (6) actually corre-

sponds to a MAP decomposition. Let us assume that:

= u+ v

with u and v realization of independant random vector

such that p(u) ∝ exp(−αk∇uk

) and v = ∇

g with g

a realization of G. Then the classical Bayes reasoning

leads to the following equations:

argmax

u∈R

,v∈R

p(u,v|u

)

= argmax

u∈R

,v∈R

p(u

|u,v) · p(u, v)

p(u

)

= argmax

u+v=u

,u∈R

,v∈R

p(u,v)

p(u

)

= argmin

u+v=u

,u∈R

,v∈R

−log(p(v)) − log(p(u))

= argmin

u+v=u

,u∈R

,v∈R

−log(p(v)) + αk∇uk

= argmin

u+v=u

,u∈R

,v=∇

kgk

+ αk∇uk

which is exactly problem (6). Also note that the

model above is equivalent to a slight variant of the

model deﬁned in Equation (2) in the case m = 2:

argmin

u+v=u

,u∈R

,v=∇

kgk

+ αk∇uk

= argmin

g=(g

)∈R

kgk

+ αk∇(u

− ∇

g)k

= argmin

)∈R

G(g

) + F(∇(u

− h

∗ g

− h

∗ g

))

where

• G(g

) =



∑

(x)

+ g

(x)

)

p/2



1/p

is a

mixed-norm variant of the function G deﬁned in

Equation (3) (Kowalski, 2009),

• F(q) = αkqk

• the ﬁlters h

and h

are the discrete derivative ﬁl-

ters deﬁned above.

The same reasoning holds for most negative

norms models proposed lately (Meyer, 2001; Aujol

et al., 2006; Vese and Osher, 2003; Osher et al.,

2003; Garnett et al., 2007), and problem (2) actu-

ally generalizes all these models. To our knowledge,

the Chambolle-Pock implementation (Chambolle and

Pock, 2011) proposed here or the ADMM method (Ng

et al., 2010) (for strongly monotone problems) are the

most efﬁcient numerical approaches.

5 NEGATIVE NORM TEXTURE

SYNTHESIS

The MAP approach to negative norm models de-

scribed above also sheds a new light on the kind of

texture appreciated by the negative norms. In order to

synthetize a texture with p.d.f. (1), it sufﬁces to run

the following algorithm:

1. Generate a sample of a uniform random vector θ ∈

[0,2π]

2. Generate a sample of a random vector γ with p.d.f.

proportional to exp(−kγk

3. Generate two vectors g

= γcos(θ) and g

γsin(θ).

4. Generate the texture v = ∇





The results of this simple algorithm are presented

in Figure 1.

6 RESULTS OF THE DENOISING

ALGORITHM

6.1 Synthetic Image

The method was validated on a synthetic example,

where a ground truth is available. A synthetic im-

age was created by adding to a cartoon image (a disk)

the sum of 3 different stationary noises. The result-

ing synthetic image is shown in Figure 2. The cartoon

image and the 3 noise components are presented in

Figure 3(a,b,c,d). The ﬁrst noise component is a sam-

ple of a Gaussian white noise. The second component

is a sine function in the horizontal direction. The third

component is the sum of elementary patterns, this is

a sample of a Bernoulli law with probability 5.10

−4

convolved with an elementary ﬁlter.

ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods

340

p = 2

p = 1

p = ∞

Figure 1: Left: standard noises. Right: different tex-

tures synthetized with the negative norm p.d.f. Note: we

synthetize the “Laplace” noise by approximating it with a

Bernoulli process.

Figure 2: Synthetic image used for the toy example.

The results of Algorithm 1 are presented in Figure

3(e,f,g,h). The decomposition is almost perfect. This

example is a good proof of concept.

6.2 Real SPIM Image

Algorithm 1 was applied to a zebraﬁsh embryo image

obtained using the SPIM microscope. Two ﬁlters ψ

and ψ

were used to denoise this image. The ﬁrst ﬁlter

is a Dirac (which allows the recovery of Gaussian

(a) (e)

(b) (f)

(c)

(g)

(d) (h)

Figure 3: Toy example. Left column: real components;

right column: estimated components using our algorithm.

(a,e): cartoon component - (b,f): Gaussian noise, std 0.2 -

(c,g): Stripes component (sine)- (d,h): ’Poisson’ noise com-

ponent (poisson means ﬁsh in French).

Figure 4: A detailed view of ﬁlter ψ

white noise), and the second ﬁlter ψ

is an anisotropic

Gabor ﬁlter with principal axis directed by the stripes

(this orientation was obtained by user). The ﬁlter ψ

is shown in Figure 4.

A GENERALIZATION OF NEGATIVE NORM MODELS IN THE DISCRETE SETTING - Application to Stripe

Denoising

341

(a) (d)

(b) (e)

Figure 5: Top-Left: original image zebraﬁsh embryo

Tg.SMYH1:GFP Slow myosin Chain I speciﬁc ﬁbers - Top-

Right: TV-L2 denoising - Mid-Left: H

-Gabor restoration

- Mid-Right: TV-Gabor restoration - Bottom-Left: stripes

identiﬁed by our algorithm - Bottom-Right: white noise.

The original image is presented in Figure 5(a), and

the result of Algorithm 1 is presented in Figure 5(e).

We also present a comparison with two other algo-

rithms in Figures 5(d,b):

• a standard TV-L

denoising algorithm. The algo-

rithm is unable to remove the stripes as the prior

is unadapted to the noise.

• an “H

-Gabor” algorithm which consists in set-

ting F(·) =

k · k

in equation (2). The image

prior thus promotes smooth solutions and pro-

vides blurry results.

ACKNOWLEDGEMENTS

The authors wish to thank Julie Batut for providing

the images of the zebraﬁsh. They also thank Val´erie

Lobjois, Bernard Ducommun, Raphael Jorand and

Franc¸ois De Vieilleville for their support during this

work.

REFERENCES

Aujol, J.-F., Gilboa, G., Chan, T., and Osher, S. (2006).

Structure-texture image decomposition - modeling, al-

gorithms, and parameter selection. Int. J. Comput. Vi-

sion, vol. 67(1), pp. 111-136.

Chambolle, A. and Pock, T. (2011). A ﬁrst-order primal-

dual algorithm for convex problems with applications

to imaging. J. Math. Imaging Vis. 40(1): 120-145.

Fadili, M., Starck, J.-L., Bobin, J., and Moudden, Y. (2010).

Image decomposition and separation using sparse rep-

resentations: an overview. Proc. of the IEEE, Spe-

cial Issue: Applications of Sparse Representation, vol.

98(6), pp. 983-994,.

Garnett, J., Le, T., Meyer, Y., and Vese, L. (2007). Image

decompositions using bounded variation and general-

ized homogeneous besov spaces. Appl. Comput. Har-

mon. Anal., 23, pp. 25-56.

Huisken, J., Swoger, J., Bene, F. D., Wittbrodt, J., and

Stelzer, E. (2004). Optical sectioning deep inside live

embryos by selective plane illumination microscopy.

Science, vol. 305, 5686, p.1007.

Kowalski, M. (2009). Sparse regression using mixed norms.

Appl. Comput. Harmon. A. 27, 3, 303–324.

Meyer, Y. (2001). Oscillating patterns in image process-

ing and in some nonlinear evolution equations, in 15th

Dean Jacqueline B. Lewis Memorial Lectures. AMS.

Ng, M., Weiss, P., and Yuan, X.-M. (2010). Solving con-

strained total-variation image restoration and recon-

struction problems via alternating direction methods.

SIAM Journal on Scientiﬁc Computing, 32.

Osher, S., Sole, A., and Vese, L. (2003). Image decomposi-

tion and restoration using total variation minimization

and the h

−1

norm. SIAM Multiscale Model. Sim. 1(3),

pp. 339-370.

Rockafellar, T. (1970). Convex Analysis. Princeton Univer-

sity Press.

Shiryaev, A. (1996). Probability, Graduate Texts in Mathe-

matics 95. Springer.

Starck, J., Elad, M., and Donoho, D. (2005). Image decom-

position via the combination of sparse representations

and a variational approach. IEEE Trans. Im. Proc.,

vol. 14(10).

Vese, L. and Osher, S. (2003). Modeling textures with total

variation minimization and oscillating patterns in im-

age processing. J. Sci. Comput., 19(1-3), pp. 553-572.

ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods

342