A Novel Real-time Edge-preserving Smoothing Filter

Simon Reich

, Alexey Abramov

, Jeremie Papon

, Florentin Wörgötter

and Babette Dellen

Third Institute of Physics - Biophysics, Georg-August-Universität Göttingen,

Friedrich-Hund-Platz 1, 37077 Göttingen, Germany

Institut de Robotica i Informatica Industrial (CSIC-UPC), Llorens i Artigas 4-6, 08028 Barcelona, Spain

Keywords:

Texture Filter, Image Segmentation, GPU, Real-time, Edge-preserving.

Abstract:

The segmentation of textured and noisy areas in images is a very challenging task due to the large variety of

objects and materials in natural environments, which cannot be solved by a single similarity measure. In this

paper, we address this problem by proposing a novel edge-preserving texture ﬁlter, which smudges the color

values inside uniformly textured areas, thus making the processed image more workable for color-based image

segmentation. Due to the highly parallel structure of the method, the implementation on a GPU runs in real-

time, allowing us to process standard images within tens of milliseconds. By preprocessing images with this

novel ﬁlter before applying a recent real-time color-based image segmentation method, we obtain signiﬁcant

improvements in performance for images from the Berkeley dataset, outperforming an alternative version

using a standard bilateral ﬁlter for preprocessing. We further show that our combined approach leads to better

segmentations in terms of a standard performance measure than graph-based and mean-shift segmentation for

the Berkeley image dataset.

1 INTRODUCTION

The segmentation of image areas into perceptually

uniform parts continues to be a challenging computer-

vision problem due to the large variety of textures and

materials in our natural environment. Another impor-

tant issue is the performance of the methods in terms

of computation time. Many applications would proﬁt

largely from a real-time segmentation method that is

able to handle a broad spectrum of different images.

When grouping image areas into segments a sim-

ilarity criterion needs to be deﬁned. However, simi-

larities can exist on different scales, i.e., between ad-

jacent pixels, or groups of pixels, as it is the case

for texture. Segmentation algorithms thus need to

take into account similarities occurring at these dif-

ferent scales, which can be rather costly. Graph-

based segmentation algorithms solve this problem

through the deﬁnition of an adaptive similarity mea-

sure, which depends on the average pixel-to-pixel

similarity inside growing regions (Felzenszwalb and

Huttenlocher, 2004). The mean-shift segmentation

algorithm by (Comaniciu and Meer, 2002) performs

a non-parametric analysis in the feature space (Paris

and Durand, 2007) by iteratively computing aver-

age values of pixels inside a Gaussian neighborhood.

Through this process, feature values of pixels are

successively moved towards the mean value of a lo-

cal neighborhood. Upon convergence of the proce-

dure, the feature values are grouped using a cluster-

ing method. While providing quite satisfactory re-

sults, both methods have the disadvantage that they

are based on a sequential process, and thus are not

readily parallizable, limiting their performance.

Alternatively, the image can be pre-processed us-

ing a smoothing ﬁlter for homogenizing textured ar-

eas and making them this way workable for standard

segmentation and clustering methods. During the past

two decades many types of smoothing ﬁlters have

been proposed. Most ﬁlters are based on two basic

steps: ﬁrst detecting noise and second removing it.

In noise detection, noise and noise-free areas are dis-

tinguished using a threshold. These thresholds can

be either learned using a training set of images, as in

support vector machines (Yang et al., 2010) and neu-

ral networks (Muneyasu et al., 1995), or the threshold

may be computed from the surrounding pixel values,

as in (Du et al., 2011). (Lev et al., 1977) identiﬁed

similar pixels by detecting edges and iteratively re-

placing the intensity of the pixel by the mean of all

pixels in a small environment. Another approach was

proposed by (Tomasi and Manduchi, 1998). The bi-

lateral ﬁlter blurs neighboring pixels depending on

their combined color and spatial distance. Hence,

Reich S., Abramov A., Papon J., Wörgötter F. and Dellen B..

A Novel Real-time Edge-Preserving Smoothing Filter.

DOI: 10.5220/0004214300050014

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2013), pages 5-14

ISBN: 978-989-8565-47-1

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

only texture which has a small deviation from the

mean can be blurred without affecting boundaries.

This leads to a trade-off for highly textures area:

Large blurring factors are needed to smooth out tex-

ture, having the consequence that edges are not pre-

served anymore.

In the current work, we present a novel smoothing

ﬁlter which is not limited by these constraints. The

basic idea can be described as follows: Given a mea-

surement window, the feature values of the pixel val-

ues inside the window are smoothed with a smooth-

ing factor dependent on the window size. If smooth-

ing decreased the distance from the average value be-

low a certain threshold, all pixels in the window are

replaced by their smoothed value and a component

that depends on the average value. If smoothing does

not decrease the distance from the mean sufﬁciently,

it is assumed that a true boundary is located inside the

window, and the original feature values are kept. This

procedure is repeated for many different window lo-

cations and sizes (optional). Since the procedures for

each window are independent from each other, the al-

gorithm can be easily parallelized. We show in this

paper that the proposed ﬁlter leads to improved seg-

mentations in conjunction with a recent real-time seg-

mentation algorithm based on the superparamagnetic

clustering of data (Abramov et al., 2012), and out-

performs the bilateral ﬁlter on the Berkeley database.

Importantly, the ﬁlter runs in real time on a GPU,

providing a powerful add-on to the existing segmen-

tation technique, which can also be applied to video

segmentation. We further compare our results to the

graph-based and the mean-shift segmentation on the

Berkeley segmentation dataset and benchmark (Mar-

tin et al., 2001).

While these segmentation algorithms are color-

based, partitioning could also base on e.g. an object

classiﬁcation library as in (Farmer and Jain, 2005)

or depth information as in (Cigla and Aydin Alatan,

2008). These algorithms need either a training phase,

a set of ﬁxed parameters or other pre-set information.

Other than segmentation and image smoothing, tex-

ture and noise ﬁlters are found in many other appli-

cations including denoising (Elad, 2002; Jiang et al.,

2003), tone management (Farbman et al., 2008; Du-

rand and Dorsey, 2002), demosaicking (R. and W.,

2003; Farsiu et al., 2006) or optical ﬂow estimation

(Xiao et al., 2006; Sun et al., 2010).

The paper is organized as follows. In section 2

we introduce the proposed ﬁlter and describe the pro-

cessing ﬂow. In section 3 we present obtained seg-

mentation results and evaluate the performance of the

method quantitatively. In section 4 we conclude our

work.

2 APPROACH

2.1 Proposed Filter

A diagram of the proposed ﬁlter is given in ﬁgure 1.

First the image Φ is divided into subwindows Ψ of the

size N = k ·l. Each subwindow is shifted by one pixel

relative to the last one, such that there are as many

subwindows as there are pixels in the image. Then,

the pixels inside each subwindow are smoothed. A

distance δ

i, j

for each pixel inside the subwindow and

a mean distance δ

are computed in the color domain

to obtain a measurement for noise, as described be-

low. A user selected value τ deﬁnes a threshold be-

tween noise or texture and a color edge. If noise is

detected, a weight ω

i, j

is calculated which moves the

color values of the respective pixel towards the mean

color of the subwindow.

1. Smoothing and Division into Subwindows.

The image Φ holds the RGB color vectors ϕ

i, j

(ϕ

i, j

)

. Beginning in the upper left corner,

the pixels ϕ

i, j

to ϕ

i+k, j+l

are copied into a subwin-

dow Ψ of size k × l holding the color vectors ψ

r,s

(i j)

is in the range of the image size, while (r s)

is in the range of the subwindow size k × l. Each sub-

window is smoothed using a Gaussian ﬁlter function

to remove outliers which would distort the calculation

of the mean as described below.

2. Computation of the Distance Matrix. The

arithmetic mean of Ψ is calculated as

∑

r,s

∑

r,s

∑

r,s

(1)

where N = k · l denotes the size of the subwindow.

The pixelwise distances

r,s

= |ψ

r,s

− ψ

, (2)

as well as the resulting mean pixelwise distance for

each subwindow Ψ, is computed to

∑

r,s

. (3)

3. Thresholding. δ

r,s

is now used for low level

noise detection. Small scaled color variations will re-

sult in a low variance δ

r,s

values since all color values

are close to the mean color value. In case of a sharp

color edge a large δ

r,s

is obtained. Therefore, we can

use a threshold τ to identify noisy pixels, yielding

r,s

(

1 δ

r,s

≤ τ and δ

≤ τ,

0 else,

(4)

where 1 stands for a noisy or textured pixel.

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

input image Φ

ﬁlter

segmentation

symbol-like

descriptors

1. smoothing

and dividing into

subwindows Ψ

i, j

2. compute distance

matrix ∆

i, j

and δ

i, j

3. apply threshold τ

4. accept smoothed

values or com-

pute weight ω

i, j

1. smoothing

and dividing into

subwindows Ψ

i, j

2. compute distance

matrix ∆

i, j

and δ

i, j

3. apply threshold τ

4. accept smoothed

values or com-

pute weight ω

i, j

Figure 1: Schematic of the proposed ﬁlter.

4. Computation and Updating of RGB Values.

At every iteration a global, image wide weight ω

i, j

is computed for normalization. A subwindow wide

weight, consisting of the squared distance of the user

based threshold τ and the pixelwise distance δ

r,s

, is

used for updating pixel values within one subwindow.

Please note that due to the sliding subwindows each

pixel gets updated N = k · l times. The global weight

i, j

is initialized with zeros and updated according to

i, j

←− ω

i, j

+ ψ

r,s

· (τ − δ

r,s

)

, (5)

where the ω

i, j

deﬁne the matrix Ω. For the updated

pixel values a third image frame Θ of the size of the

original image is needed. Θ is initalized with zeros

and updated according to

i, j

←− θ

i, j

+ ψ

r,s

· (τ − δ

r,s

)

· ψ

. (6)

Again, every pixel value is updated N times. When

the algorithm has reached the last iteration, all up-

dated pixels Θ are added to the original image Φ and

normalized using Ω. Even though it is highly improb-

able that any entry in Ω equals zero, 1 is added to

every entry. As in general ω

i, j

 0 and speciﬁcally

i, j

≥ 0 is true, this does not change the outcome sig-

niﬁcantly. This way the output is moved to the mean

color value ψ

. Two examples for subwindow arrays

can be seen in ﬁgure 2(a) and 2(b) (grayscaled input

subwindow on the left side, output subwindow on the

right side). In ﬁgure 2(d) an example for noisy pix-

els in a subwindow is shown, which corresponds to

the image in 2(a). The pixelwise distance as well as

the mean pixelwise distance are below the threshold

τ, and the pixel color values are shifted towards the

mean color values. In ﬁgure 2(e) a noisy step func-

tion is shown, which relates to image 2(b). Here τ is

smaller than the distances and the pixels are not up-

dated to the mean value. Since we use sliding sub-

windows, the small scaled noise in pixels 0 to 4 and 5

to 9 will be corrected but the edge will be preserved.

2.2 Formulation of the Proposed Filter

in the Continuous Domain

Let f

f (x

x) deﬁne the smoothed input image, h

h(x

x) the

output image, c

c(ζ

ζ, x

x) measures the geometric close-

ness and s

s( f

f (ζ

ζ), f

f (x

x)) the photometric similarity. As

we want to address speciﬁcally color images, bold

letters refer to RGB-vectors. In this section | · | also

refers to per-element-multiplication instead of vector

multiplication. In our approach we ﬁrst want to detect

noise and texture based on a user deﬁned parameter τ.

If noise or texture is detected, we want to remove it,

and in case of a color edge, we want to preserve the

edge. Therefore, we deﬁne a mean value

m(x

x) = k

−1

∞

−∞

∞

−∞

f (ζ

ζ) · c

c(ζ

ζ, x

x)dζ

x) =

∞

−∞

∞

−∞

c(ζ

ζ, x

x)dζ

ζ (7)

and a distance function

d ( f

f (x

x), m

m(x

x)) =

f (x

x) − m

m(x

, (8)

which results in the pixelwise distance. The mean

value m

m(x

x) now holds the average color value inside

a spatial neighborhood of x

x and d holds the color dis-

tance from the pixel to the average m

m(x

x). If the spatial

neighborhood holds only small scaled noise or tex-

ture we expect a low pixelwise distance d, as well as

a low average pixelwise distance in the spatial neigh-

borhood c

p(x

x) = k

−1

∞

−∞

d ( f

f (ζ

ζ), m

m(x

x))c

c(ζ

ζ, x

x)dζ

x) =

∞

−∞

c(ζ

ζ, x

x)dζ

ζ. (9)

Therefore, we can make a binary decision using a

threshold τ as

h(x

x) = k

−1

x) ·

(

∞

−∞

f (ζ

ζ) · c

c(ζ

ζ, x

x) · s

s(ζ

ζ, x

x)dζ

ζ p, d ≤ τ

∞

−∞

f (ζ

ζ) · c

c(ζ

ζ, x

x)dζ

ζ else,

(10)

where k

is the respective normalization. We used a

2D step function

c(ζ

ζ, x

x) =

(

1 x

x − a

a ≤ ζ

ζ ≤ x

x + b

0 else

, (11)

using the conditions a

a, b

b, e

e ∈ R

≥0

a + b

b = e

e with a

ﬁxed e

e. This generates a rectangle of the size e

e around

x. As this deﬁnition is not feasable in the continuous

ANovelReal-timeEdge-PreservingSmoothingFilter

(a) (b)

threshold τ

pixel color value ψ

r,s

mean pixel color value

pixelwise color distance δ

r,s

mean pixelwise color distance δ

Output Values

0 1 2 3 4 5 6 7 8 9

grayscale value

pixel number

(d)

100

0 1 2 3 4 5 6 7 8 9

grayscale value

pixel number

(e)

Figure 2: 2(a) Low level white noise (left) in 10 pixels is used as input for ﬁltering. The right bar shows the same 10 pixels

after ﬁltering. The computational steps involved in this image can be seen in ﬁgure 2(d). 2(b) A noisy step function is used

for input. After ﬁltering input and output are indentical to preserve the edge. The computational steps are shown in ﬁgure

2(e). 2(d) The pixelwise and mean pixelwise distance are both below the threshold τ. The output is ﬁltered according to (4)

and moved to the mean color. 2(e) The pixelwise and mean pixelwise distance are above the threshold and the output is not

ﬁltered. The edge is therefore preserved.

domain as it generates a nonﬁnite number of subwin-

dows to calculte, in the discrete case however every

pixel is checked and updated according to its neigh-

borhood e

e. As a measure for similarity we used a

squared distance

s(x

x) = (τ − d( f

f (x

x), m

m(x

x)))

m(x

x)| (12)

and the euclidian norm. In case of texture detection

the output is moved to the mean. The maximum size

of the step can be adjusted via the threshold τ.

2.3 Real-time Implementation

Real time can only be achieved by running the pro-

posed technique on parallel hardware, because the

computation of multiple subwindows is very inten-

sive on traditional CPUs. Once the image Φ is read,

values for the subwindows Ψ can be computed inde-

pendently. For accelaration we use a graphics pro-

cessor unit (GPU) and an example implementation

can be seen in ﬁgure 3(a). One block on the GPU

starts 32 × 16, and each of them loads one pixel into

shared memory. Each thread calculates the pixelwise

distances for one subwindow Ψ (marked red for two

example threads). As shown in white several threads

remain idle after copying. Since we are interested in

the average noise, we chose periodic mirrored bound-

ary conditions, which is illustrated in ﬁgure 3(b).

In our approach the images are ﬁltered twice us-

ing subwindows of size k = 8, l = 4 and k = 4, l = 8.

Two runs are used for symmetry purposes and better

ﬁlter results. As ∆ and δ

are calculated over each

subwindow, this also sets the maximum size of tex-

ture that is detected. We tested two implementations

of the algorithm: The CPU measurement refers to a

single-threaded implementation using an AMD Phe-

nom 9550 quad-core processor at 2,2 GHz using one

core and 4 GB RAM. The GPU version is executed on

an Nvidia GTX580 graphics card using 512 cores and

1.5 GB device memory.

3 EXPERIMENTAL RESULTS

3.1 Filter Results

In ﬁgure 5 a visual comparison of different threshold

levels is shown. Above a threshold of τ = 35 the out-

put does not change much, which is easily understood

when looking at equation 4. Above a certain threshold

level every pixel is identiﬁed as either noise or texture

and smoothed out. Therefore, it is not necessary to

adapt the user based threshold τ to different noise lev-

els. As shown below, there is a best value for τ for

non-artiﬁcial images.

Also in this work noise and texture is treated

equally. But contrary to a denoising ﬁlter we do not

want to restore a noisy image, but ﬁll out large ar-

eas with small color variations using the mean color.

This includes that noise, but also larger structures like

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

10 15 20 25 30

(a) (b)

Figure 3: 3(a) Example for parallelization on a GPU. One block loads 32 ×16px into shared memory where each subwindow,

shown in red, is computed by one thread. The next block would start at pixel 26. For the green pixels the weight ω will be

updated. 3(b) We used periodic mirrored boundary conditions as it leads to a good representation of the average noise at the

border.

texture, are smoothed out. As we are interested in

the perception-action loop of robots and improvement

of color-based segmentation results, the latter one is

more important to the results.

In ﬁgure 4(a) we compare the proposed ﬁlter to

a bilateral ﬁlter (Tomasi and Manduchi, 1998). As a

bilateral ﬁlter only smoothes the image based on the

spatial and color distance, it either does not preserve

the edge, but smoothes the texture, or preserves the

edge and does not ﬁlter the texture.

3.2 Segmentation Results

We ﬁltered all images from the Berkeley Segmenta-

tion Dataset and Benchmark (Martin et al., 2001) us-

ing a large variety of different thresholds. Afterwards

the ﬁltered images were segmented and compared

with the ground truth images from the database. The

performance was evaluated using the precision and re-

call method following (Martin et al., 2004). Given an

original image Φ, its machine segmentation S

, and

the corresponding ground truth S, precision is deﬁned

as the fraction of boundary pixels in the segmented

image S

that also occur in the ground truth S over

the total number of boundary pixels in S

. It is there-

fore sensitive to over-segmentation. Recall measures

the fraction of boundary pixels in S which are also

found in S

. It is sensitive to under-segmentation. The

results shown are the arithmetic means of all values

computed.

Figure 6(a) shows the precision and recall val-

ues for the Metropolis algorithm using constant seg-

mentation parameters for various threshold values τ.

Above a threshold of τ = 35 the values remain steady.

The weighted mean of precision and recall indicate

that best results are obtained for a threshold of τ = 30.

This behavior was also observed using different seg-

mentation parameters and may also be seen in the

ﬁltered images shown in ﬁgure 5. Also note that

precision is considerably lower than recall. While

the ideal value would be 1 for both, the segmented

images are extremely rich on texture, resulting in

over-segmentation. The threshold is computed using

the Berkeley Segmentation Dataset and Benchmark

which offers a wide range of heavily textured natural

images. However, our experiments show that τ = 30

may be considered a good value for all scenes offer-

ing good color contrast. As in low contrast scenes the

color segmentation will most likely fail, we do not

take it into further consideration.

In ﬁgure 6(b) we show the segmentations for dif-

ferent values of the parameter α

using the Metropo-

lis algorithm. The parameter α

is a system parame-

ter used to increase or decrease the coupling strength

in the clustering model. Thus, it inﬂuences the total

number of segments. Other parameters were taken

from (Abramov et al., 2012). Best results in the

trade-off between over- and under-segmentation were

achieved for α

= 1.0.

Next, we compare our method with the graph-

based segmentation, mean-shift algorithm and an al-

ternative version of the Metropolis algorithm using

the standard bilateral ﬁlter for smoothing (see ﬁgure

6(c)). While the Metropolis algorithm without ﬁlter

performs better than the graph-based and mean shift

segmentation, precision is improved when using the

proposed ﬁlter. Results are shown for different α

values for the Metropolis algorithm. The value for

= 1.0 is marked with a circle. For graph-based

segmentation we used the combination of parameters

recommended by the authors for segmentation of ar-

bitrary images, see (Felzenszwalb and Huttenlocher,

2004). The mean shift algorithm of (Paris and Du-

rand, 2007) uses three input parameters: the Gaus-

sian parameters σ

and σ

for color and spatial do-

main respectively and the persistent threshold τ

. We

determined experimentally the combination σ

= 2,

= 8 and τ

= 1 for best results. On purpose, we

ANovelReal-timeEdge-PreservingSmoothingFilter

(a) (b) (c)

100

200

0 100 200 300 400

color value

pixel nr

(d)

100

200

0 100 200 300 400

color value

pixel nr

(e)

100

200

0 100 200 300 400

color value

pixel nr

(f)

Figure 4: 4(a) Artiﬁcial input image with three features: a green color edge, texture is simulated using red squares 7px wide,

and white noise is added. A cross section is shown in white and can be seen in ﬁgure 4(d). 4(b) Bilateral Filter. For an

arbitrary large kernel of 20 px in the space domain and using σ

= 200 for the color domain all texture and noise is ﬁltered.

The color edge now has a width of 18 px. A cross section can be seen in ﬁgure 4(e). 4(c) Proposed texture ﬁlter. The edge

remains sharp and all texture, as well as noise is smoothed out. A cross section can be seen in ﬁgure 4(f). 4(d) Cross section

of original image. A green color step, as well as texture and noise can be seen. 4(e) Cross section of the bilateral ﬁlter. The

color edge is not preserved, texture is removed, but some noise remains. 4(f) Cross section of proposed ﬁlter. The green color

edge is preserved and texture, as well as all noise is removed.

Figure 5: Effect of different thresholds. For low thresholds the image is more blurred, after τ = 30 it does not change much.

From top left to bottom right: τ = 5, 20, 30, 40, 60, 80.

did not combine the mean-shift and the graph-based

algorithm with the proposed ﬁlter, because both algo-

rithms already perform their own preprocessing.

A visual comparison can be seen in ﬁgure 7. The

proposed ﬁlter greatly reduces noise and texture and

achieves a good trade-off between over- and under-

segmentation as compared to the other methods. A

comparison for a video sequence using the metropo-

lis algorithm may be seen in ﬁgure 8. Labels are kept

during the sequence and are encoded using different

colors: both mean-shift and graph based segmenta-

tion do not use ﬁxed label numbers for objects. The

ﬁlter reduces the over-segmentation in textured areas,

e.g. the suitcases, the pad, or the plant.

For robotic applications it is very important that

the labels of image segments do not change through-

out a video stream. Currently only very few segmen-

tation algorithms running in real-time do achieve this

(Abramov, 2012), among them the Metropolis algo-

rithm.

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

0,2

0,4

0,6

0,8

0 20 40 60 80

precision / recall

Threshold τ

precision

recall

(a)

0,4

0,5

0,6

0,7

0,8

0,9

0,28 0,32 0,36 0,4

recall

precision

= 0.8

= 0.9

= 1.0

= 1.1

= 1.2

= 2.0

(b)

0,5

0,6

0,7

0,8

0,9

0,2 0,25 0,3 0,35 0,4

recall

precision

metropolis without ﬁlter

metropolis with bilateral ﬁlter

metropolis with proposed ﬁlter

graph-based

mean shift

(c)

Figure 6: 6(a) Metropolis algorithm performance with ﬁxed parameters using ﬁltered images for various thresholds. 6(b)

Estimating best segmentation parameters for a ﬁxed threshold τ = 30 using ﬁltered images. 6(c) Comparison between graph-

based, mean shift and Metropolis algorithm. Metropolis segmentation using α

= 1.0 is marked with a circle.

3.3 Time Performance Results

We computed the average frame rates for images of

different sizes in table 1. For comparison purposes,

images from the Berkeley Segmentation Dataset and

Benchmark were used (Martin et al., 2001). As

shown in section 2 the complexity is independent of

the threshold used. You can see that the GPU ver-

sion is roughly 30 times faster than the CPU ap-

proach, independent of the image size. For images

of size 480 ×320 px real-time processing of movies is

achieved.

Table 1: Time performance for images of different sizes.

The test image was taken from the training set of the Berke-

ley Segmentation Dataset and Benchmark (Martin et al.,

2001) and is also used in ﬁgure 3(b). 100 measurements

were taken and averaged.

Image Size CPU GPU

[px] [Hz] [s] [Hz] [ms]

240 × 180 3.03 0.33 80.38 12.4

320 × 240 1.66 0.60 48.00 20.8

480 × 320 0.80 1.25 23.81 42.0

640 × 480 0.40 2.50 12.35 81.0

800 × 600 0.24 4.17 7.65 130.7

1024 × 768 0.15 6.67 4.24 235.9

4 CONCLUSIONS

In this paper, we presented a novel real-time edge

preserving smoothing ﬁlter, which replaces noisy and

textured areas by uniformly colored patches. The

performance of a recent image segmentation method

could be signiﬁcantly improved using the ﬁltered im-

ages. The time performance makes the ﬁlter applica-

ble to video streams, as shown in ﬁgure 8, and hence

can be used in the future as a component inside the

perception-action loop of robotic applications. The

proposed method improves the precision and recall

trade-off of obtained segmentations. Furthermore, the

detected features could be used for classiﬁcation pur-

poses in other applications.

ACKNOWLEDGEMENTS

The research leading to these results has received

funding from the European Community’s Seventh

Framework Programme FP7/2007-2013 (Speciﬁc

Programme Cooperation, Theme 3, Information and

Communication Technologies) under grant agree-

ment no. 269959, Intellact. B. Dellen acknowledges

support from the Spanish Ministry of Science and In-

novation through a Ramon y Cajal program.

ANovelReal-timeEdge-PreservingSmoothingFilter

(a) (b) (c)

(d)

Figure 7: Visual comparison of several threshold levels. 7(a) Original image. 7(b) Graph based segmentation. 7(c) Mean-shift

segmentation. 7(d) Proposed ﬁlter in connection with metropolis algorithm.

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

(a) (b) (c) (d)

Figure 8: Visual comparison of a short video sequence. 8(a) Original frames from sequence. 8(b) Filtered frames from

sequence using the proposed ﬁlter. 8(c) Segmentation using the Metropolis algorithm without ﬁlter. 8(d) Segmentation using

the Metropolis algorithm and the proposed ﬁlter. There is signiﬁcant less over-segmentation in textured areas (e.g. suitcase,

pad).

ANovelReal-timeEdge-PreservingSmoothingFilter

REFERENCES

Abramov, A. (2012). Compression of the visual data

into symbol-like descriptors in terms of the cognitive

real-time vision system. PhD thesis, Georg-August-

Universität Göttingen.

Abramov, A., Pauwels, K., Papon, J., Wörgötter, F., and

Dellen, B. (2012). Real-time segmentation of stereo

videos on a portable system with a mobile gpu. IEEE

Transactions on Circuits and Systems for Video Tech-

nology.

Cigla, C. and Aydin Alatan, A. (2008). Depth assisted ob-

ject segmentation in multi-view video. In 3DTV Con-

ference: The True Vision - Capture, Transmission and

Display of 3D Video, 2008, pages 185 –188.

Comaniciu, D. and Meer, P. (2002). Mean shift: a robust

approach toward feature space analysis. IEEE Trans-

actions on Pattern Analysis and Machine Intelligence,

24(5):603 –619.

Du, W., Tian, X., and Sun, Y. (2011). A dynamic threshold

edge-preserving smoothing segmentation algorithm

for anterior chamber oct images based on modiﬁed

histogram. In 4th International Congress on Image

and Signal Processing (CISP), volume 2, pages 1123

–1126.

Durand, F. and Dorsey, J. (2002). Fast bilateral ﬁltering

for the display of high-dynamic-range images. ACM

Trans. Graph., 21(3):257–266.

Elad, M. (2002). On the origin of the bilateral ﬁlter and

ways to improve it. IEEE Transactions on Image Pro-

cessing, 11(10):1141 – 1151.

Farbman, Z., Fattal, R., Lischinski, D., and Szeliski, R.

(2008). Edge-preserving decompositions for multi-

scale tone and detail manipulation. ACM Trans.

Graph., 27(3):67:1–67:10.

Farmer, M. and Jain, A. (2005). A wrapper-based approach

to image segmentation and classiﬁcation. IEEE Trans-

actions on Image Processing, 14(12):2060 –2072.

Farsiu, S., Elad, M., and Milanfar, P. (2006). Multiframe de-

mosaicing and super-resolution of color images. IEEE

Transactions on Image Processing, 15(1):141 –159.

Felzenszwalb, P. and Huttenlocher, D. (2004). Efﬁcient

graph-based image segmentation. International Jour-

nal of Computer Vision, 59:167–181.

Jiang, W., Baker, M. L., Wu, Q., Bajaj, C., and Chiu, W.

(2003). Applications of a bilateral denoising ﬁlter in

biological electron microscopy. Journal of Structural

Biology, 144(1–2):114 – 122.

Lev, A., Zucker, S. W., and Rosenfeld, A. (1977). Iterative

enhancemnent of noisy images. IEEE Transactions on

Systems, Man and Cybernetics, 7(6):435 –442.

Martin, D., Fowlkes, C., and Malik, J. (2004). Learning

to detect natural image boundaries using local bright-

ness, color, and texture cues. IEEE Transactions on

Pattern Analysis and Machine Intelligence, 26(5):530

–549.

Martin, D., Fowlkes, C., Tal, D., and Malik, J. (2001).

A database of human segmented natural images and

its application to evaluating segmentation algorithms

and measuring ecological statistics. In Proc. 8th Int’l

Conf. Computer Vision, volume 2, pages 416–423.

Muneyasu, M., Maeda, T., Yako, T., and Hinamoto, T.

(1995). A realization of edge-preserving smoothing

ﬁlters using layered neural networks. In IEEE Interna-

tional Conference on Neural Networks, Proceedings.,

volume 4, pages 1903 –1906 vol.4.

Paris, S. and Durand, F. (2007). A topological approach to

hierarchical segmentation using mean shift. In IEEE

Conference on Computer Vision and Pattern Recogni-

tion (CVPR), pages 1 –8.

R., R. and W., S. (2003). Adaptive demosaicking. J. Elec-

tron. Imaging, 12(12):633.

Sun, D., Roth, S., and Black, M. (2010). Secrets of optical

ﬂow estimation and their principles. In IEEE Con-

ference on Computer Vision and Pattern Recognition

(CVPR), pages 2432 –2439.

Tomasi, C. and Manduchi, R. (1998). Bilateral ﬁltering for

gray and color images. In Sixth International Confer-

ence on Computer Vision, pages 839 –846.

Xiao, J., Cheng, H., Sawhney, H., Rao, C., and Isnardi,

M. (2006). Bilateral ﬁltering-based optical ﬂow es-

timation with occlusion detection. In Leonardis, A.,

Bischof, H., and Pinz, A., editors, Computer Vision –

ECCV 2006, volume 3951 of Lecture Notes in Com-

puter Science, pages 211–224. Springer Berlin / Hei-

delberg.

Yang, Q., Wang, S., and Ahuja, N. (2010). Svm for edge-

preserving ﬁltering. In IEEE Conference on Computer

Vision and Pattern Recognition (CVPR), pages 1775 –

1782.

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications