An Entropy-based Model for a Fast Computation of SSIM

Vittoria Bruni

1,2

and Domenico Vitulano

Dept. of SBAI, University of Rome La Sapienza, Via A. Scarpa 16, Rome, Italy

Istituto per le Applicazioni del Calcolo, CNR, Via dei Taurini 19, Rome, Italy

Keywords:

Information Theory, SSIM, Image Quality Assessment, Typical Set.

Abstract:

The paper presents a model for assessing image quality from a subset of pixels. It is based on the fact that

human beings do not explore the whole image information for quantifying its degree of distortion. Hence, the

vision process can be seen in agreement with the Asymptotic Equipartition Property. The latter assures the

existence of a subset of sequences of image blocks able to describe the whole image source with a preﬁxed and

small error. Speciﬁcally, the well known Structural SIMilarity index (SSIM) has been considered. Its entropy

has been used for deﬁning a method for the selection of those image pixels that enable SSIM estimation with

enough precision. Experimental results show that the proposed selection method is able to reduce the number

of operations required by SSIM of about 200 times, with an estimation error less than 8%.

1 INTRODUCTION

A wide literature has deﬁnitely proved that embed-

ding and translating HVS concepts in image process-

ing based applications promote the optimization of

several applications in terms of efﬁciency, precision,

automaticity and, sometimes, computing time (Bruni

et al., 2012; Bruni et al., 2013a; Hontsch and Karam,

2002; Hou and Yau, 2010; Jourlin and Pinoli, 1998;

Lee and Lee, 2006; Panetta et al., 2008; Wang and

Li, 2011). In this context, the deﬁnition of mea-

sures for image quality assessment that correlate more

with human visual system plays a fundamental role

(Bruni et al., 2013b; Ferzli and Karam, 2009; Moor-

thy and Bovik, 2009; Sheikh et al., 2005; Wang et al.,

2004; Wang and E.P.Simoncelli, 2005; Wang and Li,

2011). Despite its recognized lack of correlation with

human perception, the classical mean squared error

(MSE) is still used in many applications, especially

in optimization problems, due to its simplicity, low

computational effort and nice mathematical proper-

ties. The Structural SIMilarity index (SSIM) (Wang

et al., 2004) revealed to be a robust competitor of

MSE thanks to its discrete correlation with HVS, its

deﬁnition through very simple operations and, as re-

centy proved, its interesting mathematical properties

that can promote its use, for example, in regulariza-

tion methods. Unfortunately, the computational cost

required by SSIM is higher than the one required by

MSE. SSIM is a pixelwise measure but it involves

block-based operations for each pixel

The aim of this paper is to speed up the compu-

tation of SSIM by computing it on a reduced num-

ber of blocks. This strategy mainly relies on the fact

that humans are able to assign a score to the image

just looking at few speciﬁc points, known as ﬁxation

points (Monte et al., 2005; Frazor and Geisler, 2006).

This way of selecting information is closely related to

some concepts in Information Theory and, in partic-

ular, to the Asymptotic Equipartition Property (Cover

and Thomas, 1991). This principle states that for a

given source, there exists a subset of sequences ableto

represent the whole source — i.e. with entropy close

to the source entropy. Accordingly, in the context of

vision, there exists morethan one sequence of ﬁxation

points of a given length that is able to code the whole

image information content. Hence, by deﬁning the

visual distortion typical set as in (Bruni and Vitulano,

2014), we want to develop a method for extracting at

least one sequence belonging to this set from which

For the i−th pixel SSIM, is deﬁned as follows:

SSIM(b

, d

) =

2µ

+ µ

+ σ

where b

and d

are blocks centered at i respectively in the

original and distorted image, µ

∗

and σ

∗

respectively are the

mean and the standard deviation of ∗, σ

is the correlation

between b

and d

, whileC

andC

are numerical stabilizing

constants.

226

Bruni, V. and Vitulano, D.

An Entropy-based Model for a Fast Computation of SSIM.

DOI: 10.5220/0005730002260233

In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 4: VISAPP, pages 226-233

ISBN: 978-989-758-175-5

assessing the quality of the whole image. With re-

spect to the literature concerning the selection of the

best pooling weights for an image quality assessment

measure (Moorthyand Bovik, 2009; Park et al., 2011;

Wang and Li, 2011)), the proposed method can also

be seen as a binary pooling, preserving some blocks

while discarding the others.

The proposed method consists of the following

main steps:

1. Luminance based image segmentation: the image

is split into a ﬁnite number of distinct regions hav-

ing different characteristics;

2. Finite random walk on a connected and weighted

graph whose nodes are the regions given by the

segmentation. This step provides a sequence of

points belonging to the typical set of length K.

K is automatically determined for each image us-

ing the Minimum Description Length principle

(MDL) (Grunwald, 2004).

It will be shown that the mean value of the SSIM eval-

uated on blocks centered at these points gives a faith-

ful estimationof SSIM of the whole image with a con-

siderable computational saving. Experimental results

on test images from TID2013 database (Ponomarenko

et al., 2015) show that it is possible to reach a speed

up for SSIM, evaluated for different distortion levels,

over 200:1 with a relative estimation error lower than

8%.

The outline of the paper is the following. The next

Section gives some preliminary results on the visual

distortion typical set. Section 3 presents a method

for determining a sequence of points belonging to

this set; details about the algorithm and its computa-

tional cost will also be given. Section 4 presents some

experimental results obtained on TID2013 database

while the last section draws the conclusions.

2 SOME PRELIMINARY

RESULTS

In (Bruni and Vitulano, 2014), the visual distortion

typical set A

has been deﬁned as a subset of all se-

quences composed of samples of the original image

I (and the corresponding ones in the degraded image

) such that they give an approximated value

M of

the expected value of the measure M (i.e.

M) within

an error ε, i.e.: |

M −

M| < ε, where M is the refer-

ence quality measure. In our case, M is the pointwise

SSIM,

M is the mean of M computed using all im-

age pixels, while

M is the mean of M computed on a

reduced number of image pixels. More formally, A

is the set of sequences of ﬁxed size whose entropy

is close to the entropy of the source. The existence

of A

is guaranteed by the Asymptotic Equipartition

Property (AEP) (Cover andThomas, 1991), that states

that for i.i.d. r.v.s X

it holds:

log

p(X

,..,X

)

→

H(X) n → ∞.

AEP is the entropic version of the weak law of

large numbers. However, the entropy based version

is more mathematically tractable as entropy increases

as the number of samples grows (Cover and Thomas,

1991), while it is not so for the mean value. Based

on these concepts, in (Bruni and Vitulano, 2014) the

authors gave some guidelinesfor an optimized extrac-

tion of the visual distortion typical set from the cou-

ple of images (I, I

). Speciﬁcally, it has been formally

proved that:

1. Not all information in I and I

is really important;

it is sufﬁcient to select just a part of it for assess-

ing image quality. In addition, an entropy based

criterion can be applied for selecting the signiﬁ-

cant information. Speciﬁcally, it has been proved

the following result:

Proposition 1. Let X ∼ Q with a positive and

numerical alphabet χ and {X

} ∼ p

, {X

, X

} ∼

, ..., {X

, X

, ..., X

} ∼ p

. Let µ

be the mean

of p

, µ be the mean of Q and D

the Kullbach-

Leibler divergence. Then

(a) the sequence {µ

} is not monotonic for increas-

ing n;

(b) |µ

− µ|

≤ 2M

||Q) ∀n, with M

max

x∈χ

This Proposition along with the known results on

the monotonicity of the entropy per element of a

stationary stochastic process (Cover and Thomas,

1991), support the use of the entropy as fun-

damental measure to use in the selection of se-

quences belonging to the visual distortion typical

set.

2. In the construction of the sequence of interest, it

is more convenient to select non overlapping local

regions (for instance, blocks) as samples of I (and

) rather than to randomly select isolated pixels.

3. It is more convenient to extract signiﬁcant infor-

mation from M rather than from the couple of im-

ages I and I

What was missing in (Bruni and Vitulano, 2014) is a

constructive method for determining the typical set:

only its existence along with some criteria and guide-

lines for its best search have been provided. That is

why, in the sequel we will give an answer to the fol-

lowing question

4. How to ﬁnd a sequence belonging to A

using a

fast procedure.

An Entropy-based Model for a Fast Computation of SSIM

227

It is worth outlining that there is a wide literature

concerning ﬁxation points (Frazor and Geisler, 2006;

Monte et al., 2005; Raj et al., 2005), i.e. those points

that allow to sinthesize and understand scene infor-

mation in the preattentive phase. Several approaches

for the determination of a subset of scene informa-

tion mainly rely on the construction of saliency maps

(Benabdelkader and Boulemden, 2005; Bruni et al.,

2011; Wang et al., 2010). However, to the best of au-

thors’ knowledge, there are not complete theoretical

formalisms that lead to a speciﬁc subset that can be

extracted in a limited time (Raj et al., 2005), as the

proposed approach does. In addition, unlike existing

methods that provide empirical and computationally

demanding strategies that lead to a speciﬁc solution

(i.e. a speciﬁed walk in the scene under exam), the

proposed approach proves the existence of more than

one walk given I, I

, M and ε, in agreement with the

concept of typical set in Information Theory (Cover

and Thomas, 1991).

3 THE PROPOSED MODEL

Fixation points vary from observer to observer since

they depend on personal cognitive experience, the

scope of the observation and image content. How-

ever, if we restrict to the class of natural images, some

rules of visual system, that guide the saccadic move-

ments in the preattentive phase, can be modelled in

an easier way. The characteristics of natural scenes

guided the adaptation of the visual system over time;

hence, their sources are the ones with which the visual

system is more familiar. In the ﬁrst milliseconds of

scene inspection, ﬁxation points are not conditioned

by the observer, but mainly by image features; that is

why only global distortions (affecting all image pix-

els) will be considered in the remaining part of the

paper. In fact, a local distortion would strongly orient

the path of ﬁxations, that cannot be easy predictable

without additional information on the distortion kind.

The proposed method consists of the following

main steps:

1. Luminance based segmentation of the image I.

The output is a partition of the image in 2

re-

gions R

, i = 1, . . . , 2

having different charac-

teristics. To this aim the Successive Mean Quan-

tization Transform (SMQT) (Nilsson et al., 2005)

applied to the approximation band of the wavelet

expansion of the image has been employed.

2. Finite random walk on a connected and weighted

graph whose nodes are the regions R

, i =

Figure 1: From left to right top to bottom: Original image;

Image affected by additive gaussian noise; pointwise SSIM

map; Image affected by gaussian blur; pointwise SSIM

map.

1, . . . , 2

. This step provides a sequence belong-

ing to the typical set of length K. K is automat-

ically determined for each image using the Mini-

mum Description Length principle.

3.1 Luminance based Segmentation

The ﬁrst step aims at discriminating image regions in

agreement with the visibility of distortion. In fact,

global distortions are not perceived in the same way

in the whole image. For example, as shown in Fig. 1

random noise is more visible in ﬂat regions while it is

masked in textured regions. On the contrary, blurring

is more visible in textured regions than in ﬂat regions.

Hence, the proposed method segments the image ac-

cording to this visibility criterion. Speciﬁcally, the

luminance value at a given ﬁxed resolution has been

selected as the visibility criterion. Luminance is one

of the two measures that regulate the adaptation pro-

cess in the preattentive phase. The second one is the

contrast that has not been considered here for sim-

plicity. The resolution aims at simulating early vision

process, that essentially is a low pass ﬁlter whose cut-

off frequency depends on the viewing distance. In

order to speed up the segmentation process, the ap-

proximation band (low-pass component) at level J

) of the dyadic wavelet expansion of the image

I has been computed (Mallat, 1998), since its dimen-

sion is

J+1

of the original image size. For segment-

ing A

, the Successive Mean Quantization Transform

(SMQT) has been adopted due to its simplicity and

reduced computational effort. SMQT builds a binary

tree using the following rule: given a set of data A

and a real parameter L (number of levels), split A

into two subsets,

x ∈ A

(x) ≤

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

228

and

x ∈ A

(x) > A

where

is the mean value of A

. A

and A

are

the ﬁrst level of the SMQT. The same procedure is

recursively applied to A

and A

until the L

level,

that is composed of 2

subsets (regions) that will be

denoted with R

, R

, . . ., R

3.2 Random Walk on a Connected

Graph

The ﬁxation path is determined by suitably extracting

points from these regions. To this aim, the observa-

tion process has been modeled as a Markov chain, i.e.

random walk on a connected weighted graph whose

nodes are the 2

regions R

, R

, . . . , R

, with weights

≥ 0 on the edge joining node i to node j. The

graph is undirected, i.eW

= W

, andW

= 0 if there

is not an edge joining the node i to the node j.

Hence, given a point randomly extracted from the

region R

, the successive point in the walk is a ran-

dom point in the region R

chosen among the nodes

connected to R

with a probability

∑

i∼k

(1)

that is proportional to the weight W

. By denoting

with n

the number of pixels in the region R

, the

weights are deﬁned as follows



i = j

i 6= j

(2)

where Z

= n

∑

i∼k,k6=i

∑

k=1

. W

takes into account the

representativeness of the region R

in the image and

also as neighbouring region of R

. Even though a

more reﬁned deﬁnition of the weights could be used,

this choice is simple but enough signiﬁcant for our

preliminary study.

The initial point of the walk is extracted on the

basis of the stationary distribution of the process, as

described in (Cover and Thomas, 1991). On the con-

trary, the last point of the walk is determined on the

basis of the minimum descritpion length principle

(Grunwald, 2004), as shown in the sequel.

3.3 MDL for Blocks Number

This principle allows the selection of a good model

for approximating the data with the least complex-

ity. It is based on the concept that good compression

means good approximation, in agreement with the

deﬁnition of Kolmogorov complexity. Speciﬁcally,

the simpler version of MDL, namely crude-MDL, se-

lects a model from a set of candidates M

(1)

, M

(2)

, . . .

by minimizing the following cost

L(M

(k)

) + L(X|M

(k)

) (3)

where L(M

(k)

) is the cost (in terms of bits) required

for coding the model M

(k)

, while L(X|M

(k)

) is the

number of bits required for coding the data X given

the model. In general, the better the model the higher

its cost but the smaller the approximation error. That

is why the selection of the best model is a trade off

between complexity and good approximation. In our

case the model M

(k)

is the ﬁxation path containing

the SSIM value of k points whose average gave an ap-

proximation of SSIM of the whole image. The data X

are correspondingblocks in I and I

centered at these-

lected pixels that are involved in SSIM computation.

The cost is measured as entropy per element. More

precisely, by indicating with M

, M

the value of

SSIM computed in the ﬁrst k points selected dur-

ing the random walk on the graph described above,

and with (b

, b

, ·· · , b

) the blocks used for the eval-

uation of SSIM, we have L(X|M

(k)

) =

H(M

)

and L(X|M

(k)

) =

H(b

,...,b

)+2log

(k)+1

where H is

the entropy, w

is the dimension of a block and

2log

(k) + 1 is the cost for coding the integer k. By

coding the blocks independently, H(b

, b

, ·· · , b

) =

kH(b

), i = 1, 2, . .. , k and by considering a com-

pression ratio 8 : 1, eq. (4) can be rewritten as

K = argmin

H(M

, M

)

k+ 2log

(k) + 1

(4)

where K gives the length of the optimal path, i.e. the

length of a sequence in the visual distortion typical

set.

3.4 Algorithm

1. Compute the wavelet approximation band A

J − th level of the image I

2. Apply L levels of the SMQT transform to A

and

extract the regions R

, R

, . . . , R

3. Compute the cardinality n

, n

, . . ., n

of the seg-

mented regions and evaluate the weights of the

graph as in eq. (2)

4. Extract a point from a region R

according to the

stationary distribution of the graph as deﬁned eq.

(2)

5. Compute M

, i.e. SSIM on a block of dimension

w× w centered at the selected point and set k = 2

6. Extract a point in the region R

selected according

to the probability P

i, j

deﬁned in eq. (1)

An Entropy-based Model for a Fast Computation of SSIM

229

7. Compute M

, i.e. SSIM on a block of dimension

w× w centered at the selected point

8. Evaluate the argument of eq. (4) and assign its

value to the variable L

9. If L

> L

k−1

, set K = k − 1 and

M =

∑

k=1

and

stop; otherwise set k = k+ 1 and go to step 6.

M is the approximation for SSIM given by the model,

while K is the number of blocks used for getting it.

3.5 Model’s Complexity

By denoting with C

log

the cost for the calculation of

the logarithm of a number, with N the image size and

with |χ| the cardinality of the alphabet of SSIM, it is

possible to prove that the proposed algorithm requires



1−



+ 2L+ 1



N − 2

− L+ 1+ 2



+ 30+

log

+ log

|χ|



K + (4 + C

log

)

operations, i.e. multiplications, algebraic sums, di-

visions and comparisons, while the computation of

SSIM using all image pixels requires

(8w

+ 18)N

operations. Hence, by comparing the number of op-

erations given above, it is possible to determine the

maximum value for K, which gives a gain in the com-

putation of SSIM. This value depends on the parame-

ters of the proposed method and the image size.

4 EXPERIMENTAL RESULTS

The proposed method has been tested on several im-

ages affected by different distortion kinds and levels.

In this paper we will give some results obtained from

natural images extracted from TID2013 database

(Ponomarenko et al., 2015) affected by global distor-

tions like additive and multiplicative gaussian noise,

high freuency noise, gaussian blurring, jpeg and jpeg

2000 compression, mean shift and contrast change.

For each distortion, four levels have been considered.

In all tests the following parameters have been used.

The level J of the wavelet transform has been set

equal to 3 and a Daubechies with 2 vanishing mo-

ments has been adopted; the levels L of the SMQT

have been ﬁxed to 3 in order to have 8 regions; the di-

mension of the blocks for SSIM computationof SSIM

has been ﬁxed to 17× 17, since it corresponds to a vi-

sual angle equal to 0.56 degrees (Monte et al., 2005)

— however smaller dimensions provide similar re-

sults; the cardinality of the alphabet for SSIM has

been set equal to 200, that corresponds to a quanti-

zation step equal to 0.01. Table 1 provides the results

achieved on 512 × 384 Ocean image (image I16 in

TID2013). It is worth outlining that each run of the

proposed algorithm provides a different sequence in

the visual distortion typical set of the image. That is

why the average value of SSIM estimations obtained

by 30 runs of the algorithm has been given in Table 1.

The same table includes the standard deviation of the

estimation as well as the average number of blocks

used for computing it and the corresponding standard

deviation. As it can be observed, the estimation error

increases as the distortion level increases but it does

not overexceed 8% and the standard deviation is quite

small. For some distortion kinds, like gaussian noise

and gaussian blur this percentage is less than 5% and

for distortions like mean shift and contrast change it

is does not overexceed 1.2%. The average number

of blocks is less than 50, it means that the number

of operations required for the computation of SSIM

of Ocean image is reduced of about 200 times. It

is worth outilining that similar results have been ob-

tained for the other images in the database; for some

of them the average number of blocks is smaller than

50, while the average estimation error is still less than

8%. It is also worth stressing that the proposed proce-

dure does not involve an exhaustive search of points

of interest, as required by the contrast-based proce-

dure in (Raj et al., 2005).

Figure 2 shows the segmentation used for Ocean

image. As it can be observed, the segmentation is not

far from the one given by the SSIM map except for

the edges. It is due to the fact that the criterion used

for the segmentation is based just on the luminance

and then a region based segmentation has been em-

ployed. Nonethless, the optimal point selected by the

MDL principle on the entropy curve corresponds to a

good value of SSIM, providing acceptable estimation

errors. The same ﬁgure shows the blocks belonging

to the selected ﬁxation path. As it can be observed,

more blocks are selected in regions where blurring is

more visible.

5 CONCLUSIONS AND FUTURE

RESEARCH

This paper has presented a method for the estima-

tion of the Strucutral SIMilarity index from a reduced

number of suitably selected image pixels. It mod-

els the observation process in the preattentive phase

as a random walk on a graph whose nodes are im-

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

230

Table 1: Ocean image; I16 in TID2013 database. SSIM (

M), estimated SSIM (

M) using the proposed method, mean value of

the estimation error (%) over 30 runs (ε), standard deviation of the estimation error (σ

), mean value of the number of blocks

used (

K), standard deviation of the number of blocks (σ

distortion level

ε σ

high 2 0.8661 0.8672 1.1222 0.0113 48.00 2.92

frequency 3 0.7034 0.7086 2.6788 0.0237 48.87 4.54

noise 4 0.4829 0.4842 4.9297 0.0275 48.70 3.51

5 0.2722 0.2627 8.9767 0.0284 47.80 4.43

2 0.8673 0.8728 1.4447 0.0132 45.67 3.87

Gaussian 3 0.7781 0.7821 2.1462 0.0217 48.50 4.55

noise 4 0.6614 0.6607 3.4739 0.0308 48.87 3.50

5 0.5276 0.5272 4.3112 0.0260 49.30 3.37

2 0.9513 0.9542 0.6528 0.0069 37.37 6.15

Gaussian 3 0.8805 0.8828 1.4811 0.0159 44.47 5.10

blur 4 0.7925 0.8045 2.9532 0.0260 46.00 6.59

5 0.7012 0.7128 4.1474 0.0346 48.40 4.67

2 0.9451 0.9448 0.5260 0.0061 41.87 5.17

JPEG 3 0.8891 0.8895 0.8761 0.0097 47.33 3.94

compression 4 0.7578 0.7518 2.1654 0.0198 48.33 3.56

5 0.6320 0.6257 4.1073 0.0311 49.37 4.10

2 0.8516 0.8553 1.9288 0.0183 46.73 3.59

JPEG2K 3 0.6942 0.6939 4.0191 0.0326 49.43 4.35

compression 4 0.5394 0.5529 6.4984 0.0395 47.73 3.55

5 0.4799 0.4827 7.6406 0.0426 49.53 2.83

2 0.9951 0.9950 0.1259 0.0015 21.00 5.52

Mean 3 0.9778 0.9779 0.2198 0.0028 34.37 6.13

shift 4 0.9620 0.9644 0.5506 0.0059 31.27 7.36

5 0.8929 0.8930 1.0167 0.0111 45.53 4.57

2 0.9829 0.9832 0.3674 0.0042 29.23 5.70

Contrast 3 0.9713 0.9711 0.1596 0.0019 33.53 5.51

change 4 0.9349 0.9392 0.8867 0.0086 40.20 5.70

5 0.8726 0.8749 0.7490 0.0081 44.60 3.28

Multiplicative 2 0.8594 0.8697 2.3503 0.0220 45.80 5.25

gaussian 3 0.7730 0.7851 3.3095 0.0286 47.40 4.33

noise 4 0.6615 0.6627 4.0346 0.0327 48.60 3.75

5 0.5376 0.5346 5.0814 0.0322 50.77 3.18

age regions having distinct visual characteristics and

whose edges are weighted accounting for the repre-

sentativeness of the region in the whole image and

also in the neighborhood of proximal regions. The

length of the sequence is automatically determined

for each image by means of the minimum description

length that selects the number of blocks able to guar-

antee a good tradeoff between good estimation er-

ror and reduced computational complexity. The pro-

posed method makes some assumptions on the class

of analysed images (natural images) and distortion

kind (global distortion); in addition, it uses some sim-

ple criteria and ﬁxed parameters in the segmentation

step. Nonetheless, even though in its simpler form,

the results are satisfying and promising. Very few

blocks provide SSIM estimation with errors less than

8%; this worst case is reached in very particular cases.

Future research will be devoted to the use of more re-

ﬁned criteria in the segmentation process and to make

adaptive and automatic the choice of the parameters

involved in the segmentation step (resolution of the

wavelet transform, number of regions of image parti-

tion). Furthermore, some dependency on region con-

tent will be introduced in the deﬁnition of the edge

weights of the graph that is used for deﬁning the ﬁxa-

tion path. In fact, such an approach may also allow: i)

to improve the design of existing QA measures, ii) to

design novel and possibly more precise QA measure,

iii) to build novelHVS based regularization functions,

iv) to add some novel elements to Visual Information

Theory with possible effects on the deﬁnition of new

visive image coding schemes.

An Entropy-based Model for a Fast Computation of SSIM

231

0 10 20 30 40 50 60 70

0.7

0.75

0.8

0.85

0.9

0.95

SSIM

0 10 20 30 40 50 60 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

MDL

Figure 2: First row Original Ocean image (left); Blurred image (middle); SSIM value estimated for an increasing number

of blocks (the optimal point has been marked) (right). Second row SSIM map (left); Segmentation provided by the SMQT

(middle); entropy per sample used in the MDL based procedure — the optimal point has been marked (right). Last row

selected image blocks.

ACKNOWLEDGEMENTS

The Authors would like to thank Simone Guarracino

for the development of part of the Matlab code of the

proposed method.

REFERENCES

Benabdelkader, S. and Boulemden, M. (2005). Recursive

algorithm based on fuzzy 2-partition entropy for 2-

level image thresholding. In Pattern Recognition. El-

sevier.

Bruni, V., Crawford, A., Kokaram, A., and Vitulano, D.

(2013a). Semi-transparent blotches removal from

sepia images exploiting visibility laws. In Signal Im-

age and Video Processing, 7(1), 11-26.

Bruni, V., Rossi, E., and Vitulano, D. (2012). On the equiva-

lence between jensen shannon divergence and michel-

son contrast. In IEEE Trans. on IInformation Theory,

Vol. 58, No. 7. IEEE.

Bruni, V., Rossi, E., and Vitulano, D. (2013b). Jensen-

shannon divergence for visual quality assessment. In

Signal Image and Video Processing, Vol. 7, No. 3.

Springer.

Bruni, V. and Vitulano, D. (2014). A fast computation

method for iqa metrics based on their typical set. In

Proc. of ICPRAM 2014.

Bruni, V., Vitulano, D., and Ramponi, G. (2011). Image

quality assessment through a subset of the image data.

In Proc. of ISPA 2011. IEEE.

Cover, T. M. and Thomas, J. A. (1991). Elements of Infor-

mation Theory. John Wiley sons.

Ferzli, R. and Karam, L. J. (2009). A no-reference objective

image sharpness metric based on the notion of just no-

ticeable blur (jnb). In IEEE Trans. Image Processing,

Vol. 18, No. 4. IEEE.

Frazor, R. and Geisler, W. (2006). Local luminance and

contrast in natural in natural images, 46. In Vision

Research.

Grunwald, P. D. (2004). A tutorial introduction to the min-

imum description length principle. In Advances in

Minimum Description Length: Theory and Applica-

tions. Myung Grunwald, Pitt.

Hontsch, I. and Karam, L. (2002). Adaptive image coding

with perceptual distortion control. In IEEE Trans. on

Image Processing. IEEE.

Hou, Z. and Yau, W. (2010). Visible entropy: A measure

for image visibility. In Proc. of ICPR.

Jourlin, M. and Pinoli, J. C. (1998). A model for logarithmic

image processing. In J. Microsc., Vol. 149.

Lee, H. and Lee, S. (2006). Visual entropy gain for wavelet

image coding. In IEEE Sig. Proc. Letters. IEEE.

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

232

Mallat, S. (1998). A wavelet tour of signal processing. Aca-

demic Press.

Monte, V., Frazor, R., Bonin, V., Geisler, W., and Corandin,

M. (2005). Independence of luminance and contrast

in natural scenes and in the early visual system 8(12).

In Nature Neuroscience.

Moorthy, A. and Bovik, A. (2009). Visual importance pool-

ing for image quality assessment. In IEEE Journal on

Special Topics in Sig. Proc., 3(2).

Nilsson, M., Dahl, M., and Claesson, I. (2005). The suc-

cessive mean quantization transform. In Proc. of

ICASSP05.

Panetta, K. A., Wharton, E. J., and Agaian, S. S. (2008).

Human visual system-based image enhancement and

logarithmic contrast measure. In IEEE Transaction on

Systems, Man, and Cybernetics-Part B, Vol. 38, No. 1.

IEEE.

Park, J., Sshadrinathan, K., Lee, S., and Bovik, A. C.

(2011). Spatio-temporal quality pooling accounting

for transients severe impairments and egomotion. In

Proc. of ICIP 2011. IEEE.

Ponomarenko, N., Jin, L., Ieremeiev, O., Lukin, V., Egiazar-

ian, K., Astola, J., Vozel, B., Chehdi, K., Carli, M.,

Battisti, F., and Kuo, C. J. (2015). Image database

tid2013. In Image Communication, Vol. 30. Elsevier

Science Inc.

Raj, R., Geisler, W., Frazor, R., and Bovik, A. (2005). Con-

trast statistics for foveated visual systems: ﬁxation se-

lection by minimizing contrast entropy. In J Opt Soc

Am A, Vol. 20, No. 10. Opt Image Sci Vis.

Sheikh, H. R., Bovik, A. C., and Veciana, G. D. (2005). An

information ﬁdelity criterion for image quality assess-

ment using natural scene statistics. In IEEE Trans. on

Image Proc., Vol. 14, No. 12. IEEE.

Wang, W., Wang, Y., Huang, Q., and Gao, W. (2010). Mea-

suring visual saliency by site entropy rate. In Proc. of

GVPR 2010. IEEE.

Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P.

(2004). Image quality assessment: From error visibil-

ity to structural similarity. In IEEE Trans. on Image

Proc., Vol. 13, No. 4. IEEE.

Wang, Z. and E.P.Simoncelli (2005). Reduced-reference

image quality assessment using a wavelet-domain nat-

ural image statistic model. In Proc. of SPIE Human

Vision and Electronic Imaging X, vol. 5666. SPIE.

Wang, Z. and Li, Q. (2011). Information content weight-

ing for perceptual image quality assessment. In IEEE

Trans. on Image Proc., Vol. 20, No. 5. IEEE.

An Entropy-based Model for a Fast Computation of SSIM

233