assessing the quality of the whole image. With re-
spect to the literature concerning the selection of the
best pooling weights for an image quality assessment
measure (Moorthyand Bovik, 2009; Park et al., 2011;
Wang and Li, 2011)), the proposed method can also
be seen as a binary pooling, preserving some blocks
while discarding the others.
The proposed method consists of the following
main steps:
1. Luminance based image segmentation: the image
is split into a finite number of distinct regions hav-
ing different characteristics;
2. Finite random walk on a connected and weighted
graph whose nodes are the regions given by the
segmentation. This step provides a sequence of
points belonging to the typical set of length K.
K is automatically determined for each image us-
ing the Minimum Description Length principle
(MDL) (Grunwald, 2004).
It will be shown that the mean value of the SSIM eval-
uated on blocks centered at these points gives a faith-
ful estimationof SSIM of the whole image with a con-
siderable computational saving. Experimental results
on test images from TID2013 database (Ponomarenko
et al., 2015) show that it is possible to reach a speed
up for SSIM, evaluated for different distortion levels,
over 200:1 with a relative estimation error lower than
8%.
The outline of the paper is the following. The next
Section gives some preliminary results on the visual
distortion typical set. Section 3 presents a method
for determining a sequence of points belonging to
this set; details about the algorithm and its computa-
tional cost will also be given. Section 4 presents some
experimental results obtained on TID2013 database
while the last section draws the conclusions.
2 SOME PRELIMINARY
RESULTS
In (Bruni and Vitulano, 2014), the visual distortion
typical set A
ε
M
has been defined as a subset of all se-
quences composed of samples of the original image
I (and the corresponding ones in the degraded image
I
d
) such that they give an approximated value
ˆ
M of
the expected value of the measure M (i.e.
¯
M) within
an error ε, i.e.: |
ˆ
M −
¯
M| < ε, where M is the refer-
ence quality measure. In our case, M is the pointwise
SSIM,
¯
M is the mean of M computed using all im-
age pixels, while
ˆ
M is the mean of M computed on a
reduced number of image pixels. More formally, A
ε
M
is the set of sequences of fixed size whose entropy
is close to the entropy of the source. The existence
of A
ε
M
is guaranteed by the Asymptotic Equipartition
Property (AEP) (Cover andThomas, 1991), that states
that for i.i.d. r.v.s X
i
it holds:
1
n
log
1
p(X
1
,X
2
,..,X
n
)
→
H(X) n → ∞.
AEP is the entropic version of the weak law of
large numbers. However, the entropy based version
is more mathematically tractable as entropy increases
as the number of samples grows (Cover and Thomas,
1991), while it is not so for the mean value. Based
on these concepts, in (Bruni and Vitulano, 2014) the
authors gave some guidelinesfor an optimized extrac-
tion of the visual distortion typical set from the cou-
ple of images (I, I
d
). Specifically, it has been formally
proved that:
1. Not all information in I and I
d
is really important;
it is sufficient to select just a part of it for assess-
ing image quality. In addition, an entropy based
criterion can be applied for selecting the signifi-
cant information. Specifically, it has been proved
the following result:
Proposition 1. Let X ∼ Q with a positive and
numerical alphabet χ and {X
1
} ∼ p
1
, {X
1
, X
2
} ∼
p
2
, ..., {X
1
, X
2
, ..., X
n
} ∼ p
n
. Let µ
n
be the mean
of p
n
, µ be the mean of Q and D
KL
the Kullbach-
Leibler divergence. Then
(a) the sequence {µ
n
} is not monotonic for increas-
ing n;
(b) |µ
n
− µ|
2
≤ 2M
n
D
KL
(p
n
||Q) ∀n, with M
n
=
max
x∈χ
x.
This Proposition along with the known results on
the monotonicity of the entropy per element of a
stationary stochastic process (Cover and Thomas,
1991), support the use of the entropy as fun-
damental measure to use in the selection of se-
quences belonging to the visual distortion typical
set.
2. In the construction of the sequence of interest, it
is more convenient to select non overlapping local
regions (for instance, blocks) as samples of I (and
I
d
) rather than to randomly select isolated pixels.
3. It is more convenient to extract significant infor-
mation from M rather than from the couple of im-
ages I and I
d
.
What was missing in (Bruni and Vitulano, 2014) is a
constructive method for determining the typical set:
only its existence along with some criteria and guide-
lines for its best search have been provided. That is
why, in the sequel we will give an answer to the fol-
lowing question
4. How to find a sequence belonging to A
ε
M
using a
fast procedure.