SUBJECTIVE VERIFICATION OF PERCEPTUAL METRICS FOR
I
MAGE WATERMARKING FIDELITY
Franco Del Colle and Juan Carlos G´omez
Laboratory for System Dynamics and Signal Processing, FCEIA, Universidad Nacional de Rosario, Argentina
CIFASIS, CONICET
Keywords:
Image Digital Watermarking, Discrete Wavelet Transform, Perceptual Distortion Metrics.
Abstract:
In this paper, the performance of several state-of-the-art watermark perceptual transparency metrics is evalu-
ated through subjective assessment. Simulation results show that a metric based on S-CIELAB distortion maps
proved to be better correlated to the subjective tests than other objective metrics available in the literature. The
paper focus on Image Adaptive Watermarking methods in the Discrete Wavelet Transform Domain since they
yield better results regarding robustness and transparency than other watermarking schemes.
1 INTRODUCTION
Digital Watermarking has become the most efficient
and widely used technique addressing the issue of
digital data protection. The idea is to imperceptibly
embed information (the watermark) into the original
data in such a way that always remains present and de-
tectable. A set of requirements should be met by any
watermarking technique (Barni and Bartolini, 2004).
The main requirements are perceptual transparency,
payload of the watermark and robustness. Percep-
tual transparency refers to the property of the water-
mark of been imperceptible in the sense that humans
can not distinguish the watermarked images from the
original ones by simple inspection. Payload of the
watermark refers to the amount of information stored
in the watermark, which in general depends on the ap-
plication. Finally, robustness refers to the capacity of
the watermark to remain detectable after alterations
due to processing techniques or intentional attacks.
Good overviews on the state of the art of classi-
cal watermarking techniques can be found in the re-
cent textbooks (Barni and Bartolini, 2004) and (Cox
et al., 2002), and in (Langelaar et al., 2000), (Petitco-
las, 2000) and the references therein.
Among the different approaches that have been
proposed in the literature for the watermarking of still
images, the ones in the transform domain which are
adapted to the particular image have proved to deliver
better results regarding transparency and robustness.
In these methods the length, location and amplitude
of the watermark is adapted to the image character-
istics (Barni et al., 2001) and (Podilchuk and Zeng,
1998). This paper will focus on Image Adaptive
Discrete Wavelet Transform (IADWT) domain water-
marking techniques.
In this paper, several perceptual metrics for water-
mark image fidelity evaluation are validated through
subjective tests. In particular, the perceptual metrics
introduced in (Le Callet and Barba, 2003), in (Del
Colle and G´omez, 2008) and in (Wang et al., 2004)
are considered. Simulation results show that the per-
ceptual metric in (Del Colle and G´omez, 2008) out-
perform the other metrics regarding correlation to the
subjective tests.
2 IADWT WATERMARKING
In this paper, the watermark embedding scheme in the
DWT domain in (Podilchuk and Zeng, 1998) is con-
sidered. Here, the watermark is modulated by the Just
Noticeable Differences (JND) thresholds, and the co-
efficients are marked whenever they are greater than
the JND threshold, i.e.
b
X
w
(u,v) =
b
X(u,v) + J(u,v)w()
b
X(u,v) > J(u,v)
b
X(u,v) othewise
(1)
where
b
X(u,v) and
b
X
w
(u,v) are the DWT coeffi-
cients of the original image and the watermarked im-
age respectively, and J(u, v) is the JND matrix at the
u, v frequency in the DWT domain. In this scheme,
397
Del Colle F. and Carlos Gómez J. (2008).
SUBJECTIVE VERIFICATION OF PERCEPTUAL METRICS FOR IMAGE WATERMARKING FIDELITY.
In Proceedings of the International Conference on Signal Processing and Multimedia Applications, pages 397-400
DOI: 10.5220/0001940903970400
Copyright
c
SciTePress
the watermark sequence w() is generated from a
zero mean, unit variance, normally distributed ran-
dom sequence. In this way, the watermark sequence
weighted by the JND thresholds has lower power
than the maximum power that can be inserted with-
out causing noticeable distortions in the image.
The JND thresholds are computed based on a per-
ceptual model of the Human Visual System (HVS). A
widely used perceptual model is the one introduced
in (Watson et al., 1997), which takes into account
frequency sensitivity, local luminance and contrast
masking effects to determine an image-dependent
quantization matrix, which provides the maximum
possible quantization error in the DWT coefficients
which is not perceptible by the HVS.
The following modification to the IADWT inser-
tion scheme in (1) can be introduced
b
X
w
(u,v) =
b
X(u,v)+ J(u,v)w()
b
X(u,v) > J(u, v) > T
b
X(u,v) othewise
(2)
This modified insertion scheme will be hereafter
denoted as IADWT
T
. The rationale for the con-
straint J(u, v) > T is that when the JND thresholds
are too small, the magnitude of the marking term
in (2) becomes negligible. The introduction of the
lower bound T has then the advantage of reducing
the watermark length, improving in this way the fi-
delity. Through simulation trials a value of T equals
12 provedto be the most suitable for all tested images.
3 FIDELITY ASSESSMENT
In the evaluation of image watermarking methods it
is of interest to judge the fidelity of the watermarked
image. Basically, the fidelity is a measure of the sim-
ilarity between the images before and after the water-
mark insertion. The natural way to assess fidelity is to
run a subjectivetest where observers are asked to rank
the distortion of the images in a given scale. This type
of evaluation involves large number of individuals in
order for the results to be statistically significant and
demands considerable time.
As an alternative to this, an objective assessment
based on a metric that quantifies the watermarked im-
age fidelity can be performed since it is less time con-
suming and does not require the involvement of hu-
man beings. However this objective assessment is
usually validated with a subjective test. Several met-
rics have been proposed in the literature to quantify
image quality, see for instance (Winkler, 2005) and
the reference therein. The most successful ones are
those that take into account the perceptual character-
istics of the HVS. These techniques could eventually
be used to quantify watermark fidelity.
3.1 Subjective Assessment
As pointed out before the straightforward way to as-
sess the fidelity of watermarked images is to run a
subjective test. There are standardized techniques
to perform subjective tests for general image qual-
ity assessment. For instance the Recommendation
ITU-R BT.500-11 (ITU, 2002) specifies a methodol-
ogy for the subjective assessment of still image qual-
ity. On the other hand no standards are available
for subjective assessment of watermarked image qual-
ity. Since watermarked images can be considered
as the result of some processing operations (the wa-
termark embedding algorithms) applied to the origi-
nal image, these general subjective quality assessment
techniques could be applied to watermarked images.
In this paper, the Double Stimulus Impairment Scale
(DSIS) protocol, described in (ITU, 2002), is used.
This protocol has also been used by Marini and coau-
thors in (Marini et al., 2007) in the same context.
The experiment was carried out in a room de-
signed according to the recommendation ITU-R
BT.500-11 (ITU, 2002). Fourteen observers were en-
rolled to do the test and fifteen different natural im-
ages were watermarked using the two IADWT algo-
rithms described in section 2. This resulted in 20 min
sessions where observers were asked to rate 30 im-
ages at an observation distance of six times the dis-
play size of the images. The original and the wa-
termarked images were displayed side by side on the
monitor and the observers were asked to rate the qual-
ity of the marked image compared to that of the orig-
inal on a scale of five categories, namely 5=Imper-
ceptible, 4=Perceptible but not annoying, 3=Slightly
annoying, 2=Annoying, and 1=Very annoying. The
results of these experiments are included in section 4.
3.2 Objective Assessment
To avoid the dependence on human judgement, the
objective assessment of watermarked image fidelity
using a metric that takes into account the character-
istics of the HVS is desirable. Several perceptual
metrics have been proposed to quantify image qual-
ity. The S-CIELAB based metric introduced by the
present authors in (Del Colle and G´omez, 2008) will
be briefly described and compared to the Komparator
metric introduced in (Le Callet and Barba, 2003) and
the SSIM metric introduced in (Wang et al., 2004).
All of them take into account the different sensitivi-
ties of the human eye for color discrimination, con-
SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications
398
trast masking and texture masking.
T
he S-CIELAB (Zhang, 1996) metric is an ex-
tension of the CIELAB metric (CIE, 1971) which
incorporates the different spatial sensitivities of the
three opponent color channels by adding a spatial pre-
processing step before the standard CIELAB E cal-
culation. As a result a S-CIELAB E
94
distortion
map, indicating where the visible distortions are in the
image and how large this distortions are, is obtained.
Due to the spatial distribution of the S-CIELAB
E
94
errors in the distortion maps it is difficult to
make a comparison with other metrics. To provide
a unique parameter quantifying the fidelity, a pooling
of the S-CIELAB E
94
errors is proposed by defining
the following fidelity factor:
F ,
1
M
i=1
N
j=1
(SE
94
(i, j)Mask(i, j))
M
i=1
N
j=1
X
L
(i, j)
2
+X
a
(i, j)
2
+X
b
(i, j)
2
×1
00
(3)
where SE
94
is a matrix with the values of the S-
CIELAB E
94
errors for each pixel, i.e. the image
distortion map, Mask is a mask with ones in the posi-
tions where the S-CIELAB E
94
errors are above the
threshold and zeros otherwise, X
L
, X
a
and X
b
are the
image components in the Lab color space. Values of
F close to 100 % indicates that non perceptible dis-
tortion is present in the watermarked image.
The performance of the above described percep-
tual metrics will be compared in section 4 with a pool-
ing of the standard Root Mean Square (RMS) error,
namely, the RMS Fit (RMS
FIT
) defined as:
RMS
FIT
,
1
M
i=1
N
j=1
X
R
(i, j)
2
+X
G
(i, j)
2
+X
B
(i, j)
2
M
i=1
N
j=1
X
R
(i, j)
2
+X
G
(i, j)
2
+X
B
(i, j)
2
!
×1
00
(4)
where the subindexes R, G and B denote the corre-
sponding image components in the RGB color space.
4 RESULTS
The metrics described in subsection 3.2 are used in
this section to evaluate the fidelity of the IADWT
watermarking described in section 2. A set of fif-
teen (256×256) natural color images was used. The
complete image dataset can be downloaded from:
http://www.fceia.unr.edu.ar/lsd/mrg/watermark/.
Results from two separate tests are presented in
this section. The purpose of Test 1 is to compare the
four fidelity metrics, namely, the standard RMS
FIT
,
and the perceptual metrics, SSIM, Komparator and
the one defined in eq. (3). On the other hand, Test 2 is
designed to compare the fidelity of the two IADWT
insertion schemes described in section 2 using the
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
3
4
5
Assessment
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
3
4
5
Images
Assessment
Figure 1: Comparison of Objective and Subjective Assess-
m
ent for methods IADWT (top) and IADWT
T
(bottom).
CI: Blue solid line, RMS
FIT
: green triangles, SSIM: orange
squares, Komparator: brown circles, F : red crosses.
S-CIELAB based metric, which is the one that best
matches the subjective tests.
Test 1 - Fidelity Metrics Comparison. In order
to illustrate which metric provides the best objective
assessment of image quality for both watermarking
methods, the four metrics are computed and com-
pared to the mean opinion score
1
(MOS) for the fif-
teen images. The corresponding 97.5 % confidence
intervals (CI) were also calculated to specify inter-
vals of values with the highest likelihood of contain-
ing the true value of the general MOS. These inter-
vals, centered in the MOS, are shown in blue solid line
in Fig. 1; the non perceptual RMS
FIT
is denoted with
green triangles, the SSIM values with orange squares,
the Komparator values with brown circles, while the
Fidelity Factor F with red crosses. The values in
Fig. 1 are normalized in the range [1, 5].
From Fig. 1 it is clear that the RMS
FIT
does not
give a correct assessment of fidelity as the values fail
to fall in the confidence intervals for twelve out of
thirty watermarked images. The number of points that
fall outside the confidence intervals and the average
distance (d) of each metric to the MOS were calcu-
1
T
he Mean Opinion Score for each image is the average
of the scores assigned by the observers.
SUBJECTIVE VERIFICATION OF PERCEPTUAL METRICS FOR IMAGE WATERMARKING FIDELITY
399
Table 1: Performance of the metrics.
IADWT IADWT
T
Points d Points d
outside CI outside CI
RMS
FIT
10 0.59 2 0.18
SSIM 9 0.29 2 0.12
Komparator 3 0.26 3 0.20
F 1 0.23 0 0.08
lated for both Watermarking algorithms and the cor-
responding values are shown in Table 1. From Fig. 1
and Table 1, it is clear that the metric F is the one that
best fits the subjective results, although the Kompara-
tor metric gives also acceptable results.
Test 2 - Watermarking Schemes Comparison. The
fidelity factor, F , is used to compare the performance
of the IADWT and IADWT
T
insertion schemes. In
Fig. 2, the values of F for the IADWT and IADWT
T
insertion schemes are represented by red circles and
blue crosses, respectively.
As it can be observed, the IADWT
T
method out-
performs the IADWT one regarding fidelity. This
holds even for images with large uniform color re-
gions, where the image adaptive methods are sup-
posed to work poorly (Podilchuk and Zeng, 1998) (re-
sults are not shown here due to space limitation).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
3
4
5
Images
Assessment
IADWT
IADWT
T
Figure 2: Objective Assessment based on F for methods
IADWT (red circles) and IADWT
T
(blue crosses).
5 CONCLUDING REMARKS
Several image perceptual metrics have been tested in
this paper for the purpose of evaluating the trans-
parency of image watermarking insertion schemes.
In particular IADWT watermark insertion algorithms
were tested. The evaluation has been carried out
by performing subjective tests using the protocol de-
scribed in (ITU, 2002) and comparing the MOS to
the result of each metric. Simulation results show
that the image fidelity factor based on the S-CIELAB
E
94
perceptual distortion maps has a better correla-
tion with the subjective tests for the purposes of quan-
tifying still image watermarking fidelity. In addition,
a comparison of the fidelity of the two IADWT wa-
termarking schemes has been done showing that the
IADWT
T
outperforms the method in (Podilchuk and
Zeng, 1998) regarding image fidelity.
REFERENCES
Barni, M. and Bartolini, F. (2004). Watermarking Systems
Engineering - Enabling Digital Assets and Other Ap-
plications. Marcel Dekker, Inc., New York.
Barni, M., Bartolini, F., and Piva, A. (2001). Im-
proved wavelet-based watermarking through pixel-
wise masking. IEEE Transactions on Image Process-
ing, 10(5):783–791.
CIE (1971). Recommendations on uniform color spaces,
color difference equations, psychometrics color terms.
Technical Report CIE 15 (E.-1.3.1), Vienna, Austria.
Cox, I., Miller, M., and J.Bloom (2002). Digital Water-
marking. Morgan Kaufmann, San Francisco.
Del Colle, F. and G´omez, J. C. (2008). DWT-based digital
watermarking fidelity & robustness evaluation. Jour-
nal of Computer Science & Technology, 8(1):15–20.
ITU (2002). Recommendation ITU-R BT.500-11: Method-
ology for the subjective assessment of the quality of
television pictures. Technical report, International
Telecommunication Union.
Langelaar, G., Setyawan, I., and R.Lagendijk (2000). Wa-
termarking digital image and video data. IEEE Signal
Processing Magazine, 17(5):20–46.
Le Callet, P. and Barba, D. (2003). A robust quality metric
for color image quality assessment. In Proceedings of
the IEEE International Conference on Image Process-
ing, volume 1, pages 437–440.
Marini, E., Autrusseau, F., Le Callet, P., and Campisi, P.
(2007). Evaluation of standard watermarking tech-
niques. In Proc. of SPIE-IS& Electronic Imaging, vol-
ume 6505, pages 1–10, San Jose, CA, USA.
Petitcolas, F. (2000). Watermarking schemes evaluation.
IEEE Signal Processing Magazine, 17(5):58–64.
Podilchuk, C. and Zeng, W. (1998). Image-adaptive water-
marking using visual models. IEEE Journal on Se-
lected Areas in Communications, 16(4):525–539.
Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, P. (2004).
Image quality assessment: from error visibility to
structural similarity. IEEE Transactions on Image
Processing, 13(4):600–612.
Watson, A., Yang, G., Solomon, J., and Villasenor, J.
(1997). Visibility of wavelet quantization noise. IEEE
Transactions on Image Processing,, 6(8):1164–1175.
Winkler, S. (2005). Digital Video Quality Vision Models
and Metrics. John Wiley & Sons Ltd, Chichester, UK.
Zhang, Z. (1996). A spatial extension to CIELAB for digi-
tal color image reproduction. Society for Information
Display Symposium Technical Digest, 27:731–734.
SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications
400