ON ENCRYPTION AND AUTHENTICATION OF THE DC DCT

COEFFICIENT

Li Weng and Bart Preneel

∗

Department of Electrical Engineering, Katholieke Universiteit Leuven, 3001 Heverlee, Belgium

Keywords:

Encryption, Authentication, DC coefﬁcient, DCT.

Abstract:

When encryption and authentication techniques are applied to image or video data, sometimes it is advanta-

geous to limit the operation to the DC DCT coefﬁcient of each 8 × 8 block in a picture. In this work, the

performance of such an approach is evaluated. This problem is considered as an image quality problem,

and the metric structural similarity is used to show that by authenticating the DC coefﬁcient, about 60% of

the information can be guaranteed; by encrypting the DC coefﬁcient, about 80% of the information can be

hindered.

1 INTRODUCTION

With the increasing awareness of security in multime-

dia applications, conventional encryption and authen-

tication techniques are also being adopted to protect

multimedia data. The former guarantees that the data

is not accessible to any unauthorized party; the latter

guarantees data integrity, and sometimes also veriﬁes

the source (Stallings, 2006). In spite of their different

purposes, when confronted with the huge amount of

multimedia data, especially for video, they both suf-

fer from signiﬁcant computation complexity. For ex-

ample, current High Deﬁnition TV systems have data

rates up to 10s of Mbps for a single channel. Such

high data rates can already make signiﬁcant burden

for video servers. With the extra requirement of en-

cryption and authentication, the overall design of a

video transmission system becomes even more com-

∗

This work was supported in part by the Concerted Re-

search Action (GOA) Ambiorics 2005/11 of the Flemish

Government and by the IAP Programme P6/26 BCRYPT

of the Belgian State (Belgian Science Policy). The

ﬁrst author was supported by the IBBT-VIN project,

which was co-funded by IBBT (Interdisciplinary Institute

for BroadBand Technology), a research institute founded

by the Flemish Government in 2004, and the involved

companies and institutions (Philips, IPGlobalnet, Vital-

sys, Landsbond onafhankelijke ziekenfondsen, UZ-Gent).

https://projects.ibbt.be/vin

plicated. Therefore, reducing processing overhead is

an essential concern for these applications.

Among various video or image encryption and au-

thentication solutions, those that operate in the dis-

crete cosine transform (DCT) domain seem to show

practical interest, e.g., (Lin and Chang, 2001; Zeng

and Lei, 2003), because currently the most popular

video or image standards, such as MPEG-1/2/4 and

JPEG, all store visual information in the DCT do-

main. If encryption or authentication is directly ap-

plied to data in the DCT domain without going back

to the spatial domain, computation for the inverse

DCT can be saved. Therefore, the DCT domain is

favored by both encryption and authentication.

However, it is not always feasible to process all

the data in the DCT domain. In some scenarios, only

part of the DCT coefﬁcients are encrypted or authen-

ticated, e.g., (Weng et al., 2006; Sun et al., 2006).

This might be due to low processing power or real-

time constraints, as well as other particular require-

ments. As a result, there is a typical approach consist-

ing of encrypting or authenticating only the DC coefﬁ-

cient. For example, an MPEG authentication scheme

which is robust to transcoding was proposed in (Sun

et al., 2006), where the DC coefﬁcient is authenti-

cated; an MPEG encryption algorithm with multiple

security levels was proposed (Li et al., 2007; Weng

et al., 2006) and the ﬁrst level is DC coefﬁcient en-

375

Weng L. and Preneel B. (2007).

ON ENCRYPTION AND AUTHENTICATION OF THE DC DCT COEFFICIENT.

In Proceedings of the Second International Conference on Signal Processing and Multimedia Applications, pages 365-369

DOI: 10.5220/0002137503650369

 SciTePress

cryption. Empirically, it seems that the DC coefﬁcient

is a good candidate for encryption and authentication,

and generally leads to a good compromise between

computation overhead and security. However, to the

best of our knowledge, there are neither theoretical

nor experimental results to quantify how much infor-

mation can be guaranteed or hindered by authenticat-

ing or encrypting the DC coefﬁcient. Without such

results, it is difﬁcult for designers to decide whether

to adopt similar approaches. Motivated by this fact,

we would like to investigate the above problem in a

quantitative way and measure the actual performance

of DC coefﬁcient encryption and authentication.

In this work, the evaluation of encryption and au-

thentication performance is considered as an image

quality evaluation problem. An image quality met-

ric called “structural similarity” (Wang et al., 2004)

is adopted to measure the amount of information hin-

dered by partial encryption or guaranteed by partial

authentication. From various experiments applied to

an image database of about 420 JPEG images, we

show that by authenticating DC coefﬁcients, about

60% of the information can be guaranteed; by en-

crypting DC coefﬁcients, about 80% of the informa-

tion can be hindered. Although these are not absolute

results, they might give insights into the problem.

The rest of the work is organized as follows: Sec-

tion 2 ﬁrst introduces our approach to evaluate en-

cryption and authentication performance by structural

similarity; Subsection 2.1 gives the background of

partial image authentication and results of various ex-

periments in terms of structural similarity; Subsec-

tion 2.2 is a similar approach to partial image encryp-

tion. Section 3 concludes our work.

2 BENCHMARKING DC

COEFFICIENT ENCRYPTION

AND AUTHENTICATION

Although encryption and authentication are quite dif-

ferent, a uniform approach can be used to evaluate

their performance for visual data. Because they both

consider how much information is preserved after the

operation, one could think of them as an image qual-

ity problem. Assuming that the original image has full

quality, for partial encryption, we measure the quality

of the encrypted version compared to the original one;

for partial authentication, we measure the quality of

the authenticated part compared to the original one.

Therefore, the problem is converted into the choice of

a proper image quality metric.

The most widely used image quality metrics are

the peak signal-to-noise ratio (PSNR) and the mean

square error (MSE). They are simple and efﬁcient for

general purposes. However, “they are not very well

matched to perceived visual quality” (Wang et al.,

2004), because they only concentrate on the amount

of errors, but not the perceived information. For these

metrics, image quality measure is quite different from

the amount of information expressed through the im-

age. For example, Figure 1 shows a gray-scale Lena

image (a) and several distorted versions: (b) subtract-

ing a constant from all pixel values and setting neg-

ative results to zero; (c) applying an averaging ﬁlter;

(d) JPEG compression. The PSNRs of the distorted

versions, compared to the original one, are given be-

low the images. Although the distorted versions look

rather similar to the original one, the PSNRs are quite

low. One can also note that the PSNR of (b) is lower

than the one of (c), while (b) actually has better per-

ceptual quality. Therefore, it might not be appropriate

to use simple metrics that measure the error visibility,

such as PSNR and MSE; instead, we need a metric

which indeed measures image similarity.

We ﬁnd that the image quality metric structural

similarity (SSIM) (Wang et al., 2004) fulﬁlls the re-

quirement. This metric compares luminance, con-

trast, and structure information between two gray-

scale images of the same size and returns an average

score between zero and one, with one meaning ex-

actly the same and zero meaning completely different.

It is deﬁned as:

SSIM(x, y) = [l(x, y)]

· [c(x, y)]

· [s(x, y)]

, (1)

where x and y represent two test images; functions

l(), c(), and s() correspond to luminance, contrast,

and structure similarity, respectively; α, β, and γ are

weighting factors. In our experiments, we use its sim-

pliﬁed form:

SSIM(x, y) =

(2µ

)(2σ

)

(µ

+ µ

)(σ

+ σ

)

, (2)

where µ represents mean; σ represents (co)variance;

and C

are numerical constants for stability (we

use 6.5025 and 58.5225 as suggested). This metric

satisﬁes the following conditions:

• Symmetry: SSIM(x, y) = SSIM(y, x);

• Boundedness: SSIM(x, y) ≤ 1;

• Unique maximum: SSIM(x, y) = 1 iff x = y.

These properties make it easy to interpret the meaning

of an SSIM score. Due to space constraints, we skip

elaboration on the details of this metric. For more

information, one can refer to the paper (Wang et al.,

2004). Although this metric does not cover all aspects

of image similarity measure, it gives more reasonable

SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications

376

(a) original (b) mean shifted

PSNR=14 dB

SSIM=0.761

PSNR=22.3 dB PSNR=25.7 dB

SSIM=0.654 SSIM=0.855

Figure 1: Lena vs. distorted Lena.

results than MSE or PSNR in our scenario. Figure 1

also shows the SSIM scores for the distorted versions,

which are reasonably high and more consistent with

human perception.

In the following sections, we ﬁrst introduce the

background of DC coefﬁcient authentication and en-

cryption, then design experiments and measure the

performance by SSIM. Because the purpose is to mea-

sure the importance of the DC coefﬁcient, we limit

our experiments to images. Since video can be con-

sidered as sequences of images, it is reasonable to as-

sume that the experiment results also indicate the per-

formance for video scenarios.

2.1 DC Coefﬁcient Authentication

Authentication can achieve data integrity veriﬁcation

and/or source veriﬁcation. In this work, only the for-

mer is considered. Due to space constraints, we skip

the formal introduction to authentication techniques;

we assume that once authentication is applied, the in-

tegrity of the involved data can be guaranteed.

Traditional authentication scenarios assume that

the data that need to be protected never change

(Stallings, 2006). However, multimedia data is some-

times subjected to transcoding during transmission.

Typical transcoding techniques are: requantization,

frame resizing, and frame dropping (Vetro et al.,

2003). They all modify the original data, thus com-

promising the integrity check. If the authentication

is applied to all DCT coefﬁcients, it will fail at the

veriﬁcation stage while the content is still authentic.

Therefore, a simple approach to circumvent this dif-

ﬁculty is to authenticate low-frequency DCT coefﬁ-

cients which are less likely to be affected by transcod-

ing. For example, an MPEG authentication scheme

which is robust to the above mentioned transcoding

techniques was proposed in (Sun et al., 2006). The

authors suggest to authenticate the DC coefﬁcient of

each 8 × 8 block and demonstrated acceptable re-

sults. However, they also mentioned that the selection

of features for authentication is application-speciﬁc,

without further motivating their choice. Our experi-

ment studies the performance of their approach.

In order to evaluate the “importance” of the DC

coefﬁcient, we design the following experiment:

1. Divide a gray image into 8× 8 blocks;

2. Perform DCT on each block;

3. Leave the DC coefﬁcient and set all others to zero;

4. Perform inverse DCT and restore the image;

5. Compare the restored image with the original one

by SSIM.

Figure 2(a) illustrates the effect of this procedure ap-

plied to the Lena image. Note that this is equivalent

to replacing the pixel values in an 8× 8 block by the

mean of all pixels. This experiment is carried out for

about 420 real-life JPEG images. They are divided

into the following sets:

Type 1: architecture Type 4: landscape

Type 2: sculpture Type 5: objects

Type 3: humanoid Type 6: vehicle

Each set contains around 70 images. They are con-

verted to gray-scale and resized to three canonical

sizes before the experiment:

Size 1: 640 × 480 Size 3: 1600 × 1200

Size 2: 1024 × 768

The average results from all sets are listed in Table 1.

From the results one can see that the SSIM score is

usually above 0.6, and it increases with the size. That

can be interpreted as, that more than 60% of the infor-

mation of the original image is preserved in the DC

coefﬁcients. This is an interesting result, because the

percentage of data representing DC coefﬁcients in an

image or video bitstream is obviously much less than

that. Therefore, it shows that DC coefﬁcient authenti-

cation is quite a cost-effective approach.

However, normally two different images might

have an average SSIM score around 0.5. Compared

to the average score of 0.662 from above experiment,

it seems that the DC coefﬁcient is not that signiﬁ-

cant. To give a fair comparison, we also measure the

ON ENCRYPTION AND AUTHENTICATION OF THE DC DCT COEFFICIENT

377

(a) DC only (b) random DC

PSNR=23.3 dB PSNR=9.1 dB

SSIM=0.626 SSIM=0.179

Figure 2: Lena with: (a) only DC; (b) random DC.

Table 1: Average SSIM for DC coefﬁcient authentication.

Size 1 Size 2 Size 3 Average

Type 1 0.634 0.686 0.742 0.687

Type 2 0.577 0.609 0.655 0.614

Type 3 0.661 0.703 0.750 0.705

Type 4 0.641 0.655 0.689 0.662

Type 5 0.625 0.674 0.725 0.675

Type 6 0.583 0.623 0.673 0.626

Average 0.620 0.658 0.706 0.662

SSIM score between any two different images of the

same size within each test set. The average results are

around 0.3, which is half of the score of the DC co-

efﬁcients. This conﬁrms that DC coefﬁcients indeed

contain signiﬁcant information.

Since our experiments show that the DC coefﬁ-

cient carries so much information, it might be inter-

esting to know how much information is carried by

other low-frequency DCT coefﬁcients. Therefore, we

repeat the above experiment with some modiﬁcation.

Instead of leaving the DC coefﬁcient only, we keep

the ﬁrst two DCT coefﬁcients of each 8 × 8 block,

and the rest of the experiment is the same. The aver-

age results are around 0.7. It shows that about 70%

of the information is preserved in the ﬁrst two DCT

coefﬁcients, i.e., the gain is around 10% compared to

authenticating DC coefﬁcients alone. We noted that

the gain is decreasing if we repeat this experiment for

more DCT coefﬁcients. Therefore, it might be less

cost-effective to authenticate more DCT coefﬁcients.

2.2 DC Coefﬁcient Encryption

Encryption of multimedia data used to have a speed

problem, especially for video. Therefore, partial en-

cryption was advocated to alleviate the situation. Cur-

rently, although powerful computing seems to be less

expensive, partial encryption is still interesting for

some other purposes. For example, sometimes par-

tial encryption is preferred, in order to make the en-

Table 2: Average SSIM after DC encryption.

Size 1 Size 2 Size 3 Average

Type 1 0.163 0.146 0.128 0.146

Type 2 0.199 0.188 0.171 0.186

Type 3 0.173 0.160 0.145 0.159

Type 4 0.182 0.181 0.170 0.178

Type 5 0.183 0.164 0.146 0.164

Type 6 0.199 0.184 0.165 0.183

Average 0.183 0.171 0.154 0.169

crypted bitstream format-compliant (Wen et al., 2002;

Liu and Eskicioglu, 2003; Weng et al., 2006). Among

various partial encryption approaches, encrypting the

DC coefﬁcient has special interest for video standards

such as H.262 (MPEG-2 video) and H.263 (MPEG-4

simple proﬁle). In H.262, the length of the DC coefﬁ-

cient is indicated beforehand; in H.263, the DC coefﬁ-

cient is coded as an 8-bit ﬁxed-length ﬁeld. Therefore,

encrypting only the DC coefﬁcient does not compro-

mise the format. This can enable many interesting

features, such as random frame access and simple

statistics in the encrypted domain.

In order to simulate and measure the effect of

encrypting DC coefﬁcients, we make the following

modiﬁcation to the 3rd step of previous experiment:

3. Set the DC coefﬁcient of each block to a random

value (0-255) and leave all others;

Figure 2(b) illustrates the effect of this procedure

when applied to the Lena image. It has almost become

unrecognizable. Therefore this encryption approach

seems to have good performance. This is conﬁrmed

by the new experiment after applying to the same im-

age sets as in previous ones. The average SSIM scores

are listed in Table 2. They are all below 0.2, indicat-

ing that more than 80% of the information has been

hindered. This is a very promising result.

As a comparison, we also test the performance of

encrypting sign bits of DCT coefﬁcients, which is an-

other well-known partial encryption approach (Bhar-

gava et al., 2004). We modify our experiment to ran-

domly ﬂip the sign of DCT coefﬁcients except for the

DC coefﬁcient. Figure 3(a) shows the Lena image

encrypted this way. The observed distortion is very

limited. Therefore, this approach is not as effective

as DC coefﬁcient encryption. The results show an av-

erage score above 0.5, which is consistent with our

observation.

Nevertheless, note that encrypting DC coefﬁcients

or sign bits is sometimes vulnerable to error conceal-

ment attack (ECA) (Wen et al., 2002). For exam-

ple, one can set the DC coefﬁcient of each block to

128 to cancel some encryption effect. This is illus-

trated in Figure 3(b), where the encrypted Lena im-

SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications

378

(a) random DCT sign (b) error concealment

PSNR=20.28 dB PSNR=14.70 dB

SSIM=0.502 SSIM=0.750

Figure 3: Lena with: (a) random DCT sign; (b) ECA.

age in Figure 2(b) is subjected to ECA. After the at-

tack the contour of Lena becomes quite visible and the

SSIM score increases to 0.75. Therefore, such kind of

encryption schemes are only sufﬁcient for scenarios

where the security requirement is not very high, e.g.,

commercial video applications.

3 CONCLUSION AND

DISCUSSION

When encryption or authentication is applied to im-

age or video data in the DCT domain, the operation

is sometimes limited to DC coefﬁcients only. Al-

though empirically plausible, we have evaluated the

corresponding security gain in a quantitative way. We

have considered this as an image quality problem and

adopted the metric structural similarity: for authen-

tication, we measure the SSIM between an original

image and a quality-reduced version which only con-

sists of DC coefﬁcients; for encryption, we measure

the SSIM between an original image and a distorted

version whose DC coefﬁcients are scrambled. After

extensive experiments with about 420 real-life images

of various types, we conclude that by authenticating

DC coefﬁcients, more than 60% of the information is

guaranteed; and by encrypting DC coefﬁcients, more

than 80% of the information is hindered. These re-

sults coincide with empirical experience and on the

other hand give more sound basis for DC-coefﬁcient-

based approaches. Our experiments are limited to im-

ages, but the results can also imply the performance

when similar approaches are applied to video, due to

the similarity between image and video coding.

Note however, that in this work, the way we mea-

sure the amount of information is not exact. Therefore

our results can be used only as reference or guide-

lines. Our experiments are limited to gray-scale im-

ages, so the results might only explain the case of lu-

minance part of visual data, which is usually the fo-

cus for many image operations. Extending on color

images and video are interesting topics for future re-

search.

REFERENCES

Bhargava, B., Shi, C., and Wang, S.-Y. (2004). MPEG

video encryption algorithms. Multimedia Tools Appl.,

24(1):57–79.

Li, S., Chen, G., Cheung, A., Bhargava, B., and Lo, K.-

T. (2007). On the design of perceptual MPEG-video

encryption algorithms. IEEE Transactions on Circuits

and Systems for Video Technology, 17(2):214–223.

Lin, C.-Y. and Chang, S.-F. (2001). A robust image au-

thentication method distinguishing JPEG compres-

sion from malicious manipulation. IEEE Transac-

tions on Circuits and Systems for Video Technology,

11(2):153–168.

Liu, X. and Eskicioglu, A. (2003). Selective encryption

of multimedia content in distribution networks: Chal-

lenges and new directions. In Proc. of IASTED Int.

Conference on Communications, Internet and Infor-

mation Technology.

Stallings, W. (2006). Cryptography and Network Security.

Prentice Hall, 4th edition.

Sun, Q., He, D., and Tian, Q. (2006). A secure and robust

authentication scheme for video transcoding. IEEE

Transactions on Circuits and Systems for Video Tech-

nology, 16(10).

Vetro, A., Christopoulosa, C., and Sun, H. (2003).

Video transcoding architectures and techniques: An

overview. IEEE Signal Processing Magazine, pages

18–29.

Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E. (2004).

Image quality assessment: From error visibility to

structural similarity. IEEE Transactions on Image

Processing, 13(4).

Wen, J., Severa, M., Zeng, W., Luttrell, M., and Jin, W.

(2002). A format-compliant conﬁgurable encryption

framework for access control of video. IEEE Trans-

actions on Circuits and Systems for Video Technology,

12(6):545–557.

Weng, L., Wouters, K., and Preneel, B. (2006). Extend-

ing the selective MPEG encryption algorithm PVEA.

In Proc. of IEEE Int. Conf. on Intelligent Information

Hiding and Multimedia Signal Processing (IIH-MSP),

Pasadena, USA.

Zeng, W. and Lei, S. (2003). Efﬁcient frequency domain se-

lective scrambling of digital video. IEEE Transactions

on Multimedia, 5(1):118–129.

ON ENCRYPTION AND AUTHENTICATION OF THE DC DCT COEFFICIENT

379