QUALITY-BASED IMPROVEMENT OF QUANTIZATION FOR

LIGHT FIELD COMPRESSION

Raphael Lerbour, Bruno Mercier, Daniel Meneveaux and Chaker Larabi

SIC Laboratory, University of Poitiers

Bat. SP2MI, Teleport 2, Bvd Marie et Pierre Curie, BP 30179 86962 Futuroscope Chasseneuil Cedex, France

Keywords: Image-based rendering, light ﬁeld, compression, subjective assessment.

Abstract: In the last decade, many methods have been proposed for rendering image-based objects. However, the number

and the size of the images required are highly memory demanding. Based on the light ﬁeld data structure, we

propose an improved compression scheme favoring visual appearance and fast random access. Our method

relies on vector quantization for preserving access in constant time. 2D Bounding boxes and masks are used to

reduce the number of vectors during quantization. Several light ﬁeld images are used instead of blocks of 4D

samples, so that image similarities be exploited as much as possible. Psychophysical experiments performed

in a room designed according to ITU recommendations validate the quality metrics of our method.

1 INTRODUCTION

Image-based rendering methods offer an attractive

mean for realistically rendering and/or relighting real-

life objects, with potentially complex shape and re-

ﬂectance properties. In many cases, modeling objects

from our real world is unpractical, not only because of

shape that is difﬁcult to reproduce, but also due to re-

ﬂectance properties that should also be modeled and

rendered, with subsurface scattering, anisotropy and

so on.

With image-based rendering methods, object

complexity is postponed to image complexity. Fur-

thermore, in some cases, rendering time is constant.

This is one important reason why this representation

has been a method of choice for several years.

However, these methods suffer from various draw-

backs such as the high number of images required, the

lack of precision when the observer is close to the ob-

ject, or the (blurred) discontinuities appearing on the

rendered images.

This paper addresses the problem of compression

for light ﬁelds (or lumigraphs) interactive rendering

(Levoy and Hanrahan, 1996; Gortler et al., 1996).

Even though compression is necessary for reducing

the size of data for generating images, both visual

quality and rendering time have to be taken into ac-

count (Figure 1).

Several methods have been applied for compress-

ing data related to light ﬁelds (Ramanathan et al.,

2003; Chang et al., 2003; Li et al., 2001; Magnor

Figure 1: Images from compressed light ﬁelds. (a) ref-

erence image; (b) original 4D compression scheme from

(Levoy and Hanrahan, 1996); (c) our method. GPU imple-

mentation allows the rendering of 20 light ﬁelds at between

25 to 65 frames per second.

et al., 2003; Girod et al., 2003). The original work

described in (Levoy and Hanrahan, 1996) presents a

compression method based on the 4D structure. In

the literature, high compression rates can only be

achieved at the expense of image quality or with a

loss of random access. In addition, object contours

are subject to artifacts (see Figure 1.b).

In this paper, we propose to adapt vector quantiza-

tion for light ﬁeld rendering so as to perform well in

235

Lerbour R., Mercier B., Meneveaux D. and Larabi C. (2007).

QUALITY-BASED IMPROVEMENT OF QUANTIZATION FOR LIGHT FIELD COMPRESSION.

In Proceedings of the Second International Conference on Computer Graphics Theory and Applications - GM/R, pages 235-243

DOI: 10.5220/0002078802350243

 SciTePress

every way of these. Our method relies in the 2D space

of images and highly beneﬁts from inter-image simi-

larities when viewpoints are close. Our contributions

include: (i) the use of object masks for reducing the

compression areas and preserving the object contours;

(ii) the combination of images for improving vector

quantization in terms of visual appearance and com-

pression rates; (iii) the validation of our compression

scheme using a PSNR (Peak Signal to Noise Ratio)

metric validated by a psychophysical study quantify-

ing its correlation with human judgment.

This paper is organized as follows: Section 2

presents the work related to our paper; Section 3

presents the broad lines of our method; Section 4 sum-

marizes the vector quantization method we use; Sec-

tion 5 discusses light ﬁeld compression using quanti-

zation; Section 6 shows how object silhouette can ef-

ﬁciently be used for improving quantization; Section

7 presents our quality assessment system; Section 8

gives implementation details; Section 9 provides re-

sults; Section 10 concludes and proposes future work.

2 RELATED WORK

2.1 Light Field Data Structure

Light ﬁelds (or lumigraphs) correspond to a 4D sam-

pling of the plenoptic function deﬁned in (Adelson

and Bergen, 1991) by Adelson and Bergen. They are

deﬁned by a set of slabs. Each slab is a pair of parallel

planes uv and st uniformly sampled (Levoy and Han-

rahan, 1996; Gortler et al., 1996). Figure 2 illustrates

the light ﬁeld representation.

2.2 Light Field Compression

In the original work proposed in (Levoy and Han-

rahan, 1996), light ﬁeld compression is achieved

through vector quantization. Instead of using 2D vec-

tors on the images, the authors compress 4D vectors

corresponding to 2D samples on the uv plane com-

bined with 2D samples on the st plane. The aim is to

beneﬁt from the similarity existing between two close

viewpoints.

In (Tong and Gray, 2000), prediction is used for

recovering images and achieving high compression

rates. In (Zhang and Li, 2000), compression makes

use of prediction on intermediate images for concen-

tric mosaics. Principal component analysis (Lelescu

and Bossen, 2004), 2D shape encoding (Girod et al.,

2003) or wavelet coders (Wei, 1997; Li et al., 2001)

can also advantageously be exploited for increasing

compression rates.

Slab (s,t)

(a) Slab (u,v)

Real object

Slab (s,t)

(b) Slab (u,v)

Figure 2: Light ﬁeld representation. (a) an image corre-

sponds to one uv sample associated with the whole st plane;

(b) conversely, one st sample associated with all the uv

directions provides radiance samples passing through the

point located in (s,t).

Several authors address perceptual image qual-

ity with image-based rendering compression meth-

ods without quantization (Ramanathan et al., 2001;

Magnor and Girod, 2000; Magnor et al., 2003;

Magnor and Girod, 1999; Zhang and Li, 2000). For

instance in (Magnor and Girod, 2000), the authors

propose two methods dedicated to light ﬁeld com-

pression. The ﬁrst one relies on DCT-based video

compression while the second one relies on disparity-

compensated image prediction. The compression

rates achieved are very high (from 100:1 to 2000:1).

Nevertheless, as pointed out in (Heidrich et al.,

1999; Levoy and Hanrahan, 1996) vector quantiza-

tion is more practical for graphics hardware imple-

mentation since decompression can be achieved by

the GPU.

Geometry has also been used for improving ob-

ject appearance and reducing the image-based infor-

mation size (Magnor et al., 2003; Chang et al., 2003).

However, some reconstruction process is required.

This paper rather addresses light ﬁelds compression

without geometry reconstruction.

3 WORK OVERVIEW

Our compression scheme is based on vector quantiza-

tion. Instead of using vectors in the 4D space of light

ﬁelds, we propose to exploit image similarities using

several images with 2D vectors. As shown in the re-

sults, the visual quality is greatly improved, with a

high compression rate.

GRAPP 2007 - International Conference on Computer Graphics Theory and Applications

236

Most light ﬁelds represent 3D objects in front of a

neutral background (uniformly black by convention).

In this paper, we focus on that type of light ﬁeld and

take advantage of it: pixels corresponding to the back-

ground are ignored using 2D bounding boxes on the

images and a binary mask matching the object silhou-

ette. This is used for: (i) avoiding the compression

of groups of background pixels and (ii) providing a

better representation for PSNR computations.

In order to validate the use of PSNR which is often

considered as uncorrelated to human judgment, we

have designed and conducted a psychophysical exper-

iment. After a correlation study, we have been able to

link the PSNR to the MOS (Mean Opinion Score) so

as to extract a quality threshold.

We have applied our compression method on a set

of virtual and real objects, with various reﬂectances,

textures and sizes. Our method proves fast and efﬁ-

cient for compressing light ﬁeld images and render-

ing them in real-time directly from their compressed

form.

A LZ scheme can further reduce the light ﬁeld

sizes on the disk (for instance when a light ﬁeld has

to be transmitted though a computer network).

4 VECTOR QUANTIZATION

Even with loss, vector quantization is a method of

choice for reducing image sizes. However, visual ar-

tifacts should be unnoticeable for quality light ﬁeld

rendering.

The aim of vector quantization is to replace a

(high) number of vectors by a set of indexes referenc-

ing a reduced set of representatives (the dictionary).

The size required for each index depends on the num-

ber of vectors contained in the dictionary.

Several methods have been proposed for the dic-

tionary construction. Most of them provide compa-

rable results in terms of compression rates. We have

chosen the LBG method (Linde et al., 1980) since the

dictionary size is a power of two. This is convenient

for storing data in the memory as explained in Sec-

tion 8. Moreover, the dictionary size can be automat-

ically chosen depending on a measured quality value

(the PSNR in our case). The dictionary reﬁnement is

based on a generalized Lloyd iteration (Lloyd, 1982).

We have tried various color spaces (RGB, CIE

Luv, CIE Lab, LCh). In all the tests we made for light

ﬁeld compression, the quadratic RGB distance associ-

ated with a PSNR quality measurement provided the

best results.

5 LIGHT FIELD COMPRESSION

Figure 3 and Table 1 present the light ﬁelds used in

this paper, including real and virtual objects. The Sun-

ﬂower is a real object as well as the Clown. The Quad

is a virtual object, rendered with POV-ray. Buddha

and Dragon are provided by Stanford University.

Figure 3: Images of light ﬁelds used for our tests: (a) Sun-

ﬂower, (b) Clown, (c) Quad, (d) Buddha, (e) Dragon.

Table 1: Light ﬁelds characteristics. Sl. is the number of

slabs; (u, v) and (s, t) represent the number of samples on

the uv and st planes; m.size corresponds to the memory re-

quired for storing the whole light ﬁeld without compression

(given in MB).

LF Sl (u, v) (s,t) m.size

Sunﬂower 4 8× 8 256× 256 48

Clown 5 8× 8 256× 256 60

Quad 6 8× 8 256× 256 72

Buddha 1 32× 32 256× 256 192

Dragon 1 32× 32 256× 256 192

5.1 Images (2D Vectors)

On one hand, it is possible to compress every light

ﬁeld image independently. Thus, one dictionary is

necessary for each image. However in this case, com-

pression rates do not beneﬁt from images similarity

when viewpoints are close. Moreover, since dictio-

naries are separated, as many dictionaries as images

are necessary in the memory during rendering.

On the other hand, only one dictionary for all the

images of a light ﬁeld is not more attractive since all

the viewpoints show various portions of the object,

QUALITY-BASED IMPROVEMENT OF QUANTIZATION FOR LIGHT FIELD COMPRESSION

237

with varying lighting conditions and potentially vary-

ing reﬂectance properties. Therefore, a single dictio-

nary clamps many important noticeable tints. This

method is thus inappropriate for coding shading re-

ﬁnements.

5.2 Uvst Blocks (4D Vectors)

Another solution consists in exploiting the whole light

ﬁeld coherence. This is why the authors of (Levoy

and Hanrahan, 1996) propose a compression scheme

based on 4D uvst blocks. With this approach, a 2 ×

2× 2× 2 uvst vector corresponds to 4 blocks of 2× 2

pixels in 2× 2 uv sample images in the same slab.

Increasing the vector size also requires to increase

the dictionary size. Furthermore, the vectors in this

dictionary are also larger. This is why a single dictio-

nary should be used for the whole light ﬁeld with 4D

vectors.

Table 2: Comparison of compression methods: 4D quanti-

zation vs. 2D quantization. Two light ﬁelds (LF) are used:

1- the Sunﬂower. and 2- the Buddha. v.type indicates the

type of vectors used for quantization; d.size is the dictio-

nary size; c.time provides the computing time; m.size is the

memory size (in KB) with compression. The given PSNR

corresponds to a mean over the compressed images.

LF v.type d.size c.time m.size PSNR

1 4D 16384 2h 1173 30.63

2D 256 40s 916 30.70

2 4D 16384 7h 2598 33.43

2D 256 3m 4104 32.72

The results provided in Table 2 show that larger

vectors and unique dictionary for all the slabs imply

higher compression time. As stated in (Levoy and

Hanrahan, 1996), the approach using 4D vectors re-

mains interesting only when the uv plane is densely

sampled. However, increasing the sampling density

also increases the light ﬁeld size whatever the com-

pression scheme used. Both these afﬁrmations can be

veriﬁed in this example (Buddha is 16 times denser

than Sunﬂower).

5.3 Slab Images

As shown in Figure 4.a, a block of pixels on the st

plane does generally not correspond to the same re-

gion of the object for 2×2 uv samples. This produces

artifacts when using 4D vectors. On the other hand,

2D st vectors can be associated with one region of

the object for more distant viewpoints, thus increas-

ing image quality with a smaller dictionary.

uv plane

st plane

uv plane

(a) (b)

Figure 4: (a) A 2× 2× 2× 2 block of uvst does not cover

the same region of the object; (b) With 2 × 2 blocks of st

samples, it is possible to associate pixels corresponding to

the same region of the object for several viewpoints.

This is the reason why we have associated one dic-

tionary for groups of several uv images in each slab.

This method better beneﬁts from image similarities

for both uv and st planes.

Table 3: Comparison of number of dictionaries per slab for

the Sunﬂower. light ﬁeld. d.size corresponds to the dictio-

nary size while m.size indicates the size required in memory

(in KB). The given PSNR (in dB) corresponds to a mean

on the compressed images (background pixels are not taken

into account).

# Dict. d.size m.size PSNR

64 128 1144 30.8

16 256 1060 31.8

4 512 1073 32.6

1 1024 1133 33.3

As shown in Table 3, reducing the number of dic-

tionaries for each slab allows to increase the dictio-

nary size (and thus the image quality) while keeping

a similar overall memory size. Nevertheless, we have

noticed that the loss in visual quality is not worth the

beneﬁt in memory space when the number of dictio-

naries is too low (with ﬁxed dictionary size). Addi-

tionally, we show in Section 9 that the number of dic-

tionaries should remain high enough for ensuring a

homogeneous quality over the whole light ﬁeld. In

practice, the best compromise has generally been ob-

tained with 4 dictionaries per slab, corresponding to 4

regions of uv images subdividing the slab.

6 OBJECT SILHOUETTE

Background pixels in the images do not correspond to

any information. Quantifying these pixels increases

the computing time, the dictionary size and con-

tributes to the impairment of visual quality through

aliasing on the object contour. This is why we pro-

pose to only take into account pixels corresponding

GRAPP 2007 - International Conference on Computer Graphics Theory and Applications

238

to the object in our quantization method.

6.1 Silhouette Bounding Box

In most light ﬁelds, the st plane is placed so that the

object be located at the center of the images. We

ﬁrst associate a 2D bounding box enclosing the object

with each image. All the pixels outside the box are not

considered during the quantization process; they are

not even stored in the memory. In practice, depend-

ing on light ﬁelds, between 40% and 80% of vectors

are ignored during encoding.

6.2 Silhouette Mask

Inside each bounding box (for each uv image), a bi-

nary mask indicates whether a pixel corresponds to

the object or to the background. It is RLE-encoded

on the disk, and stored uncompressed in the memory

for rendering performance reasons.

Table 4: Mask size in the memory for each light ﬁeld and

beneﬁt compared to compression without bounding boxes

nor masks. Dictionaries contain 256 vectors.

Light Field Mask size (KB) Beneﬁt

Sunﬂower 430 67.1%

Clown 704 57.4%

Quad 1510 24.8%

Buddha 2031 62.6%

Dragon 4779 12.1%

As shown in Table 4, even though bounding boxes

and masks have to be stored in the memory, compres-

sion rates are higher than pure 2D compression with

equal PSNR. Last but not least, object silhouettes are

accurately preserved.

7 QUALITY ASSESSMENT

Quality assessment addresses several types of appli-

cations such as medical imaging, image and video

compression, etc. The assessment can be subjective

involving human judgment, objective implying the

use of mathematical tools, or both (Keelan, 2002).

Formal subjective testing has been used for many

years with a relatively stable set of standard methods

described in the ITU recommendation (ITU-R Rec-

ommendation BT.500-10, 2000).

Objective quality assessment offers several types

of measures or metrics. Simple metrics such as PSNR

are very easy to compute and are appropriate for real-

time assessment but they may not correlate with hu-

man judgment. Other measures are based on the Hu-

man Visual System (HVS) modeling which allows a

good correlation but are often difﬁcult to implement.

Because subjective experiments are complicated

to manage and time consuming, they are difﬁcult to

repeat. To ensure repeatability, the correlation exist-

ing between the opinion score (subjective) and the re-

sults of mathematical metrics (objective) is studied.

In the case of a good correlation (greater than 70%),

it is possible to use the metric and to extrapolate the

results for human judgement.

In existing light ﬁeld works, compression bit-rate

(i.e. size of dictionaries) is chosen only with regards

to the used memory while the PSNR is used for ﬁnal

quality assessment. In our approach, bit-rate is reg-

ulated by a quality criterion (see Section 8). During

quantization, the dictionary is iteratively constructed.

At every step, the quality associated to the dictionary

is measured and compared to a threshold. If it exceeds

the threshold, the algorithm stops. Otherwise, a new

step starts with an increased dictionary size.

7.1 Evaluations Conditions

In the framework of psychophysical assessment, we

have veriﬁed that the observer has a normal visual

acuity and no color blindness. Our psychophysical

test room conforms to ITU recommendations (ITU-R

Recommendation BT.500-10, 2000) (Figure 5.a):

• an adjustable and directional lighting with a tem-

perature between 5000K and 6500K delivering

25 lux on the display because of the black back-

ground;

• a calibrated display with a resolution of 800×600

pixels to display 256× 256 images;

• a non reﬂective wall painting;

• an adapted viewing distance: 75 cm.

7.2 Assessment Protocol

The duration of a subjective experiment is around 15

minutes. It should not exceed 30 min because of the

observer fatigue. The test protocol is composed of 5

different light ﬁelds where only 3 images have been

chosen for their speciﬁc content. 4 couples of suc-

cessive dictionary sizes are confronted for each view,

from 128 vs. 256 to 1024 vs. 2048, deﬁning 60 tests

(5×3×4). The original image is displayed on the top

in order to have a reference of quality. A snapshot of

the protocol is given in Figure 5.b.

QUALITY-BASED IMPROVEMENT OF QUANTIZATION FOR LIGHT FIELD COMPRESSION

239

(a)

(b)

Figure 5: Psychophysical experiments: (a) Test room instal-

lation; (b) Snapshot of the proposed protocol using ”Presen-

tation” software from Neurobehavioral Systems.

In front of the conﬁguration of Figure 5.b, the ob-

server has to make a choice. If one of the two com-

pressed images looks less impaired than the other, he

clicks on the best one. If no difference is perceptible,

the reference image or the ”Bad” button are clicked

respectively when both images seem similar to the

reference image or when they are strongly impaired.

When the result is validated, an intermediate black

screen is displayed during half a second for memo-

rization avoidance. Another test among the sixty is

proposed to the observer in a random way.

7.3 Assessment Results

Nineteen observers have participated to the subjective

experiment (a minimum of ﬁfteen is recommended

for coherent statistics). The average score of each

image is computed for all the observers. This score

is called Mean Opinion Score (MOS) and its value

is between 0 (no observer chose it) and 1 (all the ob-

servers chose it). To reject the incoherent answers and

the observers that do not make the test seriously, the

kurtosis test is performed with the whole data.

The next step is to study the correlation between

the PSNR and the MOS (see Figure 6). For this pur-

pose, we use the Pearson correlation coefﬁcient which

provides the link existing between two data sets. A

high correlation value means that the two measures

have a similar evolution. Furthermore, the behavior

of the ﬁrst one could be extrapolated from the second

one. We thus obtain a value of 83.3% for the Pearson

coefﬁcient which demonstrates that the PSNR and the

MOS are very correlated in the framework of our ap-

plication.

Figure 6: Correlation between PSNR and MOS.

The quality threshold implemented in the com-

pression stage is based on the deﬁnition of the straight

line drawn in Figure 6 obtained by linear regression.

The equation of this line is:

PSNR = 9.032× MOS + 26.168 (1)

For instance, the threshold value for MOS = 0.7

(agreement of 70% of the observers) is PSNR =

32.5dB. Equation 1 is integrated in the system for

the automatic dictionary size determination: the user

can specify a MOS value as the quality criterion.

8 IMPLEMENTATION

For constructing a dictionary as representative as pos-

sible, we have chosen to use all the uvst samples as a

learning set. Even though this choice implies longer

computing times, the ﬁnal image quality is better dur-

ing rendering. Moreover, compression time only cor-

responds to a preprocessing step and is much shorter

than with 4D vectors.

The dictionary size is automatically ﬁxed accord-

ing to the PSNR measured at each step of the LBG

algorithm. This method provides a dictionary size

equal to 2

, n being the number of steps of the al-

gorithm (each step doubles the dictionary size). With

such a representation, the size of each index is equal

to n bits, which allows to store efﬁciently the index

GRAPP 2007 - International Conference on Computer Graphics Theory and Applications

240

Black Red

Green

Blue

Red

Green

Blue

Yellow

Light red

Green

Blue

Dark grey

(a) (b) (c)

Figure 7: Block selection when not taking background

(black) pixels into account. The vector distance between

(a) and (b) is lower than between (a) and (c) since the black

pixel is not taken into account.

table and offers a random access to any value inside

this table.

The distance d

used between a pair of vectors V

and V

during quantization is computed as follows:

) =

∑

i=1

1,i

2,i

)

k,i

corresponds to the i

pixel of the vector V

. d

is the Euclidean distance between two pixels, using

their respectiveprimary componentsin the RGB color

space. This is the distance usually used for vector

quantization. As shown in Figure 7, background pix-

els are advantageously discarded during quantization

by always getting a minimal distance with other pix-

els. These background pixels are recovered during

rendering thanks to the binary masks.

Algorithm 1 provides the decompression method

for a (slab, u, v, s,t) sample.

Algorithm 1: R,G,B sample from u, v, s,t coordi-

nates and a compressed light ﬁeld.

Data:

slab, u, v, s,t: light ﬁeld direction;

: compressed light ﬁeld

Result:

r, g, b: light ﬁeld sample (”color”);

begin

image = LF

.images(Slab, u, v);

codebook = LF

.codebooks(Slab, u, v);

if (s,t) ∈ image.bbox then

if (s,t) ∈ image.mask then

index = image.indexes(s,t);

(r, g, b) = codebook(index);

else

(r, g, b) = background;

else

(r, g, b) = background;

end

9 RESULTS

The PSNR values provided in this paper do not in-

clude background pixels since they disturb the actual

value. Using these pixels generally provides a much

higher PSNR value which is in practice unreliable be-

cause it depends on the number of such pixels in the

image. Tables 5 and 6 provide the results obtained for

the 5 test light ﬁelds.

The size selection is automatic and incremental,

based on the PSNR. A light ﬁeld is compressed using

4 dictionaries per slab, each having its own size, such

that the PSNR is always greater than 32.5dB (MOS of

70%).

Table 5: Compression rates and PSNR for the test light

ﬁelds. PSNR. is given in dB, m.size corresponds to the

size required in memory after compression (in MB), c.rate

provides the compression rate, c.time indicates the time re-

quired for the compression process.

LF PSNR m.size c.rate c.time

Sunﬂower 33.0 1.52 31.6:1 4m14s

Clown 33.2 2.90 20.7:1 22m12s

Quad 33.4 4.47 16.1:1 5m14s

Buddha 33.1 6.09 31.5:1 9m52s

Dragon 32.9 15.25 12.6:1 36m04s

Table 6: Results obtained in terms of variation coefﬁcient.

v.coeff. represents the variation coefﬁcient in terms of

PSNR computed for all the images of the light ﬁeld. Images

LT provide the percentage of images having a PSNR lower

than the given threshold. PSNR Min provides the minimum

value found for the PSNR of one image.

Light ﬁeld v.coeff Images LT PSNR Min.

Sunﬂower 0.55 12.1% 31.4

Clown 0.53 5.6% 32.0

Quad 0.92 16.1% 31.0

Buddha 0.96 27.7% 30.4

We have implemented both GPU and CPU light

ﬁeld rendering programs. The tests were run with a

Xeon 2.4 GHz processor with 2GB RAM. For more

information about rendering, please refer to (Levoy

and Hanrahan, 1996).

For CPU rendering, when using compressed in-

stead of uncompressed light ﬁelds, performance in

terms of frames per second decreases of about 10%

with the whole data in memory. The rate is between

30 and 52 frames per second for a single light ﬁeld.

The difference is essentially due to access indirections

even though silhouette bounding boxes and masks

avoid searching the dictionary for pixels outside the

object.

QUALITY-BASED IMPROVEMENT OF QUANTIZATION FOR LIGHT FIELD COMPRESSION

241

The GPU used for our tests is a NVIDIA Quadro

FX 3450/4000 SDI with 256 MB of memory. De-

pending on the viewpoint, our GPU program gener-

ates between 25 and 65 images per second with 20

light ﬁelds together.

Table 7: PSNR comparison without background pixels for

the Dragon. object between the Light Field Rendering com-

pression scheme and our method. The size provided for our

method does not include the binary mask. PSNR vc cor-

responds to the PSNR variation coefﬁcient. Note that this

coefﬁcient is much lower with our method.

Dragon LF Rendering Our Method

PSNR vc 1.20 dB 0.25 dB 0.26 dB

PSNR min 30.0 dB 30.4 dB 32.3 dB

PSNR max 36.8 dB 31.6 dB 33.6 dB

PSNR avg 31.1 dB 30.8 dB 32.9 dB

MOS avg 55 % 51% 74 %

Mem. size 9.5 MB 9.6 MB 10.6 MB

Table 7 shows results obtained by our compres-

sion method and the approach proposed in (Levoy

and Hanrahan, 1996) with the original images of the

Dragon. With equivalent PSNR and without masks,

compression rates are equivalents though it is the

worst case for our compression scheme. However, the

variation coefﬁcient is much lower with our method

(due to the use of several dictionaries), implying a

better visual quality during rendering. Using binary

masks increases further the object silhouette quality

as shown in Figure 1. Unfortunately, this parameter

is difﬁcult to estimate in terms of PSNR. Another ad-

vantage of our method concerns the automatic choice

of compression rate that provides an average PSNR

greater than 32.5 dB. It generally increases the PSNR

of 2 dB at the expense of 10% on the the light ﬁeld

size. In average, our method gives a PSNR high

enough to ensure that most observers do not notice

any loss in quality (MOS > 70%) while the previous

method does not.

10 CONCLUSION

This paper presents an improved compression method

relying on quantization dedicated to interactive qual-

ity rendering. Compression time and visual quality

have been improved with the help of object bounding

boxes and silhouette masks for each light ﬁeld image.

The introduction of a PSNR threshold has allowed to

tune directly the visual quality of the compressed ob-

jects with regards to human judgment. As shown in

the results, our method provides efﬁcient random ac-

cess to uvst samples during the rendering phase. We

wish to integrate depth to the binary masks so as to

reduce aliasing artifacts due to uv sampling, also val-

idated by visual experiments.

ACKNOWLEDGEMENTS

We wish to thank Stanford University for providing

the original and compressed Dragon images. We also

aknowledge James Cowley for the Quad model.

REFERENCES

Adelson, E. H. and Bergen, J. R. (1991). The Plenop-

tic Function and the Elements of Early Vision, chap-

ter 1. Computational Models of Visual Processing,

MIT Press.

Chang, C., Zhu, X., Ramanathan, P., and Girod, B. (2003).

Shape adaptation for light ﬁeld compression. In IEEE

ICIP.

Girod, B., Chang, C., Ramanathan, P., and Zhu, X. (2003).

Light ﬁeld compression using disparity-compensated

lifting. In IEEE ICASSP.

Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F.

(1996). The lumigraph. ACM Computer Graphics,

30(Annual Conference Series):43–54.

Heidrich, W., Lensch, H., Cohen, M., and Seidel, H. (1999).

Light ﬁeld techniques for reﬂections and refractions.

In Eurographics Rendering Workshop 1999. Euro-

graphics.

ITU-R Recommendation BT.500-10 (2000). Methodology

for the subjective assessment of the quality of televi-

sion pictures. Technical report, ITU, Geneva.

Keelan, B. W. (2002). Handbook of Image Quality: Char-

acterization and Prediction. Marcel Dekker, New

York, NY.

Lelescu, D. and Bossen, F. (2004). Representation and cod-

ing of light ﬁeld data. Graph. Models, 66(4):203–225.

Levoy, M. and Hanrahan, P. (1996). Lightﬁeld render-

ing. ACM Computer Graphics, 30(Annual Conference

Series):31–42.

Li, J., Shum, H., and Zhang, Y. (2001). On the the com-

pression of image based rendering scene: A compar-

ison among block, reference and wavelet coders. In

Int. Journal of Image and Graphics, 1(1):45–61.

Linde, Y., Buzo, A., and Gray, R. (1980). An algorithm for

vector quantizer design. IEEE Trans. on Communica-

tions, 1:84–95.

Lloyd, S. P. (1982). Least squares quantization in

pcm. IEEE Transactions on Information Theory,

28(2):129–136.

GRAPP 2007 - International Conference on Computer Graphics Theory and Applications

242

Magnor, M. and Girod, B. (1999). Hierarchical coding of

light ﬁelds with disparity maps. In IEEE ICIP, Kobe,

Japan, pages 334–338.

Magnor, M. and Girod, B. (2000). Data compression for

light ﬁeld rendering. IEEE Trans. Circuits and Sys-

tems for Video Technology, 10(3):338–343.

Magnor, M., Ramanathan, P., and Girod, B. (2003). Multi-

view coding for image-based rendering using 3-d

scene geometry. In IEEE Trans. Circuits and Systems

for Video Technology, 13(11):1092–1106.

Ramanathan, P., Flierl, M., and Girod, B. (2001). Multi-

hypothesis prediction for disparity-compensated light

ﬁeld compression. In IEEE ICIP.

Ramanathan, P., Kalman, M., and Girod, B. (2003). Rate-

distortion optimized streaming of compressed light

ﬁelds. In IEEE ICIP, pages 277–280.

Tong, X. and Gray, R. (2000). Coding of multi-view im-

ages for immersive viewing. In IEEE ICASSP, Istan-

bul, Turkey, pp. 1879-1882.

Wei, L.-Y. (1997). Light ﬁeld compression using wavelet

transform and vector quantization. Technical Report

EE372, University of Stanford.

Zhang, C. and Li, J. (2000). Compression of lumigraph with

multiple reference frame (MRF) prediction and just-

in-time rendering. In Data Compression Conference,

pages 253–262.

QUALITY-BASED IMPROVEMENT OF QUANTIZATION FOR LIGHT FIELD COMPRESSION

243