Efﬁcient Image Distribution on the Web

Instant Texturing for Collaborative Visualization of Virtual Environments

Michael Englert, Yvonne Jung, Jonas Etzold, Marcel Klomann and Paul Grimm

Fulda University of Applied Sciences, Marquardstr. 35, 36039 Fulda, Germany

Keywords:

HTML5, WebGL, GPU Decoding, Mobile Devices, Progressive Image Transmission, Collaboration.

Abstract:

In this paper, we present a browser-based Web 3D application that allows an instant distribution of image data

even over mobile networks as well as textured rendering of large image collections on mobile devices with

restricted processing power. Applications utilizing a lot of image data require an adaptive technology to build

responsive user interfaces. This applies especially for the use in mobile networks. Furthermore the up- and

download of the massive amount of image data should be transmitted in a progressive manner to get an instant

feedback. While people are used to instant reaction of web applications and do not care about the amount of

data that has to be transferred, the instantaneous display of imperfect content that gets continuously reﬁned is

state of the art for many application areas on the web. However, standard 2D image transmission technologies

are usually inappropriate within a 3D context. In 2D, image size as well as resolution are often set during the

authoring phase, whereas in 3D applications size and displayed resolution of textured 3D objects depend on

the virtual camera. Our GPUII approach (GPU-based Image Interlacing) follows a client-server architecture,

which allows an instant distribution of new data while also reducing the CPU load and network trafﬁc.

1 INTRODUCTION

Today, the combination and integration of real world

information into virtual environments is a well-known

task in Augmented and Virtual Reality research. The

increasing performance of mobile devices and wear-

ables like Google Glass additionally promote this re-

search area. Applications like Google Streetview

(Anguelov et al., 2010), as well as Microsoft Photo-

synth or Photo Tourism (Snavely et al., 2006) already

use a huge amount of real world image data to recon-

struct 3D geometry directly from digital photos and

show that this type of application is getting more and

more important.

Another important fact is the increasing availabil-

ity of 3D web applications. Web applications can

be a good alternative to native apps because of their

great deployment possibilities even on mobile de-

vices. Standards like HTML5 and WebGL as well

as 3D frameworks like Three.js (Cabello, 2013) or

X3DOM (Behr et al., 2010) facilitate the develop-

ment of such applications and additionally provide

an easy integration of 3D real-time graphics into web

pages without installing further plugins. To be able

to cope with the requirements of modern client-side

web apps not only new web standards were devel-

oped but also the performance of JavaScript engines

did increase enormously the last few years. Further-

more, browser vendors and the W3C accomplished

more features through their API speciﬁcations, which

e.g. allow direct access to the devices’ media hard-

ware like the camera or other sensors.

Building on these APIs, the collaborative con-

struction planning system recently proposed by (Et-

zold et al., 2014b) clearly shows that complex Mixed

Reality applications can be developed using only web

standard technologies. It combines CAD planning

data with real world information by placing real pho-

tos and annotations within the virtual 3D scenes gen-

erated from the CAD blueprints. Therefore, the au-

thors are using the camera access and other web tech-

nologies to visualize the combined data via hardware

supported 3D rendering. Furthermore, a server-based

collaboration component allows all involved people,

like workers on the building site as well as supervisors

or even investors to discuss about the current state or

problems during the construction phase.

Here, not only static scenes are used, but new im-

age material is integrated into the 3D scene at the

original camera pose by using the sensor informa-

tion (such as GPS, gyroscope, etc.) of the mobile de-

vice, whenever new photos are captured (Etzold et al.,

478

Englert M., Jung Y., Etzold J., Klomann M. and Grimm P..

Efﬁcient Image Distribution on the Web - Instant Texturing for Collaborative Visualization of Virtual Environments.

DOI: 10.5220/0005345704780485

In Proceedings of the 10th International Conference on Computer Graphics Theory and Applications (GRAPP-2015), pages 478-485

ISBN: 978-989-758-087-1

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: Unordered photo collection that represents parts

of a room or building and reorganization within a 3d scene.

2014a). As a consequence, the updated scene data

including the newly taken images ﬁrst has to be up-

loaded to the server and then the data has to be dis-

tributed to all connected clients in order to get a dis-

cussion base. In a short time a lot of photos are in-

tegrated into the virtual scene this way (see ﬁgure 1).

However, since the authors are using only standard

textures and do not focus on a more advanced stream-

ing approach, connected users have to spend a lot of

time to receive new image data, especially when using

slow mobile networks.

Along with the improving quality of images (e.g.,

captured by digital cameras), their sizes and resolu-

tion also increase, which is why the integration of sev-

eral images into the applications requires a lot of data

transmission that can be very problematic while using

mobile devices and their often strictly limited band-

width on mobile networks. While people are used to

instant reaction of web applications (responsive user

interfaces) and do not care about the amount of data

that has to be transferred, the transmission of this im-

age data can take a lot of time, which can lead to a

frustrating user experience. Users are getting bored,

if the system shows no reaction after a few seconds

and in the worst case they even leave the page.

Besides using smaller images, a good and ele-

gant approach to improve the application’s startup and

runtime behavior during synchronization with other

clients is to employ progressive image transmission

(PIT) strategies (Chee, 1999). Instead of transferring

a single entity the images are divided into smaller

parts, which are transmitted consecutively and can

also be used as preview. The ﬁrst levels are rough

approximations while the quality increases with each

additional detail level. The instantaneous display of

imperfect content that gets continuously reﬁned is

state of the art for many application areas on the web.

While the PNG and JPEG standards have become a

well-established method for progressive delivery of

2D image data, there was no focus on suitable meth-

ods for progressive transmission of textures in inter-

active 3D web applications.

Also browsers still only support the parallelization

of working processes in a rather limited way, so ac-

cessing the raw byte stream including the decoding

process during loading phase substantially impacts

the interactive framerate (Herzig et al., 2013). Other

possibilities have to be developed that better deal with

the extremely limited processing power of mobile de-

vices and concurrently transmit the image data in a

progressive way to support an instant feedback.

To overcome these problems, our proposed web

application shows how a huge amount of real world

image data can be efﬁciently distributed to an arbi-

trary number of connected clients using different de-

vices and bandwidths. Our GPUII method for texture

streaming and level-of-detail (LOD) has a minimal

overall data rate, easy and fast en- and decoding with

small ﬁrst previews. In addition, we derive a control

function to specify the maximum reﬁnement levels of

each texture based on the required rendering quality

and the current camera state. For load balancing be-

tween CPU and GPU during the decoding phase our

approach is optimized for mobile devices with less

processing power and allows an instant texturing in

3D web applications using the roughest approxima-

tion given by the ﬁrst reﬁnement level.

2 RELATED WORK

There exist different progressive technologies for 2D

image transmission schemes, which are shortly dis-

cussed in order to select and transfer the best practice

as far as possible to a new progressive texture trans-

mission approach that allows instant texturing of vir-

tual 3D objects and ﬁts well with the limited process-

ing power of mobile devices.

While all web browsers support image formats

like PNG, JPEG, and GIF, different progressive strate-

gies to download them are natively implemented and

intermediate download results cannot be accessed

from within JavaScript/ WebGL. Though all browsers

present several previews of the image during trans-

mission, their efﬁciencies vary. In this regard, the

Adam7 interlacing scheme of the PNG ﬁle format

(Costello, 2003) allows a very fast ﬁrst impression

of the image due to the simplicity of the algorithm.

Adam7 is a 2-dimensional interlacing scheme. Its en-

coding strategy consists in transposing the pixel order

from a sequential to a 2-dimensional distribution by a

8 × 8 pixel pattern with exactly seven steps. Within

these seven steps, the resolution increases by a factor

of two with respect to the previous step.

Utilizing the original Adam7 interlacing for tex-

tures in 3D scenes imposes several problems that have

EfficientImageDistributionontheWeb-InstantTexturingforCollaborativeVisualizationofVirtualEnvironments

479

to be resolved, as already discussed in (Herzig et al.,

2013). On the one hand, it is not possible to get

access to the exact preview steps, and on the other,

web browsers do not allow accessing this stream in

an efﬁcient way. The byte stream has to be con-

verted using JavaScript, which could be rather time-

consuming. The Adam7 interlacing scheme strictly

uses seven (preview) steps. This can be problematic,

if very big images should be transferred, as the ﬁrst

version already could be too big for slow mobile net-

works. Although in 3D scenes often not the whole

quality of a texture is required, for instance because

of a larger distance to the virtual camera, in the orig-

inal Adam7 method the stream cannot be paused if

enough data is already visualized. Furthermore, the

Adam7 technique is only implemented for the PNG

image format, and JPEG compression cannot be ex-

ploited efﬁciently this way.

Already Chee (Chee, 1999) classiﬁed approaches

to transmit images in a progressive way into four dif-

ferent categories: successive approximation, multi-

stage residual coding, transmission sequence based

coding, and hierarchical coding. Some approaches

only target an efﬁcient transfer, deal with massive

amounts of pixels, and visualize them in a 2D context,

like Deep Zoom Images (DZI) (Kopf et al., 2007),

or they require a sophisticated encoding. Following

the classiﬁcation scheme of (Chee, 1999), DZI can

be seen as hierarchical coding strategy, because of its

quadtree-based layout.

The pyramid transmission scheme of (Herzig

et al., 2013) uses a quadtree to divide and transfer

images as well as 3D data. The authors show that

such a strategy can be utilized as well in a 3D con-

text, to access 3D terrain data progressively. How-

ever, their method still exhibits some disadvantages,

as esp. the data transfer lacks efﬁciency, because all

pixels of the ﬁrst preview are transferred twice when

transmitting the next level, and a third time on the

next coarser level, etc. Other approaches to stream

images progressively try to extract the regions of in-

terest and transmit the preferred image parts as soon

as possible like (Hu et al., 2004), or esp. (Lim et al.,

2010), who already exploits a quadtree-based method

to select and transmit the preferred image parts ﬁrst to

quickly get better previews.

Furthermore, other authors adapt images on the

server for different devices to better exploit the some-

times rather limited bandwidth (Wilcox, 2014), since

for example on mobile devices with small screens

only low resolution images need to be visualized.

This way, images can automatically be provided for

every required size by a PHP script (instead of the

typically manual process of preparing images in vari-

ous sizes as needed for different media queries/ screen

sizes in responsive web design), but they cannot be

transmitted progressively.

Especially for texture compression in hardware-

accelerated 3D graphics, the DXT formats (Pat Brown

et al., 2013) where developed. In contrast to image

compression algorithms like JPEG they have a ﬁxed

data compression rate and require only one memory

access per texel for decompression. To overcome

artifacts arising with e.g. normal maps, (Munkberg

et al., 2006) later outlined several techniques to im-

prove quality also for normal map compression (cp.

(van Waveren and Castano, 2008) for a more detailed

discussion). However, all these methods are not yet

available in WebGL and only aim at reducing storage

memory and bandwidth, but do not provide any means

for progressive texture transmission over the web.

Since the broad support of WebGL (Khronos

Group, 2014) with its hardware-accelerated 3D ren-

dering in all major web browsers along with 3D

web frameworks like X3DOM (Behr et al., 2010)

or three.js (Cabello, 2013) for simplifying the devel-

opment of 3D web apps, adaptive methods not only

for images but also for textures in 3D scenes gained

importance. Thus, in (Schwartz et al., 2013b) and

(Schwartz et al., 2013a) recently a shader-based algo-

rithm that supports progressive transmission as well

as access to the dataset within a 3D context was pre-

sented in order to stream a BTF (bidirectional texture

function) to allow for photorealistic rendering using

WebGL, though encoding is still very specialized.

3 TEXTURE REPRESENTATION

Mixed Reality (MR) applications like the web-based

support and collaboration tool presented in (Etzold

et al., 2014b) and (Etzold et al., 2014a) for con-

struction planning and supervising scenarios use a

lot of images to reconstruct virtual worlds combined

with real world data. The tool combines classical

CAD planning data with photo collections represent-

ing temporal snapshots of the associated construction

site, which can be integrated into the already existing

3D scene during runtime. After arranging new pho-

tos within the scene using sensor data and other MR

methods they have to be distributed to all connected

clients. A scene can obtain hundred or even more im-

ages (ﬁgure 1 should give an idea of this use case).

On every single start of the application all data has to

be transferred via network before it can be used on the

client, which can be rather time consuming.

Most of the time clients are connected by mo-

bile devices combined with their strictly limited band-

GRAPP2015-InternationalConferenceonComputerGraphicsTheoryandApplications

480

Client 1

Client

2..n

Server

Produce GPUII level

loop

[Till end of GPUII level reached]

Send next GPUII level

Create new GPUII Image

Storing new GPUII level

alt

[Is First level of new GPUII Image]

Request for next required level

loop

[Till all GPUII parts are loaded]

Create GPUII Image

Required level if already available

Figure 2: Sequence diagram for upload and distribution of

a newly captured and positioned photo using GPUII levels.

width due to mobile networks. If someone wants to

share a photo he or she has to upload it. Unfortu-

nately, the upload is usually even much slower than

the download, so one has to wait a long time. After

the upload the image also has to be downloaded by all

other connected (mobile) clients. But there are more

challenges to solve when using mobile devices. They

are not as powerful as a modern desktop PC or note-

book. Thus, encoding and decoding has to be very

cheap and efﬁcient.

In this section, we therefore present our approach

for progressive image transmission to allow instant

textured rendering in distributed, web-based 3D ap-

plications using mobile devices, which we call GPUII

(GPU-based Image Interlacing). The goal of our work

is to allow an efﬁcient texture transmission for 3D

web applications. For instant response a ﬁrst reﬁne-

ment level should be delivered very fast, which results

in a small preview image. The quality of the trans-

mitted textures should be controlled dynamically by

their size and displayed resolution in the ﬁnal render-

ing. The amount of transmitted data should be mini-

mal, which means, that each reﬁnement level is inte-

grated into the next one, so that no pixels are trans-

mitted twice. We do not focus or optimize for a spe-

ciﬁc image format in order to support a broad range

of applications and to allow the usage of all com-

monly supported image formats of a browser exploit-

ing their speciﬁc advantages. To allow using massive

amounts of images and to minimize the requirements

for the web browser the proposed approach should

be lightweight regarding CPU and memory resources,

and should also scale well for all server jobs.

3.1 Encoding Scheme

We use a technique that exploits a 2-dimensional

interlacing to split original images, independent of

their original format, into a subset of preview images.

The result is an image set with progressively increas-

ing resolution, where the original image format only

serves as container format. Instead of utilizing a static

count of subimages the number of images is calcu-

lated based on the original image size. We split the

original image until a minimum size (e.g. 128 × 128

pixels) is reached, which can be dynamically deﬁned.

In case of a 512 × 512 pixel image we produce a sub-

set of only ﬁve images. If an image with a resolution

of 4096 × 4096 pixels shall be used, we encode the

image into a subset of eleven preview steps.

Before creating all GPUII levels, we scale the

original image to the best matching power-of-two

(POT) representation to provide better performance

and MipMap support to prevent ﬂickering during

camera movement. After this the count of image lev-

els for the resulting GPUII data has to be calculated.

This is done in dividing width and height by two in

an alternating manner, till one of both dimensions

reaches the deﬁned minimum resolution. After scal-

ing and calculation of the iterations the actual encod-

ing can be started. The encoding scheme, we exploit

in our application is oriented on the Adam7 interlac-

ing scheme (Costello, 2003). In contrast to the orig-

inal implementation there is no restriction to exactly

seven steps. Thus, better adaption on large images are

possible, which results in predictable loading times

of the ﬁrst preview independent of the image size. In

addition, reﬁnement control is possible.

To provide interactive framerates during encoding

we use worker threads to produce all image levels.

Through workers, browsers allow parallel processing.

Functions can run as long as needed without affect-

ing the framerate or user interface of the application.

During encoding a CPU-based approach is no prob-

lem, because only one image has to be converted si-

multaneously. Decoding is in our application a little

bit more tricky, because a lot of images have to be

decoded concurrently.

3.2 Decoding Scheme

During the encoding step, the original image was

splitted into a set of preview images with progres-

sively increasing resolution. The number of subim-

ages is varying and stored with the following naming

scheme: [1].[ f ormat] − [n].[ f ormat]. To decode the

image correctly, the previews are loaded in a chrono-

logical manner starting from preview number 1 to n.

After loading one or several levels of the image, it has

to be decoded before presentation, where the highly

parallelized computing power of the GPU is used.

Moreover, every image pixel is transmitted only once

EfficientImageDistributionontheWeb-InstantTexturingforCollaborativeVisualizationofVirtualEnvironments

481

Upload GPUII detail levels sequenally

Load available /

required levels

Figure 3: Distribution of a) newly captured and arranged photo with b) upstream of the levels of GPUII dataset, c) storage of

the GPUII dataset on the server and distribution to connected clients d) using progressive download with different bandwidth.

during download, which is ensured through our en-

coding strategy.

The ﬁrst preview can be displayed directly without

any decoding effort. All consecutively loaded n − 1

image levels are integrated into the already existing

image, following the algorithm sketched in (Englert

et al., 2014) – here, cp. ﬁgure 2 for a visualization.

After the second level of the image has been loaded,

the new data and the currently visualized texture are

both sent to the shader. Additionally, a combination

pattern is required. The shader itself is applied to a

view-aligned quad of the targeted image size during

an offscreen rendering pass using two FBOs (Frame-

buffer Objects). The result of the rendering pass is

written to an FBO and serves in the following frames

as the new and ﬁner representation of the surface tex-

ture. This process is repeated in a ping-pong’ing fash-

ion until the ﬁnal texture resolution, depending on the

factors outlined in section 3.3, is reached.

For a combination operation we exploit two pat-

terns that are alternately applied. A pattern is a small

texture of 2 × 2 px that is used as lookup texture and

that describes how to combine the pixels of two im-

ages into an intermediate result. Which pattern will be

used ﬁrst is computed from the number of previews of

the decoding dataset. An even number of texture pre-

views are combined line by line, whereas in the other

case the combination is done column by column. The

colors black and white are used as index and always

deﬁne the texture whose pixel should be placed on the

resulting image. Obviously, more complex patterns

are possible this way. To implement this approach on

the GPU, two textures that should be combined are re-

quired including a pattern texture along with a scaling

vector R

x,y

as uniform variable for scaling the texture

coordinates accordingly.

x,y









)·2

b0.5nc

)·2

b0.5(n−1)c



, 2 - m



)·2

b0.5(n−1)c

)·2

b0.5nc



, 2 | m

(1)

Here, R

x,y

deﬁnes how often the pattern has to be

repeated in x and y direction (or s and t respectively).

If for instance two textures, both of resolution 64×64

should be combined and the number of desired pre-

views is odd, then the used vector would be





. The

exact scaling factor is obtained via equation 1, where

x,y

speciﬁes the resolution in x, y directions, n the

current preview number, m the preview count, and p

the ﬁrst preview image. The name GPU-based Im-

age Interlacing thus denotes the basic idea of our al-

gorithm. The consecutively ordered image levels are

downloaded progressively and are combined in an in-

terlaced manner on the GPU.

3.3 Data Distribution

To distribute photos our application consists of two

parts. First, the image has to be uploaded to the server,

followed by the distribution to all other clients. Fig-

ures 2 and 3 visualize and additionally explain the en-

tire streaming process.

Upload of New GPUII Dataset. Once we have cre-

ated a new GPUII dataset on the client, the image data

has to be transferred to the server and distributed to

all connected clients. All levels of the GPUII dataset

are transferred in a chronological manner and stored

by the server using the previously explained naming

scheme. The distribution to all other connected clients

starts immediately when the ﬁrst level is available.

The server sends a message to all clients to create a

new GPUII dataset with a position and URL. They

directly start the download of the ﬁrst level and then

follow the rules explained next.

Download of All Required GPUII Datasets. Tex-

tured objects in a 3D scene often cover only a small

part of the viewport. Hence, a more sophisticated

download method is necessary that also takes proper-

ties of the 3D scene, like camera position, etc., into

account, to decide which preview has to be loaded

next and, if multiple textures are used, how to sort

them for importance. We deﬁne importance using the

following criteria with decreasing priority:

GRAPP2015-InternationalConferenceonComputerGraphicsTheoryandApplications

482

• Visibility

• Necessity of next reﬁnement level

• Distance to camera

• Currently loaded preview step

All images are sorted along these criteria. In addi-

tion to the visibility of each image, it has to be decided

if further preview steps are needed. This information

is determined by calculating the screen space size of

the image and relating it to its pixel density (similar to

mipmapping). The advantage of this control function

is that not in every case a better resolution is required.

All images that are visible and require a ﬁner resolu-

tion level are registered to the system for sorting along

camera distance, size, and priority.

To prevent that always the same image is preferred

by the distance criterion, distance is ordered by con-

centric circles around the virtual camera to group im-

ages with similar distance. In a last sorting step, the

images of the nearest circle are ordered by their cur-

rent preview step, where the images with the smallest

preview step are preferred to get a consistent look.

4 RESULTS AND APPLICATIONS

In this section we discuss the main beneﬁts of our ap-

proach and also present some additional use cases.

4.1 Results

Using GPUII instead of standard image textures has

several advantages for applications, which are using

huge amounts of image data on low-end hardware like

mobile devices together with mobile networks.

Adaption. The advantage of adaption can be split-

ted into three areas. First the download can be adapted

to the bandwidth. In case of using mobile networks

the download of the GPUII preview parts can be re-

stricted. Using one step less than the entire amount

of GPUII preview versions can reduce the download

amount up to 50%. Furthermore the download can

be adapted to the storage size of the presentation de-

vice. Mobile devices are not only restricted in pro-

cessing power, but also in their memory. In our appli-

cation a lot of images should be visualized at the same

time what often leads in reaching the memory limit.

In case of extremely storage limitations the download

amount for each GPUII image can also be restricted

as previously explained. Additionally to the restric-

tion caused from bandwidth and device limitations

our approach adapts to the camera position within the

scene and the size of the viewport. This can efﬁciently

a) b)

Figure 4: a) Difference image (enhanced contrast for better

visibility) of b) original PNG texture (4096 × 4096 px, 8.5

MB) and c) corresponding 4096 × 4096 px GPUII dataset

(1 MB) adaptively rendered on 1024 × 768 px viewport.

reduce the data download amount. Often images in

3D scenes are rendered very small due to their dis-

tance to the virtual camera. And when presenting

them in fullscreen they also require only the size of

the viewport that is often less than image resolutions

of currently available digital cameras. Utilizing our

proposed GPUII method offers cost-savings of about

80% in situations with big viewport but small image

sizes. A comparison of an original PNG image with

a GPUII visualization is shown in ﬁgure 4, where a)

shows the slightly exaggerated difference image.

First Preview. Another important aspect that has to

be taken into account is the transmission efﬁciency of

the ﬁrst preview of the image. Additionally, the ﬁrst

preview should provide enough data to get a good ﬁrst

impression of what will be shown later. Therefore, a

trade-off between minimal resolution and download

speed has to be considered. Adam7 is limited in its

variability and always uses statically seven steps to

reorder the pixels. This always leads to

th of the

number of pixels in the ﬁrst representation, and both,

high resolution as well as low resolution images, are

restructured in the same way. This soon leads to

rather large ﬁrst previews with increasing source im-

age sizes. Our proposed method in contrast exploits a

dynamic amount of preview steps and allows specify-

ing a minimum resolution in x- and y-direction, which

prevents a further subdivision of the source image. In

case of small source resolutions the dataset thus con-

sists of a small number of previews, while high res-

olution source images instead affect in an opposing

manner: the number of previews increases. The size

of the ﬁrst preview is nearly identically using arbi-

trary image resolutions on the source ﬁle. For GPUII

a minimum size of 128 × 128 px seems most reason-

able – in our tests we found this being a good trade-off

between data size and preview quality.

Data Distribution. A fast distribution of new pho-

tos is another important aspect. While uploading

standard images, the server has to wait until all data

of the image is available before distributing it to all

EfficientImageDistributionontheWeb-InstantTexturingforCollaborativeVisualizationofVirtualEnvironments

483

Figure 5: Using GPUII to transmit terrain data (left) or ge-

ometry images as proposed in (Gu et al., 2002) (right).

connected clients. Calculating GPUII images on the

client before uploading, the distribution can be ac-

celerated efﬁciently. Because of ever increasing im-

age resolutions of digital cameras the image resolu-

tion can be reduced before uploading it, adapted to the

bandwidth. Furthermore, because of using the images

as textures in WebGL, they should be power of two.

After transmitting the ﬁrst preview level of GPUII,

which requires a transfer of only a few KB, the distri-

bution can be started some time earlier. This leads to

instant previews on all connected clients directly after

starting an image upload, independent from supported

data rates of the network.

4.2 Additional Use Cases

Progressive Transmission of 3D Vertex Data.

Having regular data eases progressive transmission,

where esp. terrain data is very often arranged in a

regular form. This type of data can be easily encoded

in images and streamed with our approach. Figure

5 exemplarily shows some regular terrain data trans-

mitted and rendered by using our GPUII method. The

rendered terrain is visualized with the ﬁrst version of

both, the displacement data and the surface texture.

Both textures are transmitted using PNG containers,

because of the lossless decoding, although for the

color texture alternatively JPEG is possible as image

transport format, since here compression artifacts are

usually not perceivable. Using geometry images (Gu

et al., 2002), 3D models can be transformed into reg-

ular meshes. These meshes, or more precisely their

vertices, can be arranged within images as regular

RGB values depending on the topology. Therefore,

this kind of meshes can be transferred progressively

with GPUII using the same texture, which would not

be possible with standard texturing.

Geometry Streaming and LOD. Furthermore,

GPUII can be used together with progressively

streamed 3D geometries (e.g., the PopGeometry pre-

sented in (Limper et al., 2013)), where the geometry

data also gets streamed and builds up progressively

in a LOD-like manner. However, for small models

the texture data usually takes more time to transfer

and thus the progressive 3D geometry will not have

a texture until the transfer is completed. To enhance

the quality, GPUII can be combined with the POP ap-

proach so that the visual perception is more consistent

in that the geometry and texture data build up simul-

taneously, see (Englert et al., 2014). Note that smaller

loading times are possible if for a certain camera dis-

tance not the full texture quality is needed. Thus, the

texture reﬁnement can stop at a lower level-of-detail

(LOD), just like the POP buffer geometry does.

2D Image Viewer. GPUII can also be used for

streaming texture data progressively when displaying

the image in a pure 2D context. As a result, a preview

of the corresponding image is shown almost immedi-

ately. The missing data to show the image in its full

quality is transmitted progressively to further adjust

the quality depending on the requirements. Therefore,

images can be adapted to different screen sizes. De-

pending on the actual visual part of the image, the

required data can thus be reduced to a minimum.

5 CONCLUSION AND FUTURE

WORK

In this short paper, we have presented an adaptive

bandwidth-optimized approach that allows instant im-

age distribution and web-based textured rendering on

strictly limited mobile devices and their closely linked

mobile networks. Our method uses a simple encoding

scheme, based on Adam7 interlacing, and a fast de-

coding algorithm that beneﬁts from hardware acceler-

ation by a GPU – even with WebGL’s rather limited

instruction set. Various image sizes of the previews

are possible, including the possibility to specify the

minimal resolution of the ﬁrst preview, which enables

us to generate ﬁrst previews of nearly identical size

in bytes, independent of the source image size. This

allows us to get a more ﬂexible handling of source

images containing arbitrary resolutions.

Moreover, image interlacing schemes like Adam7

do not foresee pausing the download and carrying on

if more data is required, though this can be very help-

ful to reduce the data amount to be transferred. Our

GPUII approach however allows visualizing images

in pure 2D applications in an adaptive manner, which

is esp. useful for responsive web design, where an op-

timal viewing experience for a wide range of displays

GRAPP2015-InternationalConferenceonComputerGraphicsTheoryandApplications

484

has to be provided. Whereas in 2D the image size and

resolution are deﬁned during authoring, in 3D appli-

cations the size and displayed resolution of textured

3D objects depend on their world space positions and

the viewpoint, which is updated every frame. So, not

in every case the full texture quality is necessary, like

for instance if an object is far away from the camera.

Therefore, our approach can also be used as a new

level-of-detail method on the texture level, indepen-

dent from the geometric model representation.

To summarize, our proposed technique can be

applied to stream surface textures for progres-

sive meshes for consistent rendering, to load large

amounts of images in a 3D scene, or to transmit reg-

ular geometry information, like e.g. displacement

maps or other vertex information. In addition to the

PNG format, all common image formats that are sup-

ported by browsers can be utilized as data transport

containers for our GPUII textures. To ease usage we

have integrated the proposed technique as special tex-

ture node in X3DOM. Moreover, in sec. 3 we have

also shown an important application scenario, where

our approach allows increasing the number of photos

in the 3D scene by a factor of at least ten to twelve.

For future work, we would like to combine our

method with a hierarchical approach to stream large

regularly organized meshes (e.g., terrain data). Be-

sides this, it would be interesting to natively imple-

ment our polyﬁll approach in the web browser for

transparent and even more efﬁcient usage.

REFERENCES

Anguelov, D., Dulong, C., Filip, D., Frueh, C., Lafon,

S., Lyon, R., Ogale, A., Vincent, L., and Weaver, J.

(2010). Google street view: Capturing the world at

street level. Computer, 43.

Behr, J., Jung, Y., Keil, J., Drevensek, T., Eschler, P.,

ollner, M., and Fellner, D. W. (2010). A scalable

architecture for the HTML5/ X3D integration model

X3DOM. In Proceedings Web3D ’10, pages 185–193,

New York, USA. ACM Press.

Cabello, R. (2013). Three.js. http://threejs.org/.

Chee, Y.-K. (1999). Survey of progressive image transmis-

sion methods. International Journal of Imaging Sys-

tems and Technology, 10(1):3–19.

Costello, A. M. (2003). Portable network graphics (png)

speciﬁcation (second edition): Information technol-

ogy – computer graphics and image processing.

Englert, M., Jung, Y., Klomann, M., Etzold, J., and

Grimm, P. (2014). Instant texture transmission using

bandwidth-optimized progressive interlacing images.

In Proceedings of 19th Intl. Conf. on 3D Web Tech-

nologies, Web3D ’14, New York, USA. ACM.

Etzold, J., Englert, M., Grimm, P., Jung, Y., and Klomann,

M. (2014a). Mipos: Towards mobile image position-

ing in mixed reality web applications based on mobile

sensors. In Proceedings of 19th Intl. Conf. on 3D Web

Technologies, Web3D ’14, pages 17–25, New York,

USA. ACM.

Etzold, J., Grimm, P., Schweitzer, J., and D

orner, R.

(2014b). karbon: a collaborative mr web application

for communicationsupport in construction scenarios.

In CSCW Companion, pages 9–12. ACM.

Gu, X., Gortler, S. J., and Hoppe, H. (2002). Geometry

images. ACM Trans. Graph., 21(3):355–361.

Herzig, P., Englert, M., Wagner, S., Jung, Y., and Bockholt,

U. (2013). X3d-earthbrowser: Visualize our earth

in your web browser. In Proceedings Web3D 2013,

pages 139–142, New York, USA. ACM Press.

Hu, Y., Xie, X., Chen, Z., and Ma, W.-Y. (2004). Attention

model based progressive image transmission. In Mul-

timedia and Expo, ICME ’04, volume 2, pages 1079–

1082 Vol.2. IEEE.

Khronos Group (2014). Webgl speciﬁcation. http://

www.khronos.org/registry/webgl/specs/latest/1.0/.

Kopf, J., Uyttendaele, M., Deussen, O., and Cohen, M. F.

(2007). Capturing and viewing gigapixel images.

In ACM SIGGRAPH 2007 Papers, SIGGRAPH ’07,

New York, NY, USA. ACM.

Lim, N.-K., Kim, D.-Y., and Lee, H. (2010). Interactive

progressive image transmission for realtime applica-

tions. Consumer Electronics, IEEE Transactions on,

56(4):2438–2444.

Limper, M., Jung, Y., Behr, J., and Alexa, M. (2013). The

pop buffer: Rapid progressive clustering by geometry

quantization. Computer Graphics Forum, 32(7):197–

206.

Munkberg, J., Akenine-M

oller, T., and Str

om, J. (2006).

High quality normal map compression. In Proceed-

ings Graphics Hardware ’06, pages 95–102, New

York, NY, USA. ACM.

Pat Brown et al. (2013). GL

EXT texture

compression s3tc. http://www.opengl.org/registry/

specs/EXT/texture compression s3tc.txt.

Schwartz, C., Ruiters, R., and Klein, R. (2013a). Level-

of-detail streaming and rendering using bidirectional

sparse virtual texture functions. Comput. Graph. Fo-

rum, pages 345–354.

Schwartz, C., Ruiters, R., Weinmann, M., and Klein, R.

(2013b). Webgl-based streaming and presentation of

objects with bidirectional texture functions. Jour-

nal on Computing and Cultural Heritage (JOCCH),

6(3):11:1–11:21.

Snavely, N., Seitz, S. M., and Szeliski, R. (2006). Photo

tourism: Exploring photo collections in 3d. In

SIGGRAPH Conference Proceedings, pages 835–846,

New York, NY, USA. ACM Press.

van Waveren, J. M. P. and Castano, I. (2008). Real-time

normal map dxt compression. http://www.nvidia.de/

object/real-time-normal-map-dxt-compression.html.

Wilcox, M. (2014). Adaptive images - deliver small images

to small devices. http://adaptive-images.com/.

EfficientImageDistributionontheWeb-InstantTexturingforCollaborativeVisualizationofVirtualEnvironments

485