2 RELATED WORK
A survey of real scene illumination modelling for
Augmented Reality is given in (Jacobs and Loscos,
2004). The survey indicates that there is no one pre-
ferred or most popular family of approaches. No
technology has matured to the point of outperform-
ing other types of approaches. In fact, any approach
offers a set of possibilities at the price of a set of as-
sumptions or limitations, leaving the application sce-
nario to define which approach to choose.
There are three main categories of approaches:
1) omni-directional environment maps, 2) placing
known objects/probes in the scene, and 3) manually
or semi-manually model the entire scene, including
the light sources, and perform inverse rendering.
The most widely used approach is to capture the
scene illumination in a High Dynamic Range (HDR),
(Debevec and Malik, 1997), omni-directional envi-
ronment map, also called a light probe. The technique
was pioneered by Debevec in (Debevec, 1998) and
used in various forms by much research since then,
e.g., (Barsi et al., 2005; Debevec, 2002; Gibson et al.,
2003; Madsen and Laursen, 2007). The technique
gives excellent results if the dominant illumination in
the scene can be considered infinitely distant relative
to the size of the augmented objects. The drawbacks
are that it is time-consuming and impractical to ac-
quire the environment map whenever something has
changed in the scene, for example the illumination.
Illumination adaptive techniques based on the envi-
ronment map idea have been demonstrated in (Havran
et al., 2005; Kanbara and Yokoya, 2004) but require a
prototype omni-directional HDR camera, or a reflec-
tive sphere placed in the scene, respectively.
The other popular family of approaches is based
on requiring the presence of a known object in the
scene. Sato et al. analyze the shadows cast by a
known object, (Sato et al., 1999a; Sato et al., 1999b)
onto a homogeneous Lambertian surface, or require
images of the scene with and without the shadow cast-
ing probe object. Hara et al., (Hara et al., 2005) ana-
lyze the shading of a geometrically known object with
homogeneous (uniform albedo) Lambertian object, or
require multiple images with different polarizations,
to estimate the illumination direction of a single point
light source. Multiple light sources can be estimated
from the shading of a known object with homoge-
neous Lambertian reflectance using the technique de-
scribed in (Wang and Samaras, 2008).
The last family of approaches do not estimate il-
lumination per se as they rely on modelling the en-
tire scene in full detail, including modelling the ge-
ometry and the radiances of the light sources. The
modelling process is labor intensive. Given the full
description of the scene and images of it (in HDR if
needed) inverse rendering can be performed to esti-
mate the parameters of applicable reflectance func-
tions of scene surfaces. Subsequently virtual objects
can be rendered into the scene with full global illumi-
nation since all required information is known. Exam-
ples include (Boivin and Gagalowicz, 2001; Boivin
and Gagalowicz, 2002; Loscos et al., 2000; Yu et al.,
1999).
A final piece of related work does not fall into the
above categories, as it is the only representative of this
type of approach. Using manually identified essential
points (top and bottom point of two vertical structures
and their cast shadow in outdoor sunlight scenes) the
light source direction (the direction vector to the sun)
can be determined, (Cao et al., 2005).
In summary existing methods either require pre-
recorded full HDR environment maps, require homo-
geneous Lambertian objects to be present in the scene,
require total modelling of the scene including the il-
lumination, or require manual identification of essen-
tial object and shadow points. None of the mentioned
techniques offer a practical solution to automatically
adapt to the drastically changing illumination condi-
tions of outdoor scenes.
The approach proposed in this paper addresses
all of these assumption and/or constraints: it does
not require HDR environment maps, nor HDR image
data, it does not require objects with homogeneous
reflectance (entire objects with uniform reflectance),
it does not require manual modelling of the illumina-
tion (in fact the illumination is estimated directly) and
there is no manual identification of essential points.
3 ASSUMPTIONS BEHIND
APPROACH
Our approach rests on a few assumptions that are
listed here for easy overview. It is assumed that we
have registered color and depth data on a per pixel
level. High Dynamic Range color imagery is not re-
quired; standard 8 bit per color channel images suffice
if all relevant surfaces in the scene are reasonably ex-
posed. In this paper the image data is acquired using
a commercially available stereo camera, namely the
Bumblebee XB3 from Point Grey, (PointGrey, 2009).
It is also assumed that the response curve of the color
camera is approximately linear. The Bumblebee XB3
camera is by no means a high quality color imaging
camera but has performed well enough. It is also as-
sumed that the scene is dominated by approximately
diffuse surfaces, such as asphalt, concrete, or brick,
GRAPP 2011 - International Conference on Computer Graphics Theory and Applications
130