age plane, with considerably higher ray density in the
foveal region. After a five minute preprocessing step,
they observed a 4.6x speedup when rendering a 256 x
256 x 109 voxel magnetic resonance scan.
Certain features of a scene, such as edges, abrupt
changes in color, and sudden movement tend to at-
tract involuntary user attention. Low-level saliency
models determine which regions of a scene exhibit
these features, and can be used as an alternative to
eye tracking when locating regions of interest. Cater
et al. (Cater et al., 2003) applied a saliency model that
includes knowledge of a viewer’s visual task in or-
der to render a scene with high resolution in regions
of interest and lower resolution elsewhere. Spatial
level of detail variation can also be realized through
adaptive subdivision of three dimensional polygon
meshes. Reddy (Reddy, 2001) developed a system
that renders terrain geometry in high detail at the fixa-
tion point and a simplified mesh outside of the foveal
region. This is accomplished by recursively subdi-
viding the mesh, with regions outside of the fovea
terminating earlier than those within. For a terrain
model with 1.1 million triangles, the perceptual op-
timization achieved a 2.7x improvement in rendering
time. While no eye tracking hardware was used for
this model (fixation was bound to the center of the im-
age), Reddy emphasizes the need for such technology
to produce an accurate perceptually based system. A
more general-purpose method for adaptive subdivi-
sion was proposed and implemented by Murphy and
Duchowski (Murphy and Duchowski, 2001). It con-
verts a full-polygon mesh to a variable level of detail
mesh through spatial degradation according to visual
angle. An eye tracker is used to determine which por-
tion of the mesh to render in full detail while the re-
mainder is rendered using the degraded mesh. For a
268,686 polygon Igea mesh, applying this technique
allowed for near-interactive frame rates (20 - 30 fps),
while frame rate for the full resolution model was too
low to measure.
While many perceptual optimization techniques
have shown positive results, existing methods are not
well-suited for application in a subtle, perceptually
optimized real-time computer graphics architecture.
Multi-resolution display models tend to produce no-
ticeable image degradation; according to Levoy and
Whitaker (Levoy and Whitaker, 1990), “users are
generally aware of the variable-resolution structure of
the image”. In addition, the nonuniform pixel dis-
tribution produced by the multi-resolution approach
tends to exhibit poor coherency with regards to the
Single Instruction, Multiple Data (SIMD) paradigm
employed by modern GPUs. Considering the current
trend towards massively parallel computing architec-
tures, this is a major drawback. Adaptive subdivision
comes with a similar drawback; transitioning between
the full-detail mesh and the spatially degraded mesh
produces motion that is very perceptible to the user’s
peripheral vision. Task-level saliency models offer
excellent computational speedup and low noticeabil-
ity. However, they are not applicable to the general
case, where the user task may be complex and re-
gions of interest are not guaranteed to be consistent
or easily identifiable. Furthermore, automatic predic-
tion of attention regions has been shown to be unre-
liable (Marmitt, 2002). Our perceptually optimized
rendering framework leverages the difference in acu-
ity between the foveal and peripheral regions of the
field of view to provide computational speedup while
more effectively preserving perceived image quality
compared to spatial degradation techniques.
3 SYSTEM DESIGN
3.1 Ray-Tracing Framework
Ray-tracing is a well-established method for render-
ing three-dimensional scenes (Whitted, 1980). The
algorithm models the approximate path of light in re-
verse, flowing from the camera to objects in the scene.
When a light ray intersects an object, the associated
pixel is filled with the color of the object at that point.
For reflective and refractive objects, additional rays
are spawned recursively at the point of intersection.
The ray-tracing algorithm is computationally in-
tensive, which has historically prevented it from be-
ing used for real-time applications. Approximately
75% of the time required to render simple scenes is
allocated to computing ray-object intersections, with
this number increasing for scenes with a large num-
ber of objects. A performance speedup can be real-
ized by reducing the number of intersection tests per
ray or the overall number of rays computed. Our sys-
tem is built on a basic ray-tracing framework, and is
designed to reduce the number of rays that need to
be computed by taking advantage of the differences
in visual acuity between the foveal and peripheral vi-
sion. It also includes a number of traditional opti-
mizations that reduce the number of intersections per
ray as well as the time required to compute each in-
tersection.
The bounding volume hierarchy (BVH) is one
effective method of organizing scene object data to
reduce ray-object intersection calculations per ray.
Each polygon in a mesh is encapsulated within a
bounding volume; we use a sphere, which has a rel-
atively low intersection cost. This set of volumes is
GRAPP2014-InternationalConferenceonComputerGraphicsTheoryandApplications
202