global illumination approach of voxel cone tracing is
well-established (Crassin et al., 2011). The section
starts with a justification for the use of 3D textures in
this context, before detailing the implementation, and
describing some optimisations.
3.1 Applicability of Sparse 3D Textures
A sparse 3D texture was selected for the hierarchical
voxelisation of the scene, rather than the previously
used octree. Sparse texture maps reduce the memory
footprint of images with areas of unused space, so ap-
pear to be well suited to a hierarchical voxel structure.
The contents of a sparse texture are loaded in chunks,
only if they contain non-null data. If used to repre-
sent a voxelised space, chunks will only be uploaded
if they contain a non-uniform space.
While there is potential for a reduced memory
footprint, there is a caveat. Each chunk must be at
least 64kb in size (this constraint is likely to change
as hardware evolves). In the voxel environment this
means at least 26x26x26 blocks of volume sent each
time (for an 8-bit RGBA texture). Even if there is a
single stored entry and the rest of the voxels are null,
the entire chunk has to be sent to the GPU. Conse-
quently the viability of the approach will vary depend-
ing on the scene’s geometry.
An additional benefit of using 3D textures is the
availability of efficient hardware trilinear interpola-
tion. This technique calculates the value within a cube
volume, based on eight control values. To obtain the
final result, two values are obtained via the two bilin-
ear interpolations for the top and bottom plane, fol-
lowed by computing the value in between them, with
another linear interpolation. In the case of a dynam-
ically changing octree laid-out linearly in the GPU
memory, this would be hard to achieve, and would
require bespoke code rather than have hardware sup-
port. Furthermore, estimating the values between the
voxel volumes is achieved via quadrilinear interpola-
tion (a further linear interpolation based on the pair of
trilinear values fetched from each voxel).
As graphics hardware evolves, support for 3D tex-
tures is likely to increase, providing faster access to
the texture data, more efficient hardware interpola-
tion, scattered texture writing and larger high-speed
memory space. Conversely, the sparse octree ap-
proach is likely to continue to require bespoke code
tailored to the constraints of general GPU computing.
3.2 Voxelisation in Sparse 3D Textures
Scattered texture writing is a relatively recent addi-
tion to graphics SDKs (eg OpenGL 4.3 onwards).
The technique enables a fragment shader to write to
an arbitrary texel in a texture. Ordinarily, the use
of frame-buffers entails binding each fragment to a
target texel, with the correspondence of fragment to
texel pre-determined before the fragment shader is
run. Programmable texel access allows us to write
voxel details dynamically to the 3D texture map.
Without scattered texture writes the program
would have to select a single layer of the texture vol-
ume into which a fragment will be rasterized. In our
implementation layers are ordered along the z-axis,
hence if a triangle extends along the z-axis, many du-
plicates may be generated during the geometry shader
stage. A unique layer would then be selected for each
generated triangle. However there is a limit to how
many vertices can be generated from any incoming
primitive during the geometry shader stage. In addi-
tion, the more information that is associated to a ver-
tex and has to be carried into the fragment stage, the
less the number of new vertices that can be created.
The approach was worthwhile only with relatively
small triangle pieces that could extend to twenty lay-
ers in the worst case scenario (meaning an additional
60 vertices to be generated). Due to the varying lim-
itations of graphics cards, this technique would be
heavily hardware dependent. Therefore, we chose to
employ scattered texture writes, giving the ability of
writing into an arbitrary number of fragments within
the texture volume.
A triangle should be rasterized to the single plane
where its projection onto that plane results in the
greatest surface area. This requirement is met when
a triangle is projected onto the plane that is most per-
pendicular to the triangle’s normal. In order to avoid
dynamic branching, all three projection matrices were
constructed in advance, and a simple interpolation
technique was used to select the appropriate matrix
during the geometry shader stage.
It is essential to store fragment data to be used in
further lighting calculations, however 4-channel tex-
tures can only store four floating point numbers per
texel. Thus, it was necessary to work with multiple
images, which made it important to group some in-
formation and re-use it where possible. Simply ded-
icating a new texture for any data-type when desired
would consume a lot of memory, so numbers were
packed into a single float value as much as possi-
ble. Every voxel had to contain a normal, base ma-
terial colour (raw colour of the unlit surface) and
the transparency. During the voxelisation stage, each
voxel block had to determine the shading coming
from the light sources in the scene. The block’s cen-
tre was compared to the existing value in the shadow
map, determining whether it was in the shadow. This
GRAPP 2020 - 15th International Conference on Computer Graphics Theory and Applications
204