2 BACKGROUND
Ray tracing is well suited for many ways of paral-
lel computations. Threads can be used, for exam-
ple, to trace multiple rays in parallel. On the other
hand, there are at least two ways in which ray traver-
sal can use vector instructions. First, multiple rays can
be traced together in parallel, which is called packet
tracing. Second, multiple ray-AABB tests can be cal-
culated in parallel. In order to use wider vectors,
BVHs with a greater branching factor than two can be
used, which is called Multi Bounding Volume Hierar-
chy (MBVH) (Ernst and Greiner, 2008). Usually MB-
VHs are labelled based on their branching factor in a
way that MBVH4 means that the tree has a branch-
ing factor of 4 and MBVH8 means that the tree has a
branching factor of 8, et cetera. The two ways of us-
ing vector instructions can be combined. In that case
multiple rays are traced in parallel and all of them test
intersections with multiple AABBs.
There have been many proposed algorithms and
data structures to reduce the amount of data needed
for the ray tracing traversal step. These algorithms
can be divided into two categories.
First, some of the child AABB bounds can be in-
ferred from the parent AABB. Because in the BVH
the AABBs are tightly bound to their child geome-
try, all of the parent bounds are used by at least one
of the children. Some examples of this are Bounding
Interval Hierarchy (BIH), which stores only two co-
ordinate values and one axis in each node (W
¨
achter
and Keller, 2006) and the SPBVH (Ernst and Woop,
2011). Nevertheless, these ideas are not as good with
MBVHs since there are more children, which means
that fewer values for each child can be inferred.
3 RELATED WORK
Another approach is to compress data to reduced pre-
cision data types, at the cost of decompression during
traversal and reduced data quality. Low quality data
causes extra work at traversal, as the ray-AABB inter-
section tests might give false positive hits. Moreover,
algorithms must be carefully designed to avoid false
negatives in intersection tests, which would cause vis-
ible gaps in the scene geometry.
One way to compress the data is to use reduced
precision integer formats (Mahovsky and Wyvill,
2006). Most of the hardware commonly used today
is capable of doing calculations in different integer
precisions. Using integers, the scene is divided into
equally sized quantization steps. This is beneficial if
the details of the geometry are equally divided around
the whole scene.
To avoid too big quantization steps, the data type
can be used hierarchically. In this representation the
value range of the data type is scaled to the parent
node’s AABB bounds, so that the smallest possible
value of the data type corresponds to the parent’s
lower bound and the greatest possible value to the
upper bound. In a deep tree, even with two integer
bits, this kind of hierarchical encoding can achieve
more accuracy per coordinate than single-precision
floating-point format. This is possible because on
every lower level of the tree the quantization steps
get smaller in world coordinates. However, this extra
precision is not useful since the triangles are usually
stored in single-precision format.
The previous work on half-precision floating-
point BVHs (Koskela et al., 2015) only considers the
use of half-precision floating-point numbers as a stor-
age format. This reduces BVH inner node mem-
ory bandwidth and space usage by half, which cor-
responds to an average of 16% reduction in cache
misses and 7% reduction in memory space usage with
MBVH4. Their work evaluates both the BVH with
world coordinate half-precision floating-point num-
bers and the hierarchical representation similar to the
work on integers (Mahovsky and Wyvill, 2006). The
world coordinate half-precision traversal performance
is reduced by the extra traversal caused by the false
positives and the overhead of format conversion when
AABBs are loaded from memory. The hierarchical
half-precision BVH avoids most of the false positives,
at the cost of extra computational overhead.
Instead of converting the hierarchical AABB
bound coordinates into world coordinates, the ray
origin can be converted into hierarchical coordinates
(Keely, 2014). This allows intersection tests to be
computed in a low-precision format, which reduces
the required bit counts significantly. Moreover, this
is beneficial since fewer coordinates need to be con-
verted to different coordinate base (there are six val-
ues in the AABB bounds and only three values in the
ray origin). To avoid overflows in the hierarchical co-
ordinates, the ray origin has to be moved to the edge
of the coordinate range on every traversal step.
4 PROPOSED ALGORITHM
This paper follows the compression category by us-
ing half-precision floating-point numbers. The traver-
sal algorithm used in this paper is based on a simi-
lar idea as the (Keely, 2014) algorithm. The differ-
ence is that instead of adding custom integer hard-
ware to the GPU, the proposed algorithm uses ex-
GRAPP 2016 - International Conference on Computer Graphics Theory and Applications
172