3.3 Framebuffer
As explained before, the number of renders of M
HR
needed to be performed is equal to the triangle count
of M
LR
. However, we are only interested in the
pixels which are inside the area formed by the
triangle T projected with the matrix MVP. Setting up
the stencil buffer to discard all pixels which are
outside the projection of T, is a very simple way to
discard unwanted pixels and to protect those parts
which have been already rendered.
Finally, the pixel shader re-scales every un-
masked normal to the range [0,1], so that the normal
vector can be encoded as a colour in the RGB space.
3.4 The Auto-occlusion Problem
Sometimes, there are some parts of the models that
will cause this method to fail. This happens when
there is another part of the model between the
camera and the real target surface. This problem is
clearly shown in Figure 4, where a part of the model
is incorrectly occluding the desired surface
(coloured in red), causing the normal map to be
completely invalid.
Figure 4: Auto-occlusion problem: the ear next to the
camera is occluding the real desired geometry.
To solve this problem we have developed a
technique called vertex mirroring. Basically we
consider that if a pixel is going to be drawn more
than once (some parts of the model overlap in screen
space), then the valid pixel will be that one which is
closer to T. This is similar to what raytracing-based
normal map generation algorithms do: if some
polygons intersect the ray, take the one which is
closer to T.
Let Π be the plane containing T. Let N be the
normal of Π, v
i
be each one of the vertices of M
HR
and k
i
be the distance between Π and v
i
. Then the
final position of v
i
is recalculated as follows:
Nkkclampvv
iiii
)·,0,(·2−=
(6)
The function clamp(a,b,c) will trunk the value a
inside the range [b,c]. This ensures that all vertices
of M
HR
are in front of the plane Π, because those
vertex that are behind that plane are mirrored
through it. After performing this step, we can use the
standard depth test to ensure that each rendered pixel
is the nearest possible to T.
This technique can be implemented in a vertex
shader for optimal performance, in a clear, elegant
and efficient way.
3.5 Normal Map Border Expansion
Once the previous process is over, the normal map is
correctly calculated. However, due to the texture
filtering methods used in real-time hardware,
normals could incorrectly interpolate with their
neighbouring “empty” texels. To solve this problem,
we need to create an artificial region surrounding
every part of the normal map.
To detect those texels that do not contain a valid
normal, an extra pass rendering a full screen quad
textured with the previously generated normal map
will be performed. For each pixel, the pixel shader
of the normal map generator will check if the texel
belonging to the pixel being processed has a module
less than 1, which means that it does not contain a
valid normal (because all normal must be unitary). If
that happens, the pixel must be filled with the
average normalized value of its neighbouring texels
which contain a valid normal.
At the end of the process a 1-pixel sized frontier
is created around all parts of the normal map that
didn’t contain a valid normal. This process can be
repeated with the resulting texture to expand the
frontier to a user defined size.
4 RESULTS
All tests were performed on an Athlon64 3500+ /
1GB RAM / GeForce 6800 Ultra and can be divided
into two categories: performance and quality tests.
Table 1 shows a study of total times required to
generate the normal maps for two models with a
different polygonal complexity. For each model
(M
HR
, first column) different coarse approximations
are used (M
LR
, second column) to generate MN. The
column on the right shows the time in milliseconds
needed to calculate the normal map for a certain
combination of meshes.
An octree-based acceleration structure is used to
discard as many triangles as possible in an efficient
way to improve rendering times up to 10 times.
GPU-BASED NORMAL MAP GENERATION
65