ANIMATING AND RENDERING VIRTUAL HUMANS

Extending X3D for Real Time Rendering and Animation of Virtual Characters

Yvonne Jung

Fraunhofer IGD, Darmstadt, Germany

Keywords:

Rendering, Virtual Characters, Animation, Simulation, Virtual Reality, X3D.

Abstract:

In this paper, we focus on the different aspects of real time visualization and animation of realistic virtual

characters. The common goal was to come up with solutions based on the concepts of the open ISO standard

X3D and if necessary to propose generalized extensions to the standard. First we describe the high level

control language PML and its implementation, which is also suitable for non-graphics experts. Then we focus

on realistic rendering, how X3D must be extended to allow special effects and realization of novel rendering

algorithms, including skin and emotion rendering. Next we explain the challenges of dynamics related to

virtual characters covering play-back and blending of pre-deﬁned animations, online simulation of locomotion

and last but not least hair simulation.

1 INTRODUCTION

Virtual characters are well known from movies, ads

and computer games. While most of these movies

feature non-realistic characters, e.g. Toy Story, there

are already some showing virtual humans with very

high realism, e.g. Final Fantasy. In ﬁrst person shoot-

ers like Half-Life 2 we often see realistic virtual char-

acters. But current solutions have major drawbacks

and research is still needed. Today, extreme real-

ism can only be achieved in movies, where real-time

is not an issue at all. Virtual characters in movies

and games are the result of many month of work of

artists. In computer games movements like mimic,

gestures, and locomotion are limited and do not allow

expressive, non repetitive behaviors. Controlling vir-

tual characters needs a lot of CG knowledge. Our vi-

sion is a widespread deployment of interactive and re-

active virtual characters, but today it needs expertise,

much time and effort to use them in an application.

As long as this does not change, we will see charac-

ters in movies and games only. Thus we are working

on solutions to overcome the major barriers concern-

ing rendering aspects. All the work described in this

paper was carried out in the project Virtual Human,

where we were responsible for the graphics output,

meaning the rendering of scenes and realistic virtual

characters as well as playing back animations.

An important topic was to provide an abstract layer

to the graphics environment, because controlling of

character behavior and emotional states was done by

researchers in the area of Artiﬁcal Intelligence and

Digital Storytelling. Understandably they did not

want to bother with low-level graphics and demanded

a high level control language where they could issue

commands like ”character one waves hand and simul-

taneously blinks with his eyes”. Therefore we devel-

oped the PML language corporately, and in chapter 2

we will focus on its design from a graphics perspec-

tive. Methods for realistic rendering will be described

in chapter 3. Since realistic characters move around,

gesture and show emotions, dynamics are also essen-

tial, see chapter 4 for further details. Lessons learned,

a summary and outlook for future work conclude this

paper. A common goal was not to develop methods

only suitable to use in our own specialized environ-

ment. We also thought about how virtual characters

could be realized effectively when using a common

standard for interactive 3D like the open ISO standard

X3D. In case it was necessary to extend the standard,

we did also provide appropriate X3D binding speciﬁ-

cations.

2 CONTROLLING GRAPHICS

When visualizing virtual characters one also has to

think about interoperability aspects. PML (Player

Markup Language) is an XML-based Markup Lan-

guage which takes up concepts from RRL (Rich Rep-

387

Jung Y. (2008).

ANIMATING AND RENDERING VIRTUAL HUMANS - Extending X3D for Real Time Rendering and Animation of Virtual Characters.

In Proceedings of the Third International Conference on Computer Graphics Theory and Applications, pages 387-394

DOI: 10.5220/0001093503870394

 SciTePress

resentation Language) for the description of agent be-

havior in net environments (Piwek et al., 2002). It

is an abstract speciﬁcation language which is inde-

pendent of the implementation of the player and the

virtual environment and is used as descriptive inter-

face markup language between a dialog engine and a

graphics engine. PML is a language for controlling

virtual environments with special regards to character

animation and user interaction, and therefore deﬁnes

a format for sending commands to a 3D virtual real-

ity system. Additionally it deﬁnes a message format

which can be sent to a player or received from it. At

the beginning of a new scene all objects and charac-

ters are deﬁned by a ”deﬁnitions” script. In the course

of the story all runtime dependent actions like char-

acter animations are described by so-called ”actions”

scripts, whereas the temporal order is given by a spe-

cial schedule block. A basic principle in this architec-

ture is the fact that the player is assumed to be ”ig-

norant” without any autonomous behavior. Keeping

in mind, that all rendered geometry has to be created

in a pre-process, it seems to be logical that all objects

including the way they act have to be registered ﬁrst,

before any action can be triggered. Therefore a PML

script can have the following distinct root elements:

deﬁnitions deﬁne all possible/ available animations

of a virtual character or an object;

actions contain the schedule of animations, ranging

from key-frame to IK, and similar actions like setting

visibility or starting the TTS system;

message is used for bidirectional communication;

query can be used for retrieving scene information.

The animation tags of a PML actions script re-

fer to preloaded animations, which are referenced by

their name. There are distinguished different kinds

of animations like morph targets for facial anima-

tion (”singlePose”) or key-frame animation for ges-

tures (”multiPoses”), because every animation type

must be handled different and has a varying set of

attributes. An example of a rather unusual anima-

tion which can be handled quite easily this way is

the change of the face complexion. Usually only the

changes in geometry by means of displacers or morph

targets are addressed in computer graphics. This is a

well known problem, and the classiﬁcation usually is

based on Ekman’s Facial Action Coding System (Ek-

man, 1982), which identiﬁes certain Action Units for

morphing. But with the help of moderngraphics hard-

ware the more subtle changes concerning face color-

ing can also be covered. For some emotions, changing

of skin color is essential for correct perception (e.g.

embarrassment and shock). It is caused by varying

blood circulation and results in pallor or blushing. As

the color variation can be limited to several face ar-

eas, a 3D texture map that denotes these areas can be

used. Just like a stack of plates the 2D images, one

for each complexion, are layered, starting with the

palest face image and ending with the reddest one.

Using this method not only prevents from setting in-

valid emotional stages but also leads to convincing

results by employing texture fetching hardware for

color interpolation by assigning each emotional im-

age a number between zero and one (the intensity),

which then is used for indexing into the 3D texture

image stack (Jung and Kn¨opﬂe, 2006). The following

actions script (which combines typical facial anima-

tions like blinking with blushing and crying) shows

how such an animation can look like.

<text>I am so sorry about this.</text>

</speak>

</character>

<seq>

<par>

</par>

</seq>

</schedule>

</actions>

We used the InstantReality framework (Avalon, 2007)

as the basis for our developments. Here the graph-

ical properties and the behavior of the virtual world

are deﬁned by a scene graph following and extend-

ing the concepts of X3D (Web3DConsortium, 2007).

For handling animations we added some additional

nodes. Below the X3D interfaces are shown. The

”TimelineComposer” is the PML interface and can

be seen as a bindable node. It handles all commu-

nication with the system and forwards PML com-

mands to the parser. During parsing, the schedul-

ing block is sequenced and single action and deﬁni-

tion chunks are created and routed to the appropriate

disposer components. When receiving a start mes-

sage, the scheduler dispatches the action chunks to

the ”AnimationController” node of the correspond-

ing character or object. The ”TimelineComposer”

triggers and holds references to all ”AnimationCon-

troller” nodes, which in turn point to all ”Animation-

Container” node types; i.e. the ”InstantAnimation-

Container” for referring to transitions which are state

changes like toggling visibility or object reposition-

GRAPP 2008 - International Conference on Computer Graphics Theory and Applications

388

ing, and the ”TimedAnimationContainer” (including

two specialized subclasses for locomotion generation)

for storing all time based animations like key-frame

animations and inverse kinematics. Because a com-

plex story can lead to an arbitrary number of gestures

or respectively animations, the main job of the ”Ani-

mationController” is to blend and cross-fade all kinds

of animations. This was due to the requirement, that

for correct blending, cross-fading and generally up-

dating all animations of an object at a single time

step, the controlling unit needs knowledge of all ani-

mations, a task which could not be handled with the

simple routing mechanisms of VRML/ X3D.

TimedAnimationContainer : AnimationContainer {

SFString [] name ""

MFNode [] targets []

MFString [] fieldnames []

MFNode [] interpolators []

SFFloat [] duration 0

}

AnimationController : AnimationBase {

SFString [] name ""

MFNode [in, out] animationContainer []

}

TimelineComposer : X3DNode {

SFString [in,out] command ""

SFString [out] message

MFNode [in,out] animationController []

}

3 REALISTIC RENDERING

When looking at games or technical demonstrations,

one might think that most problems are solved and

generalized solutions are standardized and readily

available. But currently for instance the exchange

schema Collada or even X3D do not support ad-

vanced interaction and rendering methods. To over-

come these limitations we present various enhance-

ments to the present X3D standard, comprising nodes

for advanced rendering techniques as well as exten-

sions for animation of characters and implementa-

tion of storylines, allowing application developers to

create and author realistic VR environments easily.

Though, rendering virtual characters has a lot of chal-

lenges. First, the algorithms should be easy to use and

integrate into different applications. Second, to have

ﬂexible control of the character requires a ﬂexible an-

imation system including body movements (gestures,

walking) and speech (TTS, mimics). Then, visual re-

alism means to have realistic models, natural gestures

and a realistic simulation of materials. Finally every-

thing has to be done in real-time, because the interface

must react immediately to user input.

Human skin can be classiﬁed according to spatial

scale. The micro scale is deﬁned by cellular elements

whose dominant effects are scattering and absorption.

For approximating subsurface scattering effects, the

distance light travels through skin is measured with

the same technique as is used for shadow mapping

(Green, 2004). Hence we need to introduce multi pass

rendering in the context of X3D, which can be under-

stood in two ways. First, multi pass means the ability

to dynamically render to an off-screen buffer. In the

Xj3D extension (Xj3D, 2004) a simpliﬁed possibility

for creating off-screen images was proposed with the

”RenderedTexture” node. Second, multi pass denotes

the ability to render geometry in an ordered sequence,

usually with different drawing operations like blend-

ing enabled, which is currently not possible in X3D.

We are using an extended ”RenderedTexture” to pro-

vide the ability for off-screen rendering including as-

sociated buffers like the depth buffer. It has an addi-

tional ﬁeld called ”depthMap”, which allows the au-

tomatic generation of depth maps for e.g. additional

user created shadows as needed for the light pass of

the skin shader (Fig. 1). Because this is only useful

in combination with appropriate transformation ma-

trices, the ”projection” (model view projection matrix

of camera space) and ”viewing” ﬁelds (model matrix

of parent node) are added.

Figure 1: Light pass - Blur pass - Final render pass.

A person usually has more than 100,000 hairs, which

cannot be simulated in real-time. In order to re-

duce geometric complexity and to avoid rendering

problems we model hair wisps as small quad strips.

For creating an impression of thin, semi-transparent

hair, textures with hair like transparent patterns are

mapped onto the hair patches. To provide an impres-

sion of hair volume, alpha blending is used, which re-

quires correct back-to-front sorting of the hair wisps.

Therefore we propose the ”SortedPrimitiveSet” node

(chapter 4.2). However, for simulated quad strips no

unique sorting order can be determined and therefore

severe sorting artifacts may result. As proposed in

(Scheuermann, 2004), most artifacts can be alleviated

by a multi-pass approach. Although this is only suit-

able for presorted hair without animation, in combina-

tion with our method for rendering human hair (Jung

et al., 2005) it leads to good results. After rendering

all back-facing polygons with depth writes disabled

and depth test set to ”less”, the front-facing polygons

ANIMATING AND RENDERING VIRTUAL HUMANS - Extending X3D for Real Time Rendering and Animation of

Virtual Characters

389

are rendered. This is accomplished by means of our

”MultiPassAppearance” node and the additional use

of special render mode nodes like the ”FaceMode”

and the ”DepthMode” for ﬁne grained render state

control. Multi pass rendering in X3D is thus pretty

easy by using the proposed node extensions.

This shows that for complex rendering tasks con-

trol over the rendering order as well as over low level

rendering modes is needed. Therefore we extended

the X3D Shape component with nodes for setting dif-

ferent render states and therewith the ”Appearance”

node with the appropriate ﬁelds. First we introduce

the SFInt32 ”sortKey” ﬁeld for deﬁning the render-

ing order, what is essential in combination with e.g.

alpha blending or depth writing. For rendering oper-

ations that belong closely together as it is the case for

the two pass hair shader, we also introduce the ”Mul-

tiPassAppearance”. The ”appearance” ﬁeld simply

contains an ordered sequence of ”Appearance” nodes.

Additionally we propose an ”AppearanceGroup” that

extends the ”Group” node with an ”appearance” ﬁeld.

This is useful if a whole group of nodes, like head

and hair, share the same material properties as is the

case for the light pass, where the fragment’s distance

to the light source is written and fragments with al-

pha values smaller a certain threshold are discarded.

Besides ”FaceMode” and ”DepthMode” some more

nodes for allowing ﬁner control over rendering states

are also introduced: The ”BlendMode” allows access

to blending and alpha test; ”StencilMode” and ”Col-

orMaskMode” should likewise be self-explanatory. If

the corresponding ﬁelds in the ”Appearance” node are

not set, standard settings are used.

Figure 2: Left: Weeping woman (see ch. 2). Right: Sweat.

The human face communicates with various types

of signals like muscular activity, but it also exhibits

other signals, which are often caused by strong emo-

tions (Kalra and Magnenat-Thalmann,1994). Usually

only the changes in geometry are addressed, but real-

istic skin rendering also requires displaying changes

of skin color dependent on physical conditioning and

emotional state - people always communicate. Some

emotions are also accompanied by other phenomena,

sadness often comes along with tears and sweat can

be a sign of fear. Mostly sweat can be regarded as be-

ing static, but tears have to run down the face. Here

we use the same animation technique as before, but

now with a stack of normal and gloss maps.

4 DYNAMICS

4.1 Playback of Predeﬁned Animations

Dynamics occur in different ways. The most obvious

are the movements of the character, including loco-

motion and gestures. But also human hair is not static

and must be simulated to achieve high visual realism.

Generally spoken two types of approaches can be dis-

tinguished, the play back of predeﬁnedanimation data

on the one hand, and the online computation of ani-

mation data on the other hand. The X3D H-ANIM

component provides support for character animation

based on predeﬁned animations (Web3DConsortium,

2006). It is based on a Skins and Bones model and

deﬁnes a common skeletal model including locations

and names of speciﬁc bones. Different levels of qual-

ity are deﬁned. The animation data itself is stored in

X3D interpolators; one interpolator per joint, and the

data ﬂow is deﬁned via X3D routes. For simple sce-

narios, like a single animation to be played, H-ANIM

works well, but it is hard to use in cases where multi-

ple animation sets are available, which are combined

and concatenated dynamically during run-time. The

overall structure of such an application gets unman-

ageable and confusing because of the vast amount of

nodes, routes and missing informationabout member-

ship to speciﬁc information. Tracing and debugging is

almost impossible, especially when routes are created

and deleted during run-time to blend animations to-

gether in scripts. As already explained, an X3D com-

patible binding is proposed, which introduces several

nodes that primarily act as data containers.

For efﬁcientcombination and concatenation of an-

imations we also need additional information about

the animations, e.g. data look ahead and a list of ac-

tive animations and animations that will be activated

within the next time-frame, which the humanoid an-

imation component does not provide. To solve these

problems we have designed animation storage nodes,

which provide a consistent view on an animation set,

including membership information of nodes to a spe-

ciﬁc animation. Furthermore we developed a cen-

tralized control engine for animations, overcoming

the problems mentioned above, which is explained in

chapter 2. For doing convincing character animation

in complex and responsive environments, H-ANIM

needs to be extended for incorporating blending of

different animations like waving and turning around

GRAPP 2008 - International Conference on Computer Graphics Theory and Applications

390

at one single time step, which cannot be accomplished

with current X3D concepts. The same goes for cross-

fading different succeeding animations in order to al-

leviate jerky leaps between e.g. an idle motion and

a subsequent gesture. By using our previously ex-

plained animation controlling extension, mixing of

animations can be easily done in X3D. This is concep-

tually the same for facial animation based on morph

targets (Alexa et al., 2000) or the H-ANIM displacer

nodes. Both above mentioned types of mixing ani-

mations are realized with geometric algebra (Hilden-

brand, 2005) which especially for rotations is superior

to simple linear interpolation schemes. But there are

still some issues, mainly due to unsuitable animation

data. If the spatial distances between the ﬁrst and the

last animation frame are too big, this either leads to

jerks or to sliding effects, depending on the blending

parameters, which in the latter case introduce damp-

ing effects if too many time steps are averaged.

Figure 3: Unit hemisphere with reference quaternion q

∗

Rotations internally are represented as unit quater-

nions. Because the same rotation can be described by

the quaternions q and −q, care must be taken when

blending quaternions. In order to get a unique de-

scription of a rotation, we ﬁrst deﬁne a 4d unit hemi-

sphere, on whose surface S the unit quaternionsare lo-

cated. Following (Park et al., 2002), the initial choice

of the hemisphere is based on an arbitrarily chosen

reference quaternion (e.g. the ﬁrst one, depicted as

∗

in Fig. 3). Our simpliﬁed method is based on

the observation that the reference quaternion q

∗

, any

other quaternion q and the origin of the 4d unit sphere

are always coplanar. With the additional constraint,

that all unit quaternions are located on the same hemi-

sphere S, the maximum angular separation between q

and q

∗

is 90

◦

. The maximum distance directly results

from Pythagoras:

√

+ 1

√

2. By calculating the

Euclidian distance d between q

∗

and any quaternion

q, we can check, if q lies on the reference hemisphere

by comparing d with

√

2. If d >

√

2, then q doesn’t

lie on S. By simply negating q the given rotation is

represented by a quaternion being located on S. In

the last step all quaternions are multiplied with their

weights, summed together and ﬁnally normalized for

obtaining the interpolation result.

4.2 Realistic Simulation of Hair

The renderingof long, light colored hair is much more

complex than that of short, dark hair. Thus due to

the translucent characteristics of hair ﬁbers the addi-

tional consideration of transmission, dispersion and

self shadowing is required. In case of direct lighting,

there are two different specular highlights (Scheuer-

mann, 2004). The ﬁrst one results from direct reﬂec-

tion at the surface of a hair ﬁber. The second highlight

results from internal reﬂection. The incident light

passes through the interior of the ﬁber and is reﬂected

at the opposite side of the cylindrical shape. Because

of refraction the light’s direction changes, so the sec-

ondary peak appears shifted towards the hair tip. The

highlight is weaker and gets colored by the pigments

of the hair. In order to calculate the different peaks

described above, two tangents T’ and T” are needed,

which are shifted in opposite directions. This can be

achieved by adding a scaled normal onto the original

tangent T, given by the hair’s direction, which is up-

dated during the simulation anyway.

Figure 4: Left: Man with a beard. Right: Hair simulation.

Deformable materials like hair or cloth are often sim-

ulated by using mass spring systems. They can be

calculated with the help of differential equations by

equating Newton’s second law of motion (F = ma =

m¨s) and Hook’s law (F = ks), which relates the force

F exerted by a spring with spring constant k and rest

length l to its deﬂection s = l

′

−l. But explicit numer-

ical methods for solving these equations do not nec-

essarily converge if forces are too strong and the size

of the time step lies above a certain threshold. Our

hair simulation is derived from the cantilever beam

method (Anjyo et al., 1992), which originally was in-

tended for hair modeling. Compared to mass spring

approaches, it provides a numerically simpler and vi-

sually more convincing way to simulate hair. The

most important difference of kinematic models com-

pared to mass spring systems is that the initial dis-

ANIMATING AND RENDERING VIRTUAL HUMANS - Extending X3D for Real Time Rendering and Animation of

Virtual Characters

391

tance l can be fully conserved. Because neighbor-

ing elements don’t interact by means of spring forces,

oscillations can’t occur. Thus a kinematic simula-

tion system keeps stable even with much bigger time

steps. Our modiﬁed cantilever beam algorithm inter-

nally works on a kinematic multi-body chain, as il-

lustrated in Fig. 5, left. The nodes of the multi-body

chain are deﬁned by the vertices of the original geom-

etry. Two types are distinguished, anchors and free

moving vertices. Anchors are connected to the scalp,

whereas all other vertices in the chain are free mov-

ing, due to external forces like gravity, and by apply-

ing the length conservation constraint.

Besides a convincing simulation a natural behav-

ior in case of collisions is also required. Collisions

with the body are a hard constraint and must be treated

explicitly. Thus for approximation we use objects like

spheres, ellipsoids and planes, for which intersection

tests can be handled efﬁciently. Hair-hair collision

can’t be handled easily in real-time. Thus the inter-

penetration of hair wisps is avoided by using a slightly

different bending factor for every chain, based on the

position of its respectiveanchor, and by arranginghair

strips on top of the scalp in different layers within a

different distance to the head. For keeping this up

during dynamics, each vertex, depending on its posi-

tion, is assigned a virtual collision sphere with a dif-

ferent radius, in order to parameterize the distance to

the head individually (Fig. 5, right).

Figure 5: Simulation step; T and B resemble a quad strip.

SortedPrimitiveSet : X3DComposedGeometryNode {

SFString [in,out] mode "QuadSet"

SFString [in,out] drawOrder "BackToFront"

SFNode [in,out] refPoint NULL

SFNode [in,out] coord NULL

SFNode [in,out] color NULL

SFNode [in,out] normal NULL

SFNode [in,out] texCoord NULL

SFNode [in,out] tangent NULL

MFInt32 [in,out] index []

SFFloat [in,out] lowerBound 0.25

SFVec4f [in,out] upThreshold 0 1 0 0.85

}

All special simulation and rendering components are

implemented as scene graph nodes in Avalon. The

rendering component consists of the hair appearance

and the ”SortedPrimitiveSet” node. It holds all geo-

metric properties like positions, indices and tangents

and is responsible for the CPU based part of the sort-

ing algorithm. The latter can be parameterized by the

”drawOrder” ﬁeld, the ”upTheshold” ﬁeld for deﬁn-

ing the threshold for a second sorting step, and the

”lowerBound” ﬁeld for determining the percentage of

quad strips which can be omitted after sorting during

rendering due to occlusion. Because of its generic

design our proposed ”SortedPrimitiveSet” node like-

wise is useful for similar usages like rendering grass.

It is updated via the X3D routing mechanism by the

simulation component. This way the shaders only be-

long to the appearance nodes and are therefore inter-

changeable and easily to parameterize.

Figure 6: Component structure in VR system.

As shown in Fig. 6 we also propose some simula-

tion system nodes. The ”MassSpringSystem” is suit-

able for applications like cloth simulations. It inher-

its from our abstract ”SimulationSystem” base node

and is strictly separated from geometry and appear-

ance components for maximum ﬂexibility. The al-

ready described ”CantileverBeamSystem” for simu-

lating hair also inherits from the ”SimulationSystem”.

With the ”collisionDomainType” and ”collisionDo-

mainParam” ﬁelds collision objects can be speciﬁed,

and because full triangle tests would be too expen-

sive, the ”massRadius” ﬁeld can be used for setting

the distance between connected vertices.

SimulationSystem : X3DNode {

SFBool [in,out] enabled TRUE

SFTime [in,out] time 0

SFInt32 [in,out] minStepsPerFrame -1

SFInt32 [in,out] maxStepsPerFrame -1

SFTime [in,out] maxStepTime -1

SFBool [in,out] localCoordSystem TRUE

MFInt32 [in,out] index []

MFInt32 [in,out] anchorIndex []

MFVec3f [in,out] coord []

MFVec3f [in,out] normal []

MFString [in,out] collisionDomainType []

MFFloat [in,out] collisionDomainParam []

SFVec3f [in,out] gravity 0 -9.81 0

SFVec3f [in,out] externalForce 0 0 0

SFFloat [in,out] massRadius 1.0

SFFloat [in,out] staticFriction 0.5

SFFloat [in,out] slidingFriction 0.5

SFFloat [in,out] airFriction 1.0

}

CantileverBeamSystem : SimulationSystem {

MFVec3f [in,out] tangent []

MFColor [in,out] color []

GRAPP 2008 - International Conference on Computer Graphics Theory and Applications

392

MFVec2f [in,out] texCoord []

MFVec3f [in,out] refPoint []

SFFloat [in,out] minBend 0.05

SFFloat [in,out] maxBend 0.1

}

4.3 Simulation of Locomotion

Capturing and processing motion data is a tedious and

time consuming task. Thus a better solution would be

to automatically generate them. Furthermore there are

animations whose appearance is not known upfront

because they depend on external parameters. Exam-

ples are pointing gestures, where the direction is cal-

culated during run-time, e.g. pointing towards a mov-

ing object, and character locomotion, where the target

is deﬁned during runtime (”go to the red cube”). In

the following we will focus on locomotion, i.e. walk-

ing. Basically there exist two types of approaches for

automatic generation of walking animations (Multon

et al., 1999). The ﬁrst one tries to simulate the phys-

iology of the human body using kinematics or dy-

namic constraints. The ﬂexibility is very high, be-

cause theoretically any kind of human motion can be

calculated, but for decent results the complexity of

such simulations is very high, too. The second type

adapts captured animation data according to exter-

nal parameters, e.g. ”interpolation” between walking

and running to attain jogging. Here the complexity is

lower, because originalities of human walking are al-

ready deﬁned in the animation sets. The drawback of

these approaches is the need for motion data upfront.

The most promising approach we found was the

one described by (Park et al., 2002; Park et al., 2004),

which synthesis animation data from previously cap-

tured animation data according to different parame-

ters, e.g. mood of character and style of walking.

In a ﬁrst step one has to preprocess the motion data

and create animation sequences. Each sequence com-

prises of one walking cycle with ﬁxed speed, angle

and mood. To walk on a given path or towards a

speciﬁed target the sequences are automatically con-

catenated during runtime. The values itself are inter-

polated according to the input values deﬁned by the

application. We integrated the algorithm into Avalon

and generated the necessary motion data with a spe-

cially written exporter. The calculation of the anima-

tion data (56 joints, 27 example motions, 3 parameter

dimensions) took approx. 30 ms on a Pentium 4 with

2.4 GHz, thus it was real-time capable. The visual

results were very convincing; especially the concate-

nated walking cycles looked very lifelike.

Figure 7: Different walking styles.

4.4 Content Creation

Content creation embraces the creation of the charac-

ters geometry including materials as well as anima-

tions, the scene, the behavior and the overall ”story”.

For the creation of the digital story several methods

can be used. Advanced ones provide non-linear sto-

rytelling and adaptive dialogues. But there are many

applications where this complexity is not necessary.

For such cases we developed simpler tools allowing

us to put together story-lines in an easy and intuitive

way. As described earlier, a story can be described

with PML. It allows deﬁning when and what a char-

acter or object in the scene is doing and how/when the

user can interact with the virtual environment. To al-

low interactions by the user and to change the ﬂow of

the story, we deﬁne short acts and transform them into

PML scripts. A short act could be a dialogue between

two characters on a given topic. Such PML scripts

will be stored in special ”SceneAct” nodes which are

connected by routes. The routes deﬁne the ﬂow of the

story. As soon as such a node gets a trigger on its in-

put ﬁeld, its PML script will be executed. The ”url”

ﬁeld holds the corresponding PML script, whereas the

”run” and ”ﬁnished” SFBool ﬁelds are used for the

story graph composition. By adding other nodes in

the route graph, we easily can add some non-linearity

and possibilities for user interaction to the story.

5 RESULTS AND CONCLUSIONS

In this paper we described results we achieved in the

Virtual Human project concerning rendering and an-

imation of virtual characters. First extrinsic factors

like properties of human skin and hair were incor-

porated. For modeling behavioral aspects intrinsic

factors were also considered, including control lan-

guages, emotions, and animations in general. Though

the rendering of scene and virtual characters was not

only aimed for photorealism but also the behavior

had to be modeled as realistic as possible. During

the project we realized that realistic real-time ren-

dering techniques well-known from modern computer

games, such as image-based rendering as well as tech-

niques requiring ﬁne grained render state access were

mostly ignored in the ﬁeld of MR/ VR. The same goes

ANIMATING AND RENDERING VIRTUAL HUMANS - Extending X3D for Real Time Rendering and Animation of

Virtual Characters

393

for the real-time simulation of deformableobjects like

cloth and hair. Although a lot of work already is done

towards realistic rendering and simulation, research

usually is conducted in a standalone application with-

out embedding the algorithms into a wider ﬁeld of

applications as is needed for e.g. X3D, which is not

only an open standard for interactive 3D graphics but

is also easy to learn for non-programmers. Thus we

proposed extensions of the standard, which were eval-

uated in two different scenarios.

One major goal was visual realism, where we face

the same problems as research on humanoid robots:

the ”uncanny valley”, a hypothesis introduced by M.

Mori in 1970. He states that as a robot is made more

human-like, the emotional response from a human be-

ing to the robot will become increasingly positive and

empathic, until a point is reached beyond which it

becomes strongly repulsive. But as appearance and

motion continue to become less distinguishable from

a human being’s, the emotional response becomes

positive again and approaches human-human empa-

thy levels. This also holds for virtual characters, and

for convincing results we have to come very close to

human-like appearance and behavior. We have devel-

oped powerful algorithms to improve rendering, even

taken care of dynamics. But by simply applying them

to a character model leaves us deeply stuck inside the

uncanny valley without attaining convincing results,

because parameter optimization is still tedious and

has to be done by experts. An example is the forth-

coming game from Crytek, which was postponed sev-

eral times although lots of people are working on it.

So, automatic generation of realistic virtual humans is

not possible without human intervention. One solu-

tion could be to setup libraries a user can choose from

and to have authoring tools, which guide through the

creation process, starting at a very coarse level, and

reﬁning the choices step by step.

Concerning recorded motion capturing data the

biggest problem was the data quality. Without heavy

manual work one will face ”ﬂoating” characters or

strange artifacts when blending between two very dif-

ferent poses. Since any kind of blending does ”inter-

polation” in some way there will be always cases were

blending will fail and deliver unsatisfactory results.

Without model knowledge or very accurate animation

data we will not be able to blend animations convinc-

ingly. So, prerecorded animation must be planned ac-

curately. It should be deﬁned which is the starting

and which is the ending pose as well as which joints

are involved. Blending between very different poses

therefore should be avoided. To increase ﬂexibility

research should focus on automatic real-time capable

methods for the creation of animation data.

ACKNOWLEDGEMENTS

This work was part of the project Virtual Human

funded by the German ministry for edu. and research.

REFERENCES

Alexa, M., Behr, J., and M¨uller, W. (2000). The morph

node. Web3D - VRML 2000 Proc., pages 29–34.

Anjyo, K.-I., Usami, Y., and Kurihara, T. (1992). A simple

method for extracting the natural beauty of hair. In

SIGGRAPH ’92, pages 111–120. ACM Press.

Avalon (2007). Avalon. http://www.instantreality.org/.

Ekman, P. (1982). Emotion in the human face. Cambridge

University Press.

Green, S. (2004). Real-Time Approximations to Subsurface

Scattering, pages 263–278. Add. Wes.

Hildenbrand, D. (2005). Geometric computing in computer

graphics using conformal geometric algebra. In CG

2005, volume 29, pages 802–810.

Jung, Y. and Kn¨opﬂe, C. (2006). Dynamic aspects of real-

time face-rendering. In VRST 2006, pages 193–196,

New York. ACM: VRST Cyprus 2006.

Jung, Y., Rettig, A., Klar, O., and Lehr, T. (2005). Realistic

real-time hair simulation and rendering. In VVG 05,

pages 229–236, Aire-la-Ville. Eurographics Assoc.

Kalra, P. and Magnenat-Thalmann, N. (1994). Modeling

of vascular expressions. In Computer Animation ’94,

pages 50–58, Geneva.

Multon, F., France, L., Cani-Gascuel, M.-P., and Debunne,

G. (1999). Computer animation of human walking:

a survey. The Journal of Visualization and Computer

Animation, 10(1):39–54.

Park, S., Shin, H., Kim, T., and Shin, S. (2002). Online

locomotion generation based on motion blending. In

ACM Symposium on Computer Animation.

Park, S., Shin, H., Kim, T., and Shin, S. (2004). Online

motion blending for real-time locomotion generation.

In Comp. Anim. and Virt. Worlds. John Wiley a. sons.

Piwek, P., Krenn, B., Schr¨oder, M., Grice, M., Baumann, S.,

and Pirker, H. (2002). Rrl: A rich repr. lang. for the

desc. of agent behaviour in neca. In Proc. of WS ”Em-

bodied convers. agents, let’s spec. and eval. them”.

Scheuermann, T. (2004). Practical real-time hair rendering

and shading. Siggraph 04 Sketches.

Web3DConsortium (2006). H-Anim.

http://www.web3d.org/x3d/speciﬁcations/ISO-

IEC-19774-HumanoidAnimation/.

Web3DConsortium (2007). Extensible 3D (X3D).

http://www.web3d.org/x3d/speciﬁcations/ISO-IEC-

19775-X3DAbstractSpeciﬁcation

Revision1 to Part1/.

Xj3D (2004). Xj3d dynamic texture rendering ext.

http://www.xj3d.org/extensions/render

texture.html.

GRAPP 2008 - International Conference on Computer Graphics Theory and Applications

394