animated clips to be the most popular in the Section
focusing on coverage of the Barcelona World Race.
Our paper is organised as follows. Section 2
presents related work in the field of automated
signing avatars. Section 3 details the methodology
used to create our system. Section 4 presents Borja,
the signing avatar designed for the Barcelona World
Race. Finally, Section 5 summarises the paper and
draws conclusions.
2 RELATED WORK
Several studies(Dehn & Van Mulken, 2000; Johnson
& Rickel, 1997; Moundridou & Virvou, 2002) found
that rendering agents with lifelike features, such as
facial expressions, deictic gestures and body
movements may rise the so called persona effect. A
persona effect is a result of anthropomorphism
derived from believing that the agent is real and
authentic(Van Mulken et al., 1998; Baylor &
Ebbers, 2003).
During the last decade there has been extensive
research and development carried out with the goal
of automating the animation of sign language for
virtual characters. TheViSiCast and eSIGN
projects(Elliott et al., 2007) focused on the
development of a comprehensive pipeline for text-
to-sign translation. Using text written in English and
German as input, the system translates it into written
text in sign language (English- ESL or German -
DGS, respectively). The translation, though, is very
literal and liable to produce unnatural results.
Written sign language is translated into signs using
the HamNoSys (Hanke, 2004) notation which
describes signs as positions and movements of both
manual (hands) and non-manual (upper-torso, head
and face) parts of the body. This notation enables
signs to be performed by a virtual character
procedurally, using inverse kinematic techniques. In
this sense the animation system is procedural and,
thus, flexible and reusable compared to others that
use motion captured data or handcrafted animation.
The HamNoSys notation, however does not include
any reference to speed and timing, and ignores
prosody. This has been considered one of the
reasons of low comprehensibility (71%) of signed
sentences using HamNoSys(Kennaway et al., 2007).
This is a particularly serious disadvantage given that
recent studies have demonstrated that prosody is as
important in spoken languages as in signed ones,
activating brain regions in similar ways (Newman et
al., 2010), and has a crucial role in understanding the
syntax of a signed message.
Automatic Program Generation is a field of
natural interest to the commercial domain, but which
has seen little academic research(Abadia et al.,
2009).The most recent integrated attempt to
disseminate sport news using a virtual signer has
been the SportSign platform (Othman et al., 2010). It
is a partially automatic system that needs an operator
to choose the kind of sport and specify other relevant
data,such as the teams playing match, the number of
goals or points scored etc. Then the system generates
a written version of the news in sign language
(specifically, American ASL) that a human operator
has to validate. Finally, a server-based service
generates a video with an animated character that is
then published on a web page. The workflow needs
extensive interaction by a human user, in order to
guide and validate the results of the system, meaning
that it wouldnot meet the important goal of reducing
the costs to make signed sport news feasible.
3 METHODOLOGY
3.1 System Overview
Our animation system is designed to build a
complex signed animation with a virtual character
by concatenating, blending and merging animated
clips previously prepared to digitally mimic the
gestures of expert signers. The system relies on a
series of XML-based templates indicating which
signs should be used to build certain content, and
which type of data is needed for the relevant
information (numbers, text, etc.). Figure 1 shows a
schematic overview of how the system constructs an
animation.
The parser receives the data and decides whether
is a known phrase, a number or a custom word. If it
is a known phrase, it is retrieved directly from
adatabase that stores the animation data for that
entire phrase. If it is not a phrase, the system falls
back and looks for a number or word. In this case,
the sign elements that are used to compose that
number or word them are taken individually and
merged together to build a single new sign. This fall
back is very useful when dealing with names of
people or cities, or even with numbers of more than
one digit.Finally, each of the clips is merged using
animation blending. How each clip has to be blended
is specified in the metadata description associated
with that clip in the database. Each clip has its own
length, start time, end time and in-out trim points
which specify the boundaries for the blending.
GRAPP2014-InternationalConferenceonComputerGraphicsTheoryandApplications
488