Table 5: Average scores (percentage of correct answers) for different gesture styles.
gesture style no gesture regular gestures highlighted gestures
average scores for all questions 56.7 % 60.7 % 66.7 %
average scores for questions about figures 57.4 % 60.0 % 72.7 %
Table 6: Average scores (percentage of correct answers) for different expression styles.
expression style no expression moderate expressions intensive expressions
average scores for all questions 56.3 % 60.0 % 67.4 %
average scores for questions about figures 55.6 % 66.7 % 67.4 %
have investigated the important issues (i.e. duration
and timing) in the manual synchronizing of gesture
with speech, which has led us to consider the synchro-
nization problem to be a motion synthesis problem.
We have proposed a novel two-step solution using the
motion graph technique within the constraints of ges-
ture structure. Subjective evaluation of two scenarios
involving talking and news commentary has demon-
strated that our method is more effective than the con-
ventional method.
In the future, we plan to improve the generation
of facial expressions, where realistic facial dynamics
can further improve animation quality. At the same
time, we are extending the target applications to new
categories such as remote chat and human-robot in-
teraction.
REFERENCES
Arikan, O. and Forsyth, D. (2002). Interactive motion gen-
eration from examples. ACM Transactions on Graph-
ics, 21(3):483–490.
Beskow, J., Engwall, O., Granstrom, B., and Wik, P. (2004).
Design strategies for a virtual language tutor. In
INTERSPEECH-2004, pages 1693–1696.
Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L.,
Chang, K., Vilhj´almsson, H., and Yan, H. (1999). Em-
bodiment in conversational interfaces: Rea. In Pro-
ceedings of the SIGCHI conference on Human Factors
in Computing Systems, CHI ’99, pages 520–527.
Cassell, J., Vilhj´almsson, H. H., and Bickmore, T. (2001).
Beat: the behavior expression animation toolkit.
In Proceedings of the 28th annual conference on
Computer graphics and interactive techniques, SIG-
GRAPH ’01, pages 477–486.
Dutoit, T. (2001). An Introduction to Text-to-Speech Syn-
thesis. Springer.
Ekman, P., Friesen, W. V., and Hager, J. C. (2002). Facial
Action Coding System: The Manual on CD ROM. A
Human Face, Salt Lake City.
Huang, J. and Pelachaud, C. (2012). Expressive body ani-
mation pipeline for virtual agent. In Intelligent Virtual
Agents, volume 7502 of Lecture Notes in Computer
Science, pages 355–362.
Kopp, S., Krenn, B., Marsella, S., Marshall, A., Pelachaud,
C., Pirker, H., Th´orisson, K., and Vilhjlmsson, H.
(2006). Towards a common framework for multi-
modal generation: The behavior markup language.
In Intelligent Virtual Agents, volume 4133 of Lecture
Notes in Computer Science, pages 205–217. Springer
Berlin Heidelberg.
Kovar, L., Gleicher, M., and Pighin, F. (2002). Motion
graphs. ACM Transactions on Graphics, 21(3):473–
482.
Lee, J., Chai, J., Reitsma, P., Hodgins, J., and Pollard, N.
(2002). Interactive control of avatars animated with
human motion data. ACM Transactions on Graphics,
21(3):491–500.
Marsella, S., Xu, Y., Lhommet, M., Feng, A., Scherer, S.,
and Shapiro, A. (2013). Virtual character performance
from speech. In Proceedings of the 12th ACM SIG-
GRAPH/Eurographics Symposium on Computer Ani-
mation, SCA ’13, pages 25–35.
McGurk, H. and MacDonald, J. (1976). Hearing lips and
seeing voices. Nature, 264:746 – 748.
McNeill, D. (1985). So you think gestures are nonverbal?
Psychological Review, 92(3):350–371.
McNeill, D. (2005). Gesture and Thought. University of
Chicago Press.
Miller, L. M. and D’Esposito, M. (2005). Perceptual fu-
sion and stimulus coincidence in the cross-modal in-
tegration of speech. The Journal of Neuroscience,
25(25):5884–5893.
Neff, M., Kipp, M., Albrecht, I., and Seidel, H.-P. (2008).
Gesture modeling and animation based on a proba-
bilistic re-creation of speaker style. ACM Transactions
on Graphics, 27(1):5:1–5:24.
Ng-Thow-Hing, V., Luo, P., and Okita, S. (2010). Synchro-
nized gesture and speech production for humanoid
robots. In Intelligent Robots and Systems (IROS),
2010 IEEE/RSJ International Conference on, pages
4617–4624.
Niewiadomski, R., Bevacqua, E., Mancini, M., and
Pelachaud, C. (2009). Greta: an interactive expres-
sive eca system. In Proceedings of The 8th Interna-
tional Conference on Autonomous Agents and Multi-
agent Systems - Volume 2, AAMAS ’09, pages 1399–
1400.
Nishida, T. (2007). Conversational Informatics: An Engi-
neering Approach. John Wiley & Sons, Ltd.
Noma, T., Zhao, L., and Badler, N. (2000). Design of a
AccurateSynchronizationofGestureandSpeechforConversationalAgentsusingMotionGraphs
13