convolve a temporal neighborhood of 7 emotion sam-
ples (M = 7). For the consideration of multi-channel
effects each of the C filters has therefore C · M = 35
taps.
Fig. 3 shows a sample sequence. The image se-
quence (above) starts with the neutral facial expres-
sion. The candidate has then to imitate suprise and
happiness. Normalized feature data is shown in the
middle providing the input for the SVM classifier.
Below the classification probability values esti-
mated by pairwise coupling corresponding to the de-
tected features are depicted. It is clearly shown that at
the constant parts of the facial expression time line the
classification gives stable results. But at the transition
from neutral to surprise false classifications are vis-
ible (Anger-Disgust-Neutral). When returning from
the happy to the neutral expression surprise and dis-
gust are detected which not has been intended.
Fig. 4 (left) shows the application of the asso-
ciative deconvolution to the classification probabil-
ity values. The results show again how the SVM is
clearly defining boundaries between different classes
when the facial expression is constant but fails at
the transitions. All time series start with the neu-
tral expression. In the upper sequence the expres-
sion changes to ’Happy’. In the second example two
transitions are included: from ’Neutral’ to ’Happy’
and from ’Happy’ to ’Surprise’. The third example
shows the transitions ’Neutral’-’Disgust’-’Anger’and
the forth ’Neutral’-’Surprise’-’Disgust’. In the right
half of Fig. 3, the association of the probabilities val-
ues is shown. It is obvious that false classifications at
the transitions from one emotion to the other are well
suppressed.
5 CONCLUSIONS AND
OUTLOOK
We have presented an efficient framework for facial
expression recognition in human computer interaction
systems. Our system achieves robust feature detection
and expression classification and can also cope with
variable head poses causing perspective foreshorten-
ing and changing face size of different skin colors.
The shown approach with a linear multi-channel
deconvolution shows the principle of inclusion of
temporal behavior. Additional inclusion of feature
data together with the classification data should fur-
ther improve the results. Current work is also attempt-
ing to estimate additional dynamic features. Those
features are obtained by methods of motion analysis,
e.g. optical flow techniques.
It is expected that other (non-linear) approaches
such as associative memories, known from the artifi-
cial neural networks [e.g. (Kohonen, 1995)], could be
interesting.
ACKNOWLEDGEMENTS
This work has been supported by DFG-Project TRR
62/1-2009.
REFERENCES
Chang, C.-C. and Lin., C.-J. (2009). Libsvm: a library for
support vector machines.
Cohen, I., Sebe, N., Garg, A., Chen, L., and Huang, T.
(2003). Facial expression recognition from video se-
quences: Temporal and static modeling. Computer
Vision and Image Understanding, 91(1-2):160–187.
Ekman, P. (1994). Strong evidence for universals in fa-
cial expressions: a reply to russell’s mistaken critique.
Psychol. Bull Journal, pages 268–287.
Fragopanagos, N. and Taylor, J. G. (2005). Emotion recog-
nition in human-computer interaction. Neural Net-
works, Special Issue, 18:389–405.
Jain, A. (1998). Fundamentals of Digital Image Processing.
Prentice Hall. Prentice Hall.
Kohonen, T. (1995). Self-Organizing Maps. Springer.
Li, S. and Jain, A. (2001). Handbook of Face Recognition.
Springer.
Niese, R., Al-Hamadi, A., Aziz, F., and Michaelis, B.
(2008). Robust facial expression recognition based on
3-d supported feature extraction and svm classifica-
tion. In Proc. of the IEEE International Conference on
Automatic Face and Gesture Recognition (FG2008,
Sept. 17-19).
Niese, R., Al-Hamadi, A., Panning, A., Brammen, D.,
Ebmeyer, U., and Michaelis, B. (2009). Towards
pain recognition in post-operative phases using 3d-
based features from video and support vector ma-
chines. International Journal of Digital Content Tech-
nology and its Applications, (in print).
Viola, P. and Jones, M. (2001). Rapid object detection us-
ing a boosted cascade of simple features. In Proceed-
ings of the IEEE Conf. on Computer Vision and Pat-
tern Recognition.
Wu, T. and Lin, C. (2004). Probability estimates for multi-
class classification by pair wise coupling. Journal of
Machine Learning Research, 5:975–1005.
VISAPP 2010 - International Conference on Computer Vision Theory and Applications
542