respect to facial expressions. Thus, upper face AUs
(1, 2 and 5) and AU combinations (1+2, 1+4, 1+5,
1+6, 1+7, 2+4, 2+5, 4+5) which result in raising of
eyebrows and widening of eyelids had a slight or no
effect on the eye region localization. The
degradation in the eye region localization rates was
mainly caused by activation of upper face AUs (4, 6,
7, and 43/45) and AU combinations (4+6, 4+7,
4+45, and 6+7) which typically narrow down a
space between the eyelids and/or cause the eyebrows
to draw down together. These facial behaviours were
the main reasons for wrong eye region localization
error.
Recently, studies on the feature-based AU
recognition, which performance depends on the
features used, reported similar results. In (Lien,
Kanade, Cohn, and Li, 2000), first-order derivative
filters of different orientations (horizontal, vertical,
and diagonal) were utilized to detect transient facial
features (wrinkles and furrows) for the purpose of
AU recognition. They reported AU recognition rate
of 86% for AU 1+2, 80% for AU1+4, and 96% for
AU4. In (Tian, Kanade, and Cohn,
2002), the
authors reported a decrease in performance of the
feature-based AU recognition for nearly all the same
AUs (AU 4, 5, 6, 7, 41, 43, 45, and 46) which
created difficulties in landmark localization in the
present study. Among all the upper face AUs, they
found AUs 5, 6, 7, 41, and 43 as the most difficult to
process with feature-based AU recognition method.
4.2 Effect of Lower Face AUs on Nose
and Mouth Localization Rates
The results demonstrated that nose and mouth
localization was significantly affected by facial
expressions in both upper and lower face. As it was
suggested in (Guizatdinova and Surakka, 2005),
AUs 9, 10, 11, and 12 were found to cause a poor
localization performance of the method.
There are certain changes in the face when the
listed AUs are activated. In particular, when AU12
is activated, it pulls the lips back and obliquely
upwards. Further, the activation of AUs 9 and 10 lift
the centre of the upper lip upwards making the shape
of the mouth resemble an upside down curve. AUs
9, 10, 11, and 12 all result in deepening of the
nasolabial furrow and pulling it laterally upwards.
Although, there are marked differences in the shape
of the nasolabial deepening and mouth shaping for
these AUs, it can be summed up that these AUs
generally make the gap between nose and mouth
smaller. These changes in the facial appearance
typically caused wrong nose and mouth localization
errors.
Especially, lower face AU 9 and AU
combinations 4+6, 9+17, 12+20, 12+16 caused
strong degradation in nose and mouth localization
rates. Similarly, in (Lien, Kanade, Cohn, and Li,
2000), degradation in the feature-based recognition
of the lower face AU combinations 12+25 and 9+17
was observed (84% and 77%, respectively).
However, regardless of considerable deterioration of
nose and mouth localization by the listed AUs,
mouth could be found regardless of whether the
mouth was open or closed and whether the teeth or
tongue were visible or not (Figure 2).
4.3 General Discussion
So far we discussed the effect of upper face AUs on
the eye region localization and the effect of lower
face AUs on the nose and mouth localization.
However, the results also revealed that expressions
in the upper face noticeably deteriorated nose and
mouth localization and some changes in the lower
face affected eye region localization. It is due to the
fact that occurring singly or in combinations, AUs
may produce strong skin deformations to be in a far
neighbourhood from those AUs. In the current
database, upper face AUs were usually represented
in conjunction with lower face AUs, and their joint
activation caused changes in both upper and lower
parts of the face. Because of this, the effect of single
AU or AU combinations was difficult to bring into
the light. The present study investigated only the
indirect effect of AUs and AU combinations on the
landmark localization.
The overall performance of the method can be
improved in several respects. First, the results
demonstrated that a majority of the errors was
caused by those facial behaviours which resulted in
the decrease of space between neighbouring
landmarks. Thus, wrong localization errors occurred
already on the stage of edge map construction. The
reason for that was that a distance between edges
extracted from neighbouring landmarks became less
than a fixed threshold and edges belonging to
different landmarks were erroneously grouped
together. To fix this problem, adaptive thresholds are
needed for edge grouping. To facilitate landmark
localization further, the merged landmarks can be
analyzed according to edge density inside the
merged regions. The results showed that the regions
of merged landmarks have non-uniform edge
density. Such regions can be processed subsequently
and separated into several regions of strong edge
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
264