Figure 3: Membership functions f
H
i
and f
H
c
i
, associated re-
spectively to classes H
i
and H
c
i
, applied on SVM output x
to build mass functions.
efficiency of the classifiers (i.e. precision, given by
the confusion matrix) is used to weight their decision.
This is formalized as follows.
Let S
1
, S
j
, S
m
be the m sets of features considered
as sources of information. For all S
j
, n binary clas-
sifiers c
i j
are trained to recognize the n classes H
i
.
Let now consider the building of the mass function
m
c
i j
corresponding to the belief mass obtained from
source of information S
j
using classifier c
i j
trained to
recognize class H
i
. According to the output x
i j
of c
i j
,
and using membership functions (Fig. 3), the mass is
distributed on three subsets of Ω: Ω itself, H
i
and H
c
i
the complement of H
i
in Ω as in Eq. 5.
m
c
i j
(H
i
) = f
H
i
(x). p
c
i j
(H
i
) (3)
m
c
i j
(H
c
i
) = f
H
c
i
(x). p
c
i j
(H
c
i
) (4)
m
c
i j
(Ω) = 1 − m
c
i j
(H
i
) − m
c
i j
(H
c
i
) (5)
where p
c
i j
(H
i
) is the precision of c
i j
for class H
i
and p
c
i j
(H
c
i
) is the precision of c
i j
for class H
c
i
, both
computed from the confusion matrix of c
i j
.
Thus, if the output x of classifier c
i j
is a high pos-
itive value, it means that c
i j
is sure that the input is
in class H
i
. But, as c
i j
may be mistaken, the mass is
distributed not only on H
i
but also on Ω which corre-
sponds to uncertainty, according to the ability of c
i j
to correctly recognize H
i
. On the contrary, if x is very
negative, it means that c
i j
is sure that the input is in
class H
c
i
. However, this decision is also weighted by
the ability of c
i j
to correctly recognize H
c
i
, leading to
a distribution of mass between H
c
i
and Ω. Finally, if
x is around 0, it means that classifier c
i j
has a doubt,
thus the mass is in majority given to Ω, which corre-
sponds to uncertainty.
Once mass functions m
c
i j
are computed from all
classifiers c
i j
, they are combined according to a given
combination operator, such as Dempster’s one (Eq.2),
which corresponds to a fusion of informations given
by all sources S
j
. Finally, a single mass function is
obtained distributing the belief over some subsets of
Ω. The final decision can be taken according to deci-
sion measures presented in section 3.1.
4 EXPERIMENTS
In our experiments, we have made used of the IAPS
database (P. J. Lang, 1999), which provides ratings of
affect (pleasure or valence, arousal and control) for
1192 emotionally-evocative images. We have con-
sidered an emotion model based on the pleasure and
arousal dimensions using four classes corresponding
to each quadrant as shown in Fig. 5. The IAPS cor-
pus is partitioned into a train set (80% of the data, 953
images) and a test set (20% of the data, 239 images),
and all the experiments repeated ten times to get the
average correct classification rate (CR).
To explore the performance of different feature
sets for visual emotion recognition presented in Sec-
tion 2, we have built a classification scheme using two
support vector machine classifiers to identify each
class: the first one is to identify arousal dimension,
and the second one is dedicated to the pleasure di-
mension. The results obtained are shown in Figure 6.
From these results, it appears that among the dif-
ferent features, texture (LBP, Tamura) are the most ef-
ficient ones. Moreover, the higher level features (dy-
namism and harmony) may first seem giving lower
performance, but as they consist in a single value,
their efficiency is in fact remarkable.
To evaluate the efficiency of different types of
combination approaches, we have built an emotion
classification scheme that combined classifiers based
on different features according to the framework il-
lustrated in Fig. 4. In these systems, SVM classifiers
are employed and each feature set S
j
was used to train
classifiers c
i j
, which produces measurement vector y
i j
corresponding to the probability of inputs to belong
to different classes C
i
. Vectors y
i j
are then used to
perform the combination to get the classification re-
sults according to section 3.2. The following com-
bination methods have been implemented and com-
pared to our approach based on the Theory of Evi-
dence: maximum-score, minimum-score, mean-score
and majority-score. The results obtained are shown in
Table 1.
These results show that fusion with the Theory of
Evidence is more efficient with an average percent-
age of 54.7% compared to fusion with mean-score,
min-score, max-score, and majority voting, according
to following equation (Robert Snelick, 2005) , which
proves the ability of the Theory of Evidence to com-
EVALUATION OF FEATURES AND COMBINATION APPROACHES FOR THE CLASSIFICATION OF
EMOTIONAL SEMANTICS IN IMAGES
355