(Baccianella et al., 2010) is the result of automatic
annotation of all WordNet synsets according to their
degrees of positivity, negativity, and neutrality.
Starting from WordNet Affect, (Valitutti et al.,
2005) proposed a simple word presence method to
detect emotions. (Ma et al., 2005) designed an emo-
tion extractor from chat logs, based on the same sim-
ple word presence. SemEval 2007, task 14 (Strap-
parava and Mihalcea, 2008) presented a corpus and
some methods to evaluate it, some based on Latent
Semantic Analyser (LSA) and presence of an emo-
tional word (e.g. WordNet Affect item).
Methods more related to signal processing were
proposed by (Alm et al., 2005), (Danisman and Alp-
kocak, 2008), or (D’Mello et al., 2006) which in-
troduce different solutions for feature extraction and
selection and various classifiers. (Alm et al., 2005)
used a corpus of child stories and a Winnow Linear
method to classify the data into 7 categories. Using
the ISEAR (Wallbott et al., 1988) dataset, a popular
collection of psychological data from around 1990,
(Danisman and Alpkocak, 2008) used different clas-
sifiers like Vector Space Model (VSM), Support Vec-
tor Machine (SVM) or a Naive-Bayes (NB) method
to distinguish between 5 categories of emotions.
2 EMOTION CLASSIFICATION
Emotional Corpus. The chosen corpus for our ex-
periment is from SemEval 2007, task 14 (Strappar-
ava and Mihalcea, 2008), proposed at the conference
with the same name. The data set contains headlines
(newspaper titles) from major websites, such as New
York Times, CNN, BBC or Google News.
The corpus was manually annotated by 6 different
persons. They were instructed to annotate the head-
lines with emotions according to the presence of af-
fective words or group of words with emotional con-
tent. The annotation scheme used for this corpusis the
basic six emotions set, presented by Ekman: Anger,
Disgust, Fear, Joy (Happiness), Sadness, Surprise. In
situations were the emotion was uncertain, they were
instructed to follow their first feeling. The data is an-
notated with a 0 to 100 scale for each emotion.
The authors of the corpus proposed a double eval-
uation, on a fine-grainedscale and on a coarse-grained
scale. For the fine-grained scale, for values from 0
to 100, the system results are correlated using the
Pearson coefficients described by the inter-annotator
agreement. The second proposition was a coarse-
grained encoding, where every value from the 0 to
100 interval is mapped to either 0 or 1 (0 =[0,50) ,
1=[50,100]). Considering the coarse-grained evalua-
tion, a simple overlap was performed.
Classification Model. The classifier we have cho-
sen is a commonly used unsupervised method, the
Self-Organizing Maps (SOM) (Kohonen, 1990). This
method is a particular type of neural network used for
mapping large dimensional spaces into small dimen-
sional ones. The SOM has been chosen because: 1)
it usually offers good results with fuzzy data, 2) the
training process is easier than other Neural Networks
and 3) the classification speed is sufficiently high.
Preprocessing Step. During the preprocessing step,
we applied on each headline a collection of filters,
in order to remove any useless information, such as
special characters and punctuation, camel-case sepa-
rators and stop word filtering
1
.
This method offers a good balance between speed
and accuracy of the results, compared to other meth-
ods like Part of Speech Tagging (POS), which pro-
vides comparable results, but tends to be slower.
Feature Extraction. We have chosen LSA, applied
with three different strategies. Hence, all the occur-
rences of key terms are counted and introduced to a
matrix (a row for each keyword, a column by head-
line). The term set (keywords) is chosen according to
three different strategies.
The first LSA strategy we implemented concerns
the algorithm applied onto the words of the Word-
Net Affect database (Strapparava and Valitutti, 2004).
This method is called pseudo-LSA or meta-LSA by
C. Strapparava and R. Mihalcea (Strapparava and Mi-
halcea, 2008). The meta-LSA algorithm differs from
the classic implementation by using clusters of words
instead of single words. This strategy did not pro-
vide the expected results: the recall decreased since
all of the presented words were carrying an emotional
value and the non-emotional words were not repre-
sented. Our version confirms the results obtained by
Mihalcea and Strapparava.
The second strategy use the classic LSA applied
onto the words of the training set. While the generic-
ness of this approach is not assured by the support
word collection, this method offers a good starting
point for similar training corpus and testing corpus.
Our third proposition was to use the top 10 000
most frequent English words, extracted from approx-
imately 1 000 000 documents existing in the Project
Gutenberg
2
. The features used are the document sim-
1
We considered as stop words all prepositions, articles
and other short words that do not carry any semantic value
(e.g. http://www.textfixer.com/resources/common-english-
words.txt)
2
Project Gutenberg is a large collection of e-books,
processed and reviewed by the project’s community.
ICAART 2012 - International Conference on Agents and Artificial Intelligence
478