for determining the polarity of a word. The proposed
method detects a higher number of positive and
negative headlines (better recall), commits few
mistakes (better precision) and detects more neutral
headlines (better accuracy).
Finally, we compare our method with the
systems participating in SemEval 2007 Task 14 (see
Table 3). The results obtained by the unsupervised
systems CLaC and UPAR7, have very low recall and
high precision and, therefore, a very low value of
F1, indicating that few headlines (about 35 of 410)
are classified as positive and negative. Most
headlines are classified as neutral; therefore, the
accuracy is artificially high due to the imbalance of
classes in the data (155 Positives, 255 Negatives and
590 Neutrals).
On the other hand, the supervised systems
(except the SWAT that obtains very bad results)
show a different behavior with respect to
unsupervised systems, they have high recall but low
precision. These systems detect a greater number of
positive and negative headlines, but many neutral
ones are misclassified. Hence, they achieve a low
accuracy.
Table 3: Results of the valence annotation.
Acc. Prec. Rec. F1
Unsupervised methods
ClaC 55.10 61.42 9.20 16.00
UPAR7 55.00 57.54 8.78 15.24
Our method 44.3 37.66 72.11 49.41
Supervised methods
SWAT 53.20 45.71 3.42 6.36
CLaC-NB 31.20 31.18 66.38 42.43
SICS 29.00 28.41 60.17 38.60
As we can observe, the proposed method
outperforms both supervised and unsupervised
systems. Notice that it obtains the best F1 score and
recall while achieving acceptable values of precision
and accuracy. Therefore, we can conclude that our
method presents a more balanced behaviour, that is,
it performs well in the three classes: positive,
negative and neutral.
5 CONCLUSIONS
In this paper, a new unsupervised method to opinion
polarity detection has been introduced. Its most
important novelty is the use of word sense
disambiguation together with standard external
resources for determining the polarity of the
opinions. These resources allow the method to be
extended to other languages and be independent of
the knowledge domain.
The experiments carried out over the data of
SemEval Task No. 14 validate the useful of word
sense disambiguation for determining the polarity of
opinions. We have also shown that the proposed
method outperforms both unsupervised and
supervised systems participating in the competition.
Future work includes testing alternative
resources for polarity detection. We believe that in
many cases our approach fails because the wrong
annotations of SentiWordNet. We also plan to
evaluate the proposed method in other test
collections of different knowledge domain.
REFERENCES
Agirre, E., Soroa, A., (2007). Semeval-2007 task 02:
Evaluating word sense induction and discrimination
systems. Proceedings of the 4th International
Workshop on Semantic Evaluations (SemEval-2007),
7-12.
Anaya-Sánchez, H., Pons-Porrata, A., Berlanga-Llavori,
R. (2006). Word Sense Disambiguation based on
Word Sense Clustering. In J. Simão; H. Coelho and S.
Oliveira (Eds.), Lecture Notes in Artificial
Intelligence: Vol. 4140. IBERAMIA-SBIA (pp. 472-
481). Ribeirão Preto, Brazil: Springer.
Esuli, A., Sebastiani, F. (2006). SentiWN: A Publicly
Available Lexical Resource for Opinion Mining.
Proceedings of the Fifth international conference on
Language Resources and Evaluation (LREC 2006),
417-422.
Gil-García, R., Badía-Contelles, J. M., Pons-Porrata, A.
(2003). Extended Star Clustering Algorithm. In A.
Sanfeliu and J. Ruiz-Shulcloper (Eds.), Lecture Notes
in Computer Sciences: Vol. 2905. 8
th
Iberoamerican
Congress on Pattern Recognition (CIARP) (pp. 480–
487). Berlin, Heidelberg: Springer-Verlag.
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D.,
Miller, K. (1993). Introduction to WordNet: An On-
line Lexical Database. International Journal of
Lexicography, 3(4), 235- 244.
Schmid, H. (1994). Probabilistic Part-of-speech Tagging
Using Decision Trees. Proceeding of the Conference on
New Methods in Language Processing, 44-49.
Stone, P. J., Dunphy, D. C., Smith, M. S., Ogilvie, D. M.
(1966). The General Inquirer: A Computer Approach
to Content Analysis. The American Journal of
Sociology, 73(5), 634-635.
Strapparava, C., and Mihalcea, R. (2007). SemEval-2007
Task 14: Affective Text. Proceedings of the 4th
International Workshop on Semantic Evaluations
(SemEval 2007), 70-74.
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
486