appeared in 2003 in (Nasukawa and Yi, 2003), in the
same year another prominent name for the domain
opinion mining was proposed in (Dave et al., 2003).
Both the terms sentiment analysis and opinion mining
are used in the literature, along with many related,
though a bit more specific, terms like review mining,
opinion affect analysis, sentiment mining or emotion
analysis (Liu, 2012).
Sentiment analysis from its beginning focuses on
extracting information about users’ emotional
attitudes from large corpora of documents, especially
from social media. In comparison with its ancestor
fields (NLP and affective computing) the problems
here are approached more directly and specifically.
The researchers do not focus on creating methods for
perfect understanding of texts being analyzed. The
texts are very often being treated as bags-of-words
exposing some features based on presence or absence
of specific words (or their co-presence expressed as
n-grams—pairs, triples, etc. of words). Also the
emotions exposed in the texts are typically not
identified very comprehensively. The usual outcome
of the analysis is bipolar: emotions are identified as
positive or negative (sometimes neutral).
The range of phenomena being analyzed is quite
broad, and includes sentiments, emotions,
evaluations, and attitudes towards products, services,
organizations, persons, events, news etc. However,
there exists no tool suitable for handling all those
phenomena universally, and most of algorithms
developed in the field focus on a single problem:
specific type of text, like microblog entry, and
specific object being evaluated, like a tablet or a
mobile phone.
The strength of methods of sentiment analysis
most frequently stems from their statistical character.
For instance, presence of the word “excellent” in a
text may be treated as a sign of the text bearing
positive opinion. This rule, while in some (perhaps
many) cases not true, when applied to a large corpora
of texts may turn out to be feasible enough to
positively contribute to extracted information about
expressed opinions.
Attention drawn by the subject of sentiment
analysis increased rapidly, from purely scientific
interest, towards many applied methods. Currently
most of the companies involved in business
intelligence (like Microsoft or SAS) offer their own
solutions for opinion mining. One of the reasons is
very broad range of potential applications: sentiment
analysis has been used for assessing sales volume
(Liu et al., 2007), ranking sellers and products
(McGlohon et al., 2010), prognosis of movie box
office (Asur and Hubeman, 2010) or assessing
attitudes of stock exchange investors ((Bollen et al.,
2011) on the basis of tweets, (Bar-Haim et al. 2011)
using posts in expert microblogs). Semantic analysis
found its applications also in political debate
(Tumasjan et al. 2010, Chen et al. 2010) to predict the
results of presidential vote in the USA.
2.2 Classifying SA Methods
The field of sentiment analysis is very rich and many
papers in the domain contain proposals of
classification schemes for SA methods. Such
proposals are most frequently presented in survey
papers and books reviewing the field, and can be used
to underpin some of the most important
characteristics of methods being classified.
One of the most classic decompositions of the
methods in the field was presented in (Feldman,
2013). The methods are classified along two
dimensions. The first dimension is about granularity,
i.e. the degree into which a method investigates the
contents of a document. While not precisely
distinguished in (Feldman, 2013), one can order those
degrees into the following hierarchy:
Document-level analysis,
Sentence-level analysis,
Aspect-level analysis.
Analysis at a document level is the most
straightforward way of assessing sentiment. Methods
at this level assign sentiment orientation to whole
documents, most frequently in the bipolar form of
positive/negative score. Sentence-level analysis
consists in assigning orientation to subsequent
sentences. Working at this level might be helpful in
detecting mixed opinions about the object of
sentiment, and is also useful when some special kinds
of sentences should be treated in a special way (like,
simply, filtering out some sentences, say sarcastic
ones). Aspect-level analysis allows for assigning
sentiment not only directly to the object being
assessed but also to its “parts”, known as aspects.
Aspects need not to be necessarily physical parts of
the object, they may also refer to its features (like
“display quality”). Assessing at aspect level allows
for assigning sentiment score to parts and features of
the object and, consequently, allows to extract
interesting information also from mixed opinions. At
the end of such analysis user may be presented with
more detailed report with score for each of the
aspects.
Analysis-level dimension is augmented by the
division of methods by the learning technique
applied. We distinguish here supervised and
unsupervised methods. In supervised learning we