plications, the amount of labeled data for a particu-
lar domain can be limited and it is interesting to con-
sider cross-domain classifiers, in other words, classi-
fiers that leverage training data from a source domain
to learn a classifier for a target domain with limited
labeled data. For example, we can use books as the
source domain, while the target domain can be either
music, DVDs, movies, electronics, clothing, toys, etc.
Generally, a classifier built on one domain (i.e.,
source domain) does not perform well when used to
classify the sentiment in another domain (i.e., target
domain). One reason for this is that there might be
some specific words that express the overall polarity
of a given sentence in a given domain, and the same
words can have different meaning or polarity in an-
other domain. Let us consider kitchen appliances and
cameras as our domains, then words such as good and
excellent express positive sentiments in both kitchen
appliance domain, as well as camera domain. Words
such as bad and worse express a negative sentiment in
both domains; they are known as domain independent
words. On the other hand, words such as safe, stain-
less, sturdy, efficient express sentiments in the kitchen
domain and may or may not express any sentiment in
the camera domain. These are known as domain de-
pendent or domain specific words.
In cross-domain classification, the general goal is
to use labeled data in the source domain and, possibly,
some labeled data in the target domain, together with
unlabeled data from the target to learn cross-domain
classifiers for predicting the sentiment of future target
instances. The cross-domain sentiment classification
problem presents additional challenges compared to
the corresponding problem in a single domain. Us-
ing both source and target data to construct the classi-
fier requires substantial insight and effort, specifically
with respect to how to choose source features that are
predictive for target, and also how to combine data or
classifiers from source and target.
To address the first problem, most recent ap-
proaches (Blitzer et al., 2006), (Blitzer et al., 2007),
(Tan et al., 2009) identify domain independent fea-
tures (a.k.a., generalized or pivot features) to repre-
sent the source, and domain specific features to repre-
sent the target. Domain independent features serve as
a bridge between source and target, thus reducing the
gap between them. The performance of the final clas-
sifier will heavily depend on the domain independent
features; therefore, care must be used when select-
ing these features. In this work, we use NLP syntax
structured trees to generate features. Domain inde-
pendent features are selected based on the frequently
co-occurring entropy (FCE) method proposed by Tan
et al. (2009). Features with high entropy values are
assumed to be independent features and used to rep-
resent the source domain. Furthermore, to combine
source and target data, we use an Expectation Maxi-
mization (EM) based Na¨ıve Bayes classifier proposed
also by Tan et al., (2009). Originally, the approach
in (Tan et al., 2009) assumes labeled source data and
unlabeled target data. In our implementation, we can
also incorporate labeled target domain data, if avail-
able. As the number of iterations increases, we reduce
the weight for the source domain instances, while in-
creasing the weight for the target domain instances,
so that the resulting classifier can ultimately be used
for predicting target domain instances.
2 RELATED WORK
Sentiment classification across domains is a very
challenging problem. Classifiers trained on one do-
main cannot always predict the instances from a dif-
ferent domain accurately, due to the fact that domain-
specific features can have different meanings in dif-
ferent domains. The main challenges when perform-
ing sentiment classification experiments consist of se-
lecting the appropriate features and the right Machine
Learning algorithms for a particular dataset.
Relevant to our work, in the context of a single
domain sentiment classification, Harb et al., (2008)
introduced the AMOD (Automatic Mining of Opin-
ion Dictionaries) approach consisting of the following
three phases. The first phase, known as the Corpora
Acquisition Learning Phase, solves a major challenge
by automatically extracting the data from the web us-
ing a predefined set of seed words (positive and neg-
ative terms). The second phase, also known as the
Adjective Extraction Phase, extracts a list of adjec-
tive words with positive and negative opinions. The
third phase, known as the Classification Phase is used
to classify the given documents using the adjective
words extracted in the second phase. The authors used
unigrams as AMOD features and then used the list
of adjective words to classify the given documents.
They used movie review dataset and the car dataset
and the results show that the AMOD approach was
able to classify the given documents by using a list of
adjective words in a single domain.
Zhang et al., (2010) proposed to use several types
of syntax subtrees as features, where the subtrees are
obtained from complete syntax trees by using both ad-
jective and sentiment word pruning strategies. The
syntax trees are derived using the Stanford parser.
These features were found to be very efficientfor clas-
sification in a single domain scenario.
Blitzer et al., (2007) introduced a domain adap-
KDIR2013-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval
170