Product Feature Taxonomy Learning based on User Reviews

Nan Tian

, Yue Xu

, Yuefeng Li

, Ahmad Abdel-Hafez

and Audun Josang

Faculty of Science and Engineering, Queensland University of Technology, Brisbane, Australia

Department of Informatics, University of Oslo, Oslo, Norway

Keywords:

Feature Extraction, Opinion Mining, Association Rules, Feature Taxonomy, User Reviews.

Abstract:

In recent years, the Web 2.0 has provided considerable facilities for people to create, share and exchange in-

formation and ideas. Upon this, the user generated content, such as reviews, has exploded. Such data provide

a rich source to exploit in order to identify the information associated with speciﬁc reviewed items. Opin-

ion mining has been widely used to identify the signiﬁcant features of items (e.g., cameras) based upon user

reviews. Feature extraction is the most critical step to identify useful information from texts. Most existing

approaches only ﬁnd individual features about a product without revealing the structural relationships between

the features which usually exist. In this paper, we propose an approach to extract features and feature rela-

tionships, represented as a tree structure called feature taxonomy, based on frequent patterns and associations

between patterns derived from user reviews. The generated feature taxonomy proﬁles the product at multi-

ple levels and provides more detailed information about the product. Our experiment results based on some

popularly used review datasets show that our proposed approach is able to capture the product features and

relations effectively.

1 INTRODUCTION

In recent years, the user generated online content ex-

ploded due to the advent of Web 2.0. For instance,

online users write reviews to how they enjoy or dis-

like a product they purchased. This helps to identify

features or characteristics of the product from users’

point of view, which is an important addition to the

product speciﬁcation. However, to identify the rele-

vant features from users’ subjective review data is ex-

tremely challenging.

Feature-based opinion mining has attracted big at-

tention recently. A signiﬁcant amount of research

has been proposed to improve the accuracy of feature

generation for products (Hu and Liu, 2004a; Scafﬁdi

et al., 2007; Hu et al., 2010; Zhang and Zhu, 2013;

Popescu and Etzioni, 2005; Ding et al., 2008). How-

ever, most techniques only extract features; the struc-

tural relationship between product features has been

omitted. For example, “picture resolution” is a com-

mon feature of digital camera in which “resolution”

expresses the speciﬁc feature concept to describe the

general feature “picture”. Yet, existing approaches

treat “resolution” and “picture” as two individual

features instead of ﬁnding the relationship between

them. Thus, the information derived by existing fea-

ture extraction approaches is not sufﬁcient for gen-

erating a precise product model since all features are

allocated in the same level and independent from each

other.

Association rule mining is a well explored method

in data mining (Pasquier et al., 1999). Based on asso-

ciation rules generated from a collection of item trans-

actions, we can discover the relations between items.

However, the amount of generated association rules

is usually huge and selecting the most useful rules is

challenging (Xu et al., 2011). In our research, we pro-

pose to identify a group of frequent patterns as po-

tential features to assist selecting useful association

rules. The selected rules are used to identify relation-

ships between features. Furthermore, in order to en-

sure that the most useful rules are to be selected, we

also propose to apply statistical topic modelling tech-

nique (Blei et al., 2003) to the selection of association

rules.

Our approach takes advantages of existing feature

extraction approaches and makes two contributions.

Firstly, we present a method to make use of associa-

tion rules to ﬁnd related features. Secondly, we create

a product model called feature taxonomy which rep-

resents the product more accurately by explicitly rep-

resenting the concrete relationships between general

features and speciﬁc features.

184

Tian N., Xu Y., Li Y., Abdel-Hafez A. and Josang A..

Product Feature Taxonomy Learning based on User Reviews.

DOI: 10.5220/0004850201840192

In Proceedings of the 10th International Conference on Web Information Systems and Technologies (WEBIST-2014), pages 184-192

ISBN: 978-989-758-024-6

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

2 RELATED WORK

Our research aims to extract useful product infor-

mation based on user generated information to cre-

ate a product model. This work is closely related to

feature-based opinion mining which has drawn many

researchers’ attention in recent years. In detail, iden-

tifying features that have been mentioned by users is

considered the most signiﬁcant step in opinion mining

(Hai et al., 2013). Hu and Liu (2004) ﬁrst proposed

a feature-based opinion mining method to extract fea-

tures and sentiments from customer reviews. They

use pattern mining to ﬁnd frequent itemsets (nouns).

These itemsets are pruned and considered frequent

product features. A list of sentiment words (adjec-

tives) that are nearby frequent features in reviews

can be extracted and used to identify those product

features that cannot be identiﬁed by pattern mining.

Scafﬁdi et al. (2007) improved the performance of

feature extraction in their proposed system called Red

Opal. Speciﬁcally, they made use of a language model

to ﬁnd features by comparing the frequency of nouns

in the review and in common use of English. Those

frequent nouns in both reviews and in common use

are considered invalid features. Hu et al. (2010)

make use of SentiWordNet to identify all sentences

that may contain users’ sentiment polarity. Then, the

pattern mining is applied to generate explicit features

based on these opinionated sentences. In addition, a

mapping database has been constructed to ﬁnd those

implicit features represented by sentiment words(e.g.,

expensive indicates price). To enhance the accuracy

of ﬁnding correct features from free text review, Hai

et al (2013) proposed a novel method which evaluates

the domain relevance of a feature by exploiting fea-

tures’ distribution disparities across different corpora

(domain-dependent review corpus such as cellphone

reviews and domain-irrelevant corpus such as culture

article collection). In detail, the intrinsic-domain rel-

evance (IDR) and extrinsic-domain relevance (EDR)

have been proposed to benchmark if a examined fea-

ture is related to a certain domain. The candidate

feature with low IDR and high EDR scores will be

pruned.

Lau et al. (2009) presented an ontology-based

approach to proﬁle the product. In detail, a number

of ontology levels, such as feature level that contains

identiﬁed features for a certain product and sentiment

level in which sentiment words that describe a certain

feature are stored, have been constructed (Lau et al.,

2009). This method provides a simple product proﬁle

rather than extracting product features only.

The statistical topic modeling technique has been

used in various ﬁelds such as text mining (Blei et al.,

2003; Hofmann, 2001) in recent years. Latent Se-

mantic Analysis (LSA) is ﬁrst proposed to capture

the most signiﬁcant features of a document collec-

tion based upon semantic structure of relevant doc-

uments (Lewis, 1992). Then, Probabilistic LSA

(pLSA) (Hofmann, 2001) and Latent Dirichlet Allo-

cation (LDA) (Blei et al., 2003) are proposed to im-

prove the interpretation of results from LSA. These

techniques have been proven more effective on doc-

ument modeling and topic extraction, which are rep-

resented by topic-document and word-topic distribu-

tion, respectively. Particularly, multinomial distribu-

tion over words which is derived based upon word fre-

quency can be generated to represent topics in a given

text collection.

None of aforementioned feature identiﬁcation ap-

proaches is able to identify the relationships between

the extracted product features. The structural relation-

ships that exist between features can be used to de-

scribe the reviewed product in more depth. However,

how to evaluate and determine the relations between

features is still challenging.

The remainder of the paper is organized as fol-

lows. The next section illustrates the construction

process of our proposed feature taxonomy. Then, the

evaluation of our approach is reported afterwards. Fi-

nally, we conclude and describe future direction of

our research work.

3 THE PROPOSED APPROACH

Our proposed approach consists of two main steps:

product taxonomy construction using association

rules and taxonomy expansion based on reference fea-

tures. The input of our system is a collection of user

reviews for a certain product. The output is a product

feature taxonomy which contains not only all gener-

ated features but also the relationships between them.

3.1 Pre-processing and Transaction File

Generation

First of all, we construct a single document called an

aggregated review document which combines all the

reviews in a collection of reviews, keeping each sen-

tence in the original reviews as one sentence in the

constructed aggregated review document. Three steps

are undertaken to process the review text in order to

extract useful information. Firstly, we generate the

part-of-speech (POS) tag for each word in the aggre-

gated review document to indicate whether the word

is a noun, adjective or adverb etc. For instance, af-

ter the POS tagging,“The ﬂash is very weak.” would

ProductFeatureTaxonomyLearningbasedonUserReviews

185

be transformed to “The/DT ﬂash/NN is/VBZ very/RB

weak/JJ ./.”, where DT, NN, VBZ, RB, and JJ repre-

sent Determiner, Noun, Verb, Adverb and Adjective,

respectively. Secondly, according to the thumb rule

that most product features are nouns or noun phrases

(Hu and Liu, 2004b), we process each sentence in

the aggregated review document to only keep words

that are nouns. All the remaining nouns are also pre-

processed by stemming and spelling correction. Each

sentence in the aggregated review document consists

of all identiﬁed nouns of a sentence in the original

reviews. Finally, a transactional dataset is generated

from the aggregated review document. Each sentence

which consists of a sequence of nouns in the aggre-

gated review document is treated as a transaction in

the transactional dataset.

3.2 Potential Features Generation

Our ﬁrst task is to generate potential product features

that are expressed by those identiﬁed nouns or noun

phrases. According to (Hu and Liu, 2004a), signif-

icant product features are discussed extensively by

users in reviews (e.g.,“battery” for cameras). Upon

this, most existing feature extraction approaches

make use of pattern mining techniques to ﬁnd poten-

tial features. Speciﬁcally, an itemset is a set of items

(i.e., words in review text in this paper) that appear

together in one or multiple transactions in a transac-

tional dataset. Given a set of items, I =

{

,...,i

}

an itemset is deﬁned as X ⊆ I. The support of an

itemset X, denoted as Supp(X), is the percentage of

transactions in the dataset that contain X. All frequent

itemsets from a set of transactions that satisfy a user-

speciﬁed minimum support will be extracted as the

potential features. However, not all frequent item-

sets are genuine since some of them may be just fre-

quent but meaningless. We use compactness pruning

method proposed by (Hu and Liu, 2004a) to ﬁlter fre-

quent itemsets. After the pruning, we can get a list

of frequent itemsets that are considered potential fea-

tures, denoted as FP.

3.3 Product Feature Taxonomy

Construction

In this step, we propose to utilize association rules

generated from the discovered potential product fea-

tures to identify relations in order to construct a fea-

ture taxonomy.

Association rule mining can be described as fol-

lows: Let I =

{

,...,i

}

, be a set of items, and

the dataset consists of a set of transactions D =

{

,...,t

}

. Each transaction t contains a subset of

items from I. Therefore, an association rule r repre-

sents an implication relationship between two item-

sets which can be deﬁned as the form X → Y , where

X, Y ⊆ I and X ∩ Y =

0. The itemsets X and Y are

called antecedent and consequent of the rule, respec-

tively. To assist selecting useful rules, the support

Supp(X ∪Y ) and the conﬁdence Con f (X → Y ) of the

rule can be used (Xu et al., 2011).

For easily describing our approach, we deﬁne

some useful and important concepts as follows:

Deﬁnition 1 (Feature Taxonomy): A feature tax-

onomy consists of a set of features and their relation-

ships, denoted as FH =

{

F, L

}

, F is a set of features

where F =

{

, f

,..., f

}

and L is a set of relations.

The feature taxonomy has the following constraints:

(1) The relationship between a pair of features is the

sub-feature relationship. For f

, f

∈ F, if f

a sub feature of f

, then ( f

, f

) is a link in the

taxonomy and ( f

, f

) ∈ L, which indicates that f

is more speciﬁc than f

. f

is called the parent

feature of f

and denoted as P( f

(2) Except for the root, each feature has only one

parent feature. This means that the taxonomy is

structured as a tree.

(3) The root of the taxonomy represents the product

itself.

Deﬁnition 2 (Feature Existence):For a given fea-

ture taxonomy FH =

{

F, L

}

, let W (g) represent a

set of words that appear in a potential feature g, let

ES(g) =

∈ 2

w(g)

∈ F

contain all subsets of

g which exist in the feature taxonomy, ES(g) is called

the existing subsets of g, if

∈ES(g)

W (a

) = W (g),

then g is considered exist in FH, denoted as exist(g),

otherwise ¬exist(g).

Opinion mining is also referred as sentiment anal-

ysis (Subrahmanian and Reforgiato, 2008; Abbasi

et al., 2008; Wright, 2009). Adjectives or adverbs

that appear together with product features are consid-

ered as the sentiment words in opinion mining. The

following deﬁnition deﬁnes the sentiment words that

are related to a product feature.

Deﬁnition 3 (Related Sentiments): For a feature

f ∈ F , let RS( f ) denote a set of sentiment words

which appear in the same sentences as f in user re-

views, RS( f ) is deﬁned as the related sentiments of

f .

Deﬁnition 4 (Sentiment Sharing): For features

, f

∈ F, the sentiment sharing between f

and f

is deﬁned as SS ( f

, f

) = |RS ( f

) ∩ RS ( f

)|.

For deriving sub features using association rules,

we need to select a set of useful rules rather than using

all the rules. In the next two subsections, we will ﬁrst

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

186

propose two methods to select rules, one method is

to select rules based on the sentiment sharing among

features and the other method is to select rules by

using the word relatedness derived from the results

generated by using the typical topic model technique

method LDA (Blei et al., 2003); then introduce some

strategies to update the feature taxonomy by adding

sub features using the selected rules.

In order to explain the topic modelling based

method, we ﬁrst deﬁne some related concepts. Let

RE = {r

,...,r

} be a collection of reviews, each

review consists of nouns only, W = {w

,...,w

}

be a set of words appearing in RE, and Z =

,...,Z

} be a set of pre-speciﬁed hidden topics.

LDA can be used to generate topic models for rep-

resenting the collection as a whole and also for each

review in the collection. At the collection level, the

topic model represents the collection RE using a set

of topics each of which is represented by a probabil-

ity distribution over words (i.e., nouns in the context

of this paper) for topic. In this paper, we will use the

collection level representation to ﬁnd the relatedness

between words.

At collection level, each topic Z

is represented

by a probability distribution over words, φ

{p(w

), p (w

),..., p (w

)},

∑

k=1

j,k

= 1,

p(w

) is the probability of word w

being used

to represent the topic Z

. Based on the probability

p(w

), we can choose the top words to represent

the topic Z

Deﬁnition 5 (Topic Words): Let φ

{p(w

), p (w

),..., p (w

)} be the topic

representation for topic Z

produced by LDA

and 0 ≤ δ ≤ 1 be a threshold, a set of the topic

words for Z

, denoted as TW (Z

), is deﬁned as

TW (Z

) = {w|w ∈ W, p (w|Z

) > δ}.

Deﬁnition 6 (Word Relatedness): We use word

relatedness to indicate how likely that two words have

been used to represent a topic together. Let w

∈

W be two words, the word relatedness between two

words with respect to topic z is deﬁned below:

W R

) =







1 − |p(w

|z) − p(w

|z)| w

∈ TW (z)

and w

∈ TW (z)

0 otherwise

(1)

Deﬁnition 7 (Feature Topic Representation):

For feature f ∈ F, let W D ( f ) be a set of words ap-

pearing in f and TW (z) be the topic words of topic

z. If W D ( f ) ⊂ TW (z), the feature topic representa-

tion of feature f for topic z is deﬁned as FT P ( f , z) =

{(w, p (w|z))|w ∈ W D( f )}.

Deﬁnition 8 (Feature Relatedness): For features

, f

∈ F, if both features appear in a certain topic

z, then the feature relatedness between f

and f

with

respect to z is deﬁned as:

( f

, f

) = min

∈WD( f

)

∈WD( f

)

{W R

)} (2)

3.3.1 Rule Selection

Let R =

{

,...,r

}

be a set of association rules

generated from the frequent itemsets FP, each rule r

in R has the form X

→Y

, X

and Y

are the antecedent

and consequent of r, respectively.

Assuming that f

is a feature which has already

been in the current feature taxonomy FH, to generate

the sub features for f

, we ﬁrst select a set of can-

didate rules, denoted as R

, which could be used to

generate the sub features:

= {X → Y |X → Y ∈ R, X = f

Supp(X) > (Y )}

(3)

As deﬁned in Equation (3), the rules in R

should

satisfy two constraints. The ﬁrst constraint, X = f

speciﬁes that the antecedent of a selected rule must

be the same as the feature f

. Sub features repre-

sent speciﬁc cases of a feature, they are more spe-

ciﬁc compared to the feature. The second constraint

is based on the assumption that more frequent item-

sets usually represent more general concepts, and less

frequent itemsets usually represent more speciﬁc con-

cepts. For instance, according to our observation to-

ward features, a general feature (e.g., “picture”, its

frequency is 62) appears more frequently than a spe-

ciﬁc feature (e.g., “resolution”, its frequency is 9)

in reviews for the camera 2 in the dataset published

by Liu (Ding et al., 2008). Therefore, only the rules

which can derive more speciﬁc features will be se-

lected.

However, not all selected rules represent correct

sub-feature relationship. For instance, mode → auto

is more appropriate for describing a sub-feature rela-

tionship rather than camera → auto. Therefore, the

rule camera → auto should not be considered when

we generate the sub features for “camera”. Upon

this, we aim to prune the unnecessary rules before

generating sub features for each taxonomy feature.

Firstly, a feature and its sub features should share sim-

ilar sentiment words since they describe the same as-

pect of a product at different abstract levels (e.g., vivid

can be use to describe both picture and color). There-

fore, we should select rules whose antecedent (rep-

resenting the feature) and consequent (representing a

possible sub feature) share as many sentiment words

as possible because the more sentiment words they

share, the more possible they are about the same as-

pect of the product. Secondly, based on topic models

ProductFeatureTaxonomyLearningbasedonUserReviews

187

generated from LDA, the more a feature and its po-

tential sub feature appear in the same topics, the more

likely they are related to each other.

Let f

, f

be two features and Z

( f

, f

)

be a set of

topics that contains both features, the feature related-

ness between f

, f

with respect to all topics, denoted

as FR

avg

( f

, f

), is deﬁned as the average feature re-

latedness between the two features over Z

( f

, f

)

avg

( f

, f

) =

∑

z∈Z

( f

, f

)

( f

, f

)

( f

, f

)

(4)

Based on this view, we propose the following equa-

tion to calculate a score for each candidate rule X → Y

in R

Weigh(X → Y ) = α(Supp(Y ) ×Con f (X → Y ))+

SS(X,Y )

|RS(X) ∪ RS(Y )|

+ γFR

avg

(X,Y )

(5)

0 < α,β,γ < 1. The value of α,β, and γ is set to

0.8, 0.1, and 0.1, respectively in our experiment de-

scribed in Section 4. There are three parts in Equation

(5). The ﬁrst part is used to measure the belief to the

consequent Y by using this rule since Con f (X → Y )

measures the conﬁdence to the association between X

and Y and Supp(Y ) measures the popularity of Y . The

second part is the percentage of the shared sentiment

words given by SS(X,Y ) over all the sentiment words

used for either X or Y . Yet, the third part in the equa-

tion is the average feature relatedness between X and

Y . Given a threshold σ, we propose to use the fol-

lowing equation to select the rules from the candidate

rules in R

. The rules in R

will be used to derive

sub features for the features in FP. R

is called the

rule set of f

= {X → Y |X → Y ∈ R

Weigh(X → Y ) > σ}

(6)

3.3.2 Feature Taxonomy Construction

Let FH = {F, L} be a feature taxonomy which could

be an empty tree, FP be a set of frequent itemsets

generated from user reviews which are potential fea-

tures, and R be a set of rules generated from user re-

views. This task is to construct a feature taxonomy

if F is empty or update the feature taxonomy if F is

not empty by using the rules in R. Let UF be a set

of features on the tree which need to be processed in

order to construct or update the tree. If F is empty,

the itemset in FP which has the highest support will

be chosen as the root of FH, it will be the only item

in UF at the beginning. If F is not empty, UF will be

F, i.e., UF = F.

Without losing generality, assuming that F is not

empty and the set of features currently on the tree,

UF is the set of features which need to be processed

to update or construct the tree. For each feature in

UF, let f

be a feature in U F, i.e., f

∈ UF and

X → Y ∈ R

be a rule with X = f

, the next step is

to decide whether or not Y should be added to the

feature taxonomy as a sub feature of f

. There are

two possible situations: Y does not exist in the feature

taxonomy, i.e., ¬exist(Y ) and Y does exist in the tax-

onomy, i.e., exist(Y ). In the ﬁrst situation, the feature

taxonomy will be updated by adding Y as a sub fea-

ture of f

, i.e., F = F ∪ {Y }, L = L ∪ ( f

,Y ), and Y

should be added to UF for further checking.

In the second situation, i.e., Y already exists in the

taxonomy, i.e., according to Deﬁnition 2, there are

two cases, Y /∈ ES(Y ) (i.e., Y is not in the tree) or

Y ∈ ES(Y ) (i.e., Y is in the tree). In the ﬁrst case,

Y is not considered a sub feature of f

and conse-

quently, no change is required to the tree. In the sec-

ond case, ∃ f

∈ F, f

is the parent feature of Y , i.e.,

P(Y ) = f

and ( f

,Y ) ∈ L. Now, we need to deter-

mine whether to keep f

as the parent feature of Y

or change the parent feature of Y to f

. That is, we

need to examine f

and f

to see which of them is

more suitable to be the parent feature of Y . The ba-

sic strategy is to compare f

and f

to see which of

them has more sentiment sharing and feature related-

ness with Y . Let f

, f

be a potential parent feature

and sub feature, respectively. We propose a rank-

ing equation to indicate how likely f

is related to

: Q( f

, f

) =

SS( f

, f

)

RS( f

)

+ FR

avg

( f

, f

). Thus, if

Q( f

,Y ) < Q( f

,Y ), the link ( f

,Y ) will be removed

from the taxonomy tree, ( f

,Y ) will be added to the

tree, otherwise, no change to the tree and f

is still the

parent feature of Y .

3.3.3 Algorithms

The construction of the feature taxonomy is to gener-

ate a feature tree by ﬁnding all sub features for each

feature. In this section, we will describe the algo-

rithms to construct the feature taxonomy. As men-

tioned above, if the tree is empty, the feature with the

highest support will be chosen as the root. So, at the

very beginning, F and UF contain at least one item

which is the root. Algorithm 1 describes the method

to construct or update a feature taxonomy.

After the taxonomy construction, some potential

features may be left over in RF and have not been

added to the taxonomy. The main reason is because

these itemsets may not frequently occur in the reviews

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

188

Algorithm 1:Feature Taxonomy Construction.

Input:

R, FH = {F,L}, FP.

Output:

FH, RF //RF is the remaining features which are

not added to FH after the construction

1: if F =

0, then root := argmax

f ∈F P

{supp( f )},

F := UF := {root};

2: else U F := F;

3: for each feature f

∈ UF

4: if R

0 //the rule set of f

is not empty

5: for each rule X → Y ∈ R

6: if ¬exist(Y) //Y does not exist on the tree

7: F := F ∪ {Y },L := L ∪ ( f

,Y ),

UF := UF ∪ {Y },FP := FP − {Y };

8: else //Y exists on the tree

9: if Y ∈ ES(Y ) and Q( f

,Y ) < Q( f

,Y )

// f

is Y

s parent feature

10: L := L ∪ ( f

,Y ),L := L − ( f

,Y );

//add ( f

,Y ) and remove ( f

,Y )

11: else //Y /∈ ES(Y ), Y is not on the tree

12: FP := FP − {Y };

13: endfor

14: endif

15: UF := UF − { f

}; //remove f

from UF

16: endfor

17: RF := FP

together with the features that have been added in the

taxonomy. In order to prevent valid features from be-

ing missed out, we check those remaining itemsets in

RF by examining the shared sentiment words and fea-

ture relatedness between the remaining itemsets and

the features in the taxonomy. Let FH = {F,L} be

the constructed feature taxonomy, RF be the set of

remaining potential features, for a potential feature g

in RF, the basic strategy to determine whether g is a

feature or not is to examine the Q ranking between g

and the features in the taxonomy. Let F

= { f | f ∈

F, Q( f , g) > 0} be a set of features which are related

to g, if F

0, g is considered a feature. The most re-

lated feature is deﬁned as f

= argmax

f ∈F

{Q( f ,g)}.

g will be added to the taxonomy with f

as its parent

feature. If there are multiple such features f

which

have the highest ranking score with g, the one with the

highest support will be chosen as the parent feature of

Algorithm 2 formally describes the method men-

tioned above to expand the taxonomy by adding the

remaining features.

After the expansion, the features left over in RF

are not considered as features for this product.

Algorithm 2: Feature Taxonomy Expansion.

Input:

FH = {F,L}, RF.

Output:

1: for each feature g ∈ RF

2: if (F

:= { f | f ∈ F,Q( f , g) > 0}) 6=

3: M := {a|a ∈ F

and

Q(a,g) = max

f ∈F

{Q( f ,g)}}

4: f

:= argmax

f ∈M

{supp( f )}

5: F := F ∪ {g}, L := L ∪ ( f

,g)

6: RF := RF − {g}

4 EXPERIMENT AND

EVALUATION

We use three datasets in the experiments. Each dataset

contains user reviews for a certain type of digital cam-

eras. One dataset is used in (Hu and Liu, 2004a),

while the other two are used in (Ding et al., 2008).

Each review in the datasets has been manually anno-

tated. In detail, a human examiner read a review sen-

tence by sentence. If a sentence is considered indicat-

ing the user’s opinions, such as positive and negative,

all possible features in the sentence that are modiﬁed

by sentiment words are tagged. We take these anno-

tated features as the correct features to evaluate the

performance of our proposed method in feature ex-

traction. The number of reviews and number of anno-

tated features are 51 and 98 for camera 1, 34 and 75

for camera 2, and 45 and 105 for camera 3.

Our proposed feature taxonomy captures both

product features and relations between features.

Therefore, the evaluations are twofold: feature extrac-

tion evaluation and structural relations evaluation.

4.1 Feature Extraction Evaluation

First of all, we evaluate the performance of our ap-

proach by examining the number of accurate features

in user reviews that have been extracted. We use the

feature extraction method (FBS) proposed in (Hu and

Liu, 2004a) as the baseline for comparison. In ad-

dition, in order to examine the effectiveness of using

the sentiment sharing measure, the feature relatedness

measure, and the combination of the two, we conduct

our experiment in four runs:

(1) Rule: construct the feature taxonomy by only uti-

lizing the information of association rules (i.e.,

support and conﬁdence value only) without using

the sentiment sharing and the feature relatedness

measures;

ProductFeatureTaxonomyLearningbasedonUserReviews

189

(2) SS: construct the feature taxonomy by taking the

information of association rules and the sentiment

sharing measure without using the feature related-

ness measure;

(3) FR: construct the feature taxonomy by taking the

information of association rules and the feature

relatedness measure without using the sentiment

sharing measure;

(4) Hybrid: the sentiment sharing and the feature re-

latedness are combined together with the informa-

tion of association rules to construct the feature

taxonomy.

Table 1: Recall Comparison.

Camera 1 Camera 2 Camera 3 Average

FBS 0.57 0.63 0.57 0.59

Rule 0.38 0.52 0.45 0.45

SS 0.56 0.65 0.58 0.60

FR 0.56 0.67 0.58 0.60

Hybrid 0.56 0.68 0.58 0.61

Table 2: Precision Comparison.

Camera 1 Camera 2 Camera 3 Average

FBS 0.45 0.42 0.51 0.46

Rule 0.55 0.57 0.74 0.62

SS 0.62 0.57 0.63 0.61

FR 0.60 0.56 0.63 0.60

Hybrid 0.62 0.59 0.68 0.63

Table 3: F1 Score Comparison.

Camera 1 Camera 2 Camera 3 Average

FBS 0.50 0.50 0.54 0.51

Rule 0.45 0.54 0.56 0.52

SS 0.59 0.61 0.60 0.60

FR 0.58 0.61 0.60 0.60

Hybrid 0.59 0.63 0.63 0.62

Table 1, 2, 3 illustrate the recall, precision, and F1

score results produced in the four runs, respectively.

From the results, we can see that using both the senti-

ment sharing and feature relatedness can obtain better

feature extraction performance than the use of asso-

ciation rule’s information only. In particular, the hy-

brid method, which uses both sentiment sharing and

feature relatedness, achieves the best results in most

cases. However, the size of the review dataset and

the number of annotated features can affect the pre-

cision and recall, which makes the values of the pre-

cision and recall vary in different range for different

datasets. For instance, camera 3 has higher precision

values than camera 2 due to more reviews in camera

3 dataset than that in camera 2 dataset, but camera

3 has lower recall values than camera 2 due to more

manually annotated features in camera 3 dataset.

4.2 Structural Relation Evaluation

The evaluation of the relations requires the standard

taxonomy or knowledge from experts (Tang et al.,

2009). Since there is no existing standard taxonomy

available for comparison, we manually created taxon-

omy for the three cameras according to the product

technical speciﬁcations provided online by manufac-

ture organizations

1, 2, 3

. From the product speciﬁca-

tions on these websites, each camera has a number

of attributes such as lens system and shooting modes.

In addition, each attribute may also have several sub

attributes. For instance, the shooting modes of the

camera contains more speciﬁc attributes (e.g., intelli-

gent auto and custom). Based upon such information,

we create the product feature taxonomy for three dig-

ital cameras and use the taxonomy as the testing tax-

onomy, called Manual Feature Taxonomy (MFT ), to

evaluate the relations within our proposed feature tax-

onomy.

Due to the difference between the technical spec-

iﬁcations from domain experts and the subjective re-

views from online users, the words used to represent

a feature in user reviews are very often different from

the words for the same feature speciﬁed by domain

experts in the product speciﬁcation. For example, the

feature lens system in the testing taxonomy and the

feature lens in our generated taxonomy should be the

same according to common knowledge even though

they are not exactly matched with each other. Be-

cause of this fact, we will determine the match be-

tween two features based on overlapping of the two

features rather than exact matching.

Let MFT = {F

MFT

} be the testing taxon-

omy with F

MFT

being a set of standard features given

by domain experts and L

MFT

being a set of links in

the testing taxonomy. For a given link ( f

F p

, f

) ∈ L

in the constructed product feature taxonomy and two

features f

M p

, f

∈ F

MFT

in the testing taxonomy, the

link ( f

F p

, f

) is considered matched with ( f

M p

, f

)

and therefore represent a correct feature relation if the

following conditions are satisﬁed:

1. W ( f

M p

) ∩W ( f

F p

) 6=

0 and W ( f

) ∩W ( f

) 6=

2. There exists a path in MFT ,

h f

M p

, f

,..., f

, f

i, ( f

M p

, f

), ( f

, f

i+1

( f

, f

) ∈ L

MFT

, i = 1, ..., n − 1

We examine the testing taxonomy and the con-

structed taxonomy to identify all matched links in the

constructed taxonomy. The traditional measures pre-

cision and recall are used to evaluate the correctness

of the feature relations in the constructed feature tax-

onomy. Let ML(FH) denote the matched links in

http://www.canon.com.au/Personal/Products/Camerasand-

Accessories/Digital-Cameras/PowerShot-S100

http://www.nikonusa.com/en/Nikon-Products/Product/

Compact-Digital-Cameras/26332/COOLPIX-S4300.html

http://www.usa.canon.com/cusa/support/consumer/digital

cameras/powershot g series/powershot g3]Speciﬁcations

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

190

Figure 1: Constructed Feature Taxonomy.

Figure 2: Testing Feature Taxonomy.

the constructed taxonomy, the precision and recall are

deﬁned as : Precision = ML(FH)/|L| and Recall =

ML(FH)/|L

MFT

Table 4: Recall and Precision of Relation Evaluation

Relations in MFT Relations in FH Recall Precision

Camera 1 75 97 0.40 0.46

Camera 2 63 97 0.57 0.65

Camera 3 71 102 0.51 0.57

Table 4 illustrates the evaluation results includ-

ing the number of relations within the testing tax-

onomy, the number of relations within our generated

taxonomy, recall and precision for the three different

cameras, respectively. From the results, we can see

that our generated feature taxonomy correctly cap-

ture around 50% of the relationships. Figure 1 and

Figure 2 show a part of the feature taxonomy gen-

erated from our proposed approach and the testing

taxonomy generated based on the product speciﬁca-

tion available online given by domain experts, respec-

tively. From the comparison, our generated feature

taxonomy identiﬁes the relation between picture and

resolution. Although the testing taxonomy uses more

technical terms, which are image sensor instead of

picture; in fact, they refer to the same attribute of the

camera according to common knowledge. Similarly,

the (mode,auto) and (shooting modes,intelligent auto)

indicate the same relationship between two features.

As aforementioned, the online users and manufac-

ture experts may describe the same feature by using

totally different terms or words. This does affect the

performance (both recall and precision) of our pro-

posed approach in feature relationship identiﬁcation

negatively. For instance, the user may prefer using

“manual” to depict a speciﬁc camera mode option.

By contrast, the manufacture experts usually pick the

term “custom” to describe this sub feature which be-

longs to ”shooting modes”. In such a case, the two

relations: (mode, manual) and (shooting modes, cus-

tom) cannot match.

5 CONCLUSION AND FUTURE

WORK

In this paper, we introduced a product feature taxon-

omy learning approach based on frequent patterns and

association rules. The objective is to not only extract

product features mentioned in user reviews but also

identify the relationship between the generated fea-

tures. The results of our experiment indicate that our

proposed approach is effective in both identifying cor-

rect features and structural relationship between them.

Particularly, the feature relationships captured in the

feature taxonomy provide more detailed information

about products. This leads us to represent products

proﬁles as multi-levels of feature, rather than a single

level as most other methods do.

In the future, we plan to improve and evaluate our

proposed product model by utilizing semantic simi-

larity tools. For instance, the vocabulary mismatch

can be handled by examining the semantic similar-

ity when we undertake the structural relation evalua-

tion. In addition, we plan to develop a review recom-

mender system that makes use of the proposed prod-

uct model in order to identify high quality reviews.

The structural relations of the product model are able

to assist identifying some characteristics of reviews,

such as how a certain feature and its sub features have

been discussed and how many different features have

been covered. Our system will therefore aim at rec-

ommending reviews based upon such criteria to help

users make purchasing decisions.

REFERENCES

Abbasi, A., Chen, H., and Salem, A. (2008). Sentiment

analysis in multiple languages: Feature selection for

opinion classiﬁcation in web forum. ACM Transac-

tions on Information Systems, 26(3).

Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent

dirichlet allocation. Journal of Machine Learning Re-

search, 3:993 – 1022.

Ding, X., Liu, B., and Yu, P. S. (2008). A holistic lexicon-

based approach to opinion mining. In Proceedings of

the 2008 International Conference on Web Search and

Data Mining, pages 231 – 240.

Hai, Z., Chang, K., Kim, J., and Yang, C. (2013). Identify-

ing features in opinion mining via intrinsic and extrin-

sic domain relevance. IEEE Transactions on Knowl-

edge and Data Engineering, pages 1 – 1.

ProductFeatureTaxonomyLearningbasedonUserReviews

191

Hofmann, T. (2001). Unsupervised learning by probabilistic

latent semantic analysis. Machine Learning, 42(1 -

2):177 – 196.

Hu, M. and Liu, B. (2004a). Mining and summarizing cus-

tomer reviews. In 10th ACM SIGKDD international

conference on Knowledge discovery and data mining.

Hu, M. and Liu, B. (2004b). Mining opinion features in

customer reviews. In Proceedings of the 19th national

conference on Artiﬁcal intelligence.

Hu, W., Gong, Z., and Guo, J. (2010). Mining product fea-

tures from online reviews. In IEEE International Con-

ference on E-Business Engineering, pages 24 – 29.

Lau, R. Y., Lai, C. C., Ma, J., and Li, Y. (2009). Automatic

domain ontology extraction for context-sensitive opin-

ion mining. In Proceedings of the Thirtieth Interna-

tional Conference on Information Systems.

Lewis, D. D. (1992). An evaluation of phrasal and clus-

tered representations on a text categorization task. In

Proceedings of the 15th ACM International Confer-

ence on Research and Development in Information

Retrieval, pages 177 – 196.

Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999).

Efﬁcient mining of association rules using closed

itemset lattices. Information Systems, 24(1):25 – 46.

Popescu, A.-M. and Etzioni, O. (2005). Extracting product

features and opinions from reviews. In Proceedings of

the conference on Human Language Technology and

Empirical Methods in Natural Language Processing,

pages 339–346.

Scafﬁdi, C., Bierhoff, K., Chang, E., Felker, M., Ng, H., and

Jin, C. (2007). Red opal: Product-feature scoring from

reviews. In Proceedings of the 8th ACM conference on

Electronic commerce, number 182 - 191.

Subrahmanian, V. S. and Reforgiato, D. (2008). Ava:

Adjective-verb-adverb combinations for sentiment

analysis. IEEE Intelligent Systems, pages 43 – 50.

Tang, J., Leung, H.-f., Luo, Q., Chen, D., and Gong,

J. (2009). Towards ontology learning from folk-

sonomies. In Proceedings of the 21st international

jont conference on Artiﬁcal intelligence, pages 2089 –

2094.

Wright, A. (2009). Our sentiments, exactly. Communica-

tions of the ACM, 52(4):14 – 15.

Xu, Y., Li, Y., and Shaw, G. (2011). Representations for

association rules. Data and Knowledge Engineering,

70(6):237 – 256.

Zhang, Y. and Zhu, W. (2013). Extracting implicit features

in online customer reviews for opinion mining. In

Proceedings of the 22nd international conference on

World Wide Web companion, pages 103 – 104.

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

192