Review-based Entity-ranking Reﬁnement

Panagiotis Gourgaris, Andreas Kanavos, Christos Makris and Georgios Perrakis

Dept. of Computer Engineering and Informatics, University of Patras, Patras, Greece

Keywords:

Inference Network, Knowledge Extraction, Opinion Mining, Re-ranking Model, Searching and Browsing,

Text Mining, Web Information Filtering and Retrieval.

Abstract:

In this paper, we address the problem of entity ranking using opinions expressed in users’ reviews. There is an

abundance of opinions on the web, which includes reviews of products and services. Speciﬁcally, we examine

techniques which utilize clustering information, for coping with the obstacle of the entity ranking problem.

Building on this framework, we propose a probabilistic network scheme that employs a topic identiﬁcation

method so as to modify ranking of results based on user personalization. The contribution lies in the construc-

tion of a probabilistic network which takes as input the belief of the user for each query (initially, all entities

are equivalent) and produces a new ranking for the entities as output. We evaluated our implemented method-

ology with experiments with the OpinRank Dataset where we observed an improved retrieval performance to

current re-ranking methods.

1 INTRODUCTION

The rapid development of web technologies and so-

cial networks has created a huge volume of reviews

on products and services as well as opinions on events

and individuals. Opinions are considered as an im-

portant part in human activity because of their affect

on decision-making. Speciﬁcally, consumers are used

to being informed by other users’ reviews in order to

carry out a purchase of a product, service, etc. One

other major beneﬁt is that businesses are really inter-

ested in the awareness of the opinions and reviews

concerning all of their products or services and thus

appropriately modify their promotion as well as their

further development.

Concerning consumers, one has to refer to many

reviews so as to create an overall evaluation assess-

ment for a set of objects of a speciﬁc entity. From

these reviews, there must be an adequate extraction of

several opinions for utilizing an observable conclu-

sion for each one of the objects. The purpose of this

extraction is the classiﬁcation of speciﬁc objects and

the latter discernment of those that are notable. At this

point, it is clear to mention that this multitude of opi-

nions creates a challenge for both the consumer and

the entity ranking systems. As an example, suppose

that we are interested in purchasing a smartphone de-

vice; this drives us in searching for device reviews

that are written by users based on their experiences of

corresponding products. Potential consumers usually

look for speciﬁc features (or else named aspects) in

a product, e.g. ios or android operating system, bat-

tery life, number of camera megapixels, etc. As a re-

sult, consumers usually refer to reviews of other users

searching for devices with positive opinions regarding

these speciﬁc features and characteristics.

The above procedure is really exhausting and

time-consuming. But as a matter of fact, the develop-

ment of computational techniques for assisting users

to utilize all opinions, is a very important and inte-

resting research challenge. In (Ganesan and Zhai,

2012), authors depict the setup for an opinion-based

entity ranking system. The intuition behind their work

is that each entity can be represented by all the re-

view texts and that the users of such a system can

determine their preferences on several attributes dur-

ing the evaluation process. Thus, we can expect that

a user’s query would consist of preferences on mul-

tiple attributes. For the previous example regarding

smartphone devices, one potential user’s query could

be “ios system, large battery life, 32 megapixels”, ex-

pressing user’s preferences in three different aspects

of the mentioned entity. A solution to the problem

of assessing entities can be its transformation into a

matching preferences problem, where we can employ

any standard information retrieval model. That is,

given a query from the user, which consists of key-

words and expresses the desired features that an en-

402

Gourgaris P., Kanavos A., Makris C. and Perrakis G..

Review-based Entity-ranking Reﬁnement.

DOI: 10.5220/0005428604020410

In Proceedings of the 11th International Conference on Web Information Systems and Technologies (WEBIST-2015), pages 402-410

ISBN: 978-989-758-106-9

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

tity should have, we can evaluate all candidate entities

based on how well their opinions match user’s prefer-

ences.

The setup presented in (Ganesan and Zhai, 2012)

is an information retrieval approach which uses the

importance of aspect keywords on review texts. We

investigate the behavior of entity ranking following

the information retrieval approach. We strive to use

the ratings of aspects in order to identify entities con-

taining similar aspect reviews among the users and

use this information to make a better entity ranking.

Finally, we consider that not all aspects are equally

important to be used in the assessment of the entities.

Furthermore, in our work, we enhance the work

presented in (Makris et al., 2013), as we propose a se-

mantically driven Bayesian Inference Network, incor-

porating semantic concepts (as extracted in (Makris

and Panagopoulos, 2014)) so as to improve the ran-

king quality of documents. Concerning Bayesian Net-

works, they are progressively being used in a vari-

ety of areas like Web Searching (Acid et al., 2003),

(Teevan, 2001), Bioinformatics (Niedermayer, 2008)

and other. A major subclass of Bayesian Networks is

the Bayesian Inference Network (BIN) (Turtle, 1991)

that has been employed in various applications (Abdo

et al., 2014), (Ma et al., 2006), (Teevan, 2001).

Building on this idea, we utilize schemes that take

into account clustering about the opinions emerging

in reviews. We also propose a probabilistic network

scheme (based on Inference Network modeling), that

employs a topic identiﬁcation method. The rest of the

paper is organized as follows. In Section 2, related

work as well as contribution is presented. In Section

3, we present the extensions regarding ranking tech-

niques. Subsequently, in Section 4, we describe our

re-ranking proposed system. In following, Section 5

presents a reference to our experimental results; we

therefore give a presentation of our results. Finally,

Section 6 concludes the paper and provides future

steps and open problems.

2 RELATED WORK

As we have already stated, in our manuscript, we try

to address the problem of creating a ranked list of enti-

ties using users reviews and at a latter stage, to present

a re-ranked list according to their selections. As a

result, the aspect-oriented or feature-based opinion

mining as deﬁned in (Ganesan and Zhai, 2012) is em-

ployed. Along this line of consideration, each entity is

represented as its total review texts and users express

their queries as preferences in multiple aspects. More-

over, in (Ganesan and Zhai, 2012), authors presented

a setup for entity ranking, where entities are evalu-

ated depending on how well the opinions expressed

in the reviews are matched against user’s preferences.

They studied the use of various state-of-the-art re-

trieval models for this task, such as the BM25 retrieval

function (Robertson and Zaragoza, 2009), the Dirich-

let prior retrieval function (Zhai and Lafferty, 2001),

as well as the PL2 function (Amati and van Rijsber-

gen, 2002). Also, they proposed some new extensions

over these models, including query aspect modeling

(QAM) and opinion expansion; the latter expansion

model introduced common praise words with positive

meaning for favoring texts and correspondingly enti-

ties with positive opinions on aspects.

In (Makris and Panagopoulos, 2014), they further

improved the setup by developing schemes, which

take into account sentiment and clustering informa-

tion about the opinions expressed in reviews; also au-

thors propose the naive consumer model as an un-

supervised schema that utilizes information from the

web so as to yield a weight of importance to each of

the features used for evaluating the entities.

Regarding reviews, a great deal of research has

been utilized in the classiﬁcation of reviews to posi-

tive and negative ones, based on the overall sentiment

information contained. There have been proposed

several supervised (Dave et al., 2003), (Pang and

Lee, 2004), unsupervised (Nasukawa and Yi, 2003),

(Turney and Littman, 2003), as well as hybrid (Pang

and Lee, 2005), (Prabowo and Thelwall, 2009) tech-

niques. In addition, there has been much research in

the direction of employing users reviews for provi-

sioning ratings in according aspects (Lu et al., 2009),

(Wang et al., 2010). These methods are relevant to

the one proposed here as with the use of aspect based

analysis, the ratings of the different aspects from the

reviews can be consequently extracted. However, our

approach differs in the applied methodology as we do

not explicitly utilize any of the modeling capabilities

that these theories provide.

A very related research area is opinion retrieval

(Liu, 2012), which aims to identify documents that

contain opinions. An opinion retrieval system is usu-

ally created on top of the classical recovery models;

relevant documents are initially retrieved and con-

currently some opinion analysis techniques are being

used so as to export documents with emerging opi-

nions. The ﬁeld of expert ﬁnding can be considered

as another related research area. Particularly, a ranked

list of persons that can be regarded as experts on a cer-

tain topic (Fang and Zhai, 2007), (Wang et al., 2010)

can be recovered. In particular, we are trying to ex-

port a ranked list of entities, but instead of evaluating

the entities based on how well they match a topic, we

Review-basedEntity-rankingRefinement

403

can use the opinions for the entities and as a result to

observe how well they match the user’s preferences.

Concerning the ranking quality of documents, au-

thors in (Lee et al., 2011) enrich the semantics of

user-speciﬁc information and documents targeting at

efﬁcient implementation of personalized searching

strategies. They adopt a Bayesian Belief Network

(BBN) as a strategy for personalized search since they

provide a clear formalism for embedding semantic

concepts. Their approach is different from ours, as

they use belief instead of inference networks and then

they employ the Open Directory Project Web direc-

tory. In (Abdo et al., 2014), the authors enhance the

BINs using relevance personalization information and

multiple reference structures applying their technique

to similarity-based virtual screening, employing two

distinct methods for carrying out BIN searching: re-

weighting the fragments in the reference structures

and a group fusion algorithm. Our approach aims at a

different application and employs semantic informa-

tion, as a distinct layer in the applied inference net-

work.

Alongside this line of research, there are ap-

proaches that exploit information from past user

queries and preferences. Relevant techniques range

from simple systems implementing strategies that

match users’ queries to collection results (Meng et al.,

2002), to the employment of the machine learning

methods exploiting the outcomes of stored queries,

so as to permit more accurate rankings (Liu, 2011).

There is also a related, but different to our focus, work

(Brandt et al., 2011) combining diversiﬁed and in-

teractive retrieval under the label of dynamic ranked

retrieval. In contrast, in (Makris et al., 2013), they

initially proposed transparent embedding of seman-

tic knowledge bases to improve search engine results

re-ranking; for this purpose they created a new proba-

bilistic model which takes as input different semantic

knowledge bases. Here in this paper, we insert doc-

ument clustering based on entity identiﬁcation in the

belief network making it possible to identify the in-

terest of users to thematic groups of results. This is

a novel approach on exploiting semantic knowledge

on belief networks applied for information retrieval

problems.

The main contribution of the proposed method is

the incorporation and further examination of cluster-

ing techniques regarding the opinions emerging in re-

views; also the proposal of a probabilistic network

scheme based on Inference Network modeling in or-

der to modify the ranking of results as the users select

entities. The method exploits the user’s belief from

the selected entity through the constructed network,

to the other entities that contain the senses of the se-

lected one. The re-ranking of the results is based on a

vector which contains a weight for each entity repre-

senting the probability of the entity to be relevant for

the user. Our method constructs a probabilistic net-

work from the entities, the clusters of the entities as

well as the aspects; so when the users select an entity,

the weights of the entities, which are in the same clus-

ter, take larger values and thus are ranked higher. De-

tailed experiments with the proposed method show in-

creased performance in comparison to the one-phase

rankings (without re-ranking module) of the result set.

3 EXTENSIONS

We present a methodology for extracting weights and

therefore effecting the Ranking List taken from the

system. There are two perspectives we try to address.

The ﬁrst one concerns information extracted from the

reviews that already exist in our database. The se-

cond deals with personalization from the viewpoint

of query introduced real time by the user.

3.1 Opinion-aspect Query Expansion

We can utilize opinion expansion in query q with use

of the WordNet. We consider query q

as that part of

the overall query concerning a single aspect.

In addition, we form Q

exp

which is the set of

, q

, . . . , q

}, where each query has value q

, st

, . . . , st

} and t

is treated holistically in the

form of the term q

. Also, st

i j

are the synonyms of the

opinion term t

and k is their number.

The aspects are handled in a similar way. As a

matter of fact, each time there is a reference in an as-

pect, the reference would concern its synonyms too.

Finally, we introduce collections Dq, where Dq

the collection of reviews that contain terms from q

3.2 Query Personalization

The intuition behind this method is that the aspects

do not have the same meaning when applied to the

query. More precisely, the same takes place for q

The assumption we make is intuitive that the ﬁrst as-

pect applied by the user, is the most important. So the

aspects introduced ultimately have less importance.

3.3 RRR Weighting

Two weights are used as re-ranking factors, termed as

RRR1 and RRR2 throughout our study. Initially, for

the RRR1 weight, we formulate the importance of a

speciﬁc aspect a

in the original collection C. We use

WEBIST2015-11thInternationalConferenceonWebInformationSystemsandTechnologies

404

the information |C

f i

| which is the number of docu-

ments containing a

0 ≤ log

(1 +

f i

|C|

) ≤ 1 (1)

For the second re-ranking factor, RRR2, we uti-

lize the Poisson discrete distribution. It expresses the

probability of a given number of events to occur in

a ﬁxed interval of time, if these events occur with a

known average rate λ and independently of the time

since the last event.

(X = k) =

× e

−λ

(2)

where λ here is constant with value 1.8. The idea here

is that aspects with given position in the query k are

introduced to the system. The event of its occurrence

in place k is what needs to be evaluated. By selecting

a ﬁxed λ, we favour aspects that are between places 1

and 2.

4 RE-RANKING PROPOSED

SYSTEM

In order for a user to utilize the services provided

by our inference network model, they simply need to

give their feedback concerning a speciﬁc query, con-

sisting of a combination of different aspects. Every

time the user clicks on a result (entity in our example),

the inference network is utilized producing the new

improved ranking list according to their choice. In

particular, the system works in a real-time and in par-

allel basis, namely re-arranging the initial order and

the algorithm introduced is presented in detail in the

following sections. Both the re-ranking method and

the network were inspired by (Makris et al., 2013).

4.1 The Re-Ranking Method

Our reﬁnement method improves the initial ranking

based on the user’s selections; our personalization

re-organizing step is utilized accompanied by the

input produced by each and every user. This step runs

iteratively every time the user makes a selection. In

particular the proposed network is utilized, either in

combination with the initial ranking returned by the

search engine or with the previous ranking of the re-

sults in the re-ranking process (if we have a series of

re-rankings). The above options are expressed by the

following equation. These were also used, though not

entirely in the same way, in (Makris et al., 2013) (see

also (Antoniou et al., 2012)):

NewRankingScore

(n − PreviousRank

+ 1) × (1 + β × R

)

(3)

where R

denotes the re-ranking weight provided by

the network for e

(its computation is described in

the next section), PreviousRank

stands for the pre-

vious rank position for entity e

utilized from the

BM25 procedure algorithm (IR model) incorporating

our RRR weights, n is the number of results retrieved

and β is a user deﬁned weight factor. Intuitively, when

the factor β is changed, the re-ranking process results

in major rank changes.

Equation 3 is used for the composite case where

the new ranking system is composed with the previ-

ous ranking of search results. In the new ranking pro-

duced, the results are ranked according to the above

calculated score. When the user selects further re-

sults, the same procedure is followed with the differ-

ence that the ranking produced by the previous phase

is used as input for the next reordering.

Even though we assume that most results se-

lected by a user are relevant, our scheme incorporates

smoothly the previous ranking, hence it is robust to

user misselections. A misselection of a result leads

to the inclusion of its relevant information to the ran-

king process, but still can be made to not affect sig-

niﬁcantly the produced ranking.

4.2 Re-Ranking Weight Calculation

Our inference network, as depicted in following

scheme, consists of four component levels (two of

them are the same): the entities level (implement-

ing user’s personalization assuming that each entity

corresponds to a unique document), the clusters level,

the aspects level (query level) and a fourth level that

represents the entities as well as the weights they are

assigned by the re-ranking procedure. The ﬁnal level

can be considered to play the role of the query layer in

the traditional inference network model and its pres-

ence signiﬁes that we are interested to model speciﬁc

re-rankings based on users’ queries.

The proposed inference network is implemented

once for the speciﬁc dataset and its structure does

not change during re-ranking. The entities level con-

tains a node e

for each entity of the query’s results.

For each entity the user selects, we assign weights to

corresponding clusters Cl

as we incorporate a fuzzy

clustering algorithm where each entity can be placed

to more than one cluster (with weight corresponding

to its contribution to this speciﬁc cluster). The cluster

Review-basedEntity-rankingRefinement

405

entities level

Top

Bottom

clusters level

aspects level

entities level

Figure 1: Inference Network.

nodes are implicitly connected with the aspects a

;

more precisely, in this level we take into considera-

tion the probabilities that we will describe in the next

section. Finally, from the aspect nodes, we expose in-

formation regarding the entity nodes belonging to the

last level of our proposed network. The formed net-

work is a four level, unidirectional graph in which the

information ﬂows from the initial entity node of the

ﬁrst level to the cluster nodes and then through the as-

pect nodes to the entity nodes of the last level (from

top to bottom).

The contribution in our network is the addition of

the aspects level (or else the query expansion) based

on users entity’s selection. As it is presented in the

Figure 1, the entity nodes are connected to their dif-

ferent aspects through directed arcs. The existence of

a directed path between a cluster node and an aspect

node denotes that this aspect is appearing in the re-

spective cluster and more speciﬁcally that the speciﬁc

aspect has been observed in the speciﬁc entity reviews

collection. The last level’s entity nodes are same with

the nodes of the ﬁrst layer, but with a slight differ-

ence; they represent the same entities but regarding

different aspects (we present this fact by connecting

the aspect nodes with the entity nodes of the last layer

with different lines, just to depict the fact that we are

interested in different aspects). The aspects level mo-

dels the needs of a user; that is the aspects the user

looks for. The aspect nodes are connected to every

node of the last level representing an entity where this

aspect appears. In addition, these entity nodes have an

accumulated belief probability that is used for the pro-

posed re-ranking procedure. The value of this belief

is estimated based on the different emerging aspects

of the entity and denotes the conceptual similarity be-

tween the entity and the information need of the user.

4.3 Estimation of Probabilities and

Rearrangement of Results

The proposed inference network is based on that

proposed in (Turtle, 1991) but with a slight differ-

ence; there, an information retrieval model is pro-

posed while in our manuscript a re-ranking model is

utilized. More speciﬁcally, we have only employed

it as a weight propagation mechanism using the ma-

chinery for computing the beliefs at the last level of

the network, thus providing a set of prior weights for

the entities of the ﬁrst level. That is, as a user selects

an entity, the corresponding entity is assigned a prob-

ability and as a result we compute the alternation of

the beliefs at the last level. This belief is transferred

through the network to the aspect nodes and then to

the ﬁnal layer representing the entity nodes and alter-

nates the results through the re-ranking process.

For the estimation of the nodes probabilities for

the constructed inference network, we begin by as-

signing weights to the root nodes (entities). We en-

able the entity node that matches the user’s selection;

the speciﬁc entity will trigger the cluster nodes the en-

tity belongs to and as a result, the arcs starting from

this entity node take a prior probability. More specif-

ically, each entity node has an initial probability that

denotes the chance of the selection of that correspond-

WEBIST2015-11thInternationalConferenceonWebInformationSystemsandTechnologies

406

ing entity from the user. For each entity e

this initial

probability will be:

p(e

) =

, i = {1, . . . , n} (4)

where n is the number of the different entities. This

probability will change into 1 for the selection denot-

ing user belief. In addition, for our speciﬁc dataset

and for each entity e

that participates to all clusters,

the probability of a cluster Cl

will be:

p(Cl

, e

, . . . , e

) =

all

∑

j=1

all

, j = {1, . . . , k} (5)

all

∑

i=1

p(e

) × w

i j

(6)

where k is the number of the different clusters and w

i j

is the contribution of the entity e

in the cluster Cl

The aspects probabilities at third level are calculated

as follows:

p(a

|Cl

) =

m j

∑

j=1

m j

, m = {1, . . . , l} (7)

m j

∑

i=1

t f (a

, e

)

∑

v=1

t f (a

, e

)

× p(e

) × w

i j

(8)

where l is the number of the different aspects, wa

m j

the number of aspects and t f (a

, e

) is the term fre-

quency of aspects a

related to the entity e

. The enti-

ties probabilities at fourth level are calculated as fol-

lows:

p(e

, a

, . . . , a

) =

∑

v=1

t f (a

, e

)

∑

i=1

∑

v=1

t f (a

, e

)

(9)

Finally, the whole probability of each entity is cal-

culated by the following equation, that is the transfer

of user’s belief through the network:

p(e

, a

, . . . , a

, Cl

, . . . , Cl

, e

, . . . , e

) =

p(e

, a

, . . . , a

) ×

∏

m=1

∏

j=1

p(a

|Cl

)

∏

j=1

p(Cl

, e

, . . . , e

) ×

∏

i=1

p(e

)

This is the belief which is then put in equation 3,

providing the user with a re-ordered list.

5 EXPERIMENTS

5.1 Experimental Setting

In order to assess our methods performance, we com-

pared the initial ranking taken from the ideal ranking

based on the ratings of the aspects that users had given

with the ranking of the results after the application of

the proposed methods. The experiments were carried

out using the OpinRank Dataset, which was presented

in (Ganesan and Zhai, 2012) and consists of entities,

which are accompanied by reviews of users from two

different domains (cars and hotels); the reviews come

from the sites Edmunds.com and Tripadvisor.com re-

spectively. Particularly for our evaluation, we use the

reviews from the domain of the cars which includes

car models as well as the corresponding reviews for

the years 2007 (227 models), 2008 (228 models) and

2009 (143 models). In our set of experiments, we per-

form 100 queries and we evaluate the performance of

our schemas so as to produce the correct entity ran-

king, calculating the nDCG at the ﬁrst 10, 20 results

as well as 50 results.

We used the Normalized Discounted Cumula-

tive Gain (nDCG) measure (Kalervo Jarvelin, 2000),

which quantiﬁes the gain of a document based on its

position in the result list. The nDCG measure is based

on the relevance judgments of the documents of the

result list. Formally, the nDCG is computed at posi-

tion p as:

DCG

= rel

∑

i=2

rel

log

(10)

nDCG

DCG

IDCG

(11)

where rel

are the document relevance scores from re-

views and IDCG

is the ideal DCG. The ideal DCG

is the DCG values when sorting the documents using

the relevance judgments.

We set the experiment posing each query and ran-

domly selecting a relevant document. Then using the

selected result, we perform re-ranking in the result

list. The re-ranking performance is measured using

nDCG for the initial ranking and after the re-ranking

step.

5.2 Results

From Equations 1, 2 and Section 4, we created Ta-

bles 1, 2 and 3 as well as corresponding graphical

representations in Figures 2 to 10 for the domain

of the cars for the years 2007, 2008 and 2009. The

Review-basedEntity-rankingRefinement

407

experiments depict that RRR1 weighting scheme has

superior performance from the formula that uses the

Poisson discrete distribution. This does not mean nec-

essarily that Poisson distribution is always worse but

probably that the proper choice of λ has to be care-

fully performed (perhaps using extra experiments)

tailored to the dataset used and also accompanied

with classiﬁcation techniques. The RRR3 results pre-

sented, are produced by weighting the entities with

both the RRR1 and RRR2 weights. It is clear, as we

have assumed, that the re-ranking network performs

better than all the other simple ranking methods.

Table 1: Average of nDCG@10 for years 2007, 2008 and

2009.

Method nDCG

BM25 0.879

BM25 + RRR1 0.895

BM25 + RRR2 0.879

BM25 + RRR3 0.893

BM25 + Network 0.915

Table 2: Average of nDCG@20 for years 2007, 2008 and

2009.

Method nDCG

BM25 0.879

BM25 + RRR1 0.894

BM25 + RRR2 0.878

BM25 + RRR3 0.893

BM25 + Network 0.913

Table 3: Average of nDCG@50 for years 2007, 2008 and

2009.

Method nDCG

BM25 0.880

BM25 + RRR1 0.892

BM25 + RRR2 0.879

BM25 + RRR3 0.890

BM25 + Network 0.905

In the following Figures 2 to 10, we present the

comparisons in the performance of the simple BM25

model with the weighting scheme RRR1 as well as

the re-ranking network. We can observe that for all 3

years, the re-ranking model and the weighting scheme

outperforms clearly the classic BM25 model.

6 GENERAL CONCLUSIONS

AND FUTURE WORK

In this work we have presented entity ranking tech-

niques using opinions expressed in users’ reviews.

Figure 2: Measurements of the nDCG@10 for year 2007.

Figure 3: Measurements of the nDCG@20 for year 2007.

Figure 4: Measurements of the nDCG@50 for year 2007.

We examined three weighting schemas that incorpo-

rate clustering information for coping with the ob-

stacle of the entity ranking problem. Implement-

ing the aspects expansion, we aim in more meaning-

ful queries incorporating additional semantic infor-

mation; by that we address the problem of the ini-

tial limited collection of corresponding queries. In

addition, we propose a probabilistic network scheme

that employs a topic identiﬁcation method so as to

modify ranking of results based on users personal-

ization. Based on this technique, our proposed sys-

tem rearranges the results; also without need of stor-

ing the navigation history of each speciﬁc user, we

achieve a signiﬁcant improvement on the ranking of

the results in terms of user preferences. The eval-

uation of our proposed implemented methods was

WEBIST2015-11thInternationalConferenceonWebInformationSystemsandTechnologies

408

Figure 5: Measurements of the nDCG@10 for year 2008.

Figure 6: Measurements of the nDCG@20 for year 2008.

Figure 7: Measurements of the nDCG@50 for year 2008.

examined through experiments with the OpinRank

Dataset where we observed an improved retrieval per-

formance to current ranking methods.

Regarding future work, there is the attractive is-

sue of studying an information retrieval model that

favors texts (and correspondingly entities) with posi-

tive opinions on aspects while imposes penalties for

negative opinions. Furthermore, another interesting

point is the incorporation of other knowledge bases

besides WordNet, such as YAGO and Wikipedia so

as to further enhance our inference network by em-

bedding in it additional semantic concepts. Further

study and work is imminent on the Poisson weight.

Ranking systems based on reviews provide us with

the ability to pre-decide for all possible queries form.

Since given aspects are extracted, inserted queries are

Figure 8: Measurements of the nDCG@10 for year 2009.

Figure 9: Measurements of the nDCG@20 for year 2009.

Figure 10: Measurements of the nDCG@50 for year 2009.

known a priori. Using this knowledge, we can ex-

amine additional characteristics out of them and as a

result to incorporate new weighting information re-

trieval techniques in such ranking systems.

REFERENCES

Abdo, A., Leclere, V., Jacques, P., Salim, N., and Pupin, M.

(2014). Prediction of new bioactive molecules using

a bayesian belief network. In Journal of Chemical

Information and Modeling, Volume 54, Issue 1, pp.

30-36.

Acid, S., de Campos, L. M., Fernandez-Luna, J. M., and

Huete, J. F. (2003). An information retrieval model

based on simple bayesian networks. In International

Review-basedEntity-rankingRefinement

409

Journal of Intelligent Systems, Volume 18, pp. 251-

265.

Amati, G. and van Rijsbergen, C. J. (2002). Probabilis-

tic models of information retrieval based on measur-

ing the divergence from randomness. In ACM Trans-

actions on Information Systems (TOIS), Volume 20,

Number 4, pp. 357-389.

Antoniou, D., Plegas, Y., Tsakalidis A., Tzimas, G. and Vi-

ennas, E. (2012). Dynamic Reﬁnement of Search En-

gines Results Utilizing the User Intervention. In Jour-

nal of Systems and Software, Volume 85, pp. 1577-

1587.

Brandt, C., Joachims, T., Yue, Y., and Bank, J. (2011). Dy-

namic ranked retrieval. In WSDM, pp. 247-256.

Dave, K., Lawrence, S., and Pennock, D. M. (2003). Min-

ing the peanut gallery: Opinion extraction and seman-

tic classiﬁcation of product reviews. In International

Conference on World Wide Web (WWW), pp. 519-528.

Fang, H. and Zhai, C. (2007). Probabilistic models for ex-

pert ﬁnding. In European Conference on IR Research

(ECIR), pp. 418-430.

Ganesan, K. and Zhai, C. (2012). Opinion-based entity ran-

king. In Information Retrieval (IR), Volume 15, Issue

2, pp. 116-150.

Kalervo Jarvelin, J. K. (2000). Ir evaluation methods for

retrieving highly relevant documents. In SIGIR, pp.

41-48.

Lee, J.-W., Kim, H.-J., and Lee, S.-G. (2011). Exploit-

ing taxonomic knowledge for personalized search: A

bayesian belief network-based approach. In Journal of

Information Science and Engineering (JISE), Volume

27, pp. 1413-1433.

Liu, B. (2012). Sentiment Analysis and Opinion Mining.

Morgan and Claypool Publishers.

Liu, T.-Y. (2011). Learning to Rank for Information Re-

trieval. Springer.

Lu, Y., Zhai, C., and Sundaresan, N. (2009). Rated aspect

summarization of short comments. In International

Conference on World Wide Web (WWW), pp. 131-140.

Ma, W. J., Beck, J. M., Latham, P. E., and Pouget, A.

(2006). Bayesian inference with probabilistic popu-

lation codes. In Nature Neuroscience, Volume 9, pp.

1432-1438.

Makris, C., Plegas, Y., Tzimas, G., and Viennas, E. (2013).

Serfsin: Search engines results’ reﬁnement using a

sense-driven inference network. In WEBIST, pp. 222-

232.

Makris, C. and Panagopoulos, P. (2014). Improving

opinion-based entity ranking. In WEBIST, pp. 223-

230.

Meng, W., Yu, C. T., and Liu, K.-L. (2002). Building efﬁ-

cient and effective metasearch engines. In ACM Com-

puting Surveys, Volume 34, Issue 1, pp. 48-89.

Nasukawa, T. and Yi, J. (2003). Sentiment analysis: Captur-

ing favorability using natural language processing. In

International Conference on Knowledge Capture (K-

CAP), pp. 70-77.

Niedermayer, I. S. P. D. (2008). An introduction to bayesian

networks and their contemporary applications. In

Springer Studies in Computational Intelligence, pp.

117-130.

Pang, B. and Lee, L. (2004). A sentimental education:

Sentiment analysis using subjectivity summarization

based on minimum cuts. In Annual Meeting of the

Association for Computational Linguistics (ACL), pp.

271-278.

Pang, B. and Lee, L. (2005). Seeing stars: Exploiting class

relationships for sentiment categorization with respect

to rating scales. In Annual Meeting of the Association

for Computational Linguistics (ACL).

Prabowo, R. and Thelwall, M. (2009). Sentiment analy-

sis: A combined approach. In Journal of Informetrics

(JOI), Volume 3, Issue 2, pp. 143-157.

Robertson, S. E. and Zaragoza, H. (2009). The probabilistic

relevance framework: Bm25 and beyond. In Founda-

tions and Trends in Information Retrieval, Volume 3,

Issue 4, pp. 333-389.

Teevan, J. B. (2001). Improving information retrieval with

textual analysis: Bayesian models and beyond. In

Masters Thesis, MIT Press.

Turney, P. D. and Littman, M. L. (2003). Measuring praise

and criticism: Inference of semantic orientation from

association. In ACM Transactions on Information Sys-

tems (TOIS), Volume 21, Issue 4, pp. 315-346.

Turtle, H. R. (1991). Inference networks for document re-

trieval. In Doctoral Dissertation.

Wang, H., Lu, Y., and Zhai, C. (2010). Latent aspect rat-

ing analysis on review text data: A rating regression

approach. In SIGKDD International Conference on

Knowledge Discovery and Data Mining, pp. 783-792.

Zhai, C. and Lafferty, J. D. (2001). A study of smoothing

methods for language models applied to ad hoc infor-

mation retrieval. In SIGIR, pp. 334342.

WEBIST2015-11thInternationalConferenceonWebInformationSystemsandTechnologies

410