AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH

ENGINE LOG FILES

Ashok Veilumuthu and Parthasarathy Ramachandran

Department of Management Studies, Indian Institute of Science, Bangalore 560 012, India

Keywords:

Ranking Models, Feedback Aggregation, Implicit Feedbacks, Click Sequence, Partial Ordering.

Abstract:

The current approaches to information retrieval from the search engine depends heavily on the web linkage

structure which is a form of relevance judgment by the page authors. However, to overcome spamming

attempts and language semantics, it is important to also incorporate the user feedback on the documents’

relevance for a particular query. Since users can be hardly motivated to give explicit/direct feedback on search

quality, it becomes necessary to consider implicit feedback that can be collected from search engine logs.

Though there are number of implicit feedback measures proposed to improve the search quality, there is no

standard methodology proposed yet to aggregate those implicit feedbacks meaningfully to get a ﬁnal ranking

of the documents. In this article, we propose an extension to the distance based ranking model to aggregate

different implicit feedbacks based on their expertise in ranking the documents. The proposed approach has

been tested on two implicit feedbacks, namely click sequence and time spent in reading a document from the

actual log data of AlltheWeb.com. The results were found to be convincing and indicative of the possibility of

expertise based fusion of implicit feedbacks to arrive at a single ranking of documents for the given query.

1 INTRODUCTION

Search engines are information retrieval systems in-

tended to help the users locate their needed informa-

tion from internet by querying. These queries are usu-

ally constructed of keywords to express the users’ in-

formation needs. Initially, traditional keyword simi-

larity based techniques were used to retrieve the rel-

evant documents from the web. These techniques

rely completely on the content of the documents upon

which only its authors have sole control. This allows

them to manipulate the search results by tampering

the documents’ content. This lead to the failure of

the traditional keyword based techniques. Later in

the late 90s, the research community realized the im-

portance of utilizing the linkage structure that exists

within the web documents in the form of hyperlinks

to improve the search results. Two seminal works

by (Brin and Page, 1998) and (Kleinberg, 1999) have

used the linkage structure of the web as the human an-

notation about the quality of the documents. The web

linkage structure captures the importance of pages to

a large extent, though, it captures only the collective

relevance judgment given by the authors of the web-

pages and not of the readers/users. It is true that the

collective judgement of authors is a reliable informa-

tion, but, the end users are more eligible to judge

the credibility of the documents presented to them.

Therefore, the users’ feedback information would be

a valuable source to help improve the search results

further.

Multitude of studies have proposed ways of ob-

taining the user relevance feedback information and

methods to incorporate them into the retrieval engine.

The feedback information could be either explicit or

implicit. In the explicit feedback based methods the

users are explicitly asked to register their feedback

on the documents presented to them. Such a strategy

would impose an increased burden and cognitive load

on the users. Further, many users may not be moti-

vated to provide this information (White et al., 2002).

In the case of implicit feedbacks, the users interaction

with the search system will be recorded in the form

of a log ﬁle and its entries can be suitably interpreted

to infer the users’ relevance judgement on the doc-

uments presented. There was a debate going on for

almost half a decade on substituting the explicit feed-

back with the implicit ones. Later from experiments

it has been concluded that implicit feedbacks can be

used as viable alternative for their explicit counter-

parts by (White et al., 2002).

269

Veilumuthu A. and Ramachandran P..

AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH ENGINE LOG FILES.

DOI: 10.5220/0003096502690274

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2010), pages 269-274

ISBN: 978-989-8425-28-7

 2010 SCITEPRESS (Science and Technology Publications, Lda.)

Since then, many attempts have been made to

grasp every possible user behavior during the search

and use them as a proxy to their relevance feedback

later. See, for example (Kim et al., 2000; Kelly and

Belkin, 2004; Ramachandran, 2005; Agichtein et al.,

2006; Veilumuthu and Ramachandran, 2007). It is

understandable that every feedback will have its own

advantage and disadvantage and therefore, it would

be more useful, if there was a way to aggregate these

feedbacks based on their expertise in achieving the

ideal ranking. Such an aggregation would help ex-

tracting the advantages from each one of the feed-

backs and will result in a more accurate and unbiased

ranking of documents.

A number of studies have been conducted in the

information retrieval literature by borrowing the theo-

ries from various ﬁelds to aggregate the rankings from

various sources. First of that kind was proposed by

(Dwork et al., 2001). They proposed a markov chain

based formulation to aggregate the individual rank-

ing produced by multiple search engines, and they

studied the impact of local kemenization in reducing

the spams. In this work, all the rankers are weighed

equally and hence it ignores the importance of the bet-

ter ranker. (Lebanon and Lafferty, 2002) used the dis-

tance based ranking models to propose a formalism

for the supervised ensemble learning and analyzed the

results for partial rankings. An unsupervised rank ag-

gregation approach using the distance based model

has been given by (Klementiev et al., 2008). These

two works hinted the possibility of incorporating the

expertise of the rankers while combining their input

rankings. All these studies intended to solve the su-

pervised as well as unsupervised rank aggregation

problems in the context of metasearch where the top

k-list is ﬁxed, no study has been done in the context

of relevance feedback aggregation where the partial

orderings can be of any length. This motivated us to

extend the distance based ranking models to combine

the various feedback rankings based on their expertise

in achieving the unknown ideal ranking.

In this paper, we propose a framework to aggre-

gate multiple feedbacks obtained in the form of partial

rankings from various user sessions into a single con-

sensus ranking. We extend the distance based ranking

models proposed by Mallows to make this unsuper-

vised aggregation more meaningful. The proposed

aggregation framework has been examined over the

two implicit feedbacks namely, click sequence and

time spent reading a document, extracted from the ac-

tual log data. The results are found be encouraging

and it also ensured the feasibility of achieving such an

expertise based aggregation of feedback rankings. In

this study, though we discuss only the aggregation of

implicit feedbacks, the construction doesn’t prevent

one from using it for the explicit counterparts. The

only requirement is that the easy convertibility of the

feedbacks in to partial orders without much informa-

tion loss.

2 DISTANCE BASED RANKING

MODELS

Given a set of elements, any meaningful ranking

scheme will assume the existence of an ideal ordering

of elements π

and will tend to arrange the elements

in an order closer to π

. Therefore, it is highly prefer-

able for a ranking scheme to produce a ranking closer

to π

than a ranking farther from it. That is, the prob-

ability of getting a permutation π should decrease as

its distance from π

increases. This is the basic intu-

ition behind most of the ranking models proposed in

the literature. In this paper, we use the family of dis-

tance based ranking models ﬁrst proposed by (Mal-

lows, 1957) and extended to partial orders by (Fligner

and Verducci, 1986). The two main features that mo-

tivated this selection are: (1) It gives a distributional

view of data, hence, an effective way of representa-

tion, and (2) easily interpretable distributional param-

eters, more importantly the dispersion parameter can

be interpreted as the expertise of the ranking scheme

(Lebanon and Lafferty, 2002).

Let X = {x

, ..., x

} be the set of items to be

ranked by the judges, identiﬁed with the indexes

1, ..., k. We denote the ranking given by the judges

with the permutation π = (π(1), ..., π(k)), where π(i)

is the rank given to the item x

and π

−1

(i) is the index

of the item assigned to rank i. If x

is preferred to x

then π(i) < π( j).

We will use π and π

−1

as vectors similar to

(Lebanon and Lafferty, 2002) whose i−th component

is π(i) and π

−1

(i) respectively. Thus π and π

−1

are the

vectors representing the ranking and ordering over the

set X respectively. If the ranking is over the entire set

X then it forms a full ranking π, but if the judge ranks

only p < k items in X then the resultant ordering will

be a partial ordering π

−1

∗

= (π

−1

(1), ··· , π

−1

(p)).

For brevity we represent π

−1

∗

as π

∗

2.1 Generalized Mallows’ Model

According to the generalized Mallows’ model,

for a given dispersion parameter −θ and location

parameter π

, the judges are assumed to generate

their rankings π from

KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval

270

P(π|θ, π

) =

exp{−θd(π, π

)}

∑

π∈Ω

exp{−θd(π, π

)}

π ∈ Ω, θ ∈ R

(1)

where, d(., .) is a right invariant distance metric, π

is a ﬁxed ranking, θ is the dispersion parameter and

ψ(θ) is the normalizing constant. When θ > 0, the

ﬁxed ranking π

is the modal ranking and when θ ap-

proaches inﬁnity mass gets concentrated at the single

ranking π

. When θ = 0 the distribution is uniform

and for θ < 0, π

is an antimode. The θ ≥ 0 can be

interpreted as the expertise of the judges. The prob-

ability of the ranking π decreases exponentially with

increase in the distance from the modal ranking π

(Fligner and Verducci, 1986) extended the above

Mallows model for the presence of partial orders

by viewing it as a multi-stage ranking process.

In the case of partial orderings, a judge reports

only his top p < k preferences, denoted by π

∗

(π

−1

(1), ··· , π

−1

(p)) and let the set of all partial or-

ders be Ω

∗

. They considered the partial ordering π

∗

as a censored observation from the Mallows distribu-

tion (1) and modeled the probability of observing π

∗

as the probability of getting a full ranking from the

coset S

k−p

π of all π ∈ Ω consistent with π

∗

Let, V

be the number of adjacent transpositions

taken in the order of π

required to place the item

−1

( j) in j

position. For example, if π

is an iden-

tity permutation then, V

is the number of adjacent

transpositions required to place the item j in j

po-

sition. V

, ··· ,V

depend on π ∈ S

k−p

π only through

∗

. The remaining vectors V

p+1

, ··· ,V

k−1

takes all its

(k − p)! possible values, thereby independent of π

∗

and a function of p and θ alone. The induced model

in partial ranking can be expressed as (Fligner and

Verducci, 1986):

P(π

∗

|θ, π

) =

exp

(

−θ

∑

j=1

(π

∗

, π

)

∏

j=1



1− exp{−(k− j + 1)θ}

1− exp(−θ)



(2)

3 AGGREGATION OF IMPLICIT

FEEDBACKS

As stated earlier, the feedback could be anything that

impose an ordering over a subset of documents for a

particular query. There are lots of implicit and explicit

feedbacksproposed in the literature, and anythingthat

can be converted into ranking with little effort would

be considered as feedback for our purpose. The users

are under no obligation to register their feedback over

the entire document list and therefore, it will be a par-

tial order. Since it is under the users’ discretion, they

might give their feedback on document sets of vary-

ing size. This demands a modiﬁcation to the model

(2) where the length of the partial orderings generated

is ﬁxed.

3.1 Model for Partial Orders of Varying

Length

For every length p, there exists a probability model

(2). Since the length of the partial orders given in

user sessions vary from one session to the another, it

follows a probability distribution P and let us assume

it to be known. Let p(.) be the function which maps

the partial ordering π

∗

to its length. Then the extended

model can be written as:

P(π

∗

|θ, π

) =

exp

(

−θ

p(π

∗

)

∑

j=1

(π

∗

, π

)

P(p(π

∗

))

p(π

∗

)

∏

j=1



1− exp{−(k− j + 1)θ}

1− exp(−θ)



(3)

P(p(π

∗

)) is the probability of getting an ordering of

length p(π

∗

). Note that this mass function P(·) is in-

dependent of π

and θ.

Consider that there are n judges giving their feed-

back on each of the m feedback measures, then the

ordering π

∗

(r)

obtained from the feedback r given by

judge s can be considered to be generated from the

extended model (3). The loglikelihood function of the

model will be as follows:

L(π

, θ

(r)

) =

∑

s=1

p(π

∗

(r)

)ln

1− exp(−θ

(r)

)

+ln

P(p(π

∗

(r)

))

− θ

(r)

p(π

∗

(r)

)

∑

j=1

(π

∗

(r)

, π

)

−

p(π

∗

(r)

)

∑

j=1

1− exp{−θ

(r)

[k− j + 1]}







(4)

Estimation of θ

(r)

: The dispersion parameter θ

(r)

for a ﬁxed π

can be estimated by solving the follow-

ing equation:

∂L

∂θ

(r)

∑

s=1







p(π

∗

(r)

(−θ

(r)

)

1− e

(−θ

(r)

)

−

p(π

∗

(r)

)

∑

j=1

(π

∗

(r)

, π

)

−

p(π

∗

(r)

)

∑

j=1

(k− j + 1)e

(−θ

(r)

(k− j+1))

1− e

(−θ

(r)

(k− j+1))







= 0

(5)

AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH ENGINE LOG FILES

271

Estimation of π

: Since the π

has been ﬁxed in the

estimation of θ

(r)

, the estimator

(r)

will be a function

of π

, denoted by

(r)

(π

). Therefore, the ideal rank-

ing estimate

can be obtained by iterating the above

equation (5) for all π

∈ Ω and by subsequently sub-

stituting the

(r)

and

values in the log likelihood

function (4) to get the pair that maximizes it.

= argmax

L(π

(r)

(π

)) (6)

3.2 Multiple Feedback Aggregation

Despite the unavailability of ideal ordering of doc-

uments π

, we would assume its existence for all

the queries and we argue that it is possible to effec-

tively estimate it through the partial orders {π

∗

(r)

: r =

1, ··· , m} induced by the observable feedbacks from

user sessions. These partial orders are proxy informa-

tion of the ideal ordering but with difference in their

precision {θ

(r)

: r = 1, ·, m}. The precision not only

changes with feedbacks but also with queries. This

is because of the feedbacks’ ability to achieve ideal

ranking changes amongst themselves as well as with

queries. Let π

∗

= (π

∗

(1)

, ··· , π

∗

(m)

) be the vector of

all m partial orders given in an user session, and each

component π

∗

(r)

of this vector follows an extended

model (3) with dispersion parameter θ

(r)

and ideal or-

dering π

. Let us denote the vector of all these m dis-

persion parameters as θ = (θ

(1)

, ··· , θ

(m)

Formally, for a given ideal ordering π

, the partial

orders {π

∗

(r)

: r = 1, ··· , m} extracted from m feed-

backs given in an user session are assumed to be con-

ditionally independent. Hence, the probability of get-

ting π

∗

for a given π

and θ is given by:

P(π

∗

|θ, π

) =

∏

r=1

P(π

∗

(r)

|θ

(r)

, π

)

(7)

3.3 Model Beneﬁts

The main advantages of using the distance based

modeling framework for feedback aggregation are the

following:

1) Since the distance used in the proposed model

(3) is the Kendall distance, the maximum likelihood

estimate of π

will be Kemney optimal and it will

enjoy the important rank aggregation properties such

as neutrality and consistency in social choice liter-

ature, widely known as Condorcet property (Dwork

et al., 2001). 2) It has been reported in (Fligner and

Verducci, 1988) that the maximum likelihood estima-

tor for π

can be obtained by arranging the elements

based on the vector of average ranks

π. This makes

the model computationally viable for larger n. 3)

Since sorting based on the average ranking

π being

the unbiased estimate of π

, it may seem similar to

Borda’s method of rank aggregation, in the case of

single implicit feedback. But, if there are more than

one feedback ranking that need to be jointly aggre-

gated, then the proposed method will have an edge

over the Borda’s method, where the weights that need

to be given to the individual ranking schemes are

not so obvious. Being an unsupervised aggregation

framework, it will estimate the expertise of the indi-

vidual feedbacks based on the given data, rather than

getting it externally.

4 MODEL EVALUATION

Since the experimentalevaluation of the implicit feed-

back is severely hampered due to the absence of an

independent evaluation of the documents, the exper-

iments are aimed at establishing the beneﬁts of the

proposed framework stated in Section 3.3.

In this experiment, a 24 hours log data recorded on

February 2001 by AlltheWeb.com has been used.

This dataset has been previously used by (Jansen and

Spink, 2005) to study the emerging trends in web

searching and later by (Veilumuthu and Ramachan-

dran, 2007) to verify the existence of the incremen-

tal information in using the click sequence and time

based implicit feedback measures.

Each tuple in the dataset corresponds to a click

event made by an user. The log contains the userID

(masked IP), clickTime (i.e., the time at which the

click has been made), the query (masked) posed and

the URL (masked) to which the click has been made.

In the present study, only the log entries pertaining

to two implicit feedbacks namely, (1) Click sequence

based ordering π

(o)

and (2) Time based based order-

ing, π

(t)

, stated in (Veilumuthu and Ramachandran,

2007) are used. For more details on order extraction

from log entries, the readers are referred to (Veilu-

muthu and Ramachandran, 2007). Top 10 non-trivial

queries (omitting queries like “google”)that had sufﬁ-

cient number of sessions (≥ 30), have been chosen for

the study. These selected queries formed the query set

Q. It is a known fact that the majority of the sessions

will be of length lesser than 3. Therefore, we picked

only the top 5 of the document list formed by the doc-

uments that appear in the top 3 positions in either of

the rankings. This is under the acceptable assumption

that the documents that are ranked higher in any of

the ranking will represent the data much better than

the others.

KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval

272

Table 1: Model parameters of the proposed and Borda’s aggregation models.

Query Order based Time based Uniﬁed

Proposed Borda’s Proposed Borda’s

ID π

(o)

(t)

Q1 1,2,3,5,4 1.23 1,2,3,5,4 1,2,3,5,4 0.94 1,2,3,5,4 1,2,3,5,4

Q2 2,1,3,4,5 0.25 2,1,4,3,5 2,1,4,3,5 0.21 2,1,4,3,5 2,1,3,4,5

Q3 2,5,3,1,4 0.65 2,5,3,1,4 5,2,3,1,4 0.48 5,2,3,1,4 2,5,3,1,4

Q4 4,5,2,1,3 0.51 4,5,2,1,3 4,2,5,1,3 0.47 4,2,5,1,3 4,5,2,1,3

Q5 5,3,4,1,2 0.59 5,3,4,1,2 5,3,4,1,2 0.48 5,3,4,1,2 5,3,4,1,2

Q6 5,4,3,1,2 0.43 5,4,3,1,2 4,5,3,2,1 0.24 3,5,4,1,2 5,4,3,1,2

Q7 1,4,2,3,5 0.28 1,4,2,3,5 1,4,2,3,5 0.33 1,4,2,3,5 1,4,2,3,5

Q8 4,5,1,2,3 0.36 4,5,1,2,3 4,5,1,3,2 0.26 4,5,3,1,2 4,5,1,2,3

Q9 2,5,3,4,1 0.72 2,5,3,4,1 2,5,3,4,1 0.69 2,5,3,4,1 2,5,3,4,1

Q10 2,3,4,1,5 0.52 2,3,4,5,1 2,3,4,5,1 0.48 2,3,5,4,1 2,3,4,5,1

4.1 Results and Discussion

The sessionwise feedback data for the click sequence

based and time based feedbacks have been aggre-

gated through, the proposed method and the Borda’s

method and the results are tabulated in Table 1.

From Table 1, it can be observed that the π

ob-

tained through the proposed method for click se-

quence based and time based feedbacks are same as

that of their respective Borda’s aggregations (8 out of

10 cases in both the schemes). This is supportive of

the fact that the maximum likelihood estimator for π

would converge to the ordering obtained by arranging

the elements based on the vector of average ranks

π,

(Fligner and Verducci, 1988).

The partial orders obtained for a feedback from

various sessions are all observations from model

(3). For a particular feedback, sessionwise obser-

vations are all from the same population and there-

fore, they all get equal weights in the aggregation

process, resulting in an ordering same as that of

Borda’s aggregation. But in contrast, while aggregat-

ing the observations from different feedbacks, since

they are all from population differing in their dis-

persion parameter (precision), fusioning with equal

weights would be misleading. Borda’s method in-

stead gives equal weights to the ranking schemes ig-

noring the differences in accuracies. Eventhough the

weighted Borda’s method proposes to use precision

based weights, it appears less attractive, because, the

precision values are not readily available and need

to be fed externally which is impractical. But in the

proposed method, the weights (intuitively dispersion

parameters) are assigned inherently by the very con-

struction of the model and hence, the feedback rank-

ings are combined in to ﬁnal ranking of documents

based on their expertise.

From Table 1, it can be observed that the

(t)

s of

time based aggregation are found to be lesser when

compared to the respective

(o)

s of click sequence

based aggregation. This can be supported by the ar-

gument that the click sequence based information is

ordinal and the time based information is continu-

ous. Therefore, the time based ranking will be more

sensitive, i.e., even a second difference in the read-

ing time will force a document to be ranked differ-

ently. Though it will have a modal ranking π

, the

mass won’t be concentrated heavily on one ranking.

This very fact can be seen in the frequency Table 2

for a typical query. Since the time based rankings

are widely dispersed, its aggregation parameter θ

(t)

is expected to be less when compared to the aggre-

gation parameter θ

(o)

of highly skewed (biased) click

sequence based ranking. This motivates the need of

the proposed uniﬁed aggregation model (Section 3.2)

which can make trade off between the two aggrega-

tions and fully extract the advantages of both the feed-

backs.

Table 2: Order-based and time-based rank frequency distri-

bution of the documents for a typical query.

Order-based Time-based

URLs 1 2 3 4 5 1 2 3 4 5

URL31 6 4 1 1 0 7 3 1 1 0

URL32 1 5 1 1 1 5 2 1 1 0

URL33 4 12 4 0 0 9 8 3 0 0

URL34 12 6 0 0 0 8 9 0 0 1

URL35 16 2 1 0 0 10 7 2 0 0

From Table 1, it can be seen that for relatively

lower values of

(t)

(when compared to

(o)

) the uni-

ﬁed aggregation model produces π

equal to that of

the click sequence based aggregation and for rela-

tively higher values of

(t)

it produces a π

which

forms the trade off between the two implicit feedback

rankings considered. As the loglikelihood of com-

bined aggregation model is the summation of loglike-

lihoods of its component rankers, it could break the

ties that emerge more frequently while estimating π

from the maximum loglikelihood. The preliminary

results hinted the possibility of using the uniﬁed ag-

AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH ENGINE LOG FILES

273

gregation of feedbacks with comparably better perfor-

mance. But, this need to be tested with a larger dataset

for better understanding and assurance.

5 CONCLUSIONS

In this article, we proposed a generalized frame-

work to incorporate multiple feedbacks on page qual-

ity from search engine log ﬁles to improve the re-

sult quality. Speciﬁcally, we considered the click se-

quence and the time spent by the users in reading a

document as measures of a documents’ importance

for a query. We proposed an extension to the dis-

tance based raking method to jointly aggregate the

feedbacks based on their expertise. This attempt is

a precursor to demonstrate the feasibility of unsuper-

vised fusion of feedbacks in the form of partial or-

ders in to a single ranking of documents taking into

account their varying levels of accuracies. The exper-

iments were conducted on the actual search log data

from AlltheWeb.com to demonstrate the strength of

the proposed model in meaningfully combining the

feedbacks.

REFERENCES

Agichtein, E., Brill, E., and Dumais, S. (2006). Improving

web search ranking by incorporating user behavior in-

formation. In Procs. SIGIR ’06, pages 19–26, New

York, NY, USA. ACM.

Brin, S. and Page, L. (1998). The anatomy of a large-scale

hypertextual web search engine. Computer Networks

and ISDN Systems, 30(1-7):107–117.

Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. (2001).

Rank aggregation methods for the web. In Procs

WWW ’01, pages 613–622, New York, NY, USA.

ACM.

Fligner, M. A. and Verducci, J. S. (1986). Distance based

ranking models. Journal of the Royal Statistical Soci-

ety. Series B (Methodological), 48(3):359–369.

Fligner, M. A. and Verducci, J. S. (1988). Multistage rank-

ing models. Journal of the American Statistical Asso-

ciation, 83(403):892–901.

Jansen, B. J. and Spink, A. (2005). An analysis of web

searching by european alltheweb.com users. Inf. Pro-

cess. Manage., 41(2):361–381.

Kelly, D. and Belkin, N. J. (2004). Display time as implicit

feedback: understanding task effects. In ACM SIGIR,

pages 377–384.

Kim, J., Oard, D., and Romanik, K. (2000). Using im-

plicit feedback for user modeling in internet and in-

tranet searching. Technical report, College of Library

and Information Services, University of Maryland at

College Park.

Kleinberg, J. M. (1999). Authoritative sources in a hyper-

linked environment. J. ACM, 46(5):604–632.

Klementiev, A., Roth, D., and Small, K. (2008). Unsu-

pervised rank aggregation with distance-based mod-

els. In Procs. ICML ’08, pages 472–479, New York,

NY, USA. ACM.

Lebanon, G. and Lafferty, J. D. (2002). Cranking: Com-

bining rankings using conditional probability models

on permutations. In Procs. ICML ’02, pages 363–370,

San Francisco, CA, USA. Morgan Kaufmann Publish-

ers Inc.

Mallows, C. L. (1957). Non-null ranking models. i.

Biometrika, 44(1/2):114–130.

Ramachandran, P. (2005). Discovering user preferences by

using time entries in click-through data to improve

search engine results. In Discovery Science, pages

383–385.

Veilumuthu, A. and Ramachandran, P. (2007). Discovering

implicit feedbacks from search engine log ﬁles. In

Discovery Science, pages 231–242.

White, R., Ruthven, I., and Jose, J. M. (2002). The use

of implicit evidence for relevance feedback in web

retrieval. In Proceedings of the 24th BCS-IRSG Eu-

ropean Colloquium on IR Research, pages 93–109.

Springer-Verlag.

KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval

274