AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH
ENGINE LOG FILES
Ashok Veilumuthu and Parthasarathy Ramachandran
Department of Management Studies, Indian Institute of Science, Bangalore 560 012, India
Keywords:
Ranking Models, Feedback Aggregation, Implicit Feedbacks, Click Sequence, Partial Ordering.
Abstract:
The current approaches to information retrieval from the search engine depends heavily on the web linkage
structure which is a form of relevance judgment by the page authors. However, to overcome spamming
attempts and language semantics, it is important to also incorporate the user feedback on the documents’
relevance for a particular query. Since users can be hardly motivated to give explicit/direct feedback on search
quality, it becomes necessary to consider implicit feedback that can be collected from search engine logs.
Though there are number of implicit feedback measures proposed to improve the search quality, there is no
standard methodology proposed yet to aggregate those implicit feedbacks meaningfully to get a final ranking
of the documents. In this article, we propose an extension to the distance based ranking model to aggregate
different implicit feedbacks based on their expertise in ranking the documents. The proposed approach has
been tested on two implicit feedbacks, namely click sequence and time spent in reading a document from the
actual log data of AlltheWeb.com. The results were found to be convincing and indicative of the possibility of
expertise based fusion of implicit feedbacks to arrive at a single ranking of documents for the given query.
1 INTRODUCTION
Search engines are information retrieval systems in-
tended to help the users locate their needed informa-
tion from internet by querying. These queries are usu-
ally constructed of keywords to express the users’ in-
formation needs. Initially, traditional keyword simi-
larity based techniques were used to retrieve the rel-
evant documents from the web. These techniques
rely completely on the content of the documents upon
which only its authors have sole control. This allows
them to manipulate the search results by tampering
the documents’ content. This lead to the failure of
the traditional keyword based techniques. Later in
the late 90s, the research community realized the im-
portance of utilizing the linkage structure that exists
within the web documents in the form of hyperlinks
to improve the search results. Two seminal works
by (Brin and Page, 1998) and (Kleinberg, 1999) have
used the linkage structure of the web as the human an-
notation about the quality of the documents. The web
linkage structure captures the importance of pages to
a large extent, though, it captures only the collective
relevance judgment given by the authors of the web-
pages and not of the readers/users. It is true that the
collective judgement of authors is a reliable informa-
tion, but, the end users are more eligible to judge
the credibility of the documents presented to them.
Therefore, the users’ feedback information would be
a valuable source to help improve the search results
further.
Multitude of studies have proposed ways of ob-
taining the user relevance feedback information and
methods to incorporate them into the retrieval engine.
The feedback information could be either explicit or
implicit. In the explicit feedback based methods the
users are explicitly asked to register their feedback
on the documents presented to them. Such a strategy
would impose an increased burden and cognitive load
on the users. Further, many users may not be moti-
vated to provide this information (White et al., 2002).
In the case of implicit feedbacks, the users interaction
with the search system will be recorded in the form
of a log file and its entries can be suitably interpreted
to infer the users’ relevance judgement on the doc-
uments presented. There was a debate going on for
almost half a decade on substituting the explicit feed-
back with the implicit ones. Later from experiments
it has been concluded that implicit feedbacks can be
used as viable alternative for their explicit counter-
parts by (White et al., 2002).
269
Veilumuthu A. and Ramachandran P..
AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH ENGINE LOG FILES.
DOI: 10.5220/0003096502690274
In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2010), pages 269-274
ISBN: 978-989-8425-28-7
Copyright
c
2010 SCITEPRESS (Science and Technology Publications, Lda.)
Since then, many attempts have been made to
grasp every possible user behavior during the search
and use them as a proxy to their relevance feedback
later. See, for example (Kim et al., 2000; Kelly and
Belkin, 2004; Ramachandran, 2005; Agichtein et al.,
2006; Veilumuthu and Ramachandran, 2007). It is
understandable that every feedback will have its own
advantage and disadvantage and therefore, it would
be more useful, if there was a way to aggregate these
feedbacks based on their expertise in achieving the
ideal ranking. Such an aggregation would help ex-
tracting the advantages from each one of the feed-
backs and will result in a more accurate and unbiased
ranking of documents.
A number of studies have been conducted in the
information retrieval literature by borrowing the theo-
ries from various fields to aggregate the rankings from
various sources. First of that kind was proposed by
(Dwork et al., 2001). They proposed a markov chain
based formulation to aggregate the individual rank-
ing produced by multiple search engines, and they
studied the impact of local kemenization in reducing
the spams. In this work, all the rankers are weighed
equally and hence it ignores the importance of the bet-
ter ranker. (Lebanon and Lafferty, 2002) used the dis-
tance based ranking models to propose a formalism
for the supervised ensemble learning and analyzed the
results for partial rankings. An unsupervised rank ag-
gregation approach using the distance based model
has been given by (Klementiev et al., 2008). These
two works hinted the possibility of incorporating the
expertise of the rankers while combining their input
rankings. All these studies intended to solve the su-
pervised as well as unsupervised rank aggregation
problems in the context of metasearch where the top
k-list is fixed, no study has been done in the context
of relevance feedback aggregation where the partial
orderings can be of any length. This motivated us to
extend the distance based ranking models to combine
the various feedback rankings based on their expertise
in achieving the unknown ideal ranking.
In this paper, we propose a framework to aggre-
gate multiple feedbacks obtained in the form of partial
rankings from various user sessions into a single con-
sensus ranking. We extend the distance based ranking
models proposed by Mallows to make this unsuper-
vised aggregation more meaningful. The proposed
aggregation framework has been examined over the
two implicit feedbacks namely, click sequence and
time spent reading a document, extracted from the ac-
tual log data. The results are found be encouraging
and it also ensured the feasibility of achieving such an
expertise based aggregation of feedback rankings. In
this study, though we discuss only the aggregation of
implicit feedbacks, the construction doesn’t prevent
one from using it for the explicit counterparts. The
only requirement is that the easy convertibility of the
feedbacks in to partial orders without much informa-
tion loss.
2 DISTANCE BASED RANKING
MODELS
Given a set of elements, any meaningful ranking
scheme will assume the existence of an ideal ordering
of elements π
0
and will tend to arrange the elements
in an order closer to π
0
. Therefore, it is highly prefer-
able for a ranking scheme to produce a ranking closer
to π
0
than a ranking farther from it. That is, the prob-
ability of getting a permutation π should decrease as
its distance from π
0
increases. This is the basic intu-
ition behind most of the ranking models proposed in
the literature. In this paper, we use the family of dis-
tance based ranking models first proposed by (Mal-
lows, 1957) and extended to partial orders by (Fligner
and Verducci, 1986). The two main features that mo-
tivated this selection are: (1) It gives a distributional
view of data, hence, an effective way of representa-
tion, and (2) easily interpretable distributional param-
eters, more importantly the dispersion parameter can
be interpreted as the expertise of the ranking scheme
(Lebanon and Lafferty, 2002).
Let X = {x
1
, ..., x
k
} be the set of items to be
ranked by the judges, identified with the indexes
1, ..., k. We denote the ranking given by the judges
with the permutation π = (π(1), ..., π(k)), where π(i)
is the rank given to the item x
i
and π
1
(i) is the index
of the item assigned to rank i. If x
i
is preferred to x
j
,
then π(i) < π( j).
We will use π and π
1
as vectors similar to
(Lebanon and Lafferty, 2002) whose ith component
is π(i) and π
1
(i) respectively. Thus π and π
1
are the
vectors representing the ranking and ordering over the
set X respectively. If the ranking is over the entire set
X then it forms a full ranking π, but if the judge ranks
only p < k items in X then the resultant ordering will
be a partial ordering π
1
= (π
1
(1), ··· , π
1
(p)).
For brevity we represent π
1
as π
.
2.1 Generalized Mallows’ Model
According to the generalized Mallows’ model,
for a given dispersion parameter θ and location
parameter π
0
, the judges are assumed to generate
their rankings π from
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
270
P(π|θ, π
0
) =
exp{−θd(π, π
0
)}
π
exp{−θd(π, π
0
)}
π , θ R
(1)
where, d(., .) is a right invariant distance metric, π
0
is a fixed ranking, θ is the dispersion parameter and
ψ(θ) is the normalizing constant. When θ > 0, the
fixed ranking π
0
is the modal ranking and when θ ap-
proaches infinity mass gets concentrated at the single
ranking π
0
. When θ = 0 the distribution is uniform
and for θ < 0, π
0
is an antimode. The θ 0 can be
interpreted as the expertise of the judges. The prob-
ability of the ranking π decreases exponentially with
increase in the distance from the modal ranking π
0
.
(Fligner and Verducci, 1986) extended the above
Mallows model for the presence of partial orders
by viewing it as a multi-stage ranking process.
In the case of partial orderings, a judge reports
only his top p < k preferences, denoted by π
=
(π
1
(1), ··· , π
1
(p)) and let the set of all partial or-
ders be
. They considered the partial ordering π
as a censored observation from the Mallows distribu-
tion (1) and modeled the probability of observing π
as the probability of getting a full ranking from the
coset S
kp
π of all π consistent with π
.
Let, V
j
be the number of adjacent transpositions
taken in the order of π
0
required to place the item
π
1
0
( j) in j
th
position. For example, if π
0
is an iden-
tity permutation then, V
j
is the number of adjacent
transpositions required to place the item j in j
th
po-
sition. V
1
, ··· ,V
p
depend on π S
kp
π only through
π
. The remaining vectors V
p+1
, ··· ,V
k1
takes all its
(k p)! possible values, thereby independent of π
and a function of p and θ alone. The induced model
in partial ranking can be expressed as (Fligner and
Verducci, 1986):
P(π
|θ, π
0
) =
exp
(
θ
p
j=1
V
j
(π
, π
0
)
)
p
j=1
1 exp{−(k j + 1)θ}
1 exp(θ)
(2)
3 AGGREGATION OF IMPLICIT
FEEDBACKS
As stated earlier, the feedback could be anything that
impose an ordering over a subset of documents for a
particular query. There are lots of implicit and explicit
feedbacksproposed in the literature, and anythingthat
can be converted into ranking with little effort would
be considered as feedback for our purpose. The users
are under no obligation to register their feedback over
the entire document list and therefore, it will be a par-
tial order. Since it is under the users’ discretion, they
might give their feedback on document sets of vary-
ing size. This demands a modification to the model
(2) where the length of the partial orderings generated
is fixed.
3.1 Model for Partial Orders of Varying
Length
For every length p, there exists a probability model
(2). Since the length of the partial orders given in
user sessions vary from one session to the another, it
follows a probability distribution P and let us assume
it to be known. Let p(.) be the function which maps
the partial ordering π
to its length. Then the extended
model can be written as:
P(π
|θ, π
0
) =
exp
(
θ
p(π
)
j=1
V
j
(π
, π
0
)
)
P(p(π
))
p(π
)
j=1
1 exp{−(k j + 1)θ}
1 exp(θ)
(3)
P(p(π
)) is the probability of getting an ordering of
length p(π
). Note that this mass function P(·) is in-
dependent of π
0
and θ.
Consider that there are n judges giving their feed-
back on each of the m feedback measures, then the
ordering π
(r)
s
obtained from the feedback r given by
judge s can be considered to be generated from the
extended model (3). The loglikelihood function of the
model will be as follows:
L(π
0
, θ
(r)
) =
n
s=1
n
p(π
(r)
s
)ln
h
1 exp(θ
(r)
)
i
+ln
h
P(p(π
(r)
s
))
i
θ
(r)
p(π
(r)
s
)
j=1
V
j
(π
(r)
s
, π
0
)
p(π
(r)
s
)
j=1
ln
h
1 exp{−θ
(r)
[k j + 1]}
i
(4)
Estimation of θ
(r)
: The dispersion parameter θ
(r)
for a fixed π
0
can be estimated by solving the follow-
ing equation:
L
∂θ
(r)
=
n
s=1
p(π
(r)
s
)e
(θ
(r)
)
1 e
(θ
(r)
)
p(π
(r)
s
)
j=1
V
j
(π
(r)
s
, π
0
)
p(π
(r)
s
)
j=1
(k j + 1)e
(θ
(r)
(k j+1))
1 e
(θ
(r)
(k j+1))
= 0
(5)
AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH ENGINE LOG FILES
271
Estimation of π
0
: Since the π
0
has been fixed in the
estimation of θ
(r)
, the estimator
ˆ
θ
(r)
will be a function
of π
0
, denoted by
ˆ
θ
(r)
(π
0
). Therefore, the ideal rank-
ing estimate
ˆ
π
0
can be obtained by iterating the above
equation (5) for all π
0
and by subsequently sub-
stituting the
ˆ
θ
(r)
and
ˆ
π
0
values in the log likelihood
function (4) to get the pair that maximizes it.
ˆ
π
0
= argmax
π
0
L(π
0
,
ˆ
θ
(r)
(π
0
)) (6)
3.2 Multiple Feedback Aggregation
Despite the unavailability of ideal ordering of doc-
uments π
0
, we would assume its existence for all
the queries and we argue that it is possible to effec-
tively estimate it through the partial orders {π
(r)
: r =
1, ··· , m} induced by the observable feedbacks from
user sessions. These partial orders are proxy informa-
tion of the ideal ordering but with difference in their
precision {θ
(r)
: r = 1, ·, m}. The precision not only
changes with feedbacks but also with queries. This
is because of the feedbacks’ ability to achieve ideal
ranking changes amongst themselves as well as with
queries. Let π
= (π
(1)
, ··· , π
(m)
) be the vector of
all m partial orders given in an user session, and each
component π
(r)
of this vector follows an extended
model (3) with dispersion parameter θ
(r)
and ideal or-
dering π
0
. Let us denote the vector of all these m dis-
persion parameters as θ = (θ
(1)
, ··· , θ
(m)
).
Formally, for a given ideal ordering π
0
, the partial
orders {π
(r)
: r = 1, ··· , m} extracted from m feed-
backs given in an user session are assumed to be con-
ditionally independent. Hence, the probability of get-
ting π
for a given π
0
and θ is given by:
P(π
|θ, π
0
) =
m
r=1
P(π
(r)
|θ
(r)
, π
0
)
(7)
3.3 Model Benefits
The main advantages of using the distance based
modeling framework for feedback aggregation are the
following:
1) Since the distance used in the proposed model
(3) is the Kendall distance, the maximum likelihood
estimate of π
0
will be Kemney optimal and it will
enjoy the important rank aggregation properties such
as neutrality and consistency in social choice liter-
ature, widely known as Condorcet property (Dwork
et al., 2001). 2) It has been reported in (Fligner and
Verducci, 1988) that the maximum likelihood estima-
tor for π
0
can be obtained by arranging the elements
based on the vector of average ranks
¯
π. This makes
the model computationally viable for larger n. 3)
Since sorting based on the average ranking
¯
π being
the unbiased estimate of π
0
, it may seem similar to
Borda’s method of rank aggregation, in the case of
single implicit feedback. But, if there are more than
one feedback ranking that need to be jointly aggre-
gated, then the proposed method will have an edge
over the Borda’s method, where the weights that need
to be given to the individual ranking schemes are
not so obvious. Being an unsupervised aggregation
framework, it will estimate the expertise of the indi-
vidual feedbacks based on the given data, rather than
getting it externally.
4 MODEL EVALUATION
Since the experimentalevaluation of the implicit feed-
back is severely hampered due to the absence of an
independent evaluation of the documents, the exper-
iments are aimed at establishing the benefits of the
proposed framework stated in Section 3.3.
In this experiment, a 24 hours log data recorded on
6
th
February 2001 by AlltheWeb.com has been used.
This dataset has been previously used by (Jansen and
Spink, 2005) to study the emerging trends in web
searching and later by (Veilumuthu and Ramachan-
dran, 2007) to verify the existence of the incremen-
tal information in using the click sequence and time
based implicit feedback measures.
Each tuple in the dataset corresponds to a click
event made by an user. The log contains the userID
(masked IP), clickTime (i.e., the time at which the
click has been made), the query (masked) posed and
the URL (masked) to which the click has been made.
In the present study, only the log entries pertaining
to two implicit feedbacks namely, (1) Click sequence
based ordering π
(o)
s
and (2) Time based based order-
ing, π
(t)
s
, stated in (Veilumuthu and Ramachandran,
2007) are used. For more details on order extraction
from log entries, the readers are referred to (Veilu-
muthu and Ramachandran, 2007). Top 10 non-trivial
queries (omitting queries like “google”)that had suffi-
cient number of sessions ( 30), have been chosen for
the study. These selected queries formed the query set
Q. It is a known fact that the majority of the sessions
will be of length lesser than 3. Therefore, we picked
only the top 5 of the document list formed by the doc-
uments that appear in the top 3 positions in either of
the rankings. This is under the acceptable assumption
that the documents that are ranked higher in any of
the ranking will represent the data much better than
the others.
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
272
Table 1: Model parameters of the proposed and Borda’s aggregation models.
Query Order based Time based Unified
Proposed Borda’s Proposed Borda’s
ID π
0
ˆ
θ
(o)
π
0
π
0
ˆ
θ
(t)
π
0
π
0
Q1 1,2,3,5,4 1.23 1,2,3,5,4 1,2,3,5,4 0.94 1,2,3,5,4 1,2,3,5,4
Q2 2,1,3,4,5 0.25 2,1,4,3,5 2,1,4,3,5 0.21 2,1,4,3,5 2,1,3,4,5
Q3 2,5,3,1,4 0.65 2,5,3,1,4 5,2,3,1,4 0.48 5,2,3,1,4 2,5,3,1,4
Q4 4,5,2,1,3 0.51 4,5,2,1,3 4,2,5,1,3 0.47 4,2,5,1,3 4,5,2,1,3
Q5 5,3,4,1,2 0.59 5,3,4,1,2 5,3,4,1,2 0.48 5,3,4,1,2 5,3,4,1,2
Q6 5,4,3,1,2 0.43 5,4,3,1,2 4,5,3,2,1 0.24 3,5,4,1,2 5,4,3,1,2
Q7 1,4,2,3,5 0.28 1,4,2,3,5 1,4,2,3,5 0.33 1,4,2,3,5 1,4,2,3,5
Q8 4,5,1,2,3 0.36 4,5,1,2,3 4,5,1,3,2 0.26 4,5,3,1,2 4,5,1,2,3
Q9 2,5,3,4,1 0.72 2,5,3,4,1 2,5,3,4,1 0.69 2,5,3,4,1 2,5,3,4,1
Q10 2,3,4,1,5 0.52 2,3,4,5,1 2,3,4,5,1 0.48 2,3,5,4,1 2,3,4,5,1
4.1 Results and Discussion
The sessionwise feedback data for the click sequence
based and time based feedbacks have been aggre-
gated through, the proposed method and the Borda’s
method and the results are tabulated in Table 1.
From Table 1, it can be observed that the π
0
ob-
tained through the proposed method for click se-
quence based and time based feedbacks are same as
that of their respective Borda’s aggregations (8 out of
10 cases in both the schemes). This is supportive of
the fact that the maximum likelihood estimator for π
0
would converge to the ordering obtained by arranging
the elements based on the vector of average ranks
¯
π,
(Fligner and Verducci, 1988).
The partial orders obtained for a feedback from
various sessions are all observations from model
(3). For a particular feedback, sessionwise obser-
vations are all from the same population and there-
fore, they all get equal weights in the aggregation
process, resulting in an ordering same as that of
Borda’s aggregation. But in contrast, while aggregat-
ing the observations from different feedbacks, since
they are all from population differing in their dis-
persion parameter (precision), fusioning with equal
weights would be misleading. Borda’s method in-
stead gives equal weights to the ranking schemes ig-
noring the differences in accuracies. Eventhough the
weighted Bordas method proposes to use precision
based weights, it appears less attractive, because, the
precision values are not readily available and need
to be fed externally which is impractical. But in the
proposed method, the weights (intuitively dispersion
parameters) are assigned inherently by the very con-
struction of the model and hence, the feedback rank-
ings are combined in to final ranking of documents
based on their expertise.
From Table 1, it can be observed that the
ˆ
θ
(t)
s of
time based aggregation are found to be lesser when
compared to the respective
ˆ
θ
(o)
s of click sequence
based aggregation. This can be supported by the ar-
gument that the click sequence based information is
ordinal and the time based information is continu-
ous. Therefore, the time based ranking will be more
sensitive, i.e., even a second difference in the read-
ing time will force a document to be ranked differ-
ently. Though it will have a modal ranking π
0
, the
mass won’t be concentrated heavily on one ranking.
This very fact can be seen in the frequency Table 2
for a typical query. Since the time based rankings
are widely dispersed, its aggregation parameter θ
(t)
is expected to be less when compared to the aggre-
gation parameter θ
(o)
of highly skewed (biased) click
sequence based ranking. This motivates the need of
the proposed unified aggregation model (Section 3.2)
which can make trade off between the two aggrega-
tions and fully extract the advantages of both the feed-
backs.
Table 2: Order-based and time-based rank frequency distri-
bution of the documents for a typical query.
Order-based Time-based
URLs 1 2 3 4 5 1 2 3 4 5
URL31 6 4 1 1 0 7 3 1 1 0
URL32 1 5 1 1 1 5 2 1 1 0
URL33 4 12 4 0 0 9 8 3 0 0
URL34 12 6 0 0 0 8 9 0 0 1
URL35 16 2 1 0 0 10 7 2 0 0
From Table 1, it can be seen that for relatively
lower values of
ˆ
θ
(t)
(when compared to
ˆ
θ
(o)
) the uni-
fied aggregation model produces π
0
equal to that of
the click sequence based aggregation and for rela-
tively higher values of
ˆ
θ
(t)
it produces a π
0
which
forms the trade off between the two implicit feedback
rankings considered. As the loglikelihood of com-
bined aggregation model is the summation of loglike-
lihoods of its component rankers, it could break the
ties that emerge more frequently while estimating π
0
s
from the maximum loglikelihood. The preliminary
results hinted the possibility of using the unified ag-
AGGREGATION OF IMPLICIT FEEDBACKS FROM SEARCH ENGINE LOG FILES
273
gregation of feedbacks with comparably better perfor-
mance. But, this need to be tested with a larger dataset
for better understanding and assurance.
5 CONCLUSIONS
In this article, we proposed a generalized frame-
work to incorporate multiple feedbacks on page qual-
ity from search engine log files to improve the re-
sult quality. Specifically, we considered the click se-
quence and the time spent by the users in reading a
document as measures of a documents’ importance
for a query. We proposed an extension to the dis-
tance based raking method to jointly aggregate the
feedbacks based on their expertise. This attempt is
a precursor to demonstrate the feasibility of unsuper-
vised fusion of feedbacks in the form of partial or-
ders in to a single ranking of documents taking into
account their varying levels of accuracies. The exper-
iments were conducted on the actual search log data
from AlltheWeb.com to demonstrate the strength of
the proposed model in meaningfully combining the
feedbacks.
REFERENCES
Agichtein, E., Brill, E., and Dumais, S. (2006). Improving
web search ranking by incorporating user behavior in-
formation. In Procs. SIGIR ’06, pages 19–26, New
York, NY, USA. ACM.
Brin, S. and Page, L. (1998). The anatomy of a large-scale
hypertextual web search engine. Computer Networks
and ISDN Systems, 30(1-7):107–117.
Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. (2001).
Rank aggregation methods for the web. In Procs
WWW ’01, pages 613–622, New York, NY, USA.
ACM.
Fligner, M. A. and Verducci, J. S. (1986). Distance based
ranking models. Journal of the Royal Statistical Soci-
ety. Series B (Methodological), 48(3):359–369.
Fligner, M. A. and Verducci, J. S. (1988). Multistage rank-
ing models. Journal of the American Statistical Asso-
ciation, 83(403):892–901.
Jansen, B. J. and Spink, A. (2005). An analysis of web
searching by european alltheweb.com users. Inf. Pro-
cess. Manage., 41(2):361–381.
Kelly, D. and Belkin, N. J. (2004). Display time as implicit
feedback: understanding task effects. In ACM SIGIR,
pages 377–384.
Kim, J., Oard, D., and Romanik, K. (2000). Using im-
plicit feedback for user modeling in internet and in-
tranet searching. Technical report, College of Library
and Information Services, University of Maryland at
College Park.
Kleinberg, J. M. (1999). Authoritative sources in a hyper-
linked environment. J. ACM, 46(5):604–632.
Klementiev, A., Roth, D., and Small, K. (2008). Unsu-
pervised rank aggregation with distance-based mod-
els. In Procs. ICML ’08, pages 472–479, New York,
NY, USA. ACM.
Lebanon, G. and Lafferty, J. D. (2002). Cranking: Com-
bining rankings using conditional probability models
on permutations. In Procs. ICML ’02, pages 363–370,
San Francisco, CA, USA. Morgan Kaufmann Publish-
ers Inc.
Mallows, C. L. (1957). Non-null ranking models. i.
Biometrika, 44(1/2):114–130.
Ramachandran, P. (2005). Discovering user preferences by
using time entries in click-through data to improve
search engine results. In Discovery Science, pages
383–385.
Veilumuthu, A. and Ramachandran, P. (2007). Discovering
implicit feedbacks from search engine log les. In
Discovery Science, pages 231–242.
White, R., Ruthven, I., and Jose, J. M. (2002). The use
of implicit evidence for relevance feedback in web
retrieval. In Proceedings of the 24th BCS-IRSG Eu-
ropean Colloquium on IR Research, pages 93–109.
Springer-Verlag.
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
274