Figure 4: A bar chart showing the performance differences
between explicit and implicit feedback for various nDCG
cutoffs. For nDCG@3, the human relevance judgements are
just 1.6% higher than those recorded using click-through
feedback. For nDCG@5, nDCG@10 and nDCG@20, the
use of CTR as ground truth achieves higher scores than hu-
man judgements.
Future work may include mitigation of the identi-
fied bias in both approaches, e.g. by applying an in-
verse propensity score or introducing more diversity
to annotator selection.
Enterprise content is diverse and different for ev-
ery organisation. The generalisability of the ENTRP-
SRCH dataset is therefore limited. However, since
click-through feedback is cheap and abundant com-
pared to human relevance judgements, our (correla-
tion and ranking performance) findings for our organ-
isation may present a crucial cost-saving opportunity
to other organisations considering which type of feed-
back approach they should adopt for learning to rank
in the context of Enterprise Search.
This research was conducted with the financial sup-
port of Science Foundation Ireland under Grant
Agreement No. 13/RC/2106 P2 at the ADAPT SFI
Research Centre at Trinity College Dublin. ADAPT,
the SFI Research Centre for AI-Driven Digital Con-
tent Technology, is funded by Science Foundation Ire-
land through the SFI Research Centres Programme.
