Enterprise Search: Learning to Rank with Click-Through Data as a Surrogate for Human Relevance Judgements
Colin Daly, Colin Daly, Lucy Hederman, Lucy Hederman
2023
Abstract
Learning to Rank (LTR) has traditionally made use of relevance judgements (i.e. human annotations) to create training data for ranking models. But, gathering feedback in the form of relevance judgements is expensive, time-consuming and may be subject to annotator bias. Much research has been carried out by commercial web search providers into harnessing click-through data and using it as a surrogate for relevance judgements. Its use in Enterprise Search (ES), however, has not been explored. If click-through data relevance feedback correlates with that of the human relevance judgements, we could dispense with small relevance judgement training data and rely entirely on abundant quantities of click-through data. We performed a correlation analysis and compared the ranking performance of a ‘real world’ ES service of a large organisation using both relevance judgements and click-through data. We introduce and publish the ENTRP-SRCH dataset specifically for ES. We calculated a correlation coefficient of r = 0.704 (p<0.01). Additionally, the nDCG@3 ranking performance using relevance judgements is just 1.6% higher than when click-through data is used. Subsequently, we discuss ES implementation trade-offs between relevance judgements and implicit feedback and highlight potential preferences and biases of both end-users and expert annotators.
DownloadPaper Citation
in Harvard Style
Daly C. and Hederman L. (2023). Enterprise Search: Learning to Rank with Click-Through Data as a Surrogate for Human Relevance Judgements. In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN 978-989-758-671-2, SciTePress, pages 240-247. DOI: 10.5220/0012170200003598
in Bibtex Style
@conference{kdir23,
author={Colin Daly and Lucy Hederman},
title={Enterprise Search: Learning to Rank with Click-Through Data as a Surrogate for Human Relevance Judgements},
booktitle={Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2023},
pages={240-247},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012170200003598},
isbn={978-989-758-671-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Enterprise Search: Learning to Rank with Click-Through Data as a Surrogate for Human Relevance Judgements
SN - 978-989-758-671-2
AU - Daly C.
AU - Hederman L.
PY - 2023
SP - 240
EP - 247
DO - 10.5220/0012170200003598
PB - SciTePress