Combining Learning-to-Rank with Clustering

Efstathios Lempesis, Christos Makris


This paper aims to combine learning-to-rank methods with an existing clustering underlying the entities to be ranked. In recent years, learning-to-rank has attracted the interest of many researchers and a large number of algorithmic approaches and methods have been published. Existing learning-to-rank methods have as goal to automatically construct a ranking model from training data. Usually, all these methods don't take into consideration the data's structure. Although there is a novel task named “Relational Ranking” which tries to make allowances for the inter-relationship between documents, it has restrictions and it is difficult to be applied in a lot of real applications. To address this problem, we create a per query clustering using state of the art algorithms from our training data. Then, we experimentally verify the effect of clustering on them.


  1. Baeza-Yates R., and Ribeiro-Neto B., (2011) Modern Information Retrieval: the concepts and technology behind search. Addison Wesley, Essex.
  2. Burges C., Shaked T., Renshaw E., Lazier A., Deeds M., Hamilton N. and Hullender G., (2005) Learning to Rank using Gradient Descent, ICML 2005: 89-96.
  3. Freund Y., Iyer R., Schapire R. E, Singer Y., An Efficient Boosting Algorithm for Combining Preferences. In Journal of Machine Learning Research 4 (2003), 933-969.
  4. Gan G., Ma C. and Wu J. (2007). Data Clustering: Theory, Algorithms, and Applications.
  5. DOI=
  6. Hearst A. M., Pedersen J. O., Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results, In Proceedings of ACM SIGIR 7896, August, 1996, Zurich.
  7. Kurland O., Inter-Document similarities, language models, and ad-hoc information retrieval. Ph.D. Thesis (2006).
  8. Kurland O., The Cluster Hypothesis in Information Retrieval, SIGIR 2013 tutorial (2013).
  9. l.pdf.
  10. Li H., Learning to Rank for Information Retrieval and Natural Language Processing. (2011) Morgan & Claypool.
  11. Liu T. Y., Learning to Rank for Information Retrieval. (2011) Springer.
  12. Liu, X, and W. Bruce C. 2004. Cluster-based retrieval using language models. In Proc. SIGIR, pp. 186-193. ACM Press. DOI: 1008992.1009026.
  13. Manning C. D., Raghavan P., Schutze H., (2008) Introduction to Information Retrieval, Cambridge University Press, pp. 232-234.
  14. McKeown et al. (2002), Tracking and Summarizing News on a Daily Basis with Columbia's Newsblaster, In Proc. Human Language Technology Conference.
  15. Raiber F., Kurland O. (2012), Exploring the Cluster Hypothesis, and Cluster-Based Retrieval, over the Web, ACM CIKM: 2507-2510.
  16. Robertson, S., Zaragoza, H., Taylor, M. (2004) Simple BM25 extension to multiple weighted fields.. In CIKM 2004: Proceedings of the thirteenth ACM International Conference on Information and Knowledge Management, pages 42-49.
  17. van Rijsbergen, C. J.: Information Retrieval, 2nd edn., Butterworths (1979).
  18. Xu J. and Li H., (2007) AdaRank: A Boosting Algorithm for Information Retrieval, SIGIR 2007: 391-398.
  19. Zeng H.-J., He Q.-C., Chen Z., Ma W.-Y., Ma J. (2004), Learning to Cluster Web Search Results. SIGIR 2004: 210-21.

Paper Citation

in Harvard Style

Lempesis E. and Makris C. (2014). Combining Learning-to-Rank with Clustering . In Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST, ISBN 978-989-758-024-6, pages 286-294. DOI: 10.5220/0004846802860294

in Bibtex Style

author={Efstathios Lempesis and Christos Makris},
title={Combining Learning-to-Rank with Clustering},
booktitle={Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,},

in EndNote Style

JO - Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,
TI - Combining Learning-to-Rank with Clustering
SN - 978-989-758-024-6
AU - Lempesis E.
AU - Makris C.
PY - 2014
SP - 286
EP - 294
DO - 10.5220/0004846802860294