Francisco Benjamim Filho, Raúl Pierre Renteria, Ruy Luiz Milidiú



The explosive growth and the widespread accessibility of the Web has led to a surge of research activity in the area of information retrieval on the WWW. This is a huge and rich environment where the web pages can be viewed as a large community of elements that are connected through links due to several issues. The HITS approach introduces two basic concepts, hubs and authorities, which reveal some hidden semantic information from the links. In this paper, we review the XHITS, a generalization of HITS, which expands the model from two to several concepts and present a new Machine Learning algorithm to calibrate an XHITS model. The new learning algorithm uses latent feature concepts. Furthermore, we provide some illustrative examples and empirical tests. Our findings indicate that the new learning approach provides a more accurate XHITS model.


