A DOMAIN-RELATED AUTHORITY MODEL FOR WEB PAGES BASED ON SOURCE AND RELATED INFORMATION

Liu Yang, Chunping Li, Ming Gu

2010

Abstract

The Internet has become a great source for searching and acquiring information, while the authority of the resources is difficult to evaluate. In this paper we propose a domain-related authority model which aims to calculate the authority of web pages in a specific domain using the source and related information. These two factors, together with link structure, are what we mainly consider in our model. We also add the domain knowledge to adapt to the characteristics of the domain. Experiments on the finance domain show that our model is able to provide good authority scores and ranks for web pages and is helpful for people to better understand the pages.

References

  1. Amento, B., Terveen, L., Hill, W., 2000. Does “authority” mean quality? Predicting expert quality ratings of web documents. In 23rd Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 296-303.
  2. Bharat, K., Mihaila, G. A., 2001. When experts agree: Using non-affiliated experts to rank popular topics. In Proceedings of the Tenth International Conference on World Wide Web, 597-602.
  3. Boldi, P., Santini, M., Vigna, S., 2005. PageRank as a function of the damping factor. In Proceedings of the 14th International Conference on World Wide Web, 557-566.
  4. Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P., 2001. Finding authorities and hubs from link structures on the World Wide Web. In Proceedings of the 10th International Conference on World Wide Web, 415-429.
  5. Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P., 2005. Link analysis ranking: algorithms, theory, and experiments. ACM Transactions on Internet Technology, 5(1), 231-297.
  6. Brin, S., Page, L., 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International Conference on WWW.
  7. Cohn, D., Chang, H., 2000. Learning to probabilistically identify authoritative documents. In Proceedings of the 17th International Conference on Machine Learning. 167-174. Stanford University.
  8. Cohn, D., Hofmann, T., 2000. The missing link - a probabilistic model of document content and hypertext connectivity. Advances in Neural Information Processing Systems (NIPS), 13.
  9. Eiron, N., McCurley, K.S., Tomlin, J.A., 2004. Ranking the web frontier. In Proceedings of the 13th International Conference on World Wide Web, 309-318.
  10. Haveliwala, T.H., 2002. Topic-Sensitive PageRank. In Proceedings of the 11th International World Wide Web Conference, 517-526.
  11. Kleinberg, J., 1999. Authoritative sources in a hyperlinked environment. J. ACM, 46.
  12. Lempel, R., Moran, S., 2001. SALSA: The stochastic approach for link-structure analysis. ACM Transactions on Information Systems, 19(2), 131-160.
  13. Lloyd, S.P., 1982. Least squares quantization in PCM. IEEE Transactions on Information Theory. IT-28(2), 129-137.
  14. McSherry, F., 2005. A uniform approach to accelerated PageRank computation. In Proceedings of the 14th International Conference on World Wide Web, 575-582.
  15. Nie, L., Davison, B.D., Qi, X., 2006. Topical link analysis for web search. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, 91-98.
  16. Yin, X., Han, J., Yu, P.S., 2008. Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering, 20(6), 796-808.
Download


Paper Citation


in Harvard Style

Yang L., Li C. and Gu M. (2010). A DOMAIN-RELATED AUTHORITY MODEL FOR WEB PAGES BASED ON SOURCE AND RELATED INFORMATION . In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT, ISBN 978-989-8425-22-5, pages 245-253. DOI: 10.5220/0002998702450253


in Bibtex Style

@conference{icsoft10,
author={Liu Yang and Chunping Li and Ming Gu},
title={A DOMAIN-RELATED AUTHORITY MODEL FOR WEB PAGES BASED ON SOURCE AND RELATED INFORMATION},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT,},
year={2010},
pages={245-253},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002998702450253},
isbn={978-989-8425-22-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT,
TI - A DOMAIN-RELATED AUTHORITY MODEL FOR WEB PAGES BASED ON SOURCE AND RELATED INFORMATION
SN - 978-989-8425-22-5
AU - Yang L.
AU - Li C.
AU - Gu M.
PY - 2010
SP - 245
EP - 253
DO - 10.5220/0002998702450253