A NEW WORD-INTERSECTION CLUSTERING METHOD FOR INFORMATION FILTERING

Mao Lin Huang, Jun Lai, Ben Soh

Abstract

As the use of the web grows globally and exponentially, it becomes increasingly harder for users to find the information they want. Therefore, there is a need for good information filtering mechanisms. This paper presents a new, efficient information filtering method using word clusters. Traditional filtering methods only consider the relevance values of document. As a result, these conventional methods fail to consider the efficiency of document retrieval, which is also crucial. Our algorithm using offline computation attempts to cluster similar documents based on words shared by documents to produce clusters, so that the efficiency of information filtering and retrieval can be improved.

References

  1. Kobayashi, M and Takeda, K, 1999. Information retrieval on the Web. In ESSIR 2000, LNCS 1980, SpringerVerlag, pp. 242-285.
  2. Meng, X and Chen, Z, 2003. Personalized web search with clusters. In IC'03, International Conference on Internet Computing, pp. 46-52.
  3. A Hartigan, J., 1975. Clustering algorithms, WILEY Publication.
  4. Yang, F., Zhu, Y., Shi, B., 2003. A new algorithm for performing ratings-based collaborative filtering. In Web Technologies and Applications: 5th Asia-Pacific Web Conference, Springer-Verlag, pp. 239 - 250.
  5. Breese, J., Heckerman, D., and Kadie, C., 1998. Empirical analysis of predictive algorithms for collaborative filtering. In 14th Conf. Uncertainty in Artificial Intelligence, Morgan Kaufmann, pp. 43-52.
  6. Goldbeg. D., Nichols. D., Oki. B.M. and Terry. D., 1992. Using collaborative filtering to weae an information tapestry. Communications of the ACM, pp. 61-70 Goldberg K., et al., 2001. Eigentaste: a constant time collaborative filtering algorithm. Information Retrieval Journal, vol. 4, no. 2, pp. 133-151.
  7. Mostafa, J., Mukhopadhyay, S., Palakal, M., and Lam, W., 1997. A multilevel approach to intelligent information filtering: model, system, and evaluation. ACM Transactions on Information Systems, Vol. 15, No. 4, pp. 368-399.
  8. Smith, J., 1998. The book, The publishing company. London, 2nd edition.
Download


Paper Citation


in Harvard Style

Lin Huang M., Lai J. and Soh B. (2004). A NEW WORD-INTERSECTION CLUSTERING METHOD FOR INFORMATION FILTERING . In Proceedings of the First International Conference on E-Business and Telecommunication Networks - Volume 1: ICETE, ISBN 972-8865-15-5, pages 241-244. DOI: 10.5220/0001382402410244


in Bibtex Style

@conference{icete04,
author={Mao Lin Huang and Jun Lai and Ben Soh},
title={A NEW WORD-INTERSECTION CLUSTERING METHOD FOR INFORMATION FILTERING},
booktitle={Proceedings of the First International Conference on E-Business and Telecommunication Networks - Volume 1: ICETE,},
year={2004},
pages={241-244},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001382402410244},
isbn={972-8865-15-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on E-Business and Telecommunication Networks - Volume 1: ICETE,
TI - A NEW WORD-INTERSECTION CLUSTERING METHOD FOR INFORMATION FILTERING
SN - 972-8865-15-5
AU - Lin Huang M.
AU - Lai J.
AU - Soh B.
PY - 2004
SP - 241
EP - 244
DO - 10.5220/0001382402410244