3. Han, Jiawei, and Micheline Kamber. 2001. Data Mining: Concepts and Techniques. San
Diego, CA: Academic Press.
4. He, Ji, Ah-Hwee Tan, and Chew-Lim Tan. 2000. Machine learning methods for Chinese
Web page categorization. Proceedings of the ACL’2000 2
nd
Chinese Language Processing
Workshop, pp. 93-100.
5. Hodges, Julia, and Jose Cordova. 1993. Automatically building a knowledge base through
natural language text analysis. International Journal of Intelligent Systems 8(9): 921-938.
6. Hodges, Julia, Shiyun Yie, Sonal Kulkarni, and Ray Reighart. 1997. Generation and
evaluation of indexes for chemistry articles. Journal of Intelligent Information Systems 7:
57-76.
7. Kupiec, Julian, Jan Pedersen, and Francine Chen. 1995. A trainable document summarizer.
Proceedings of the 18
th
Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval, pp. 68-73.
8. Kushmerick, Nicholas, Edward Johnston, and Stephen McGuinness. 2001. Information
extraction by text classification. IJCAI-01 workshop on adaptive text extraction and
mining.
9. Larsen, Bjornar, and Chinatsu Aone. 1999. Fast and effective text mining using linear-time
document clustering. Proceedings of the 1999 International Conference on Knowledge
Discovery and Data Mining (KDD-99), pp. 16-22.
10. Lehnert, W., J. McCarthy, S. Soderland, E. Riloff, C. Cardie, J. Peterson, and F. Feng.
1993. Umass/Hughes: Description of the CIRCUS system used for MUC-5. Proceedings of
the Fifth Message Understanding Conference.
11. Li, Yonghong, and Anil K. Jain. 1998. Classification of text documents. Proceedings of the
14
th
International Conference on Pattern Recognition, pp. 1295-1297.
12. Lin, Shian-Hua, Meng Chang Chen, Jan-Ming Ho, and Yueh-Ming Huan. 2002. ACIRD:
Intelligent Internet Document Organization and Retrieval. IEEE Transactions on
Knowledge and Data Engineering 14(3): 599-614.
13. McCallum, Andrew, and Kamal Nigam. 1998. A comparison of event models for naïve
Bayes text classification. Proceedings of the AAAI-98 Workshop on Learning for Text
Categorization.
14. Meadow, Charles T., Bert R. Boyce, and Donald H. Kraft. 2000. Text Information
Retrieval Systems, 2
nd
edition. San Diego, CA: Academic Press.
15. Ng, Hwee Tou, Wei Boon Goh, and Kok Leong Low. 1997. Feature selection, perceptron
learning, and a usability case study for text categorization. Proceedings of the 20
th
Annual
International ACM SIGIR Conference on Research and Development in Information
Retrieval, pp. 67-73.
16. Sable, Carl, Kathy McKeown, and Vasileios Hatzivassiloglou. 2002. Using density
estimation to improve text categorization. Technical report no. CUCS-012-02, Department
of Computer Science, Columbia University.
17. Tang, Bo, and Julia Hodges, 2000. Web document classification with positional context.
Proceedings of the International Workshop on Web Knowledge Discovery and Data
Mining (WKDDM’2000).
18. Turney, P. 1997. Extraction of Keyphrases from Text: Evaluation of Four Algorithms.
Ottawa, Canada: National Research Council of Canada, Institute for Information
Technology. ERB-1051.
19. Wang, Yong. 2002. A comparative study of Web document classification methods. M.S.
project report, Mississippi State University.
20. Yang, Yiming. 1999. An evaluation of statistical approaches to text categorization. Journal
of Information Retrieval 1(1/2): 67-88.
21. Yang, Yiming and Jan O. Pedersen. 1997. A comparative study on feature selection in text
categorization. Proceedings of the Fourteenth International Conference on Machine
Learning, pp. 412-420.
163