A Classification Method of Open-ended Questionnaires using Category-based Dictionary from Sampled Documents

Keiichi Hamada, Masanori Akiyoshi, Masaki Samejima, Hiroaki Oiso

2012

Abstract

This paper addresses a classification method of open-ended questionnaires using a category-based dictionary. Different from other classification methods, our proposed method introduces a category-based dictionary which is generated from a small set of categorized samples. This category-based dictionary is used to judge questionnaire categories with t f -id f (term frequency inverted document frequency) and co t f -id f (cooccurrence t f -id f ). Experimental questionnaires about a university lecture show that 71% of these questionnaires are classified accurately.

References

  1. Atkinson, M. and Van der Goot, E. (2009). Near real time information mining in multilingual news. In Proceedings of the 18th international conference on World wide web, WWW 7809, pages 1153-1154, New York, NY, USA. ACM.
  2. Berry, M. (2003). Survey of Text Mining : Clustering, Classification, and Retrieval. Springer.
  3. Chim, H. and Deng, X. (2008). Efficient phrase-based document similarity for clustering. IEEE Transactions on Knowledge and Data Engineering, 20:1217-1229.
  4. Matsuo, Y. and Ishizuka, M. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(1):157-169.
  5. Ramos, J. (2002). Using TF-IDF to Determine Word Relevance in Document Queries. Technical report, Department of Computer Science, Rutgers University, 23515 BPO Way, Piscataway, NJ, 08855e.
  6. Salton, G. and Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing and management, 24(5):513-523.
  7. Trieschnigg, D., Pezik, P., Lee, V., de Jong, F., Kraaij, W., and Rebholz-Schuhmann, D. (2009). MeSH Up: effective MeSH text classification for improved document retrieval. Bioinformatics (Oxford, England), 25(11):1412-1418.
  8. Tseng, Y.-H., Lin, C.-J., and Lin, Y.-I. (2007). Text mining techniques for patent analysis. Inf. Process. Manage., 43:1216-1247.
  9. Zhang, D. and Lee, W. S. (2003). Question classification using support vector machines. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, SIGIR 7803, pages 26-32, New York, NY, USA. ACM.
Download


Paper Citation


in Harvard Style

Hamada K., Akiyoshi M., Samejima M. and Oiso H. (2012). A Classification Method of Open-ended Questionnaires using Category-based Dictionary from Sampled Documents . In Proceedings of the 14th International Conference on Enterprise Information Systems - Volume 3: ICEIS, ISBN 978-989-8565-12-9, pages 193-198. DOI: 10.5220/0003971601930198


in Bibtex Style

@conference{iceis12,
author={Keiichi Hamada and Masanori Akiyoshi and Masaki Samejima and Hiroaki Oiso},
title={A Classification Method of Open-ended Questionnaires using Category-based Dictionary from Sampled Documents},
booktitle={Proceedings of the 14th International Conference on Enterprise Information Systems - Volume 3: ICEIS,},
year={2012},
pages={193-198},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003971601930198},
isbn={978-989-8565-12-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 14th International Conference on Enterprise Information Systems - Volume 3: ICEIS,
TI - A Classification Method of Open-ended Questionnaires using Category-based Dictionary from Sampled Documents
SN - 978-989-8565-12-9
AU - Hamada K.
AU - Akiyoshi M.
AU - Samejima M.
AU - Oiso H.
PY - 2012
SP - 193
EP - 198
DO - 10.5220/0003971601930198