A RETRIEVAL METHOD OF SIMILAR QUESTION ARTICLES FROM WEB BULLETIN BOARD

Yohei Sakurai, Soichiro Miyazaki, Masanori Akiyoshi

Abstract

This paper proposes a method for retrieving similar question articles from Web bulletin boards, which basically use the cosine similarity index derived from a user’s query sentence and article question sentences. Since these sentences are mostly short, it is difficult to distinguish whether article question sentences are similar to a user’s query sentence or not simply by applying the conventional cosine similarity index. In an attempt to overcome this problem, our method modifies the elements of the word vectors used in the cosine similarity index, which are derived from a sentence structure from the viewpoints of common words and non-common words between a user’s query sentence and article question sentences. Experimental results indicate that our proposed method is effective.

References

  1. Mochihashi, D. (2004). Learning Nonstructural Distance Metric by Minimum Cluster Distortions. EMNLP2004, pp.341-348.
  2. Kishida, K. (1997). International publication patterns in social sciences: a quantitative analysis of the IBSS file. Scientometrics Vol.40, No.2, pp.277-298.
  3. Sasaki, Y. (2002). NTT's QA Systems for NTCIR QAC-1. working notes, NTCIR Workshop 3, Tokyo.
  4. Tamura, T. (2005). Classification of Multiple-Sentence Questions. In Proceedings of the 2nd IJCNLP-05.
  5. Skowron, M. (2005). Effectiveness of Combined Features for Machine Learning Based Question Classification. Special Issue on Question Answering and Text Summarization, Journal of Natural Language Processing, Vol.6, pp. 63-83, 2005.
  6. Li, X. (2005). Learning Question Classifiers. COLING 2002, pp.556-562, 2002.
Download


Paper Citation


in Harvard Style

Sakurai Y., Miyazaki S. and Akiyoshi M. (2006). A RETRIEVAL METHOD OF SIMILAR QUESTION ARTICLES FROM WEB BULLETIN BOARD . In Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-972-8865-69-6, pages 238-243. DOI: 10.5220/0001315202380243


in Bibtex Style

@conference{icsoft06,
author={Yohei Sakurai and Soichiro Miyazaki and Masanori Akiyoshi},
title={A RETRIEVAL METHOD OF SIMILAR QUESTION ARTICLES FROM WEB BULLETIN BOARD},
booktitle={Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2006},
pages={238-243},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001315202380243},
isbn={978-972-8865-69-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - A RETRIEVAL METHOD OF SIMILAR QUESTION ARTICLES FROM WEB BULLETIN BOARD
SN - 978-972-8865-69-6
AU - Sakurai Y.
AU - Miyazaki S.
AU - Akiyoshi M.
PY - 2006
SP - 238
EP - 243
DO - 10.5220/0001315202380243