For factoid questions, the top ranked possible an-
swer is given as the answer to the question if it
achieves a rank above the threshold for the type of
question. For list questions, the possible answers that
achieve a rank higher than the threshold will be given
as the answer. For other questions, possible answers
that appeared more then two times are given as an-
swers. This is because our patterns sometimes extract
useless information, and if a piece of information is
important about a target, it will usually get extracted
more than once. A useless fact should only be ex-
tracted once from the set of documents.
4 EVALUATION
The TREC question answering track provides the test-
ing data to evaluate the accuracy of the systems. It
consists of sets of documents and questions/answers
related to those sets of documents. We evaluate our
system considering these data. The results of the eval-
uation are shown in Table 1.
Our system still not ready for all the types of ques-
tions that are asked in the TREC QA track collec-
tion. This difficulty arose because we mainly train
our system on the questions and answers, and we do
not present a corpus of questions large enough to in-
clude classifications for the questions. Therefore, we
lose in the accuracy of attempts to answer questions.
Table 1: Evaluation Results shown by Question Categories.
Question Type Success
Who 0.317
When 0.328
Why 0.245
How 0.265
Where 0.345
What 0.294
List 0.308
Others 0.145
Overall 0.281
5 CONCLUSIONS
We presented a system that extracts information about
entities and events given a pool of questions related to
that entity or event. We create categories for all the
questions. We extract rules to classify questions into
each of these categories. The system also includes
syntactic features and part of speech features for the
question classification and answer extraction.
Our system still need some improvements. The
overall improvement is primarily expected by the ex-
panded classification of questions and the addition of
dependency features to answer finding (Li and Roth,
2005; Pinchak and Lin, 2006). We hope to carry on
this research and obtain an even greater improvement.
REFERENCES
Chali, Y. and Dubien, S. (2004). University of Lethbridge’s
participation in TREC-2004 QA track. In Proceedings
of the Thirteenth Text REtrieval Conference.
Collins, M. (1996). A new statistical parser based on bi-
gram lexical dependencies. In Proceedings of ACL-
96, pages 184–191, Copenhagen, Denmark.
Harabagiu, S., Moldovan, D., Clark, C., Bowden, M.,
Williams, J., and Bensley, J. (2003). Answer min-
ing by combining extraction techniques with abduc-
tive reasoning. In Proceedings of the Twelfth Text RE-
trieval Conference, pages 375–382.
Li, X. and Roth, D. (2005). Learning question classifiers:
The role of semantic information. Journal of Natural
Language Engineering.
Moldovan, D., Harabagiu, S., Girju, R., Morarescu, P., Lac-
tusu, F., Novischi, A., Badulescu, A., and Bolohan, O.
(2002). LCC tools for question answering. In Pro-
ceedings of the Eleventh Text REtrieval Conference.
Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R.,
Goodrum, R., Girju, R., and Rus, V. (1999). LASSO:
A toll for surfing the answer net. In Proceedings of
the 8th Text REtrieval Conference.
Pinchak, C. and Lin, D. (2006). A probabilistic answer
type model. In Proceedings of the 11th Conference
of the European Chapter of the Association for Com-
putational Linguistics, pages 393 – 400.
Roth, D., Cumby, C., Li, X., Morie, P., Nagarajan, R., Riz-
zolo, N., Small, K., and Yih, W. (2002). Question-
answering via enhanced understanding of questions.
In Proceedings of the Eleventh Text REtrieval Confer-
ence.
Schone, P., Ciany, G., P. McNamee, J. M., Bassi, T., and
Kulman, A. (2004). Question answering with QAC-
TIS at TREC-2004. In Proceedings of the Thirteenth
Text REtreival Conference.
Sekine, S. (2002). Proteus project oak system (English sen-
tence analyzer), http://nlp.nyu.edu/oak.
Voorhees, E. M. (2003a). Overview of the TREC 2002
Question Answering track. In Proceedings of the
Eleventh Text REtrieval Conference.
Voorhees, E. M. (2003b). Overview of the TREC 2003
Question Answering track. In Proceedings of the
Twelfth Text REtrieval Conference.
Witten, I., Muffat, A., and Bell, T. (1999). Managing Gi-
gabytes: Compressing and Indexing Documents and
Images. Morgan Kaufmann.
WEBIST 2008 - International Conference on Web Information Systems and Technologies
342