Automatic Generation of English Vocabulary Tests

Yuni Susanti, Ryu Iida, Takenobu Tokunaga


This paper presents a novel method for automatically generating English vocabulary tests using TOEFL vocabulary questions as a model. English vocabulary questions in TOEFL is a multiple-choice question consisting of four components: a target word, a reading passage, a correct answer and distractors. Given a target word, we generate a reading passage from Web texts retrieved from the Internet, and then employ that reading passage and the WordNet lexical dictionary for generating question options, both the correct answer and distractors. Human evaluation indicated that 45% of the responses from English teachers mistakenly judged the automatically generated questions by the proposed method to be human-generated questions. In addition, half of the machine-generated questions were received average rating more than or equals than 3 in 5 point scale. This suggests that our machine-generated questions succeeded in capturing some characteristics of the human-generated questions, and half of them can be used for English test.


  1. Agarwal, M. and Mannem, P. (2011). Automatic gap-fill question generation from text books. In Proceedings of the Sixth Workshop on Innovative Use of NLP for Building Educational Applications, pages 56-64.
  2. Brown, J. C., Frishkoff, G. A., and Eskenazi, M. (2005). Automatic question generation for vocabulary assessment. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 819-826.
  3. Chen, W., Aist, G., and Mostow, J. (2009). Generating questions automatically from information text. In Proceedings of AIED 2009 Workshop on Question Generation, pages 17-24.
  4. Cotton, K. (1988). Classroom questioning. School Improvement Research Series, pages 1-10.
  5. Fellbaum, C. (1998). WordNet: A lexical database for English. A Bradford Book.
  6. Heaton, J. B. (1989). Writing English Language Tests. Longman Pub Group.
  7. Lee, J. and Seneff, S. (2007). Automatic generation of cloze items for prepositions. In Proceedings of Interspeech 2007, pages 2173-2176.
  8. Lesk, M. (1986). Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th Annual International Conference on Systems Documentation, pages 24-26.
  9. Lin, Y.-C., Sung, L.-C., and Chen, M. C. (2007). An automatic multiple-choice question generation scheme for English adjective understanding. In Proceedings of Workshop on Modeling, Management and Generation of Problems/Questions in eLearning, the 15th International Conference on Computers in Education (ICCE 2007), pages 137-142.
  10. Liu, M. and Calvo, R. A. (2009). An automatic question generation tool for supporting sourcing and integration in students' essays. In Proceedings of the 14th Australasian Document Computing Symposium.
  11. McCarthy, D. (2009). Word sense disambiguation: An overview. Language and Linguistics Compass, 3(2):537-558.
  12. Narendra, A., Agarwal, M., and Shah, R. (2013). Automatic cloze-questions generation. In Proceedings of Recent Advances in Natural Language Processing, pages 511-515.
  13. Navigli, R. (2009). Word Sense Disambiguation: A Survey. ACM Computing Surveys, 41(2):1-69.
  14. Pedersen, T., Patwardhan, S., and Michelizzi, J. (2004). WordNet:: Similarity: measuring the relatedness of concepts. In Demonstration Papers at HLT-NAACL 2004, pages 38-41.
  15. Pino, J., Heilman, M., and Eskenazi, M. (2008). A selection strategy to improve cloze question quality. In Proceedings of the Workshop on Intelligent Tutoring Systems for Ill-Defined Domains. 9th International Conference on Intelligent Tutoring Systems, pages 22-32.
  16. Smith, S., Avinesh, P., and Kilgarriff, A. (2010). Gap-fill tests for language learners: Corpus-driven item generation. In Proceedings of ICON-2010: 8th International Conference on Natural Language Processing, pages 1-6.
  17. Sumita, E., Sugaya, F., and Yamamoto, S. (2005). Measuring non-native speakers' proficiency of English by using a test with automatically-generated fill-in-theblank questions. In Proceedings of the 2nd Workshop on Building Educational Applications Using NLP, pages 61-68.
  18. Turing, A. M. (1950). Computing machinary and intelligence. Mind - A Quarterly Review of Psychology and Philosophy, LIX(236):433-460.
  19. Wu, Z. and Palmer, M. (1994). Verbs semantics and lexical selection. In Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics (ACL 1994), pages 133-138.

Paper Citation

in Harvard Style

Susanti Y., Iida R. and Tokunaga T. (2015). Automatic Generation of English Vocabulary Tests . In Proceedings of the 7th International Conference on Computer Supported Education - Volume 1: CSEDU, ISBN 978-989-758-107-6, pages 77-87. DOI: 10.5220/0005437200770087

in Bibtex Style

author={Yuni Susanti and Ryu Iida and Takenobu Tokunaga},
title={Automatic Generation of English Vocabulary Tests},
booktitle={Proceedings of the 7th International Conference on Computer Supported Education - Volume 1: CSEDU,},

in EndNote Style

JO - Proceedings of the 7th International Conference on Computer Supported Education - Volume 1: CSEDU,
TI - Automatic Generation of English Vocabulary Tests
SN - 978-989-758-107-6
AU - Susanti Y.
AU - Iida R.
AU - Tokunaga T.
PY - 2015
SP - 77
EP - 87
DO - 10.5220/0005437200770087