negatives (FN) and false positives (FP). A FN occurs
when an answer gets lower score than it deserves. A
FP occurs when the system assigns more marks to an
answer than it deserves. In case of our system's
evaluation, the number of FN was much higher than
the number of FP. 35% of all errors was FN while
only 25% were FP. The relative ratio of FN can be
explained based on the difficult of anticipating all
the possible paraphrases for an answer. If some
correct possibility is missed, then SA will lead to
FN. The most relevant scenario that accounts for
systems’ FP refers to students that don't know the
answer to the question, but are fortunate enough to
write some words that match with the RA.
4 CONCLUSIONS AND FUTURE
WORK
In this study, we proposed a system for free text
answer assessment. In the proposed approach, each
question has several RAs that are automatically
developed by our system, based on the word and its
part of speech tag. Answers submitted by students
can be compared with several RAs. After the word
matching algorithm that searches for similar words
of SAs in RA is applied, the similarity score is
calculated based on weights of common-words
between the SA and the RA. The system was tested
in the context of History exams, and some
evaluation results were presented. Despite
evaluation results showed a good correlation (0.78)
between average teacher scores and system scores,
we think it is possible to improve system results. We
intend to do that by detecting combined words,
(occurrences of n-grams), and using as RAs, SAs
previously marked by the teacher with the maximum
score. This way the system will be improved in a
continuous manner, when more and more training
examples - RAs are provided, which will permit a
more accurate assessment. Also teachers need to
obtain feedback on their teaching performance, and
students need feedback on their learning
performance, these goals will be achieved through
the development of the feedback module that we
intend to develop next.
REFERENCES
Mason, O., Grove-Stephenson, I., 2002. Automated free
text marking with paperless school. In Proceedings of
the Sixth International Computer Assisted Assessment
Conference, Loughborough University, UK.
Thomas P., Haley D., Roeck A., Petre M., 2004. E–
Assessment using Latent Semantic Analysis in the
Computer Science Domain: A Pilot Study. In
Proceedings of the Workshop on eLearning for
Computational Linguistics and Computational
Linguistics for eLearning, pp. 38-44. Association for
Computational Linguistics.
Valenti S., Neri F., Cucchiarelli A., 2003. An Overview of
Current Research on Automated Essay Grading.
Journal of Information Technology Education.
Perez-Marin D., Pascual-Nieto I., Rodriguez P., 2009.
Computer-assisted assessment of free-text answers.
The Knowledge Engineering Review, 24(4), pp. 353-
374.
Page, E. B., 1994. New computer grading of student prose,
using modern concepts and software. Journal of
Experimental Education, 62(2), pp. 127-142.
Jerrams-Smith J., Soh V., Callear D., 2001. Bridging gaps
in computerized assessment of texts. In Proceedings of
the International Conference on Advanced Learning
Technologies, pp.139-140.
Whittington, D., Hunt, H., 1999. Approaches to the
computerized assessment of free text responses. In
Proceedings of the Sixth International Computer
Assisted Assessment Conference, Loughborough
University, UK.
Burstein J., Leacock C., Swartz R., 2001. Automated
evaluation of essay and short answers. In Proceedings
of the Sixth International Computer Assisted
Assessment Conference, Loughborough University,
UK.
Rudner L. M., Liang T., 2002. Automated essay scoring
using Bayes’ Theorem. The Journal of Technology,
Learning and Assessment, 1(2), pp. 3-21.
Mitchell T., Russel T., Broomhead P., Aldridge N., 2002.
Towards robust computerized marking of free-text
responses. In Proceedings of the Sixth International
Computer Assisted Assessment Conference,
Loughboroug University, UK.
Smith, J., 1998. The book, The publishing company.
London, 2
nd
edition.
Almeida, J. J., Simões A., 2007. Jspellando nas
morfolimpíadas: Sobre a participação do Jspell nas
morfolimpíadas. In Avaliação conjunta: um novo
paradigma no processamento computacional da
língua portuguesa. IST Press.
OpenThesaurus, 2010 http://openthesaurus.caixamagica.
pt/, last access, Feb 2011
Salton G., Wong A., Yang C. S., 1975. A Vector Space
Model for Automatic Indexing. Communications of the
ACM, vol. 18, nr. 11, pp. 613–620.
Noorbehbahani F., Kardan A. A. 2011. The Automatic
assessement of free text answers using a modified
BLEU algorithm. Computers & Education 56, pp.337-
345.
AUTOMATICASSESSMENTOFSHORTFREETEXTANSWERS
57