quantity α to the difficulty factor, rationalizes the
relative influence of each item to the overall score.
When α is added to P
Fi
, the resulting value W
Si
>P
Fi
leads to a decrease of the relative influence of the
high-end weighting factors and a respective increase
of the influence of lower-end ones. The merit of the
proposed method lies in the fact that the weighting
factors which influence the final marks are mainly
determined by the performance of the students who
sat the examination.
Table 2 compares the results of 5 MCQ based
examinations. <E
T
> stands for the average marks of
the students who sat each examination, when the
mark of each individual student is given according to
equation (2) (W
T
, tutor defined weighting factors).
<E
S
> stands for the average marks of the students
who sat each examination, when the mark of each
individual student is given according to equation (4)
(W
S
, empirical weighting factors). The last column at
the right of Table 2 shows the value of the parameter
α for each one of the examinations. It is seen that, for
a suitable value of α there is good agreement between
the <E
T
> and <E
S
> values as well as the percentage
of students who passed the examination (E
T
and E
S
).
Our method serves as a tool for sending an alarm
to the examiner in the following two cases: The first
comes from the direct comparison of the expert
weighting factors W
Ti
to the average difficulty ratio
of the ith item. In case when the difficulty ratio P
Fi
roughly follows the discreet values of W
Ti
, then this
can be considered as a sign of success on behalf of the
tutor. Low W
Ti
when P
Fi
is high for a certain item, is
an indication of either luck of clarity or poor language
use (McCoubrie, 2005).
Low divergence between E
Tj
and E
Sj
, is also an
indication of an adequate agreement between the
judgment of the tutor and the actual performance of
the students (Fig.1). In cases when there is a
considerable divergence between E
Sj
and E
Tj
, then
this is an indicator for action to be taken by the tutor.
4 CONCLUSIONS
The present publication suggests, as a proof of
concept, a method for scoring MCQ based
examinations. The score of each student is obtained
as a weighted average of the items correctly
answered. Expert and empirical weighting factors are
both employed. The empirical weighting factor
depends on the difficulty ratio of each item, which
practically equals the percentage of students who
failed to answer a specific item. For each student two
scores are calculated: for the first, the expert
weighting factors are used and for the second, the
empirical ones. The mathematical condition imposed
to ensure best possible congruence between the two
sets of scores, was minimization of the sum of the
squared distances between the two scores over all the
examinees.
A more thorough study of the differences between
expert weighting factors and a posteriori difficulty
ratios is required in order to gain a better
understanding of the factors affecting the congruence
of the experts’ and empirical weighting overall
scores.
REFERENCES
Benvenuti, S., 2010. “Using MCQ-based assessment to
achieve validity, reliability and manageability in
introductory level large class assessment”, HE Monitor
No 10 Teaching and Learning beyond Formal Access:
Assessment through the Looking Glass, 21-33.
Freeman, R., and R. Lewis., 1998. Planning and
Implementing Assessment. London: Kogan Page.
Bjork, E.L., Soderstrom, N.C., and Little, J.L., 2015. “Can
multiple-choice testing induce desirable difficulties?
Evidence from the laboratory and the classroom”, The
American Journal of Psychology, 128(2), 229-239.
Bull, J., & McKenna, C., 2004. Blueprint for computer-
assisted assessment : London, RoutledgeFalmer
Carroll, T., and Moody, L., 2006. “Teacher-made tests”,
Science Scope, 66-67.
Chan, N., and Kennedy, P.E., 2002. “Are Multiple-Choice
exams easier for economics students? A comparison of
multiple-choice and ‘equivalent’ constructed-response
exam questions”, Southern Economic Journal, 68 ( 4),
957-971.
Cross, L.H., Ross, F. K., and Scott Geller E., 1980. “Using
Choice-Weighted Scoring of Multiple-Choice Tests for
Determination of Grades in College Courses”, The
Journal of Experimental Education, 48(4), 296-301.
Dascalua, C.G., Enachea, A.M., Radu Bogdan Mavrua,
R.B., Zegana, G., 2015. “Computer-based MCQ
Assessment for Students in Dental Medicine –
Advantages and Drawbacks”, Procedia - Social and
Behavioral Sciences 187, 22– 27.
Donnelly, C., 2014. “The use of case based multiple choice
questions for assessing large group teaching:
implications on student’s learning”, Irish Journal of
Academic Practice, 3(1), 1-15.
Freeman, R. & Lewis, R. 1998. Planning and Implementing
Assessment: London: Kogan Page.
Gower, D.M., Daniels, D.J., 1980. ‘Some factors which
influence the facility index of Objective test items in
school Chemistry’, Studies in Educational Evaluation,
6, 127-136.
Hameed, I.A., 2011. “Using Gaussian membership
functions for improving the reliability and robustness of