assesses the students’ program source code. In the
testing phase, the system gives the lecturer the
opportunity to view the students’ source code. The
system also asks the lecturer questions regarding
students’ program code as part of the assessment
process. The examiner gives an answer to questions
posed choosing one possible answer from the listed
answers such as awful, poor, fair, very good etc.,
and then the assessment process continues. It also
applies software metrics to students’ programs.
Lastly, students get feedback on his/her exercises.
In the system developed by Joy et al. (2005), the
correctness, style and authenticity of the student
program code is assessed. It is designed for
programming exercises. Students can submit their
programs using the BOSS system (a submission and
assessment system) (Joy et al., 2005). In the
feedback process, a lecturer tests and marks the
students’ submissions using BOSS. The system also
allows lecturers to get information on students’
results according to the automatic test applied and to
view original source code. Thus, the examiner can
then give further feedback in addition to the
system’s feedback. At the end of the assessment, the
student gets feedback including comments and a
score, rather than just a score.
2.1 Discussion of Related Work
In the related work section five studies were
introduced in terms of their strengths and
weaknesses. Although some of them may provide
sufficient feedback if correctly applied, they are not
designed to significantly alleviate the workload of
the examiner. That is, providing feedback may
impact negatively on the time taken by the examiner
to assess student work. The workload of the
examiner entirely depends on the approach of the
assessment systems. In addition to this, the workload
of the examiner also depends on the length of the
code script which could be short or long.
The systems of Wang et al. (2007) and Sharma et
al. (2014) and Saikkonen et al. (2001) focus on
student programming code structures, which can be
useful, but limiting in terms of feedback. While the
systems of Wang et al. (2007) and Saikkonen et al.
(2001) focus on the whole code structure in their
own systems, the system of Sharma et al. (2014)
covers only the ordering of conditions in the ‘else-if’
structure. The aim of the standardisation of the code
structure in the system of Wang et al. (2007) is to
reduce the number of model answers. Furthermore,
code structure is standardised to provide grade
students’ code rather than to provide comment
(feedback) on the code structure. On the other hand,
the system of Saikkonen et al. (2001) assesses the
return values instead of actual output strings because
of the differences in wording in students’ answers.
In other words, the system focuses on the execution
of abstract tree. In this case, the system of Saikkonen
et al. (2001) may not provide comprehensive
feedback for students. Also, Sharma et al. (2014)
system can be used only for ‘else-if’ structures,
which is effective for providing feedback to novice
programmers, but only on ‘else-if’. However, the
theory behind the system could allow it to handle
other control structures, loops and functions in the
future. Moreover, the quality of feedback could have
been enhanced by inclusion of a human in the
assessment process.
The system developed by Jackson (2000) and
Joy et al. (2005) highlights the importance of human
in the assessment process in providing
comprehensive feedback. In Jackson (2000) system
the examiner is part of the assessment process;
however, the examiner is used after the assessment
process in Joy et al. (2005) system. In these systems,
humans check each student’s code separately.
Therefore, the systems cannot reduce the workload
of examiners significantly, although they can
provide sufficient feedback using these approaches.
One significant drawback to Joy et al. (2005) system
is that while examiners can give additional
comments to students, the system could also
potentially provide inconsistent comments (as this is
not checked by the system). It is this automatic reuse
of feedback provided for given segments of code
that would have allowed greater consistency and
efficiency to be achieved. On the other hand, in
Jackson (2000) system, the examiner chooses one
comment from the suggested comments. However,
the system cannot provide comprehensive feedback
because the examiner cannot add comments to the
student’s code.
To conclude, the discussed assessment studies
intended to provide sufficient feedback and reduce
the workload of the examiner. However, they have
generally focused on whole code segments rather
than control structures, loop, functions etc. Thus,
they have generally provided superficial feedback
although some of them reduce the workload of the
examiner. Moreover, these discussed studies
generally based on the semantic similarity. The
proposed approach also related to semantic and
structure similarity. The main difference between
them is that the proposed approach does not need
model answer(s) although the discussed studies do.
Therefore, the proposed approach parses the whole