to combine both approaches and provide a system in
which the final grade is a combination of the informa-
tion obtained from both analyzers.
So, in this paper we present a web-based appli-
cation, named QUIMERA, that extends the traditional
dynamic grading systems with static analysis aimed
at the automatic evaluation and ranking of program-
ming exercises in competitive learning or program-
ming contest environments. QUIMERA provides a full
contest management system, as well as an automatic
judgement procedure.
The paper is organized in 6 sections. Besides
the Introduction and Conclusion, Section 2 introduces
the basic concepts and similar tools; Section 3 gives
an overview of QUIMERA, presenting its architecture
and main features; Section 4 provides more details
about each feature, illustrating QUIMERA functionali-
ties; and Section 5 presents the technical details un-
derlying the system implementation.
2 RELATED WORK
As previously referred, the approaches to develop
Automatic grading systems can be distinguished be-
tween static and dynamic (Dani ´c et al., 2011).
Static approaches include systems that check the
submitted program against a provided scheme, to find
the degree of similarity relatively to a set of character-
istics. In this category we have, for example, the Web-
based Automatic Grader (Zamin et al., 2006), which
evaluates programming exercises written in Visual
Basic, C or Java languages. One of its disadvantages
is that it can not be used for testing the correctness
of programs which contain input and output opera-
tions (Dani ´c et al., 2011). Another important disad-
vantage is that better classification is assigned to solu-
tions that are more similar to the provided scheme, pe-
nalizing different programming styles (Rahman et al.,
2008) that can be so good or even better.
Dynamic approaches include systems that eval-
uate the submitted program by running it through
a set of predefined tests. One example is Online
Judge
3
, implemented to help in the preparation of
ACM-ICPC (Cheang et al., 2003). Another exam-
ple is Mooshak
4
, a system originally developed for
managing programming contests, and a reference tool
for competitive learning (Leal, 2003; Leal and Silva,
2008). Generally these approaches use a simple string
comparison between the expected output and the out-
put actually produced to determine if both values are
3
http://uva.onlinejudge.org
4
http://mooshak.dcc.fc.up.pt/
equal (the program submitted will be considered cor-
rect only if this condition is true); of course, this strict
comparison can be a limitation — it is the main draw-
back of these systems.
There are also several other tools, used by instruc-
tors of some modern educational institutions, that
facilitate the automatic grading of programming as-
signments (Patil, 2010). Some of them are devel-
oped as web applications, where users can exercise
their programming skills by coding a solution for the
given problem — like Practice-It
5
or Better Program-
ming
6
. One of the advantages of these systems is
that the user gets instantaneous feedback about their
answers, which help him to attempt the problem un-
til he reaches the correct solution. Other example is
WebCAT
7
, a tool focused on test driven development
which supports the grading of assignments where stu-
dents are asked to submit their own test cases.
One major disadvantage of these traditional ap-
proaches is their incapability to analyze the way the
source code is written. This is especially relevant in
educational environments, where the instructor wants
to teach programming styles or good-practices for a
specific paradigm/language. This feature would be
crucial to detect situations where the submitted so-
lution does not comply with the exercise rules. As
an example consider a typical C programming exer-
cise that asks the student to implement a Graph us-
ing adjacency lists to print the shortest-path between
two given nodes. The referred grading systems will
consider completely correct a solution implemented
with an adjacency matrix if the final output is equal
to the expected one; however, that solution is not ac-
ceptable because it does not satisfies all the assign-
ment requirements. Or even more dramatic, if the user
computes by hand the shortest-path and the submitted
program only prints it, the solution will again be ac-
cepted because the evaluation system can not detected
such erroneous situation.
This means that traditional dynamic grading sys-
tems leave aside one important aspect when assessing
programming skills: the source code quality. How-
ever, other tools like static code analyzers (see for
example Frama-C
8
or Sonar
9
), are able to identify
the structure and extract complementary information
from the source code in order to understand the way
the program is written and discuss its quality in terms
of software metrics that can be easily computed. They
can be invoked after compilation and do not need any
5
http://webster.cs.washington.edu:8080/practiceit/
6
http://www.betterprogrammer.com
7
http://web-cat.cs.vt.edu/
8
http://frama-c.com/what is.html
9
http://www.sonarsource.org/features/
ICEIS2012-14thInternationalConferenceonEnterpriseInformationSystems
210