3.4 Example 3: Programming Item
Item generation for imperative or object-oriented
programming tasks is even more challenging as the
solution requires a sequence of decisions. The
following generated item, taken from an examination,
measures the skills in programming in imperative
(PL/SQL) programming languages.
Consider the following relation:
[enrolment] ([modulid],
[studentnumber], [status],
[enrolmentdate])
It should be ensured that a [enrolment]
to a [module] can only be [deleted] if
the [status] is "[enrolled]" and the
[deletion] is done within [14 days]
after the [enrolment date].
Implement this rule in Oracle and
provide test cases to validate it. Pay
attention to structured programming. In
case of failure, meaningful error
messages should be displayed.
In contrast to the second example task, this example
question describes a real-world scenario calling for
implementing a rule in an imperative database
program. However, the question does not state what
kind of database program (stored function, stored
procedure or trigger) has to be implemented or how
the implemented program has to be tested. To test the
database program, the students must formulate data
manipulation statements. Further constraints, for
example, structured programming and meaningful
error messages, are added to the question text to
ensure similar difficulty levels for evaluation
purposes. Although solution hints for such
programming tasks could be created according to the
domain model, the generated solutions do not cover
all possible structural solution variations.
4 FIRST EXPERIENCES
Since 2021, the author has used the two-step
workflow for the automatic generation of items with
varying domain models. The generated items have
been used in formative assessments of a database
course since 2021 at the author's university. In each
year, 148+/-6 students participate in the formal
assessment. In the first examination in 2021, question
templates were used, each covering sequenced tasks
in declarative (SQL) and imperative (PL/SQL)
programming languages. In 2021, 5 question
templates covering a whole examination were used
together with 18 domain models. Thus, 90 assessment
items were assigned individually to the participants in
an open-book examination. Each question template
contained problem-solving tasks, such as
implementing a database scheme, queries, and views
and implementing and testing a database program.
However, solutions to generated items according to at
least two question models seemed to have been
exchanged between the participants during the
examination due to similarities in the student
solutions.
Therefore, since 2022, the learning management
system has been used to assign randomly generated
test items from 13 item pools to the students. In this
examination, modelling items comprise the creation
of installation scripts with more constraints for up to
three relations, structurally comparable to example 1.
The items testing the skills in declarative
programming have had a higher difficulty level than
provided by example 2 due to more complex
solutions containing join and set operations, nested
queries, or the definition of views. The imperative
programming items are comparable to the given
example 3 above. Due to the limitations of the
Learning Management System, assigning each
student individual tasks as intended was impossible.
Therefore, 18 domain models were combined with six
up to nine question templates for each task assigned
randomly to the examinees. Thus, each participant got
items according to varying domain models in this
examination. The assessments comprised up to 35%
of closed questions (single- and multiple-choice) and
open free-text questions problem-based questions.
However, using generated assignment tasks with
individual assigned domain models does not prevent
students from exchanging solutions during the online
examination. Two malpractice cases occurred where
students submitted the exact solution to open
questions corresponding to another domain model or
template. Typical student errors could explain some
other instances of similarities between solutions.
Differences in the difficulty level did not appear,
apart from two cases of readability issues due to
grammatical flaws in the reading text.
The assurance of the item template’s quality is
essential for successfully generating items using the
proposed two-step workflow. Overall, reviewing the
question templates is more demanding than reading
plain text, as descriptions in question templates are
more abstract. Therefore, an expert-based evaluation
has been carried out on the test corpora instead of a
student-based or model-based evaluation. The model-
based and the student-based evaluation methods are
not applicable due to the absence of a mathematical
model when varying the domain context. In this