
command line logs of students during a cyber exer-
cise in order to model their progress through the cy-
ber exercise. Based on a questionnaire for instructors,
they evaluated and assessed both a trainee graph and
a milestone graph in order to make implications for
teaching practice. Macak et al. (Macak et al., 2022)
contributed an approach utilizing a process discovery
algorithm in order to discover participants processes
in cyber trainings. They used the bash history of par-
ticipants and extended it with ”hints taken”-activities,
where the participants requested predefined hints.
In this regard, the cited works primarily offer
deeper insights into participants’ behavior rather than
merely identifying task completion. However, their
logs are predominantly limited to Bash commands.
While the exclusive analysis of Bash history provides
a comprehensive and clear view of participants’ be-
havior in exercises conducted solely within the Bash
environment, it is often insufficient for more com-
plex infrastructures. Cyber exercises take place on
complex platforms (i.e., Cyber Ranges (Leitner et al.,
2020), (
ˇ
Celeda et al., 2015)), which simulate vir-
tual infrastructures often compromising multiple net-
works and servers (potentially even augmented by
physical components (Yamin et al., 2018)). Attacks
occur somewhere within the infrastructure, and par-
ticipants must possibly connect to remote servers to
mitigate these incidents. In such environments, solely
analyzing the Bash history proves to be a limiting fac-
tor, as crucial actions also occur outside of a partici-
pants’ Bash history. Therefore, the improvement of
data collection during cyber exercises is a decisive
factor to allow more accurate evaluation of participant
performance (Henshel et al., 2016).
Andreolini et al. (Andreolini et al., 2020) present
an approach to discover and assess the performance
/ behavior of participants on cyber ranges. They col-
lect data such as command history, web browsing his-
tory, GUI interactions, and network events to define
events, add them to a graph and thereafter calculate
metrics like speed or precision. However, the authors
focus on graph development and evaluating the par-
ticipants performance instead of demonstrating how
to collect and process data from the exercise environ-
ment. Braghin et al. (Braghin et al., 2020) created a
hierarchy of categories and sub-categories of actions
participants can perform in a cyber exercise and took
advantage of the approach described in (Andreolini
et al., 2020) by adapting it to their purposes. They
used the resulting graph to algorithmically score par-
ticipants.
In addition to determining whether tasks have
been completed or not and developing graphs of par-
ticipants’ activities, there are a plenty of technical
metrics that can be calculated and utilized for eval-
uation and scoring purposes. Possible metrics are, for
instance, the time until a certain command is executed
(Labuschagne and Grobler, 2017), the mean time per
action / task (Abbott et al., 2015), the number of ac-
tions per task (Abbott et al., 2015), the number of cor-
rectly identified attacks (Patriciu and Furtuna, 2009),
and many more. Maennel et al. (Maennel, 2020)
performed an extensive literature review and deter-
mined potentially relevant metrics and argued how
they could be measured in order to reflect the learning
success of participants.
While there is a substantial body of literature on
participant behavior in cyber exercises, existing ap-
proaches often have significant limitations. Many ap-
proaches either focus narrowly on technical monitor-
ing, such as analyzing exclusively the terminal history
(e.g., (Mirkovic et al., 2020), (Macak et al., 2022)).
Other approaches emphasize behavioral representa-
tion and comparison using graph-based methods (e.g.,
(Andreolini et al., 2020), (Braghin et al., 2020)), with-
out going into detail about how the used data is col-
lected from systems. Our approach addresses this gap
by providing a comprehensive method for the targeted
monitoring of complex infrastructures, enabling the
collection of meaningful data and generating valuable
insights from it.
3 MONITORING POINTS
In the complex delivery of cyber exercises, the chal-
lenges do not only lie in monitoring participant activ-
ities but also in deriving meaningful insights from the
abundance of generated data. Our approach aims to
address these challenges by leveraging the scenario-
based nature of cyber exercises (Wen et al., 2021) to
our advantage.
3.1 Concept of a Monitoring Point
In essence, we introduce the concept of monitor-
ing points, which are strategically positioned within
the exercise environment to gain enhanced insights.
These points serve as focal nodes, each designed to
examine specific predefined facets of a participant’s
system. Instead of casting a wide net across the entire
cyber exercise infrastructure or limit our monitoring
to single components (e.g., bash history), we advo-
cate for targeted monitoring that focuses on compo-
nents and applications relevant to the cyber exercise
scenario and its learning objectives.
A monitoring point is a passive observer within
the exercise environment, comprising two fundamen-
Data Collection in Cyber Exercises Through Monitoring Points: Observing, Steering, and Scoring
357