In order to understand why it is impossible to
comprehensively assess misbehavior detection with
the confusion matrix, a new perspective on misbehav-
ior is needed. Attackers sending misleading or incor-
rect information about the state of the world are not
trying to get as many packets accepted as possible—
their goal is to alter the state of the physical world in
their favor. The success of their attack and in conse-
quence the ability of any misbehavior detection sys-
tem (MDS) to mitigate this attack can thus only be
measured by considering the effect of the communi-
cation on the physical world. This aspect has rarely
been considered and requires new tools to interac-
tively simulate cooperative intelligent transport sys-
tems (C-ITS) in the presence of attackers.
2 RELATED WORK
Attacks and misbehavior detection in vehicular net-
works have been studied extensively in the last
years (van der Heijden et al., 2019; van der Heijden,
2018). Many approaches of detecting attacks have
been published, but most focus on detecting very spe-
cific attacks. As such, results are difficult to com-
pare and usually authors do not share a common eval-
uation approach or even metrics that allow for com-
parison. Moreover, the implementation is often not
available, which makes reproducing results difficult.
Recently, the VeReMi dataset containing communi-
cation traces of different attacks being conducted in
the LuST (Codeca et al., 2015) traffic scenario has
been published and extended to improve evaluation
quality in the field and provide a basis for compari-
son (van der Heijden et al., 2018; Kamel et al., 2020).
The simulation code is based on the Veins vehicular
network simulator (Sommer et al., 2011) and publicly
available.
The choice of metrics for evaluating the quality
of mechanisms is a related topic that is being dis-
cussed in the community. The authors of VeReMi
propose to use the precision / recall graph of all re-
ception events (van der Heijden et al., 2018) instead
of the FPR / FNR metrics that have been used by
many works before. While such metrics are easy
to obtain in simulations, the numbers are difficult to
interpret and even though they seem applicable for
most data-centric approaches and suggest comparison
to some extent, not all mechanisms can be measured
accurately. The evaluation depends on further sub-
tleties, such as the aggregation method and a defini-
tion of when a message is to be considered as mali-
cious (van der Heijden and Kargl, 2017), which fur-
ther weakens comparability between mechanisms. To
take the dispersion of errors in detection performance
between participants into account, the gini-index of
the FPRs / FNRs of different vehicles has been pro-
posed (van der Heijden et al., 2018). While differ-
ences in classification performance between vehicles
can be described with the gini-index, the underlying
question of the individual importance of packets and
vehicles remains open. The similarity between C-ITS
and cyber physical systems (CPS) has been noticed
and discussed before in the context of misbehavior
detection (van der Heijden et al., 2016). However,
the discussion did not consider the physical part and
mostly separated both domains. Application behavior
metrics have been used to analyze the physical im-
pact of attacks for specific applications like cooper-
ative adaptive cruise control (CACC) (van der Heij-
den, 2018), but have not been explored further due to
concerns about dependencies on specific implementa-
tions (van der Heijden and Kargl, 2017). This concept
is related to our approach, and we generalize its appli-
cation and systematically show why these metrics are
crucial for the assessment of MDS.
3 A NEW SECURITY MODEL
FOR C-ITS
In the literature, several attacker models are used and
attackers are usually described by their intention and
their capabilities. The intentions and capabilities of
possible attackers differ greatly, but a single attacker
can perform multiple attacks according to his capa-
bilities. While the individual misbehavior detection
mechanisms mostly focus on detecting specific at-
tacks, the discussion about evaluation metrics for mis-
behavior detection tries to find a generic set of metrics
that are independent of the mechanism but also the
attacker model. One property common to most (but
not all) attackers is that they send packets, and this is
what recent evaluation metrics are based on: the clas-
sification of incoming packets into legitimate and ma-
licious. Metrics based on the confusion matrix of the
classification have to assume the quality of a detection
mechanism is related to the number of correct classi-
fications in some form and that mechanisms with a
better classification are better at detecting attacks.
We argue that this approach is too broad and while
such metrics are applicable to most relevant attacks
and detection mechanisms, the underlying assump-
tion does not hold in many cases and therefore the
effectiveness of mechanisms is not sufficiently re-
flected. This is substantiated when thinking about
the success of attacks: an attack is successful if the
attacker’s goal is achieved. The hypothesis that the
VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems
530