possible safety and security in traffic. Systematic test-
ing starts with unit and component tests, where cor-
rect behavior of each single entity is proven. The fol-
lowing integration tests aim at proving the correct in-
teraction between the units and components up to the
complete system. Finally, the complete system has to
be tested in interaction with its future environment.
(Bourque et al., 2014)
Today, there are many different test methods, best
suited for the different test goals. Unit and compo-
nent testing can be achieved with Model-in-the-loop
(MIL) and Software-in-the-loop (SIL) tests (Shokry
and Hinchey, 2009) (Albers et al., 2010). With each
integration step the complexity of the tests increases
drastically as the number and thereby the possible
combinations of inputs, internal states and outputs in-
crease. The focus of integration testing is the correct
behavior of the software on the target hardware as
well as the correct interaction between different units
and components. Hardware integration can be tested
with Hardware-in-the-loop (HIL) tests (Sax, 2008)
(Oral, 2013) whereas the correct interaction between
software components can also be tested in a SIL en-
vironment. System level testing requires at least the
complete control chain plus the relevant vehicle en-
vironment, the vehicle reacts to and interacts with.
Therefore, these tests are done with prototype vehi-
cles either on the proving ground or in real traffic.
Testing, however, does not start with the release
approval. It is crucial, that the feature is extensively
tested during development. Prototype vehicles offer
the developers the possibility to experience the fea-
ture under realistic conditions. While these tests are
valuable due to their high realism and direct feed-
back for the developer, they are time consuming and
costly. Since there are many iterations of testing and
development, time and resource costs of test itera-
tions are critical. However, their validity and com-
pleteness needs to remain on the highest level possi-
ble. This gap can be filled by complementing the real
world tests with simulation approaches, which offer
less realism but more scalability and especially repro-
ducibility of tests.
The realism and thereby the validity of the asser-
tions made within a simulation environment strongly
depends on the quality of the models used to substi-
tute the real world. Depending on the use case models
for the vehicle, road topology, traffic and e.g. other
road users must be provided (Wachenfeld and Win-
ner, 2016). One possible way to obtain lots of realistic
data for the simulation is to reuse recorded real world
driving data from test campaigns and other road tests
(Zofka et al., 2015) (Langner et al., 2017). This driv-
ing data contains information about the road layout
as well as information about other vehicles and road
users at the time of recording. With some interme-
diate processing even closed-loop simulations can be
achieved using the recorded data (Bach et al., 2017)
(de Gelder and Paardekooper, 2017).
However, for recurrent testing during develop-
ment and application - even in a simulation environ-
ment - it is not efficient to use all test drives within the
ever growing data pool. A strategy for selecting rep-
resentative test drives out of the data pool is required
as well as a method to extrapolate the results based on
this representative sample to the complete data pool.
For Verification and Validation (V&V) the pur-
pose of testing is the safety and thereafter the release
approval of the FUT. In order to achieve this, the fea-
ture’s correct behavior in every conceivable situation
has to be proven - e.g. by successfully completing
each possible test once. Completeness of testing can
be argued in several ways.
For one, stochastic measures can be applied. Met-
rics like fatalities, injuries or disengagements per
x kilometers may give an indication of the sys-
tem’s safety (Shalev-Shwartz et al., 2017). How-
ever, Wachenfeld and Winner (Wachenfeld and Win-
ner, 2016) show, that billions of driven test kilometers
are required for statistically valid assertions for higher
SAE levels due to the rarity of crashes or critical situ-
ations in real world traffic.
To counter the problem of rare occurrences of crit-
ical situations, scenario-based testing (Conrad et al.,
2005) has been introduced. Here, test content is not
randomly generated through driving in the real world
but explicitly specified via scenarios. Each scenario
represents a certain situation that is to be tested. Thus,
rare situations can explicitly be tested independent of
their frequency in normal traffic. For the safety argu-
ment the focus is set on critical scenarios which are
more relevant for the release approval (Junietz et al.,
2017).
However, both approaches have little validity to-
wards the assessment of the overall feature maturity
in terms of passenger comfort and feature quality as
they focus only on safety relevant aspects. Specific
situations are either not considered at all or are cherry
picked while the frequency of the situations is com-
pletely neglected. For a quality assessment the fre-
quency of the situations in real world traffic has to
be identified and must be taken into account. For in-
stance, corner cases are less important for the driver
experience than for the safety argument. In contrast,
the frequent situations that occur more often than the
corner cases have to be weighted higher for an overall
comfort evaluation.
In this work, we want to focus on the quality as-
VEHITS 2020 - 6th International Conference on Vehicle Technology and Intelligent Transport Systems
116