Deep Learning Classiﬁers for Automated Driving: Quantifying the

Trained DNN Model’s Vulnerability to Misclassiﬁcation

Himanshu Agarwal

1,2

, Rafal Dorociak

and Achim Rettberg

2,3

HELLA GmbH & Co. KGaA, Lippstadt, Germany

Department of Computing Science, Carl von Ossietzky University Oldenburg, Germany

University of Applied Sciences Hamm-Lippstadt, Lippstadt, Germany

Keywords:

Image Classiﬁcation, Deep Learning, Deep Neural Network, Vulnerability to Misclassiﬁcation, Automated

Driving.

Abstract:

The perception-based tasks in automated driving depend greatly on deep neural networks (DNNs). In context

of image classiﬁcation, the identiﬁcation of the critical pairs of the target classes that make the DNN highly

vulnerable to misclassiﬁcation can serve as a preliminary step before implementing the appropriate measures

for improving the robustness of the DNNs or the classiﬁcation functionality. In this paper, we propose that

the DNN’s vulnerability to misclassifying an input image into a particular incorrect class can be quantiﬁed

by evaluating the similarity learnt by the trained model between the true class and the incorrect class. We

also present the criteria to rank the DNN model’s vulnerability to a particular misclassiﬁcation as either low,

moderate or high. To argue for the validity of our proposal, we conduct an empirical assessment on DNN-

based trafﬁc sign classiﬁcation. We show that upon evaluating the DNN model, most of the images for which

it yields an erroneous prediction experience the misclassiﬁcations to which its vulnerability was ranked as

high. Furthermore, we also validate empirically that all those possible misclassiﬁcations to which the DNN

model’s vulnerability is ranked as high are difﬁcult to deal with or control, as compared to the other possible

misclassiﬁcations.

1 INTRODUCTION

In the recent years, the advancements in the ﬁeld of

autonomous driving have been reinforced with the

progress in the techniques of artiﬁcial intelligence,

especially deep learning. A survey of the current

state-of-the-art deep learning technologies, e.g., deep

convolutional neural networks, deep reinforcement

learning, etc., has been presented in Grigorescu et al.

(2019). The deep convolutional neural networks have

led to various breakthrough contributions in object

detection and image classiﬁcation tasks, such as in

Krizhevsky et al. (2012) and Sermanet et al. (2014).

In context of autonomous driving, deep learning plays

a major role in perception-based tasks such as pedes-

trian detection (Ouyang et al., 2017), trafﬁc sign de-

tection (Zhu et al., 2016), etc. One of the challenges in

safe automated driving is related to the robustness of

the artiﬁcial intelligence or deep learning techniques

(Muhammad et al., 2020). In context of tasks related

to image classiﬁcation, it must be ensured that the

deep neural networks (DNNs) are not just accurate

but also robust against the perturbations that a vehicle

might encounter during its operation. The perturba-

tions, for instance, can be random noise in the input

images or even shift in brightness, contrast, etc. These

perturbations can inﬂuence the DNN’s decision sig-

niﬁcantly and aggravate the chances of misclassifying

an input image into an incorrect class that is highly

similar with respect to the true class. The questions

which need to be addressed are as follows:

• Since an image X

X belonging to the true class k

can be misclassiﬁed into any of the remaining tar-

get classes, how can we rank all these possible

misclassiﬁcations on the basis of how vulnerable

is the DNN model to each of them individually?

• If the misclassiﬁcation from a true class k

into

an another class k

(6= k

) is ranked to be offering

a high vulnerability, do the perturbations in the

images exploit this vulnerability and give rise to a

higher misclassiﬁcation rate from k

into k

Such investigations can help in determining the set

of critical misclassiﬁcations that need laborious miti-

Agarwal, H., Dorociak, R. and Rettberg, A.

Deep Learning Classiﬁers for Automated Driving: Quantifying the Trained DNN Model’s Vulnerability to Misclassiﬁcation.

DOI: 10.5220/0010481302110222

In Proceedings of the 7th International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS 2021), pages 211-222

ISBN: 978-989-758-513-5

211

gation efforts to enhance the robustness of the classi-

ﬁcation function.

Our contributions in this paper are listed below:

• We propose an approach to estimate the trained

DNN model’s vulnerability to a particular mis-

classiﬁcation. Along with it, we propose the crite-

ria to categorise the DNN model’s vulnerability to

a particular misclassiﬁcation into one of the three

levels: low, moderate or high.

• We further argue that the ease with which the rate

of a particular misclassiﬁcation can be kept un-

der control depends on the estimated value of the

DNN model’s vulnerability to it. In other words,

we highlight that it is easier to control those mis-

classiﬁcations to which the model is estimated to

be lowly vulnerable, as compared to the misclassi-

ﬁcations to which it is estimated to be highly vul-

nerable. We validate our arguments empirically.

The practical examples and the corresponding ex-

perimentation

conducted in context of the above-

mentioned contributions have been extensively dis-

cussed in the paper. The remainder of the paper is

organized as follows. In Section 2, we present a brief

background to discuss the rationale behind our pro-

posed approach of estimating the DNN’s vulnerabil-

ity to a particular misclassiﬁcation. In Section 3, we

present a detailed description of the approach, and an

experimental analysis in context of trafﬁc sign classi-

ﬁcation. Section 4 addresses the second part of our

contributions, i.e., an empirical investigation to show

that the set of misclassiﬁcations to which the DNN

model is ranked to be highly vulnerable are rather

difﬁcult to manage, even after the implementation of

the measures to control the misclassiﬁcation rate. Fi-

nally, Section 5 presents a conclusion along with a

brief overview on the possible future directions.

2 BACKGROUND

In context of image classiﬁcation, the objective is to

ensure that a classiﬁer is able to correctly distinguish

between the target classes. In Tian et al. (2020), for

instance, the authors assess the classiﬁer model’s abil-

ity to distinguish between any two classes by comput-

ing the confusion score. This score is based on mea-

suring the euclidean distance between the neuron ac-

tivation probability vectors corresponding to the two

For the experimentation, the libraries: Numpy (Harris

et al., 2020), Keras (Chollet et al., 2015), SciPy (Virta-

nen et al., 2020), Scikit-learn (Pedregosa et al., 2011) and

Matplotlib (Hunter, 2007) were used along with some of

the other standard Python libraries and their functions.

classes. Such analyses are usually performed by eval-

uating the trained model against a set of independent

(test) images. However, such evaluations against a set

of test images are not sufﬁcient to realize the classi-

ﬁer’s ability to distinguish between the classes (since

the completeness of the test data serves as a major

challenge). As an additional assessment, the critical

misclassiﬁcations corresponding to the DNN model

can be identiﬁed by determining the set of possible

misclassiﬁcations to which the model is highly vul-

nerable. In Agarwal et al. (2021), it has been argued

that the classiﬁer’s vulnerability to a particular mis-

classiﬁcation (let us say, from the true class k

into

an incorrect class k

or vice versa) can be assessed in

terms of the similarity between the dominant visual

characteristics of the corresponding two classes (i.e.,

and k

). For instance, the dominant visual charac-

teristics in the trafﬁc signs are shape and color (Gao

et al., 2006). By evaluating the overlap in terms of (a)

the shape of the trafﬁc sign board, and (b) the color

combination of the border and the background, the

similarity between any two trafﬁc sign classes can be

analysed a priori (Agarwal et al., 2021). The obtained

measure of similarity between the classes k

and k

recognized as a measure of the classiﬁer’s vulnerabil-

ity to misclassifying an input image belonging to the

class k

into the class k

or vice versa. Higher simi-

larity between the two classes is considered to induce

higher vulnerability to the corresponding misclassiﬁ-

cation.

We illustrate it further with an example. In this re-

gard, we trained a DNN to classify the different traf-

ﬁc signs from the German Trafﬁc Sign Recognition

Benchmark (GTSRB) dataset (Stallkamp et al., 2012).

Mathematically, we represent the DNN (classiﬁer)

model as f : X

X ∈ R

(48×48×3)

−→ ˆy ∈ {1,2,...,43},

where ˆy is the predicted class for the input image X

The hyperparameters and the training details related

to it are provided in Appendix A. From the set of

10000 test images, we choose 2207 images which be-

long to the danger sign type. We determine the per-

centage of these images that the DNN model f mis-

classiﬁes into: (a) another danger sign, (b) a speed

limit or a prohibitory sign, (c) a derestriction sign,

and (d) a mandatory sign. The results are graphically

presented in Figure 1. We ﬁrst consider the results

plotted for the GTSRB original test images (i.e., the

test images without any perturbations deliberately en-

forced by us). We observe a higher rate of misclassi-

ﬁcation of a danger sign into another danger sign, as

compared to the other misclassiﬁcations. All the traf-

ﬁc signs that belong to the danger sign type have high

similarity due to their two dominant visual character-

istics being the same. This can perhaps be a potential

VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems

212

0 2 4 6 8 10 12

Figure 1: Graph showing the percentage of the danger sign

images in the test dataset that were misclassiﬁed into: an-

other danger sign, speed limit or prohibitory sign, derestric-

tion sign, and mandatory sign.

source of the higher rate of misclassiﬁcation of a dan-

ger sign into another danger sign, as observed in Fig-

ure 1. It can also be observed that some common per-

turbations (e.g., contrast change, gaussian blur, etc.)

enforced by us on the GTSRB original test images

further aggravate the DNN’s susceptibility to misclas-

siﬁcation of a danger sign into another danger sign.

This indicates that the DNNs used for image classi-

ﬁcation are rather more susceptible or vulnerable to

misclassiﬁcation of an input image into the classes

that have stronger visual similarity with the true class.

This estimation of similarity depending on the

chosen predominant visual characteristics can facil-

itate the planning of the pre-training activities. For

instance, it can help the experts in choosing a suit-

able conﬁguration of the DNN architecture or tailor-

ing the DNN-based strategy appropriately in a manner

that has the potential to minimize those misclassiﬁ-

cations wherein the true class and the incorrect class

are highly similar in terms of their visual appearance.

However, it must be noted that the visual characteris-

tics considered in the abovementioned a priori anal-

ysis of similarity may not always necessarily be the

features that actually inﬂuence the decision of the

classiﬁer (Ribeiro et al., 2016). Thus, the similar-

ity perceived by humans between any two classes is

not necessarily the similarity that will be learnt by the

DNN model via training. As an extension to this con-

cept, unlike the approach discussed in Agarwal et al.

(2021), in this paper, we propose to estimate the DNN

model’s vulnerability to misclassiﬁcation from class

into class k

by measuring the similarity learnt by

the trained model between the classes k

and k

. In

our approach of measuring the class similarity learnt

by the DNN model, we use the images from the train-

ing dataset itself. Hence, by virtue of this vulnera-

bility estimation, we identify the set of critical mis-

classiﬁcations without actually evaluating the trained

model against any independent or seperate set of the

images (i.e., test data). The approach has been elabo-

rated in Section 3.

3 VULNERABILITY TO

MISCLASSIFICATION

Let us assume the DNN (classiﬁer) is trained for

K number of target classes. A class k ∈ K =

{1,2,....,K} can be misclassiﬁed into any of the re-

maining (K − 1) classes. Therefore, the set of mis-

classiﬁcations that it can incur is represented as:

M = { (k

) | k

6= k

, k

∈ K , k

∈ K }, (1)

where k

and k

represent the true class and the incor-

rect class, respectively. The total number of misclas-

siﬁcations that are possible is |M | = K(K − 1).

The DNN’s vulnerability to a particular misclassi-

ﬁcation, let us say, (k

), is determined by evaluat-

ing the similarity between the two classes k

and k

Followed by this, the categorisation criteria is imple-

mented in order to identify the set of possible misclas-

siﬁcations to which the DNN model is highly vulner-

able. The approach has been discussed in this section,

along with an experimental analysis.

3.1 Similarity between the Classes

In order to measure the class similarity learnt by the

trained model, the approach discussed in Agarwal

et al. (2020) is used. The trained DNN model pre-

dicts the logits z

= [z

,....,z

] for an input im-

age belonging to a class k ∈ K . The predicted logits

is then modelled as a multivariate normal distri-

bution N

. Mathematically, z

∼ N

(µ

,Σ

), where

and Σ

represent the K × 1 mean vector and the

K × K covariance matrix of the distribution N

. The

similarity between any two classes is determined by

calculating the Bhattacharyya distance between their

corresponding modelled multivariate normal distribu-

tions. Let us consider the classes k

and k

here. The

Bhattacharyya distance d

between N

(µ

,Σ

)

and N

(µ

,Σ

) is determined using Equation 2

(Kashyap, 2019).

(µ

− µ

)

−1

avg

(µ

− µ

)

|Σ

avg

|Σ

||Σ

, (2)

Deep Learning Classiﬁers for Automated Driving: Quantifying the Trained DNN Model’s Vulnerability to Misclassiﬁcation

213

where

avg

(Σ

+ Σ

). (3)

The similarity between the classes k

and k

will

be inversely proportional to the computed value of the

Bhattacharyya distance d

. It must be noted that

the Bhattacharyya distance computed above is sym-

metric, i.e., d

= d

3.2 Estimation of Vulnerability

Using the approach discussed in Section 3.1, we can

compute the value of d to assess the similarity of a

class k ∈ K with every other class in K \ {k}. Since

the number of target classes is K, we will have a total

K(K − 1) values of the Bhattacharyya distances. We

accumulate all these obtained values in a set D, as

shown below:

D = { d

| k

6= k

, k

∈ K , k

∈ K }. (4)

Among all these K(K − 1) values, let us assume

that the maximum value is observed to be d

max

, i.e.,

max

= max (D). (5)

Now, we represent the DNN model’s vulnerability

to the misclassiﬁcation (k

) as:

v(k

) = 1 −

max

. (6)

Note that v(k

)∈[0,1) and v(k

)=v(k

Since v(k

)=1 is possible only if k

(which

does not represent the case of misclassiﬁcation),

therefore, the value of 1 is excluded from the speci-

ﬁed range of v(k

). The value of v(k

) closer to

1 indicates higher vulnerability to the corresponding

misclassiﬁcation (k

3.3 Categorisation Criteria

Using the approach discussed above, we can acquire

the model’s vulnerability values corresponding to all

the K(K − 1) possible misclassiﬁcations. We collect

all these obtained values in a set V , as shown below:

V = { v(k

) | (k

) ∈ M }. (7)

We will now categorise the model’s vulnerability

to a particular misclassiﬁcation into one of the levels:

low, moderate or high, using certain statistical mea-

sures, as discussed below.

To the misclassiﬁcation (k

), the DNN model

will be considered to have:

• low vulnerability if 0 ≤ v(k

) < p

• moderate vulnerability if p

≤ v(k

) < p

and

• high vulnerability if p

≤ v(k

) < 1,

where p

and p

are the 25

and the 75

percentile

of the values in the set V , respectively.

The set of the misclassiﬁcations to which the DNN

model’s vulnerability is ranked as low, moderate and

high are denoted as M

low

moderate

and M

high

, re-

spectively. They are mathematically represented as:

low

= { (k

) ∈ M | 0 ≤ v(k

) < p

(8)

moderate

= { (k

) ∈ M | p

≤ v(k

) < p

(9)

high

= { (k

) ∈ M | p

≤ v(k

) < 1 }.

(10)

Note that: (i) M

low

∩ M

moderate

0, (ii)

moderate

∩M

high

0, (iii) M

low

∩M

high

0, and (iv)

low

∪ M

moderate

∪ M

high

= M .

In order to support the validity of this proposed

criteria of categorising the DNN model’s vulnerabil-

ity, we conducted an experimental analysis, which has

been discussed in Section 3.4.

3.4 Experimental Analysis

3.4.1 DNN Training

We conducted our experiment for the classiﬁcation of

the trafﬁc signs from the GTSRB dataset. The DNN

model used here in this experimental analysis is the

same model f : X

X ∈ R

(48×48×3)

→ ˆy ∈ {1, 2, ...,43},

which was trained for the illustration of the example

presented in Section 2. A brief summary of the DNN

architecture and the associated details related to it’s

training are provided in Appendix A. Since the total

number of target trafﬁc sign classes is K = 43, the

number of possible misclassiﬁcations is |M | = 43 ×

42 = 1806.

3.4.2 Vulnerability to Misclassiﬁcation

We ﬁrst determined the similarity learnt by the DNN

model f between all the 43 target trafﬁc sign classes,

using the approach discussed in Section 3.1. Note that

in order to model the logits z

as a multivariate nor-

mal distribution, the samples of the predicted logits z

were collected for all the images in the training data

that belong to the class k. Followed by the measure-

ment of similarity between the classes, we determined

the model’s vulnerability (v) to each of the 1806 pos-

sible misclassiﬁcations, as discussed in Section 3.2.

The distribution of the obtained 1806 values of v is il-

lustrated as a histogram in Figure 2. The 25

and the

percentile of the distribution are p

= 0.4354

and p

= 0.6455, respectively.

VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems

214

Figure 2: Distribution of the trained DNN model’s esti-

mated vulnerability (v) to the 1806 possible misclassiﬁca-

tions among the 43 trafﬁc sign classes. The terms p

and

denote the 25

and the 75

percentile of the distribu-

tion, respectively.

3.4.3 Evaluation

The trained model f was evaluated on τ = 10000 GT-

SRB test images. We represent the set of these test

images as X

test

= {X

,...,X

}. An accuracy of

96.12% was observed. Alternatively, it can be said

that the model misclassiﬁes ρ = 388 trafﬁc sign im-

ages from X

test

. Let us denote the set of the misclas-

siﬁed images as X

= {X

,...,X

}, such that

⊂ X

test

. The corresponding misclassiﬁcations ex-

perienced by the model for the images in X

are put

together in the set Ψ

, as shown below:

={(y

, ˆy

),(y

, ˆy

),...,(y

, ˆy

)}, (11)

where the ordered pair (y

, ˆy

) ∈ M signiﬁes that

for the image X

∈ X

, the actual class is y

; how-

ever, the model f predicts it as a class ˆy

∈ K \{y

Out of all the images in X

, the number of images

for which the corresponding misclassiﬁcations in Ψ

were ranked to be offering the model f , based on our

proposed criteria, a:

(a) Low vulnerability equals:

low

∑

i=1

[0 ≤ v(y

, ˆy

) < p

], (12)

(b) Moderate vulnerability equals:

moderate

∑

i=1

≤ v(y

, ˆy

) < p

], (13)

high

∑

i=1

≤ v(y

, ˆy

) < 1], (14)

where [...] in (12), (13) and (14) denote the Iverson

brackets, and v(y

, ˆy

) denotes the model’s vulner-

ability to the misclassiﬁcation (y

, ˆy

). Note that:

low

+ n

moderate

+ n

high

= ρ. (15)

For X

test

, using the trained classiﬁer model f , we

got the values of n

low

, n

moderate

and n

high

as 1, 103 and

284, respectively. This has been graphically presented

in Figure 3a. It can be deduced that:

high

> n

moderate

> n

low

, (16)

which implies, the model is most likely to incur

those misclassiﬁcations to which it’s vulnerability

was ranked to be high. To further validate this, we

continue our analysis using the perturbed test images.

These perturbations are: change in contrast

, change

in brightness

, gaussian blur

and gaussian noise

We apply these perturbations seperately to every im-

age X

in X

test

, j = 1 to τ. Hence, we have four dif-

ferent sets of perturbed test images, i.e., X

∆contrast

∆brightness

, X

∆blur

and X

∆noise

. The corresponding

obtained values of n

low

moderate

and n

high

are graphi-

cally presented in Figure 3b - 3e. It can be observed

that the condition in (16) holds true for the analysis

associated with the perturbed test images as well.

The criteria proposed for categorisation seems co-

herent or convincing when analysed experimentally

on a classiﬁcation problem, as it does helps in deter-

mining the set of critical misclassiﬁcations, i.e., the

set of misclassiﬁcations which the DNN model can

most likely incur. Hence, the concept of using the

class similarity learnt by the model for quantifying

it’s vulnerability to every possible misclassiﬁcation

appears to be reasonable.

4 Further Empirical Investigation

4.1 Concept and Purpose

One of the ways to minimize the possibility of an

erroneous prediction or a misclassiﬁcation incurred

by a DNN model is to integrate a mechanism that

can provide the classiﬁcation functionality an alter-

native to not yield any decision in case of lack of cer-

tainty. For this purpose, in addition to the classiﬁer

every pixel p(r,s,t) ∈ [0,1] of a test image X

undergoes

a transformation: p(r,s,t) → γ p(r,s,t), using a randomly

chosen γ ∈ [1, 3].

every pixel p(r,s,t) ∈ [0, 1] of a test image X

undergoes a

transformation: p(r, s,t) → p(r,s,t) + δ, using a randomly

chosen δ ∈ [0, 0.4].

function in the OpenCV library (https://github.com/

opencv/opencv). To perturb a test image X

, we choose

randomly a kernel size η ∈ {3, 5,7}, and the standard de-

viation σ ∈ [0.5, 1.5] along both x and y directions.

adding to a test image X

a random normal gaussian noise

with a mean µ = 0 and a randomly chosen standard devia-

tion σ ∈ [0.01, 0.1].

Deep Learning Classiﬁers for Automated Driving: Quantifying the Trained DNN Model’s Vulnerability to Misclassiﬁcation

215

103

284

Low Moderate High

100

200

300

(a) Original test images

112

299

Low Moderate High

100

200

300

(b) Test images with contrast change

156

533

Low Moderate High

200

400

600

138

425

Low Moderate High

200

400

600

(d) Test images with gaussian blur

197

564

Low Moderate High

200

400

600

(e) Test images with gaussian noise

Figure 3: Graphs showing the values of n

low

, n

moderate

and n

high

, obtained when the trained DNN model was evaluated for

different sets of the test images: (a) X

test

, (b) X

∆contrast

, (c) X

∆brightness

, (d) X

∆blur

, and (e) X

∆noise

model f (trained earlier for the experimental analy-

sis in Section 3.4), we implement a network of dif-

ferently trained DNN classiﬁers that collaborate to si-

multaneously yield a prediction for the input image.

Here, we refer to the model f as the primary classiﬁer

( f

primary

) and the classiﬁer models trained additionally

to work in conjunction with it as the secondary classi-

ﬁers. The ﬁnal prediction for a given input image can

be acquired by the principle of unanimity voting. If

the prediction ( ˆy

secondary

) obtained from this network

of secondary classiﬁers is not congruous with the pre-

diction ( ˆy

primary

) obtained from f

primary

, then the ﬁnal

prediction ( ˆy

ﬁnal

) for the input image is considered to

be undecided, as shown below:

ˆy

ﬁnal

(

ˆy

primary

, if ˆy

primary

= ˆy

secondary

φ, if ˆy

primary

6= ˆy

secondary

(17)

where φ denotes the ﬁnal prediction as undecided, i.e.,

no prediction is being issued for the given input im-

age. In the event of an undecided outcome, the ve-

hicle, for instance, can transit into a safe state. The

safe state depends on many factors. One of the most

important factors is the level of automation

the ve-

hicle possesses. For instance, in case of Level 2 au-

tomation, the safe state could be the switching off of

the AI functionality, while for Level 3 automation, it

could be the handover to the driver. In cases of higher

levels of automation, i.e., Level 4 or 5, deﬁning the

safe state will require a detailed review of the possi-

ble hazards and the associated risks. Also, the factors

SAE J3016 standard (SAE International, 2014) states six

levels of driving automation, i.e., Level 0 (no automation)

to Level 5 (full automation).

such as driving scenario, operational conditions, traf-

ﬁc environment, etc. will play a major role. However,

in this paper, deﬁning an appropriate safe state for the

vehicle in this context is not within the scope, and

therefore, we do not address it in detail here.

Theoretically, it is expected here that for all the

ρ number of test images that are misclassiﬁed by the

model f

primary

, the ﬁnal prediction ˆy

ﬁnal

will be unde-

cided (φ). The purpose of our investigation here is to

validate the following arguments:

• Argument 1: For almost every test image for

which the primary classiﬁer model incurs a

misclassiﬁcation to which it’s vulnerability was

ranked as low, the use of the abovementioned

mechanism will result in the ﬁnal prediction as

undecided (which is actually desired).

• Argument 2: On the other hand, if the primary

classiﬁer model, for an input test image, incurs

a misclassiﬁcation to which it’s vulnerability was

ranked as moderate, the possibility of obtaining

the ﬁnal prediction as undecided is lower. More-

over, this possibility gets further lowered for the

misclassiﬁcations to which the model’s vulnera-

bility was ranked as high.

The validation of these arguments will demon-

strate the fact that among all the possible misclas-

siﬁcations in M , the misclassiﬁcations to which the

primary classiﬁer model’s vulnerability is high (i.e.,

the misclassiﬁcations categorised into the set M

high

)

will need relatively more tedious mitigation efforts as

compared to the other possible misclassiﬁcations in

moderate

and M

low

VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems

216

4.2 Mechanism

4.2.1 Primary and Secondary Classiﬁers

Firstly, the primary classiﬁer f

primary

maps an input

trafﬁc sign image X

X into one of the 43 classes in the

GTSRB dataset. Mathematically, it is represented as:

• Primary classiﬁer f

primary

: X

X ∈ R

−→ ˆy

primary

∈

{1,2,...,43}.

where l denotes the size of the input image X

X, and

ˆy

primary

is the trafﬁc sign class predicted by f

primary

This is also shown in Figure 4.

4.2.2 Secondary Classiﬁers

Some trafﬁc signs vary in terms of their physical char-

acteristics, i.e., variation in terms of shape, color or

both. However, all the trafﬁc signs of a particular

sign type

have the same physical characteristics. The

classiﬁer f

primary

, for a given input image, can experi-

ence either an inter-sign-type

misclassiﬁcation or an

intra-sign-type

misclassiﬁcation.

An inter-sign-type misclassiﬁcation by f

primary

implies that it is perhaps not able to correctly visual-

ize the high-level feature(s) (i.e., shape and/or color)

of the trafﬁc sign in the given input image. In or-

der to deal with such misclassiﬁcations, a suitable ap-

proach can be to implement two additional diverse

classiﬁers: one for predicting the shape of the traf-

ﬁc sign and the other to classify the trafﬁc sign based

on the prominent colors present on the sign board.

Since these classiﬁers focus on classifying the input

trafﬁc sign image on the basis of just the correspond-

ing high-level feature, it is expected that these clas-

siﬁers will be easier to train and have high accuracy.

We refer to these two secondary classiﬁers as shape

and color classiﬁer, i.e., f

shape

and f

color

, respectively.

The classes to which these classiﬁers map the input

image X

X are presented in Figure 5. All the 43 traf-

ﬁc signs in GTSRB can be grouped into the classes

,...,S

} and {C

,...,C

} on the basis of the

shape of the trafﬁc signs and the prominent colors

present on the sign boards, respectively. Mathemat-

ically, we represent the two classiﬁers as:

• Shape classiﬁer f

shape

: X

X ∈ R

−→

secondary

∈

{1,2,...,5}, and

the 39 trafﬁc signs in GTSRB belong to one of the sign

types: speed limit, prohibitory, danger, mandatory and der-

estriction. The remaining 4 signs (priority road, yield to

cross, do not enter, stop) are independent sign types.

predicted trafﬁc sign is of a type, different than that of the

actual trafﬁc sign.

predicted trafﬁc sign is of a type, same as that of the actual

trafﬁc sign.

Primary Classiﬁer

primary

Input

Image

SL (1-6)

DRT (1)

SL (7)

SL (8)

PROH (1)

PROH (2)

DNG (1)

Priority road sign

Yield to cross sign

Stop sign

PROH (3)

PROH (4)

Do not enter sign

DNG (2-15)

DRT (2)

MDT (1-8)

DRT (3)

DRT (4)

Predicted trafﬁc sign (ˆy

primary

)

Figure 4: Primary classiﬁer to predict the trafﬁc sign class

in an input image X

X. The sign type for every trafﬁc sign

class is also mentioned, where SL, DRT, PROH, DNG and

MDT denote the sign types: speed limit, derestriction, pro-

hibitory, danger and mandatory, respectively.

Shape Classiﬁer

shape

Input

Image

Shape of the sign board

(a) Shape-type classiﬁer

Color Classiﬁer

color

Input

Image

Prominent colors on the sign board

(A = Background, B = Border)

(b) Color-type classiﬁer

Figure 5: Shape and color type classiﬁers that are trained

for predicting the shape and the prominent colors on the

sign board in the input trafﬁc sign image, respectively.

• Color classiﬁer f

color

: X

X ∈ R

−→

secondary

∈

{1,2,...,5}.

Let us consider that the classiﬁers f

shape

and f

color

predict

secondary

= 2 (triangle pointed upwards) and

secondary

= 1 (white background and red border) for

an input image X

X. This implies that they collaborate

to predict the sign type of the trafﬁc sign in X

X as dan-

ger, since all the danger trafﬁc signs are triangular

(pointed upwards) in shape and bear a white back-

ground with red border. Now, we use an additional

classiﬁer f

dng

, which, unlike f

primary

, is trained to map

X into one of the 15 danger trafﬁc sign classes only.

Similarly, we can train the additional classiﬁers for

the other sign types, namely, speed limit, prohibitory,

Deep Learning Classiﬁers for Automated Driving: Quantifying the Trained DNN Model’s Vulnerability to Misclassiﬁcation

217

derestriction and mandatory. Note that all the speed

limit and the prohibitory trafﬁc signs have the same

shape and bear the same prominent colors, therefore,

we train a common classiﬁer f

sl/proh

with all the speed

limit and the prohibitory signs as the target classes.

These sign type speciﬁc classiﬁers are schematically

represented in Figure 6, and their mathematical repre-

sentations are given below:

• Speed limit/prohibitory signs classiﬁer f

sl/proh

X ∈ R

−→ ˆy

sl/proh

∈ {1,2, ...,12},

• Danger signs classiﬁer f

dng

: X

X ∈ R

−→ ˆy

dng

∈

{1,2,...,15},

sl/proh

Input

Image

SL (1)

SL (8)

PROH (1)

PROH (4)

Predicted speed limit/

prohibitory sign (ˆy

sl/proh

)

(a) Speed limit/prohibitory signs classi-

ﬁer

dng

Input

Image

DNG (1)

DNG (15)

Predicted danger sign

( ˆy

dng

)

(b) Danger signs classiﬁer

drt

Input

Image

DRT (1)

DRT (2)

DRT (3)

DRT (4)

Predicted derestriction

sign ( ˆy

drt

)

mdt

Input

Image

MDT (1)

MDT (8)

Predicted mandatory

sign ( ˆy

mdt

)

(d) Mandatory signs classiﬁer

Figure 6: Sign type speciﬁc classiﬁers, wherein all the cor-

responding target trafﬁc sign classes have the same physical

characteristics (shape and prominent colors).

• Derestriction signs classiﬁer f

drt

: X

X ∈ R

−→

ˆy

drt

∈ {1,2, ...,4}, and

• Mandatory signs classiﬁer f

mdt

: X

X ∈ R

−→

ˆy

mdt

∈ {1,2, ...,8}.

Since the trafﬁc signs: do not enter, priority road,

yield to cross and stop, are independent in terms of

their shape and color, we do not train any further sec-

ondary classiﬁer(s). For instance, if f

shape

and f

color

predict

secondary

= 5 (octagon) and

secondary

= 5 (red

background) for an input image X

X, the prediction

obtained by the network of secondary classiﬁers is

stop sign. The trafﬁc sign predicted upon the use

of the secondary classiﬁers is ﬁnally mapped back

into the corresponding original trafﬁc sign label in

{1,2,...,43}, which is then considered as ˆy

secondary

4.2.3 Merging the Decisions of the Classiﬁers

A schematic representation of the approach is pro-

vided in Figure 7. For a given input image X

X, if

the prediction made by the primary classiﬁer f

primary

is 20 speed limit sign (i.e., ˆy

primary

= 1), then by

prior knowledge, we know that the shape of the

sign board will be circular (i.e.,

primary

= S

) and it

will have a white background with red border (i.e.,

primary

= C

). Now, the predictions

secondary

and

secondary

, made by the secondary classiﬁers f

shape

and

color

, are compared with

primary

and

primary

, respec-

tively. If either or both the conditions:

primary

secondary

and

primary

secondary

are false, then the ﬁ-

nal prediction is an undecided outcome, i.e., ˆy

ﬁnal

= φ.

On the contrary, if both the abovementioned condi-

tions are true simultaneously, then based on the pre-

dicted shape and color class (i.e.,

primary

secondary

and

primary

secondary

), the trafﬁc sign prediction

ˆy

secondary

is derived from the matrix given in Figure

7. Again, the unanimity of the predictions ˆy

primary

and

ˆy

secondary

is checked. As also shown in Equation 17,

if both these predictions are the same, then the ﬁnal

prediction ˆy

ﬁnal

for the input image X

X is ˆy

primary

(or

ˆy

secondary

). However, non-unanimity results in an un-

decided outcome, i.e., ˆy

ﬁnal

= φ.

The effectiveness of this mechanism is remarked

by it’s ability to trigger an undecided outcome for

all those test images for which the primary classiﬁer

make incorrect predictions. Nevertheless, the disad-

vantage of this strategy is that it might also trigger an

undecided outcome for certain amount of test images

for which the primary classiﬁer already make correct

predictions. Thus, a suitable threshold must be im-

posed to ensure that the number of test images for

which ˆy

ﬁnal

= φ is not too high. However, deﬁning

this threshold is not within the scope of this paper.

VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems

218

Primary Classiﬁer

primary

Predicted trafﬁc sign

{1,2,..., 43}

ˆy

primary

Shape of the

predicted sign

,...,S

}

primary

Prominent colors on

the predicted sign

,...,C

}

Shape Classiﬁer

shape

secondary

Predicted shape class

,...,S

}

Color Classiﬁer

color

secondary

Predicted color class

,...,C

}

Input

Image

equal ?

AND

Undecided

FALSE

TRUE

Input:

Shape and

color class

ˆy

secondary

Predicted trafﬁc sign

{1,2,..., 43}

equal ?

Final output

ˆy

primary

TRUE

Undecided

FALSE

Prediction of trafﬁc signs

based on predicted shape

and color class

sl/proh

X) f

dng

Yield to

cross sign

drt

- - - -

mdt

- - - -

- -

Priority road

sign

- -

Do not

enter sign

- - -

Stop sign

Figure 7: Prediction of the trafﬁc sign class for the input image X

X by fusing the predictions of the primary classiﬁer ( f

primary

)

and the network of secondary classiﬁers (i.e., f

shape

, f

color

, f

sl/proh

, f

dng

, f

drt

and f

mdt

4.3 Experimentation

A total of 26640 images from the GTSRB training

data were splitted into three equal subsets. All the im-

ages were rescaled to l = 48×48×3 pixels. The ﬁrst

subset of the data was used to train the primary clas-

siﬁer f

primary

. The model f used for our analysis in

Section 3.4 is f

primary

in our experimentation here.

Half of the images in the second subset of the

training data were used to train the shape classiﬁer

shape

, while the other half were used to train the color

classiﬁer f

color

. For training the classiﬁers f

shape

and

color

, the images in their respective training data were

ﬁrst mapped from the trafﬁc sign classes {1, 2, ...., 43}

into the corresponding shape {S

} and

color {C

} classes, respectively. The

shape and the color classes (shown in Figure 5) were,

hence, used as labels for training f

shape

and f

color

, re-

spectively. The images in the third subset belonging

to the speed limit and prohibitory sign types were

used to train the classiﬁer f

sl/proh

, and similarly, the

other classiﬁers, i.e., f

dng

, f

drt

and f

mdt

were trained

with the corresponding sign type’s images in the third

subset of the training data. Further experimenta-

tion details (e.g., DNN architecture, hyperparameters,

etc.) related to the training and the evaluation of these

classiﬁer models are recorded in Appendix B. Note

that the set of test images, i.e., X

test

, used for the eval-

uation here is the same as used in Section 3.4.3.

4.4 Results and Discussion

4.4.1 Evaluation of the Mechanism

Firstly, the individual accuracies of the classiﬁer mod-

els are recorded in Table 1.

Let us denote the total number of test images used

for the evaluation as τ. When we use the primary clas-

siﬁer alone (i.e., without the implementation of the

mechanism), let us assume it misclassiﬁes ρ number

of images. The misclassiﬁcation rate by f

primary

is:

ρ =

× 100. (18)

Now, when we implement the mechanism, out of

these τ test images, let us say that the mechanism

Deep Learning Classiﬁers for Automated Driving: Quantifying the Trained DNN Model’s Vulnerability to Misclassiﬁcation

219

Table 1: Individual accuracies of the classiﬁer models.

Classiﬁer # Test Images Test Accuracy

primary

10000 96.12 %

shape

10000 98.87 %

color

10000 98.56 %

sl/proh

4497 98.2 %

dng

2207 95.7 %

drt

278 94.6 %

mdt

1409 97.3 %

yields an undecided outcome (i.e., ˆy

ﬁnal

= φ) for λ im-

ages. Thus, the percentage of test images for which

the mechanism does not yield any prediction is:

¯u =

× 100. (19)

The number of images for which ˆy

ﬁnal

= ˆy

primary

is (τ − λ). Let us assume that out of these (τ − λ)

images, for ω images the ﬁnal prediction (i.e., ˆy

ﬁnal

ˆy

primary

) is actually an incorrect prediction. Therefore,

the misclassiﬁcation rate after the implementation of

the mechanism becomes:

ω =

τ − λ

× 100. (20)

We use the mechanism presented in Figure 7 to

obtain the ﬁnal prediction ( ˆy

ﬁnal

), for all the τ =

10000 images in the set of original GTSRB test data,

i.e., X

test

, as well as for the set of perturbed test im-

ages, i.e., X

∆contrast

, X

∆brightness

, X

∆blur

and X

∆noise

The obtained values of

ρ (Equation 18), ¯u (Equa-

tion 19) and

ω (Equation 20) are determined for each

of these sets of test images. The results are pre-

sented in Figure 8. To realize the advantage of us-

ing the mechanism, we compare the misclassiﬁcation

rate

ω obtained after implementation of the mecha-

nism with the misclassiﬁcation rate

ρ observed when

using f

primary

alone. In Figure 8, consider the results

corresponding to X

test

. The use of f

primary

alone re-

sults in a misclassiﬁcation rate of

ρ = 3.88%. How-

ever, when the mechanism is used, for ¯u = 5.61% of

images in X

test

, no decision is produced. Among the

remaining images, i.e., for which a decision is pro-

duced ( ˆy

ﬁnal

= ˆy

primary

), the misclassiﬁcation rate is

just

ω = 0.69%. A substantial drop in the misclas-

siﬁcation rate suggests that the mechanism controls

the misclassiﬁcations incurred by f

primary

consider-

ably well. Similar conclusion can be drawn from the

results obtained for the set of perturbed test images.

4.4.2 Validation of the Arguments 1 and 2

For the test data X

test

, X

∆contrast

, X

∆brightness

, X

∆blur

and X

∆noise

, the number of images n

low

, n

moderate

, and

high

misclassiﬁed by f

primary

are already shown in

Figure 3. For instance, consider the set of the original

(unperturbed) test images, i.e., X

test

. The values are

low

= 1, n

moderate

= 103, and n

high

= 284 (Figure 3a).

As discussed in Section 4.1, we expect the mechanism

in Figure 7 to yield ˆy

ﬁnal

= φ (undecided outcome),

ideally, for all these images misclassiﬁed by f

primary

Upon investigation, we observed that the mechanism

does yield the undecided outcome for the n

low

= 1 im-

age, and hence, ¯u

low

= 100%. The term ¯u

low

signiﬁes

the percentage of n

low

images for which the mecha-

nism does not yield any decision. Out of n

moderate

103 images, the mechanism yields the undecided out-

come for 97 images, i.e., 94.17% of n

moderate

im-

ages. Therefore, ¯u

moderate

= 94.17%. However, out

of n

high

= 284 images, the mechanism yields the un-

decided outcome for only 225 images, i.e., 79.23% of

high

images, therefore, ¯u

high

= 79.23%. We continue

this investigation for the set of perturbed test images.

The results are presented in Table 2. It is observed

that for all the n

low

number of misclassiﬁed images

(i.e., for which the corresponding misclassiﬁcations

lie in M

low

), the mechanism yields the undecided out-

Figure 8: Experimental results: Evaluation of the mechanism.

VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems

220

Table 2: Experimental results: Validation of the arguments.

Test Images

For the misclassiﬁcation(s) belonging to:

low

moderate

high

low

¯u

low

moderate

¯u

moderate

high

¯u

high

Original X

test

1 100% 103 94.17% 284 79.23%

Perturbed

Contrast change X

∆contrast

4 100% 112 95.54% 299 85.62%

Brightness change X

∆brightness

15 100% 156 91.03% 533 81.05%

Gaussian blur X

∆blur

2 100% 138 89.13% 425 79.53%

Gaussian noise X

∆noise

27 100% 197 88.32% 564 69.33%

come. On the other hand, it is relatively difﬁcult to

achieve the same for all the n

high

number of misclas-

siﬁed images (i.e., for which the corresponding mis-

classiﬁcations lie in M

high

). It can be inferred that the

misclassiﬁcations that were ranked to be offering the

primary classiﬁer model a high vulnerability (based

on our categorisation criteria) are relatively difﬁcult

to control, even after the implementation of the ad-

ditional efforts to minimize the misclassiﬁcation rate.

One can expect the amount of rigor required to control

a particular misclassiﬁcation incurred by f

primary

to be

considerably higher as it’s vulnerability to the mis-

classiﬁcation increases. The investigation supports

our Arguments 1 and 2 speciﬁed in Section 4.1.

5 CONCLUSIONS

In this paper, we proposed an approach to estimate

how vulnerable is a trained DNN model to any par-

ticular misclassiﬁcation. It is based on estimating

the DNN’s vulnerability to misclassiﬁcation of an in-

put image belonging to a class k

into an incorrect

class k

by measuring the similarity learnt by the

trained model between the classes k

and k

. We il-

lustrated experimentally that the majority of the test

images that are misclassiﬁed by the model encounter

the misclassiﬁcations to which it’s vulnerability is cat-

egorised as high. This provides a rationale to our

proposed approach and also justiﬁes its potentiality

to identify the set of critical misclassiﬁcations that

the DNN model is more likely to incur during the

operation. Based on the acquired knowledge, perti-

nent measures or counter strategies can be developed

and integrated to curb the in-operation likelihood of

these critical misclassiﬁcations. Our further empir-

ical investigation was to validate the argument that

the amount of rigor required to deal with these crit-

ical misclassiﬁcations is relatively higher than what is

required to deal with the misclassiﬁcations to which

the DNN model is moderately or lowly vulnerable.

As an extension to this work, we wish to study how

the knowledge acquired by the application of this pro-

posed concept be further utilized for the preparation

of the consequent steps to enhance the robustness

and/or performance of the image classiﬁcation func-

tionality in highly automated driving.

ACKNOWLEDGEMENTS

We thank the anonymous reviewers for their valuable

comments and feedback.

REFERENCES

Agarwal, H., Dorociak, R., and Rettberg, A. (2020). A

strategy for developing a pair of diverse deep learn-

ing classiﬁers for use in a 2-classiﬁer system. In 2020

X Brazilian Symposium on Computing Systems Engi-

neering (SBESC), pages 1–8. IEEE.

Agarwal, H., Dorociak, R., and Rettberg, A. (2021). On

developing a safety assurance for deep learning based

image classiﬁcation in highly automated driving. In

2021 Design, Automation & Test in Europe Confer-

ence (DATE).

Chollet, F. et al. (2015). Keras. https://keras.io.

Gao, X. W., Podladchikova, L., Shaposhnikov, D., Hong,

K., and Shevtsova, N. (2006). Recognition of traf-

ﬁc signs based on their colour and shape features

extracted using human vision models. Journal of

Visual Communication and Image Representation,

17(4):675–685.

Grigorescu, S. M., Trasnea, B., Cocias, T., and Mace-

sanu, G. (2019). A survey of deep learning tech-

niques for autonomous driving. arXiv preprint

arXiv:1910.07738.

Harris, C. R. et al. (2020). Array programming with

NumPy. Nature, 585:357–362.

Hunter, J. D. (2007). Matplotlib: A 2d graphics environ-

ment. Computing in Science & Engineering, 9(3):90–

95.

Kashyap, R. (2019). The perfect marriage and much more:

Combining dimension reduction, distance measures

and covariance. Physica A: Statistical Mechanics and

its Applications, 536:120938.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).

Imagenet classiﬁcation with deep convolutional neu-

ral networks. In Pereira, F., Burges, C. J. C., Bottou,

Deep Learning Classiﬁers for Automated Driving: Quantifying the Trained DNN Model’s Vulnerability to Misclassiﬁcation

221

L., and Weinberger, K. Q., editors, Advances in Neu-

ral Information Processing Systems, volume 25, pages

1097–1105. Curran Associates, Inc.

Muhammad, K., Ullah, A., Lloret, J., Del Ser, J., and de Al-

buquerque, V. H. C. (2020). Deep learning for safe

autonomous driving: Current challenges and future

directions. IEEE Transactions on Intelligent Trans-

portation Systems.

Ouyang, W., Zhou, H., Li, H., Li, Q., Yan, J., and Wang,

X. (2017). Jointly learning deep features, deformable

parts, occlusion and classiﬁcation for pedestrian de-

tection. IEEE transactions on pattern analysis and

machine intelligence, 40(8):1874–1887.

Pedregosa, F. et al. (2011). Scikit-learn: Machine learning

in Python. Journal of Machine Learning Research,

12:2825–2830.

Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). ”Why

Should I Trust You?”: Explaining the Predictions of

Any Classiﬁer. arXiv preprint arXiv:1602.04938.

SAE International (2014). Taxonomy and Deﬁnitions for

Terms Related to On-Road Motor Vehicle Automated

Driving Systems (Standard No. J3016).

Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus,

R., and LeCun, Y. (2014). Overfeat integrated recog-

nition, localization and detection using convolutional

networks. In 2nd International Conference on Learn-

ing Representations, ICLR 2014.

Stallkamp, J., Schlipsing, M., Salmen, J., and Igel, C.

(2012). Man vs. computer: Benchmarking machine

learning algorithms for trafﬁc sign recognition. Neu-

ral Networks, 32:323–332.

Tian, Y., Zhong, Z., Ordonez, V., Kaiser, G., and Ray, B.

(2020). Testing DNN image classiﬁers for confu-

sion & bias errors. In Proceedings of the ACM/IEEE

42nd International Conference on Software Engineer-

ing, pages 1122–1134.

Virtanen, P. et al. (2020). SciPy 1.0: Fundamental Algo-

rithms for Scientiﬁc Computing in Python. Nature

Methods, 17:261–272.

Zhu, Y., Zhang, C., Zhou, D., Wang, X., Bai, X., and

Liu, W. (2016). Trafﬁc sign detection and recognition

using fully convolutional network guided proposals.

Neurocomputing, 214:758–766.

APPENDIX

A Primary Classiﬁer

All the images from the GTSRB dataset were rescaled

to 48 × 48 × 3 and normalized so as to keep the pixel

values in the range of 0 to 1. For the experimenta-

tion details corresponding to the DNN model f (or

primary

), refer to the ﬁrst row of Table 3.

B Secondary Classiﬁers

Table 3 also summarizes the details corresponding to

shape

, f

color

, f

sl/proh

, f

drt

, f

dng

and f

mdt

. From the val-

idation (and the test) data used for evaluating f

primary

shape

and f

color

, only the images belonging to the

sign type: speed limit/prohibitory, derestriction, dan-

ger and mandatory were used as the validation (and

the test) data for the corresponding sign type classi-

ﬁers, i.e., f

sl/proh

, f

dng

, f

drt

and f

mdt

, respectively.

Table 3: Experimentation details corresponding to the different classiﬁer models.

Classiﬁer

Brief Summary of the DNN

Architecture

Hyperparameters

Number of Images

Train Validation Test

f or

6 convolutional, 4 max pooling, 2

dense and 7 dropout layers

bs=64, α=10

−4

ϕ=10

−4

8880 2630 10000

primary

shape

3 convolutional, 3 max pooling, 2

dense and 4 dropout layers

bs=32, α=10

−4

ϕ=10

−4

4440 2630 10000

color

3 convolutional, 3 max pooling, 2

dense and 4 dropout layers

bs=32, α=10

−4

ϕ=10

−4

4440 2630 10000

sl/proh

8 convolutional, 4 max pooling, 4

dense and 8 dropout layers

bs=16, α=10

−4

ϕ=10

−4

3890 1173 4497

dng

8 convolutional, 4 max pooling, 4

dense and 12 dropout layers

bs=16, α=10

−3

ϕ=10

−4

2050 583 2207

drt

8 convolutional, 4 max pooling, 4

dense and 8 dropout layers

bs=16, α=10

−3

ϕ=10

−4

280 82 278

mdt

8 convolutional, 4 max pooling, 4

dense and 8 dropout layers

bs=16, α=10

−4

ϕ=10

−4

1280 361 1409

Every classiﬁer was trained for 30 epochs using the standard categorical cross entropy loss function. We chose the model

obtained from the training epoch at which the highest validation accuracy was observed.

The DNN architecture yields two outputs: logits and output from the ﬁnal softmax activation layer.

bs, α and ϕ denotes the batch size, the learning rate, and the learning rate decay, respectively.

VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems

222