Neural Network Meta Classiﬁer:

Improving the Reliability of Anomaly Segmentation

Jurica Runtas

and Tomislav Petkovi

University of Zagreb, Faculty of Electrical Engineering and Computing, Unska 3, 10000 Zagreb, Croatia

Keywords:

Computer Vision, Semantic Segmentation, Anomaly Segmentation, Entropy Maximization, Meta

Classiﬁcation, Open-Set Environments.

Abstract:

Deep neural networks (DNNs) are a contemporary solution for semantic segmentation and are usually trained

to operate on a predeﬁned closed set of classes. In open-set environments, it is possible to encounter seman-

tically unknown objects or anomalies. Road driving is an example of such an environment in which, from a

safety standpoint, it is important to ensure that a DNN indicates it is operating outside of its learned semantic

domain. One possible approach to anomaly segmentation is entropy maximization, which is paired with a

logistic regression based post-processing step called meta classiﬁcation, which is in turn used to improve the

reliability of detection of anomalous pixels. We propose to substitute the logistic regression meta classiﬁer

with a more expressive lightweight fully connected neural network. We analyze advantages and drawbacks of

the proposed neural network meta classiﬁer and demonstrate its better performance over logistic regression.

We also introduce the concept of informative out-of-distribution examples which we show to improve training

results when using entropy maximization in practice. Finally, we discuss the loss of interpretability and show

that the behavior of logistic regression and neural network is strongly correlated. The code is publicly avail-

able at https://github.com/JuricaRuntas/meta-ood.

1 INTRODUCTION

Semantic segmentation is a computer vision task in

which each pixel of an image is assigned into one of

predeﬁned classes. An example of a real-world appli-

cation is an autonomous driving system where seman-

tic segmentation is an important component for vi-

sual perception of a driving environment (Biase et al.,

2021; Janai et al., 2020).

Deep neural networks (DNNs) are a contemporary

solution to the semantic segmentation task. DNNs

are usually trained to operate on a predeﬁned closed

set of classes. However, this is in a contradiction

with the nature of an environment in which afore-

mentioned autonomous driving systems are deployed.

Such systems operate in a so-called open-set environ-

ment where DNNs will encounter anomalies, i.e., ob-

jects that do not belong to any class from the prede-

ﬁned closed set of classes used during training (Wong

et al., 2019).

From a safety standpoint, it is very important that

a DNN classiﬁes pixels of any encountered anomaly

https://orcid.org/0009-0003-0505-2889

https://orcid.org/0000-0002-3054-002X

as anomalous and not as one of the predeﬁned classes.

The presence of an anomaly indicates that a DNN is

operating outside of its learned semantic domain so a

corresponding action may be taken, e.g., there is an

unknown object on the road and an emergency brak-

ing procedure is initiated.

One approach to anomaly segmentation is entropy

maximization (Chan et al., 2020). It is usually paired

with a logistic regression based post-processing step

called meta classiﬁcation, which is used to improve

the reliability of detection of anomalous pixels in the

image, driving subsequent anomaly detection.

In this paper, we explore entropy maximization

approach to anomaly segmentation where we pro-

pose to substitute the logistic regression meta clas-

siﬁer with a lightweight fully connected neural net-

work. Such a network is more expressive than the lo-

gistic regression meta classiﬁer, so we expect an im-

provement in anomaly detection performance. Then,

we provide additional analysis of the entropy maxi-

mization that shows that caution must be taken when

using it in practice in order to ensure its effectiveness.

To that end, we introduce the concept of informative

out-of-distribution examples which we show to im-

348

Runtas, J. and Petkovi

c, T.

Neural Network Meta Classiﬁer: Improving the Reliability of Anomaly Segmentation.

DOI: 10.5220/0013143000003912

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2025) - Volume 2: VISAPP, pages

348-355

ISBN: 978-989-758-728-3; ISSN: 2184-4321

prove training results. Finally, we discuss the loss of

interpretability and show that the behavior of logistic

regression and neural network is strongly correlated,

suggesting that the loss of interpretability may not be

a signiﬁcant drawback after all.

2 RELATED WORK

The task of identifying semantically anomalous re-

gions in an image is called anomaly segmentation or,

in the more general context, out-of-distribution (OoD)

detection. Regardless of a speciﬁc method used for

anomaly segmentation, the main objective is to obtain

an anomaly segmentation score map. The anomaly

segmentation score map a indicates the possibility

of the presence of an anomaly at each pixel location

where higher score indicates more probable anomaly

(Chan et al., 2022). Methods described in the litera-

ture differ in the ways how such a map is obtained.

The methods described in the early works are

based on the observation that anomalies usually re-

sult in low conﬁdence predictions allowing for their

detection. These methods include thresholding the

maximum softmax probability (Hendrycks and Gim-

pel, 2017), ODIN (Liang et al., 2018), uncertainty es-

timation through the usage of Bayesian methods such

as Monte-Carlo dropout (Gal and Ghahramani, 2016;

Kendall et al., 2015), ensembles (Lakshminarayanan

et al., 2017) and distance based uncertainty estima-

tion through Mahalanobis distance (Denouden et al.,

2018; Lee et al., 2018) or Radial Basis Function Net-

works (RBFNs) (Li and Kosecka, 2021; van Amers-

foort et al., 2020). These methods do not rely on

the utilization of negative datasets containing images

with anomalies so they are classiﬁed as anomaly seg-

mentation methods without outlier supervision.

However, methods such as entropy maximization

(Chan et al., 2020) use entire images sampled from a

negative dataset. Some methods cut and paste anoma-

lies from images in the chosen negative dataset on

the in-distribution images (Bevandi

c et al., 2019; Be-

vandi

c et al., 2021; Grci

c et al., 2022). The nega-

tive images are used to allow the model to learn a

representation of the unknown; therefore, such meth-

ods belong to the category of anomaly segmentation

methods with outlier supervision.

Finally, there are methods that use generative

models for the purpose of anomaly segmentation (Bi-

ase et al., 2021; Blum et al., 2019; Grci

c et al., 2021;

Lis et al., 2019; Xia et al., 2020), usually through the

means of reconstruction or normalizing ﬂows, with or

without outlier supervision. Current state-of-the-art

anomaly segmentation methods (Ackermann et al.,

2023; Rai et al., 2023; Nayal et al., 2023; Deli

et al., 2024) utilize mask-based semantic segmenta-

tion (Cheng et al., 2021; Cheng et al., 2022).

3 METHODOLOGY

In this section, we describe a method for anomaly

segmentation called entropy maximization. Then, we

describe a post-processing step called meta classiﬁ-

cation, which is used for improving the reliability of

anomaly segmentation. Finally, we describe our pro-

posed improvement to the original meta classiﬁca-

tion approach (Chan et al., 2020). All methods de-

scribed in the following two subsections are intro-

duced and thoroughly described in (Chan et al., 2020;

Chan et al., 2022; Oberdiek et al., 2020; Rottmann

et al., 2018; Rottmann and Schubert, 2019).

3.1 Notation

Let x ∈ [0, 1]

H×W ×3

denote a normalized color image

of spatial dimensions H ×W . Let I = {1, 2, ..., H} ×

{1, 2, ...,W } denote the set of pixel locations. Let C =

{1, 2, ...,C} denote the set of |C | predeﬁned classes.

We deﬁne a set of training data used to train a se-

mantic segmentation neural network in a supervised

manner as D

train



, m

)



train

j=1

, where N

train

de-

notes the total number of in-distribution training sam-

ples and m

= (m

)

i∈I

∈ C

H×W

is the correspond-

ing ground truth segmentation mask of x

. Let

F : [0, 1]

H×W ×3

→ [0, 1]

H×W ×|C |

be a semantic seg-

mentation neural network that produces pixel-wise

class probabilities for a given image x.

3.2 Anomaly Segmentation via Entropy

Maximization

Let p

(x) =



(c|x)



i∈I ,c∈C

∈ [0, 1]

|C |

denote a vec-

tor of probabilities such that the p

(c|x) is a probabil-

ity of a pixel location i ∈ I of a given image x ∈ D

being a pixel that belongs to the class c ∈ C . We de-

ﬁne p(x) =



(x)



i∈I

∈ [0, 1]

H×W ×|C |

, the probabil-

ity distribution over images in D

. When using D

train

to train a semantic segmentation neural network F,

one can interpret that the network is being trained to

estimate p(x), denoted by

p(x). For a semantic seg-

mentation network in the context of anomaly segmen-

tation, it would be a desirable property if such net-

work could output a high prediction uncertainty for

OoD pixels which can in turn be quantiﬁed with a per-

pixel entropy. For a given image x ∈ [0, 1]

H×W ×3

and

Neural Network Meta Classiﬁer: Improving the Reliability of Anomaly Segmentation

349

a pixel location i ∈ I , the per-pixel prediction entropy

is deﬁned as



(x)



= −

∑

c∈C

ˆp

(c|x)log



ˆp

(c|x)



, (1)

where E



(x)



is maximized by the uniform (non-

informative) probability distribution

(x) which

makes it an intuitive uncertainty measure.

We deﬁne a set of OoD training samples as

train

out



, m

)



train

out

j=1

where N

train

out

denotes the to-

tal number of such samples. In practice, D

train

out

is a

general-purpose dataset that contains diverse taxon-

omy exceeding the one found in the chosen domain-

speciﬁc dataset D

train

and it serves as a proxy for im-

ages containing anomalies.

It has been shown (Chan et al., 2020) that one can

make the output of a semantic segmentation neural

network F have a high entropy on OoD pixel loca-

tions by employing a multi-criteria training objective

deﬁned as

L = (1 − λ)·E

(x,m)∈D

train



F(x), m



λ·E

(x,m)∈D

train

out



F(x), m



(2)

where λ ∈ [0, 1] is used for controlling the impact of

each part of the overall objective.

When minimizing the overall objective deﬁned by

Eq. (2), a commonly used cross-entropy loss is ap-

plied for in-distribution training samples deﬁned as



F(x), m



= −

∑

i∈I

∑

c∈C

· log



ˆp

(c|x)



, (3)

where 1

c=m

∈ {0, 1} is the indicator function being

equal to one if the class index c ∈ C is, for a given

pixel location i ∈ I , equal to the class index m

de-

ﬁned by the ground truth segmentation mask m and

zero otherwise. For OoD training samples, a slightly

modiﬁed cross-entropy loss deﬁned as

out



F(x), m



= −

∑

i∈I

out

∑

c∈C

|C |

log



ˆp

(c|x)



(4)

is applied for pixel locations i ∈ I labeled as OoD

in the ground truth segmentation mask m. It can be

shown (Chan et al., 2020) that minimizing l

out

deﬁned

by Eq. (4) is equivalent to maximizing per-pixel pre-

diction entropy E



(x)



deﬁned by Eq. (1), hence

the name entropy maximization. The anomaly seg-

mentation score map a can then be obtained by nor-

malizing the per-pixel prediction entropy, i.e.,

a = (a

)

i∈I

∈ [0, 1]

H×W

, a



(x)



log(|C |)

. (5)

3.3 Meta Classiﬁcation

Meta classiﬁcation is the task of discriminating be-

tween a false positive prediction and a true positive

prediction. Training a network with a modiﬁed en-

tropy maximization training objective increases the

network’s sensitivity towards predicting OoD objects

and can result in a substantial number of false positive

predictions (Chan et al., 2019; Chan et al., 2020). Ap-

plying meta classiﬁcation in order to post-process the

network’s prediction has been shown to signiﬁcantly

improve the network’s ability to reliably detect OoD

objects. For a given image x, we deﬁne a set of pixel

locations being predicted as OoD as

out

(x, a) =



i ∈ I | a

≥ t, t ∈ [0, 1]



(6)

where t represents a ﬁxed threshold and a is computed

using Eq. (5). Based on I

out

(x, a), a set of connected

components representing OoD object predictions de-

ﬁned as

K (x, a) ⊆ P



out

(x, a)



is constructed. Note

that P



out

(x, a)



denotes the power set of I

out

(x, a).

Meta classiﬁer is a lightweight model added on

top of a semantic segmentation network F. After

training F for entropy maximization on the pixels of

known OoD objects, a structured dataset of hand-

crafted metrics is constructed. For every OoD ob-

ject prediction

k ∈

K (x, a), different pixel-wise un-

certainty measures are derived solely from

p(x) such

as normalized per-pixel prediction entropy of Eq. (1),

maximum softmax probability, etc. In addition to

metrics derived from

p(x), metrics based on the OoD

object prediction geometry features are also included

such as the number of pixels contained in

k, various

ratios regarding interior and boundary pixels, geomet-

ric center, geometric features regarding the neighbor-

hood of

k, etc. (Chan et al., 2020; Rottmann et al.,

2018).

After a dataset with the hand-crafted metrics is

constructed, a meta classiﬁer is trained to classify

OoD object predictions in one of the following two

sets,

T P

(x, a) =



k ∈

K (x, a) | IoU(

k, m) > 0



and

(x, a) =



k ∈

K (x, a) | IoU(

k, m) = 0



(7)

where C

T P

represents a set of true positive OoD ob-

ject predictions, C

a set of false positive OoD object

predictions and IoU represents the intersection over

union of a OoD object prediction

k with the corre-

sponding ground truth segmentation mask m. Each

(x, m) ∈ D

meta

out

is an element of a dataset containing

known OoD objects used to train a meta classiﬁer.

During inference, a meta classiﬁer predicts

whether an OoD object predictions obtained from F

are false positive. Certainly, the prediction is done

VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications

350

without the access to the ground truth segmentation

mask m and is based on learned statistical and geo-

metrical properties of the OoD object predictions ob-

tained from the known unknowns. OoD object pre-

dictions classiﬁed as false positive are then removed

and the ﬁnal prediction is obtained.

3.4 Neural Network Meta Classiﬁer

In (Chan et al., 2020), authors use logistic regression

for the purpose of meta classifying predicted OoD ob-

jects. Their main argument for the use of logistic re-

gression is that since it is a linear model, it is possi-

ble to analyze the impact of each hand-crafted met-

ric used as an input to the model with an algorithm

such as Least Angle Regression (LARS) (Efron et al.,

2004). However, we argue that even though it is desir-

able to have an interpretable model in order to analyze

the relevance and the impact of its input, it is possi-

ble to achieve a signiﬁcantly greater performance by

employing a more expressive type of model such as a

neural network.

Let

K be a set containing OoD object pre-

dictions for every (x, m) ∈ D

meta

out

deﬁned as

K =

(x,m)∈D

meta

out

K (x, a). We formally deﬁne

the aforementioned hand-crafted metrics dataset as

µ ⊂ R

K|×N

, where N

is the total number of hand-

crafted metrics derived from each OoD object predic-

tion.

We propose that instead of logistic regression as

a meta classiﬁer, a lightweight fully connected neu-

ral network is employed. Let F

meta

: µ → [0, 1] denote

such a neural network. We can interpret that F

meta

outputs the probability of a given OoD object predic-

tion being false positive based on the corresponding

derived hand-crafted metrics according to Eq. (7). Let

denote such probability. Since F

meta

is essentially

a binary classiﬁer, we can train it using the binary

cross-entropy loss deﬁned as

meta

= −

∑

i=1

log





+(1−y

)log



1 − p



, (8)

where N represents the number of OoD object predic-

tions included in a mini-batch and y

represents the

ground truth label of a given OoD object prediction

and is equal to one if given OoD object prediction is

false positive and zero otherwise.

4 EXPERIMENTS

In this section, we brieﬂy describe the experimen-

tal setup and evaluate our proposed neural network

meta classiﬁer.

4.1 Experimental Setup

For the purpose of the entropy maximization, we

use DeepLabv3+ semantic segmentation model (Chen

et al., 2018) with a WideResNet38 backbone (Wu

et al., 2016) trained by Nvidia (Zhu et al., 2018).

The model is pretrained on Cityscapes dataset (Cordts

et al., 2016). The pretrained model is ﬁne-tuned ac-

cording to Eq. (2). We use Cityscapes dataset (Cordts

et al., 2016) as D

train

containing 2,975 images while

for D

train

out

we use a subset of COCO dataset (Lin et al.,

2014) which we denote as COCO-OoD. For the pur-

pose of D

train

out

, we exclude images containing class

instances that are also found in Cityscapes dataset.

After ﬁltering, 46,751 images remain. The model is

trained for 4 epochs on random square crops of height

and width of 480 pixels. Images that have height

or width smaller than 480 pixels are resized. Before

each epoch, we randomly shufﬂe 2,975 images from

Cityscapes dataset with 297 images randomly sam-

pled from the remaining 46,751 COCO images. Hy-

perparameters are set according to the baseline (Chan

et al., 2020): loss weight λ = 0.9, entropy threshold

t = 0.7. Adam optimizer (Kingma and Ba, 2017) is

used with learning rate η = 1 × 10

−5

4.2 Evaluation of Neural Network Meta

Classiﬁer

We use (Chan et al., 2020) as a baseline. We substitute

the logistic regression with a lightweight fully con-

nected neural network whose architecture is shown in

Table 1. The proposed meta classiﬁer is trained on the

hand-crafted metrics derived from OoD object predic-

tions of the images in LostAndFound Test (Pinggera

et al., 2016). Derived hand-crafted metrics, i.e., cor-

responding OoD object predictions are leave-one-out

cross validated according to Eq. (7). The meta clas-

siﬁer is trained using Adam optimizer with learning

rate η = 1 × 10

−3

and weight decay γ = 5 × 10

−3

for 50 epochs with a mini-batch size N = 128. Note

that in our case, the total number of hand-crafted met-

rics N

= 75. Also note that the logistic regression

meta classiﬁer has 76 parameters. The results are

shown in Table 2 and Fig. 1. In our experiments, the

improved performance is especially noticeable when

considering OoD object predictions consisting of a

very small number of pixels.

5 DISCUSSION

In this section, we introduce the notion of high and

low informative OoD proxy images, and we show

Neural Network Meta Classiﬁer: Improving the Reliability of Anomaly Segmentation

351

Figure 1: ROC and PR meta classiﬁer curves for OoD object predictions of LostAndFound Test images. On the PR curve,

random guessing is represented as a constant dashed red line whose value is equal to the ratio of the number of OoD objects

and the total number of predicted OoD objects.

Table 1: Architecture of the neural network meta classiﬁer.

All layers are fully connected and a sigmoid activation is

applied after the last layer. The total number of parameters

is 17,176.

Layer # of neurons # of parameters

Input layer 75 5,700

1. layer 75 5,700

2. layer 75 5,700

Output layer 1 76

Table 2: Performance comparison of meta classiﬁers. Note

that the given results are based on OoD object predictions

obtained with entropy threshold t = 0.7 of Eq. (6).

Model Type Logistic Regression Neural Network

Source Baseline Reproduced Ours

AUROC 0.9444 0.9342 0.9680

AUPRC 0.7185 0.6819 0.8418

that the high informative proxy OoD images are the

ones from which the semantic segmentation network

can learn to reliably output high entropy on OoD pix-

els of images seen during inference. Then, we dis-

cuss the loss of interpretability, a drawback of using

the proposed neural network meta classiﬁer instead

of the interpretable logistic regression meta classiﬁer

and show that it may not be a signiﬁcant drawback

after all.

5.1 On Outlier Supervision of the

Entropy Maximization

We introduce the notion of high informative and low

informative proxy OoD images. What we mean by

high and low informative is illustrated with Fig. 2.

We have noticed empirically that high informative

proxy OoD images have two important characteristics

that differentiate them from the low informative proxy

OoD images: spatially clear separation between ob-

jects and clear object boundaries.

Our conjecture is that the low informative proxy

OoD images have little to no impact on the entropy

maximization training or can even negatively impact

the training procedure. On the other hand, high infor-

mative proxy OoD images are the ones from which

the semantic segmentation network can learn to re-

liably output high entropy on OoD pixels of im-

ages seen during inference, denoted by D

out

\ D

train

out

where \ represents the set difference.

To investigate our conjecture, we perform the en-

tropy maximization training on subsets of COCO-

OoD. We consider it difﬁcult to universally quan-

tify the mentioned characteristics of high informa-

tive proxy OoD images, however, we notice a signiﬁ-

cant correlation between the percentage of the labeled

OoD pixels and the desirable properties found in high

informative OoD proxy images. We use COCO-OoD

proxy for the creation of the two disjoint sets such that

the ﬁrst contains images from COCO-OoD that have

at most 20% of pixels labeled as OoD (denoted as L-

20%-OoD) and the second that contains images from

COCO-OoD that have at least 80% of pixels labeled

as OoD (denoted as M-80%-OoD). Table 3 shows that

performing the entropy maximization training using

M-80%-OoD results in a little to no improvement in

comparison to the model trained exclusively on the

in-distribution images. On the other hand, using L-

20%-OoD produced even better results than the ones

obtained with the usage of COCO-OoD.

VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications

352

(a) Examples of high informative proxy OoD images. (b) Examples of low informative proxy OoD images.

Figure 2: Examples of high and low informative proxy OoD images. The ﬁrst row contains the proxy OoD images while the

second row contains ground truth segmentation masks such that the white regions represent pixels labeled as OoD for which

Eq. (4) is applied.

Table 3: Results for the entropy maximization training using COCO-OoD subsets. Column DLV3+W38 contains the results

obtained from the model used for ﬁne-tuning (Zhu et al., 2018) which was trained exclusively on the in-distribution images.

Other columns contain results obtained from the best model after performing the entropy maximization training numerous

times with a given subset.

Metric FPR

AUPRC

Source DLV3+W38 COCO-OoD L-20%-OoD M-80%-OoD DLV3+W38 COCO-OoD L-20%-OoD M-80%-OoD

LostAndFound Test 0.35 0.15 0.09 0.13 0.46 0.75 0.78 0.48

Fishyscapes Static 0.19 0.17 0.12 0.31 0.25 0.64 0.73 0.25

5.2 Interpretability of Neural Network

Meta Classiﬁer

A drawback of using a neural network as a meta clas-

siﬁer is the loss of interpretability. However, we at-

tempt to further understand the performance of our

proposed meta classiﬁer. Fig. 3 shows LARS path for

ten hand-crafted metrics most correlated with the re-

sponse of logistic regression, i.e., the ones which con-

tribute the most in classifying OoD object predictions.

One can interpret LARS as a way of sorting the hand-

crafted metrics based on the impact on the response of

logistic regression. Algorithm 1 offers a way to lever-

age this kind of reasoning in order to gain a further

insight in how neural network meta classiﬁer behaves

in comparison to logistic regression. Note that we

assume that LARS sorts hand-crafted metrics in de-

scending order with respect to the correlation. Fig. 4

shows results of executing Algorithm 1 for both meta

classiﬁers.

For the logistic regression meta classiﬁer, after we

take a subset of µ containing 21 most correlated hand-

crafted metrics according to LARS, adding remaining

hand-crafted metrics results in little to no improve-

ment in performance. We can see that the neu-

ral network meta classiﬁer exhibits a similar behav-

ior, although in a more unstable manner. The ob-

vious difference in performance can be most likely

attributed to the fact that neural network meta clas-

siﬁer is more expressive and better aggregates the

hand-crafted metrics. We argue that the hand-crafted

metrics having the most impact on the performance

of logistic regression meta classiﬁer also do so in

the case of neural network meta classiﬁer. Such in-

sight could alleviate presumably the most signiﬁcant

drawback of using neural network meta classiﬁer in-

stead of logistic regression meta classiﬁer - the loss of

interpretability.

Figure 3: LARS path for the hand-crafted metrics at t = 0.7.

A detailed description of the hand-crafted metrics can be

found in (Chan et al., 2020).

Neural Network Meta Classiﬁer: Improving the Reliability of Anomaly Segmentation

353

Figure 4: Performance comparison of logistic regression meta classiﬁer and neural network meta classiﬁer when trained on

subsets of the hand-crafted metrics dataset µ. For each value N

on the x-axis, we train the meta classiﬁers on the subset of µ

such that we take the ﬁrst N

metrics having the most correlation with the response according to LARS.

Algorithm 1: Incremental meta classiﬁer evaluation.

Input : F

meta

, µ, N

Output: lists of AUROC and AUPRC metrics

AUROC ←− [ ];

AUPRC ←− [ ];

MetricsSortedByCorrelation ←− LARS(µ);

for i = 1 to N

ξ ←− MetricsSortedByCorrelation[:i];

initializeModel(F

meta

);

trainModel(F

meta

, ξ);

, m

) ←− evaluateModel(F

meta

, ξ);

AUROC.append(m

);

AUPRC.append(m

);

end

6 CONCLUSION

In this paper, we explored the anomaly segmenta-

tion method called entropy maximization which can

increase the network’s sensitivity towards predicting

OoD objects, but which can also result in a substantial

number of false positive predictions. Hence, the meta

classiﬁcation post-processing step is applied in or-

der to improve the network’s ability to reliably detect

OoD objects. Our experimental results showed that

employing the proposed neural network meta classi-

ﬁer results in a signiﬁcantly greater performance in

comparison to the logistic regression meta classiﬁer.

Furthermore, we provided additional analysis of

the entropy maximization training which showed that

in order to ensure its effectiveness, caution must be

taken when choosing which images are going to be

used as proxy OoD images. Our experimental results

demonstrated that high informative proxy OoD im-

ages are the ones from which the semantic segmen-

tation network can learn to reliably output high en-

tropy on OoD pixels of images seen during inference

and are therefore more beneﬁcial to the entropy max-

imization training in terms of how well a semantic

segmentation neural network can detect OoD objects

afterwards.

Finally, a drawback of using the neural network

meta classiﬁer is the loss of interpretability. In our

attempt to further analyze the performance of the pro-

posed neural network meta classiﬁer, we found that

the behavior of logistic regression and neural network

is strongly correlated, suggesting that the loss of inter-

pretability may not be a signiﬁcant drawback after all.

REFERENCES

Ackermann, J., Sakaridis, C., and Yu, F. (2023).

Maskomaly:zero-shot mask anomaly segmentation.

Bevandi

c, P., Krešo, I., Orši

c, M., and Šegvi

c, S.

(2019). Simultaneous semantic segmentation and out-

lier detection in presence of domain shift. CoRR,

abs/1908.01098.

Bevandi

c, P., Krešo, I., Orši

c, M., and Šegvi

c, S. (2021).

Dense outlier detection and open-set recognition

based on training with noisy negative images. CoRR,

abs/2101.09193.

Biase, G. D., Blum, H., Siegwart, R., and Cadena, C.

(2021). Pixel-wise anomaly detection in complex

driving scenes. CoRR, abs/2103.05445.

Blum, H., Sarlin, P., Nieto, J. I., Siegwart, R., and Ca-

dena, C. (2019). The ﬁshyscapes benchmark: Mea-

suring blind spots in semantic segmentation. CoRR,

abs/1904.03215.

Chan, R., Rottmann, M., and Gottschalk, H. (2020). En-

tropy maximization and meta classiﬁcation for out-

of-distribution detection in semantic segmentation.

CoRR, abs/2012.06575.

Chan, R., Rottmann, M., Hüger, F., Schlicht, P., and

Gottschalk, H. (2019). Metafusion: Controlled false-

VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications

354

negative reduction of minority classes in semantic seg-

mentation.

Chan, R., Uhlemeyer, S., Rottmann, M., and Gottschalk,

H. (2022). Detecting and learning the unknown in se-

mantic segmentation.

Chen, L., Zhu, Y., Papandreou, G., Schroff, F., and Adam,

H. (2018). Encoder-decoder with atrous separable

convolution for semantic image segmentation. CoRR,

abs/1802.02611.

Cheng, B., Misra, I., Schwing, A. G., Kirillov, A., and Gird-

har, R. (2022). Masked-attention mask transformer for

universal image segmentation.

Cheng, B., Schwing, A. G., and Kirillov, A. (2021). Per-

pixel classiﬁcation is not all you need for semantic

segmentation.

Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler,

M., Benenson, R., Franke, U., Roth, S., and Schiele,

B. (2016). The cityscapes dataset for semantic urban

scene understanding. CoRR, abs/1604.01685.

Deli

c, A., Grci

c, M., and Šegvi

c, S. (2024). Outlier detec-

tion by ensembling uncertainty with negative object-

ness.

Denouden, T., Salay, R., Czarnecki, K., Abdelzad, V., Phan,

B., and Vernekar, S. (2018). Improving reconstruction

autoencoder out-of-distribution detection with maha-

lanobis distance. CoRR, abs/1812.02765.

Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R.

(2004). Least angle regression. The Annals of Statis-

tics, 32(2).

Gal, Y. and Ghahramani, Z. (2016). Dropout as a bayesian

approximation: Representing model uncertainty in

deep learning.

Grci

c, M., Bevandi

c, P., and Šegvi

c, S. (2021). Dense

anomaly detection by robust learning on synthetic

negative data. ArXiv, abs/2112.12833.

Grci

c, M., Bevandi

c, P., and Šegvi

c, S. (2022). Dense-

hybrid: Hybrid anomaly detection for dense open-set

recognition.

Hendrycks, D. and Gimpel, K. (2017). A baseline for de-

tecting misclassiﬁed and out-of-distribution examples

in neural networks. In 5th International Conference

on Learning Representations, ICLR 2017, Toulon,

France, April 24-26, 2017, Conference Track Pro-

ceedings. OpenReview.net.

Janai, J., Güney, F., Behl, A., and Geiger, A. (2020).

Computer vision for autonomous vehicles: Problems,

datasets and state of the art. Found. Trends. Comput.

Graph. Vis., 12(1–3):1–308.

Kendall, A., Badrinarayanan, V., and Cipolla, R. (2015).

Bayesian segnet: Model uncertainty in deep convolu-

tional encoder-decoder architectures for scene under-

standing. CoRR, abs/1511.02680.

Kingma, D. P. and Ba, J. (2017). Adam: A method for

stochastic optimization.

Lakshminarayanan, B., Pritzel, A., and Blundell, C. (2017).

Simple and scalable predictive uncertainty estimation

using deep ensembles.

Lee, K., Lee, K., Lee, H., and Shin, J. (2018). A simple uni-

ﬁed framework for detecting out-of-distribution sam-

ples and adversarial attacks.

Li, Y. and Kosecka, J. (2021). Uncertainty aware proposal

segmentation for unknown object detection. CoRR,

abs/2111.12866.

Liang, S., Li, Y., and Srikant, R. (2018). Enhancing

the reliability of out-of-distribution image detection

in neural networks. In 6th International Conference

on Learning Representations, ICLR 2018, Vancouver,

BC, Canada, April 30 - May 3, 2018, Conference

Track Proceedings. OpenReview.net.

Lin, T., Maire, M., Belongie, S. J., Bourdev, L. D., Girshick,

R. B., Hays, J., Perona, P., Ramanan, D., Dollár, P.,

and Zitnick, C. L. (2014). Microsoft COCO: common

objects in context. CoRR, abs/1405.0312.

Lis, K., Nakka, K. K., Fua, P., and Salzmann, M.

(2019). Detecting the unexpected via image resyn-

thesis. CoRR, abs/1904.07595.

Nayal, N., Yavuz, M., Henriques, J. F., and Güney, F.

(2023). Rba: Segmenting unknown regions rejected

by all.

Oberdiek, P., Rottmann, M., and Fink, G. A. (2020). De-

tection and retrieval of out-of-distribution objects in

semantic segmentation. CoRR, abs/2005.06831.

Pinggera, P., Ramos, S., Gehrig, S., Franke, U., Rother, C.,

and Mester, R. (2016). Lost and found: Detecting

small road hazards for self-driving vehicles. CoRR,

abs/1609.04653.

Rai, S. N., Cermelli, F., Fontanel, D., Masone, C., and Ca-

puto, B. (2023). Unmasking anomalies in road-scene

segmentation.

Rottmann, M., Colling, P., Hack, T., Hüger, F., Schlicht,

P., and Gottschalk, H. (2018). Prediction error meta

classiﬁcation in semantic segmentation: Detection via

aggregated dispersion measures of softmax probabili-

ties. CoRR, abs/1811.00648.

Rottmann, M. and Schubert, M. (2019). Uncertainty mea-

sures and prediction quality rating for the semantic

segmentation of nested multi resolution street scene

images. CoRR, abs/1904.04516.

van Amersfoort, J., Smith, L., Teh, Y. W., and Gal, Y.

(2020). Simple and scalable epistemic uncertainty es-

timation using a single deep deterministic neural net-

work. CoRR, abs/2003.02037.

Wong, K., Wang, S., Ren, M., Liang, M., and Urtasun,

R. (2019). Identifying unknown instances for au-

tonomous driving. CoRR, abs/1910.11296.

Wu, Z., Shen, C., and van den Hengel, A. (2016). Wider or

deeper: Revisiting the resnet model for visual recog-

nition. CoRR, abs/1611.10080.

Xia, Y., Zhang, Y., Liu, F., Shen, W., and Yuille, A. L.

(2020). Synthesize then compare: Detecting fail-

ures and anomalies for semantic segmentation. CoRR,

abs/2003.08440.

Zhu, Y., Sapra, K., Reda, F. A., Shih, K. J., Newsam, S. D.,

Tao, A., and Catanzaro, B. (2018). Improving seman-

tic segmentation via video propagation and label re-

laxation. CoRR, abs/1812.01593.

Neural Network Meta Classiﬁer: Improving the Reliability of Anomaly Segmentation

355