
ACKNOWLEDGEMENTS
This research was partly funded by Albert and An-
neliese Konanz Foundation, the German Research
Foundation under grant INST874/9-1 and the Federal
Ministry of Education and Research Germany in the
project M
2
Aind-DeepLearning (13FH8I08IA).
REFERENCES
Br
¨
ocker, J. (2009). Reliability, sufficiency, and the decom-
position of proper scores. Quarterly Journal of the
Royal Meteorological Society.
Ebert, N., Mangat, P., and Wasenm
¨
uller, O. (2022). Multi-
task network for joint object detection, semantic seg-
mentation and human pose estimation in vehicle occu-
pancy monitoring. In Intelligent Vehicles Symposium
(IV).
Ebert, N., Stricker, D., and Wasenm
¨
uller, O. (2023).
Transformer-based detection of microorganisms on
high-resolution petri dish images. In International
Conference on Computer Vision Workshops (ICCVW).
Gal, Y. and Ghahramani, Z. (2016). Dropout as a bayesian
approximation: Representing model uncertainty in
deep learning. In International Conference on Ma-
chine Learning (ICML).
Goodfellow, I. (2016). Deep learning.
Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. (2017).
On calibration of modern neural networks. In Inter-
national Conference on Machine Learning (ICML).
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Conference on
Computer Vision and Pattern Recognition (CVPR).
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger,
K. Q. (2017). Densely connected convolutional net-
works. In Conference on Computer Vision and Pattern
Recognition (CVPR).
Jiang, H., Kim, B., Guan, M., and Gupta, M. (2018). To
trust or not to trust a classifier. Advances in Neural
Information Processing Systems (NeurIPS).
Kendall, A. and Gal, Y. (2017). What uncertainties do we
need in bayesian deep learning for computer vision?
Advances in Neural Information Processing Systems
(NeurIPS).
Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple
layers of features from tiny images. Technical report.
Kumar, A., Sarawagi, S., and Jain, U. (2018). Trainable
calibration measures for neural networks from kernel
mean embeddings. In International Conference on
Machine Learning (ICML).
Lakshminarayanan, B., Pritzel, A., and Blundell, C. (2017).
Simple and scalable predictive uncertainty estimation
using deep ensembles. Advances in Neural Informa-
tion Processing Systems (NeurIPS).
Le, Y. and Yang, X. (2015). Tiny imagenet visual recogni-
tion challenge. CS 231N.
Naeini, M. P., Cooper, G., and Hauskrecht, M. (2015).
Obtaining well calibrated probabilities using bayesian
binning. In AAAI Conference on Artificial Intelli-
gence.
Niculescu-Mizil, A. and Caruana, R. (2005). Predicting
good probabilities with supervised learning. In Inter-
national Conference on Machine Learning (ICML).
Oehri, S., Ebert, N., Abdullah, A., Stricker, D., and
Wasenm
¨
uller, O. (2024). Genformer – generated im-
ages are all you need to improve robustness of trans-
formers on small datasets. In International Confer-
ence on Pattern Recognition (ICPR).
Platt, J. et al. (1999). Probabilistic outputs for support vec-
tor machines and comparisons to regularized likeli-
hood methods. Advances in Large Margin Classifiers.
Raschka, S., Liu, Y. H., and Mirjalili, V. (2022). Ma-
chine Learning with PyTorch and Scikit-Learn: De-
velop machine learning and deep learning models
with Python. Packt Publishing Ltd.
Reichardt, L., Ebert, N., and Wasenm
¨
uller, O. (2023).
360deg from a single camera: A few-shot approach
for lidar segmentation. In International Conference
on Computer Vision Workshops (ICCVW).
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-
novich, A. (2015). Going deeper with convolutions. In
Conference on Computer Vision and Pattern Recogni-
tion (CVPR).
Tomani, C., Cremers, D., and Buettner, F. (2022). Param-
eterized temperature scaling for boosting the expres-
sive power in post-hoc uncertainty calibration. In Eu-
ropean Conference on Computer Vision.
Wenzel, F., Snoek, J., Tran, D., and Jenatton, R. (2020).
Hyperparameter ensembles for robustness and uncer-
tainty quantification. Advances in Neural Information
Processing Systems (NeurIPS).
Xie, S., Girshick, R., Doll
´
ar, P., Tu, Z., and He, K. (2017).
Aggregated residual transformations for deep neural
networks. In Conference on Computer Vision and Pat-
tern Recognition (CVPR).
Zadrozny, B. and Elkan, C. (2001). Learning and making
decisions when costs and probabilities are both un-
known. In International Conference on Knowledge
Discovery and Data Mining.
Zadrozny, B. and Elkan, C. (2002). Transforming classifier
scores into accurate multiclass probability estimates.
In International Conference on Knowledge Discovery
and Data Mining.
Zhang, J., Kailkhura, B., and Han, T. Y.-J. (2020). Mix-n-
match: Ensemble and compositional methods for un-
certainty calibration in deep learning. In International
Conference on Machine Learning (ICML).
Classifier Ensemble for Efficient Uncertainty Calibration of Deep Neural Networks for Image Classification
323