tecting misclassified and out-of-distribution examples
in neural networks. In ICLR, Toulon, France.
Hendrycks, D., Mazeika, M., Kadavath, S., and Song,
D. (2019). Using self-supervised learning can im-
prove model robustness and uncertainty. In Wallach,
H., Larochelle, H., Beygelzimer, A., d'Alch
´
e-Buc, F.,
Fox, E., and Garnett, R., editors, NeurIPS, volume 32,
pages 15637–15648, Vancouver, CA. CAI.
Hendrycks, D., Zhao, K., Basart, S., Steinhardt, J., and
Song, D. (2020). Natural adversarial examples. ArXiv,
abs/1907.07174.
Huang, H., Li, Z., Wang, L., Chen, S., Dong, B., and
Zhou, X. (2021). Feature space singularity for out-
of-distribution detection. In Espinoza, H., McDer-
mid, J., Huang, X., Castillo-Effen, M., Chen, X. C.,
Hern
´
andez-Orallo, J.,
´
O h
´
Eigeartaigh, S., and Mallah,
R., editors, Workshop on SafeAI@AAAI, volume 2808
of CEUR Workshop. ceur-ws.org.
Kim, H. (2020). Torchattacks: A pytorch repository for
adversarial attacks. ArXiv, abs/2010.01950.
Krizhevsky, A. (2009). Learning multiple layers of features
from tiny images. Technical report, Univ of Toronto.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).
Imagenet classification with deep convolutional neu-
ral networks. In Pereira, F., Burges, C. J. C., Bottou,
L., and Weinberger, K. Q., editors, NIPS, volume 25,
pages 1097–1105, Lake Tahoe, NV, USA. CAI.
Kurakin, A., Goodfellow, I. J., and Bengio, S. (2017). Ad-
versarial examples in the physical world. In ICLR,
Toulon, France.
LeCun, Y., Cortes, C., and Burges, C. (2010). Mnist
handwritten digit database. ATT Labs [Online],
http://yann.lecun.com/exdb/mnist, 2.
Lee, K., Lee, H., Lee, K., and Shin, J. (2018a). Training
confidence-calibrated classifiers for detecting out-of-
distribution samples. In ICLR, Vancouver, CA.
Lee, K., Lee, K., Lee, H., and Shin, J. (2018b). A simple
unified framework for detecting out-of-distribution
samples and adversarial attacks. In Bengio, S., Wal-
lach, H., Larochelle, H., Grauman, K., Cesa-Bianchi,
N., and Garnett, R., editors, NeurIPS, volume 31, page
7167–7177, Montreal, CA. CAI.
Lehmann, D. and Ebner, M. (2021). Layer-wise activation
cluster analysis of cnns to detect out-of-distribution
samples. In Farkas, I., Masulli, P., Otte, S., and
Wermter, S., editors, Proc of the 30th Int Conf on Ar-
tificial Neural Networks ICANN 2021, Lecture Notes
in CS, pages 214–226, Berlin, Germany. Springer.
Li, X. and Li, F. (2017). Adversarial examples detection in
deep networks with convolutional filter statistics. In
ICCV, pages 5775–5783, Venice, Italy. IEEE.
Liang, S., Li, Y., and Srikant, R. (2018). Enhancing the reli-
ability of out-of-distribution image detection in neural
networks. In ICLR, Vancouver, CA.
Lin, Z., Roy, S. D., and Li, Y. (2021). Mood: Multi-level
out-of-distribution detection. In CVPR, pages 15308–
15318. IEEE.
Ma, X., Li, B., Wang, Y., Erfani, S. M., Wijewickrema, S.,
Schoenebeck, G., Houle, M. E., Song, D., and Bailey,
J. (2018). Characterizing adversarial subspaces using
local intrinsic dimensionality. In ICLR, Vancouver,
CA.
Machado, G. R., Silva, E., and Goldschmidt, R. R. (2021).
Adversarial machine learning in image classification:
A survey toward the defender’s perspective. ACM
Comput. Surv., 55(1):1–38.
MacQueen, J. B. (1967). Some methods for classification
and analysis of multivariate observations. In Cam,
L. M. L. and Neyman, J., editors, Berkeley Symp on
Math Stat and Prob, volume 1, pages 281–297. Univ
of Calif Press.
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and
Vladu, A. (2018). Towards deep learning models re-
sistant to adversarial attacks. In ICLR, Vancouver, CA.
McInnes, L., Healy, J., and Melville, J. (2018). UMAP:
Uniform manifold approximation and projection for
dimension reduction. ArXiv, abs/1802.03426.
Meng, D. and Chen, H. (2017). Magnet: A two-pronged de-
fense against adversarial examples. In SIGSAC, page
135–147, Dallas, TX, USA. ACM.
Metzen, J. H., Genewein, T., Fischer, V., and Bischoff, B.
(2017). On detecting adversarial perturbations. In
ICLR, Toulon, France.
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and
Ng, A. Y. (2011). Reading digits in natural images
with unsupervised feature learning. In NIPS Workshop
on Deep Learning and Unsupervised Feature Learn-
ing.
Papernot, N. and McDaniel, P. (2018). Deep k-nearest
neighbors: Towards confident, interpretable and ro-
bust deep learning. ArXiv, abs/1803.04765.
Pearson, K. (1901). LIII. On lines and planes of closest
fit to systems of points in space. London, Edinburgh
Dublin Philos Mag J Sci, 2(11):559–572.
Sastry, C. S. and Oore, S. (2020). Detecting out-of-
distribution examples with gram matrices. In ICML,
volume 119, pages 8491–8501. PMLR.
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan,
D., Goodfellow, I. J., and Fergus, R. (2014). Intrigu-
ing properties of neural networks. In Bengio, Y. and
LeCun, Y., editors, ICLR, Banff, CA.
Zeiler, M. D. and Fergus, R. (2014). Visualizing and un-
derstanding convolutional networks. In Fleet, D., Pa-
jdla, T., Schiele, B., and Tuytelaars, T., editors, ECCV,
number PART 1 in Lecture Notes in CS, pages 818–
833, Zurich, CH. Springer.
Zhang, H., Dauphin, Y. N., and Ma, T. (2019). Fixup ini-
tialization: Residual learning without normalization.
ArXiv, abs/1901.09321.
Calculating the Credibility of Test Samples at Inference by a Layer-wise Activation Cluster Analysis of Convolutional Neural Networks
43