Chen, T., Kornblith, S., Swersky, K., Norouzi, M., and Hin-
ton, G. E. (2020). Big self-supervised models are
strong semi-supervised learners. In Larochelle, H.,
Ranzato, M., Hadsell, R., Balcan, M. F., and Lin, H.,
editors, Advances in Neural Information Processing
Systems, volume 33, pages 22243–22255. Curran As-
sociates, Inc.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). Imagenet: A large-scale hierarchical
image database. In 2009 IEEE conference on com-
puter vision and pattern recognition, pages 248–255.
Ieee.
Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Ex-
plaining and harnessing adversarial examples. In Ben-
gio, Y. and LeCun, Y., editors, 3rd International Con-
ference on Learning Representations, ICLR 2015, San
Diego, CA, USA, May 7-9, 2015, Conference Track
Proceedings.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Hendrycks, D. and Gimpel, K. (2016). Gaussian error linear
units (gelus). arXiv preprint arXiv:1606.08415.
Hendrycks, D. and Gimpel, K. (2017). A baseline for de-
tecting misclassified and out-of-distribution examples
in neural networks. Proceedings of International Con-
ference on Learning Representations.
Hendrycks, D., Mazeika, M., and Dietterich, T. (2019a).
Deep anomaly detection with outlier exposure. In In-
ternational Conference on Learning Representations.
Hendrycks, D., Mazeika, M., Kadavath, S., and Song,
D. (2019b). Using self-supervised learning can im-
prove model robustness and uncertainty. In Wallach,
H., Larochelle, H., Beygelzimer, A., d'Alch
´
e-Buc, F.,
Fox, E., and Garnett, R., editors, Advances in Neural
Information Processing Systems, volume 32. Curran
Associates, Inc.
Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling
the knowledge in a neural network. In NIPS Deep
Learning and Representation Learning Workshop.
Hsu, Y.-C., Shen, Y., Jin, H., and Kira, Z. (2020). Gener-
alized odin: Detecting out-of-distribution image with-
out learning from out-of-distribution data. In Proceed-
ings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, pages 10951–10960.
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian,
Y., Isola, P., Maschinot, A., Liu, C., and Krish-
nan, D. (2020). Supervised contrastive learning. In
Larochelle, H., Ranzato, M., Hadsell, R., Balcan,
M. F., and Lin, H., editors, Advances in Neural Infor-
mation Processing Systems, volume 33, pages 18661–
18673. Curran Associates, Inc.
Kingma, D. P. and Ba, J. (2015). Adam: A method for
stochastic optimization. In Bengio, Y. and LeCun,
Y., editors, 3rd International Conference on Learn-
ing Representations, ICLR 2015, San Diego, CA, USA,
May 7-9, 2015, Conference Track Proceedings.
Krizhevsky, A. et al. (2009). Learning multiple layers of
features from tiny images.
Lee, K., Lee, H., Lee, K., and Shin, J. (2018a). Training
confidence-calibrated classifiers for detecting out-of-
distribution samples. In International Conference on
Learning Representations.
Lee, K., Lee, K., Lee, H., and Shin, J. (2018b). A simple
unified framework for detecting out-of-distribution
samples and adversarial attacks. In Proceedings of the
32nd International Conference on Neural Information
Processing Systems, pages 7167–7177.
Liang, S., Li, Y., and Srikant, R. (2018). Enhancing the reli-
ability of out-of-distribution image detection in neural
networks. In International Conference on Learning
Representations.
Loshchilov, I. and Hutter, F. (2017). SGDR: stochastic gra-
dient descent with warm restarts. In 5th International
Conference on Learning Representations, ICLR 2017,
Toulon, France, April 24-26, 2017, Conference Track
Proceedings. OpenReview.net.
Loshchilov, I. and Hutter, F. (2019). Decoupled weight
decay regularization. In International Conference on
Learning Representations.
Maas, A. L., Hannun, A. Y., and Ng, A. Y. (2013). Rec-
tifier nonlinearities improve neural network acoustic
models. In Proc. icml, volume 30, page 3. Citeseer.
M
¨
uller, R., Kornblith, S., and Hinton, G. E. (2019).
When does label smoothing help? In Wallach, H.,
Larochelle, H., Beygelzimer, A., d'Alch
´
e-Buc, F.,
Fox, E., and Garnett, R., editors, Advances in Neural
Information Processing Systems, volume 32. Curran
Associates, Inc.
Nair, V. and Hinton, G. E. (2010). Rectified linear units
improve restricted boltzmann machines. In Icml.
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and
Ng, A. Y. (2011). Reading digits in natural images
with unsupervised feature learning. In NIPS Workshop
on Deep Learning and Unsupervised Feature Learn-
ing 2011.
Oberdiek, P., Rottmann, M., and Fink, G. A. (2020).
Detection and retrieval of out-of-distribution objects
in semantic segmentation. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR) Workshops.
Pacheco, A. G. C., Sastry, C. S., Trappenberg, T., Oore, S.,
and Krohling, R. A. (2020). On out-of-distribution de-
tection algorithms with deep neural skin cancer clas-
sifiers. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition (CVPR)
Workshops.
Phuong, M. and Lampert, C. (2019). Towards under-
standing knowledge distillation. In Chaudhuri, K.
and Salakhutdinov, R., editors, Proceedings of the
36th International Conference on Machine Learning,
volume 97 of Proceedings of Machine Learning Re-
search, pages 5142–5151. PMLR.
Ramachandran, P., Zoph, B., and Le, Q. V. (2018). Search-
ing for activation functions.
Sastry, C. S. and Oore, S. (2020). Detecting out-of-
distribution examples with gram matrices. In Interna-
tional Conference on Machine Learning, pages 8491–
8501. PMLR.
What Matters for Out-of-Distribution Detectors using Pre-trained CNN?
271