Authors:
Thomas Duboudin
1
;
Emmanuel Dellandréa
1
;
Corentin Abgrall
2
;
Gilles Hénaff
2
and
Liming Chen
1
Affiliations:
1
Univ. Lyon, École Centrale de Lyon, CNRS, INSA Lyon, Univ. Claude Bernard Lyon 1, Univ. Louis Lumière Lyon 2, LIRIS, UMR5205, 69134 Ecully, France
;
2
Thales LAS France SAS, 78990 Élancourt, France
Keyword(s):
Domain Generalization, Out-of-Domain Generalization, Test-Time Adaptation, Shortcut Learning, PACS, Office-Home.
Abstract:
Deep neural networks often fail to generalize outside of their training distribution, particularly when only a single data domain is available during training. While test-time adaptation has yielded encouraging results in this setting, we argue that to reach further improvements, these approaches should be combined with training procedure modifications aiming to learn a more diverse set of patterns. Indeed, test-time adaptation methods usually have to rely on a limited representation because of the shortcut learning phenomenon: only a subset of the available predictive patterns is learned with standard training. In this paper, we first show that the combined use of existing training-time strategies and test-time batch normalization, a simple adaptation method, does not always improve upon the test-time adaptation alone on the PACS benchmark. Furthermore, experiments on Office-Home show that very few training-time methods improve upon standard training, with or without test-time batch
normalization. Therefore, we propose a novel approach that mitigates the shortcut learning behavior by having an additional classification branch learn less predictive and generalizable patterns. Our experiments show that our method improves upon the state-of-the-art results on both benchmarks and benefits the most to test-time batch normalization.
(More)