Additional Results
Symmetry Breaking. Figure 10 shows the param-
eter update deviations from the constraints on gradi-
ents induced by translation symmetry, along with the
associated performance after applying the symmetry
breaking regularization.
We see similar trends as for the Leaky ReLU and
batch normalization optimizees. Specifically, L2O
breaks the constraints by a large margin early in the
training run.
