
and Gashler, 2015; Wang et al., 2020a). Recently
in (Ainsworth et al., 2023), deep recognition mod-
els like ResNet-50 are also align-able by deterministi-
cally matching weights. We continue on this path and
further explore the merit of a more symmetric activa-
tion function to improve the merging effect, on both
recognition and generative models.
7 CONCLUSION
We have introduced a new class of activation func-
tions called Conic Linear Units. Our contribu-
tion allows neural networks to possess infinite-order
group symmetry beyond channel permutations, which
was previously unattainable. This novel design ad-
dresses the apparent deficiency by incorporating soft-
alignment through optimal transport. Moreover, it
outperforms baseline results in terms of image gen-
eration quality.
ACKNOWLEDGEMENT
This work was funded in part by the French govern-
ment under management of Agence Nationale de la
Recherche as part of the ”Investissements d’avenir”
program, reference ANR-19-P3IA-0001 (PRAIRIE
3IA Institute). This work was granted access to
the HPC resources of IDRIS under the allocation
2022-AD011013178 made by GENCI, and supported
by cloud TPU from Google’s TPU Research Cloud
(TRC). We thank Ir
`
ene Waldspurger, Jalal Fadili,
Gabriel Peyr
´
e and Gr
´
egoire Szymanski for insightful
suggestions on the draft of the paper.
REFERENCES
Ainsworth, S., Hayase, J., and Srinivasa, S. (2023). Git
re-basin: Merging models modulo permutation sym-
metries. In The Eleventh International Conference on
Learning Representations.
Ashmore, S. and Gashler, M. (2015). A method for find-
ing similarity between multi-layer perceptrons by for-
ward bipartite alignment. In 2015 International Joint
Conference on Neural Networks (IJCNN), pages 1–7.
IEEE.
Bi
´
nkowski, M., Sutherland, D. J., Arbel, M., and Gretton,
A. (2018). Demystifying MMD GANs. In Interna-
tional Conference on Learning Representations.
Bruna, J. and Mallat, S. (2013). Invariant scattering convo-
lution networks. IEEE transactions on pattern analy-
sis and machine intelligence, 35(8):1872–1886.
Choi, Y., Choi, M., Kim, M., Ha, J.-W., Kim, S., and Choo,
J. (2018). Stargan: Unified generative adversarial net-
works for multi-domain image-to-image translation.
In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 8789–8797.
Garipov, T., Izmailov, P., Podoprikhin, D., Vetrov, D. P., and
Wilson, A. G. (2018). Loss surfaces, mode connectiv-
ity, and fast ensembling of dnns. Advances in neural
information processing systems, 31.
Karras, T., Laine, S., and Aila, T. (2019). A style-based
generator architecture for generative adversarial net-
works. In Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, pages
4401–4410.
Krizhevsky, A. and Hinton, G. Technical report.
LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard,
R., Hubbard, W., and Jackel, L. (1989). Handwrit-
ten digit recognition with a back-propagation network.
Advances in neural information processing systems, 2.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and
Ommer, B. (2022). High-resolution image synthesis
with latent diffusion models. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition, pages 10684–10695.
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., and
Aberman, K. (2022). Dreambooth: Fine tuning text-
to-image diffusion models for subject-driven genera-
tion.
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V.,
Radford, A., Chen, X., and Chen, X. (2016). Im-
proved techniques for training gans. In Lee, D.,
Sugiyama, M., Luxburg, U., Guyon, I., and Garnett,
R., editors, Advances in Neural Information Process-
ing Systems, volume 29. Curran Associates, Inc.
Shi, W., Caballero, J., Huszar, F., Totz, J., Aitken, A. P.,
Bishop, R., Rueckert, D., and Wang, Z. (2016). Real-
time single image and video super-resolution using
an efficient sub-pixel convolutional neural network.
In 2016 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pages 1874–1883. IEEE
Computer Society.
Singh, S. P. and Jaggi, M. (2020). Model fusion via optimal
transport. Advances in Neural Information Processing
Systems, 33:22045–22055.
Wang, H., Yurochkin, M., Sun, Y., Papailiopoulos, D.,
and Khazaeni, Y. (2020a). Federated learning with
matched averaging. In International Conference on
Learning Representations.
Wang, J., Chen, Y., Chakraborty, R., and Yu, S. X. (2020b).
Orthogonal convolutional neural networks. In Pro-
ceedings of the IEEE/CVF conference on computer vi-
sion and pattern recognition, pages 11505–11515.
Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E. (2004).
Image quality assessment: from error visibility to
structural similarity. IEEE Transactions on Image
Processing, 13(4):600–612.
Weiler, M. and Cesa, G. (2019). General e (2)-equivariant
steerable cnns. Advances in Neural Information Pro-
cessing Systems, 32.
Conic Linear Units: Improved Model Fusion and Rotational-Symmetric Generative Model
693