REFERENCES
Abdal, R., Qin, Y., and Wonka, P. (2019). Image2stylegan:
How to embed images into the stylegan latent space?
In Proceedings of the IEEE international conference
on computer vision, pages 4432–4441.
Adolphs, L., Daneshmand, H., Lucchi, A., and Hof-
mann, T. (2018). Local saddle point optimization:
A curvature exploitation approach. arXiv preprint
arXiv:1805.05751.
Alain, G., Roux, N. L., and Manzagol, P.-A. (2019). Neg-
ative eigenvalues of the hessian in deep neural net-
works. arXiv preprint arXiv:1902.02366.
Arjovsky, M., Chintala, S., and Bottou, L. (2017). Wasser-
stein gan. arXiv preprint arXiv:1701.07875.
Berard, H., Gidel, G., Almahairi, A., Vincent, P., and
Lacoste-Julien, S. (2019). A closer look at the op-
timization landscapes of generative adversarial net-
works. arXiv preprint arXiv:1906.04848.
Chatzimichailidis, A., Keuper, J., Pfreundt, F.-J., and
Gauger, N. R. (2019). Gradvis: Visualization and sec-
ond order analysis of optimization surfaces during the
training of deep neural networks. In 2019 IEEE/ACM
Workshop on Machine Learning in High Performance
Computing Environments (MLHPC), pages 66–74.
IEEE.
Chaudhari, P., Choromanska, A., Soatto, S., LeCun, Y.,
Baldassi, C., Borgs, C., Chayes, J., Sagun, L., and
Zecchina, R. (2019). Entropy-sgd: Biasing gradient
descent into wide valleys. Journal of Statistical Me-
chanics: Theory and Experiment, 2019(12):124018.
Dinh, L., Pascanu, R., Bengio, S., and Bengio, Y. (2017).
Sharp minima can generalize for deep nets. CoRR,
abs/1703.04933.
Draxler, F., Veschgini, K., Salmhofer, M., and Hamprecht,
F. A. (2018). Essentially no barriers in neural network
energy landscape. arXiv preprint arXiv:1803.00885.
Durall, R., Keuper, M., and Keuper, J. (2020). Watch
your up-convolution: Cnn based generative deep neu-
ral networks are failing to reproduce spectral distribu-
tions. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, pages
7890–7899.
Durall, R., Pfreundt, F.-J., K
¨
othe, U., and Keuper, J. (2019).
Object segmentation using pixel-wise adversarial loss.
In German Conference on Pattern Recognition, pages
303–316. Springer.
Fiez, T., Chasnov, B., and Ratliff, L. J. (2019). Convergence
of learning dynamics in stackelberg games. arXiv
preprint arXiv:1906.01217.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2014a). Generative adversarial nets. In
Advances in neural information processing systems,
pages 2672–2680.
Goodfellow, I. J., Vinyals, O., and Saxe, A. M. (2014b).
Qualitatively characterizing neural network optimiza-
tion problems. arXiv preprint arXiv:1412.6544.
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and
Courville, A. C. (2017). Improved training of wasser-
stein gans. In Advances in neural information pro-
cessing systems, pages 5767–5777.
Hochreiter, S. and Schmidhuber, J. (1997). Flat minima.
Neural Computation, 9(1):1–42.
Iizuka, S., Simo-Serra, E., and Ishikawa, H. (2017). Glob-
ally and locally consistent image completion. ACM
Transactions on Graphics (ToG), 36(4):107.
Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., and
Wilson, A. G. (2018). Averaging weights leads to
wider optima and better generalization. arXiv preprint
arXiv:1803.05407.
Jastrzebski, S., Kenton, Z., Ballas, N., Fischer, A., Bengio,
Y., and Storkey, A. (2018). On the relation between
the sharpest directions of dnn loss and the sgd step
length. arXiv preprint arXiv:1807.05031.
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen,
J., and Aila, T. (2020). Analyzing and improving
the image quality of stylegan. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition, pages 8110–8119.
Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M.,
and Tang, P. T. P. (2016). On large-batch training for
deep learning: Generalization gap and sharp minima.
arXiv preprint arXiv:1609.04836.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Lanczos, C. (1950). An iteration method for the solution of
the eigenvalue problem of linear differential and inte-
gral operators. United States Governm. Press Office
Los Angeles, CA.
Mescheder, L., Nowozin, S., and Geiger, A. (2017). The
numerics of gans. In Advances in Neural Information
Processing Systems, pages 1825–1835.
Nagarajan, V. and Kolter, J. Z. (2017). Gradient descent gan
optimization is locally stable. In Advances in neural
information processing systems, pages 5585–5595.
Radford, A., Metz, L., and Chintala, S. (2015). Unsu-
pervised representation learning with deep convolu-
tional generative adversarial networks. arXiv preprint
arXiv:1511.06434.
Sagun, L., Bottou, L., and LeCun, Y. (2016). Eigenvalues of
the hessian in deep learning: Singularity and beyond.
arXiv preprint arXiv:1611.07476.
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V.,
Radford, A., and Chen, X. (2016). Improved tech-
niques for training gans. In Advances in neural infor-
mation processing systems, pages 2234–2242.
Xue, Y., Xu, T., Zhang, H., Long, L. R., and Huang, X.
(2018). Segan: Adversarial network with multi-scale
l 1 loss for medical image segmentation. Neuroinfor-
matics, 16(3-4):383–392.
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T. S.
(2019). Free-form image inpainting with gated con-
volution. In Proceedings of the IEEE International
Conference on Computer Vision, pages 4471–4480.
VISAPP 2021 - 16th International Conference on Computer Vision Theory and Applications
218