and c
1 1
=c
1 2
= 2000 in Table 9: the model evolved
most of the challenging features of the metal logo
without a heavy presence of artefacts. They are an
improvement over the results in Table 6 in most ways,
which proves that adding both a deep style layer and
a coarse layer to maintain the content font structure
improves the final result. This could be seen in Figure
3c: the third block generates more style loss than the
other two blocks, but produces an overall worse result
than the fourth block that manages to maintain better
readability.
3.5 MSG Net
MSG Net was introduced in (Zhang and Dana,
2017).We finetuned it to our data that we scraped
from the internet: 19 corporate logos (content) and
11 heavy metal logos (style). Style loss hyperparam-
eter was set to 10000, content loss hyperparameter to
1, learning rate to 1.0.
Although MSG Net is more advanced than plain
VGG16: it has a fully convolutional architecture,
learns weights to evolve an image with the transferred
style and has more loss functions, it performs worse
than Network of Steel in terms of sparse style transfer,
as it does not transfer any font style from heavy metal
logos onto the font in the corporate logos at all. MSG-
Net manages to evolve some small elements around
the glyphs, that are barely noticeable.
4 CONCLUSIONS
Sparse style transfer requires an approach different to
that of other neural style transfer problems, due to a
large number of artefacts, merging and distortion of
elements of the style and font.
In this publication we introduced Network of Steel
for sparse style transfer from heavy metal band to cor-
porate logos. We showed that in order to synthesize
a readable logo with heavy metal style elements, in-
stead of using layers from all blocks of VGG16, only
one or two coarse layers and two or three deep layers
are enough. Our future work includes the following
challenges:
1. Train a separate network for loss coefficients,
2. Build a large database for training Networks of
Steel for different heavy metal styles and corpo-
rate logos,
3. Design accuracy metrics applicable to this prob-
lem to enable visual comparison,
4. In this paper we only used a white background for
heavy metal logos, which causes a lot of artefacts.
In the future we will use different, more challeng-
ing backgrounds, like album covers.
We showed that conv1 2 is essential to maintaining
artefact-free background and layers from the third
block in VGG16 learn style faster than deeper lay-
ers like conv5 3 and conv4 3. Our approach is sim-
ple and more robust than (Gatys et al., 2016) and
(Zhang and Dana, 2017) for sparse style transfer.
The whole deep fourth block (conv4 1, conv4 2,
conv4 3) with loss coefficients of 200 and two coarse
layers (conv1 1 and conv1 2) with loss coefficients
of 2000 produce the best tradeoff between heavy
metal style and the readability of the corporate logo.
REFERENCES
Atarsaikhan, G., Iwana, B. K., and Uchida, S. (2018). Con-
tained neural style transfer for decorated logo genera-
tion. In 2018 13th IAPR International Workshop on
Document Analysis Systems (DAS), pages 317–322.
IEEE.
Azadi, S., Fisher, M., Kim, V. G., Wang, Z., Shechtman, E.,
and Darrell, T. (2018). Multi-content gan for few-shot
font style transfer. In Proceedings of the IEEE con-
ference on computer vision and pattern recognition,
pages 7564–7573.
Das, A., Roy, S., Bhattacharya, U., and Parui, S. K.
(2018). Document image classification with intra-
domain transfer learning and stacked generalization
of deep convolutional neural networks. In 2018
24th International Conference on Pattern Recognition
(ICPR), pages 3180–3185. IEEE.
Gatys, L. A., Ecker, A. S., and Bethge, M. (2016). Image
style transfer using convolutional neural networks. In
Proceedings of the IEEE conference on computer vi-
sion and pattern recognition, pages 2414–2423.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2014). Generative adversarial nets. In
Advances in neural information processing systems,
pages 2672–2680.
Hayashi, H., Abe, K., and Uchida, S. (2019). Glyph-
gan: Style-consistent font generation based on
generative adversarial networks. arXiv preprint
arXiv:1905.12502.
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2017).
Image-to-image translation with conditional adversar-
ial networks. In Proceedings of the IEEE conference
on computer vision and pattern recognition, pages
1125–1134.
Ma, L., Jia, X., Sun, Q., Schiele, B., Tuytelaars, T., and
Van Gool, L. (2017). Pose guided person image gener-
ation. In Advances in Neural Information Processing
Systems, pages 406–416.
Simonyan, K. and Zisserman, A. (2014). Very deep con-
volutional networks for large-scale image recognition.
arXiv preprint arXiv:1409.1556.
Network of Steel: Neural Font Style Transfer from Heavy Metal to Corporate Logos
625