REFERENCES
Carlier, A., Danelljan, M., Alahi, A., and Timofte, R.
(2020). Deepsvg: A hierarchical generative network
for vector graphics animation. Advances in Neural In-
formation Processing Systems, 33:16351–16361.
Deng, Y., Tang, F., Dong, W., Ma, C., Pan, X., Wang, L.,
and Xu, C. (2022). Stytr2: Image style transfer with
transformers. In Proceedings of the IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition,
pages 11326–11336.
Dumoulin, V., Shlens, J., and Kudlur, M. (2016). A
learned representation for artistic style. arXiv preprint
arXiv:1610.07629.
Efimova, V., Jarsky, I., Bizyaev, I., and Filchenkov,
A. (2022). Conditional vector graphics gener-
ation for music cover images. arXiv preprint
arXiv:2205.07301.
Frans, K., Soros, L. B., and Witkowski, O. (2021).
Clipdraw: Exploring text-to-drawing synthesis
through language-image encoders. arXiv preprint
arXiv:2106.14843.
Gatys, L. A., Ecker, A. S., and Bethge, M. (2015). A
neural algorithm of artistic style. arXiv preprint
arXiv:1508.06576.
Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., and Song,
M. (2019). Neural style transfer: A review. IEEE
transactions on visualization and computer graphics,
26(11):3365–3385.
Johnson, J., Alahi, A., and Fei-Fei, L. (2016). Perceptual
losses for real-time style transfer and super-resolution.
In European conference on computer vision, pages
694–711. Springer.
Kettunen, M., H
¨
ark
¨
onen, E., and Lehtinen, J. (2019). E-
lpips: robust perceptual image similarity via ran-
dom transformation ensembles. arXiv preprint
arXiv:1906.03973.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). Im-
agenet classification with deep convolutional neural
networks. Communications of the ACM, 60(6):84–90.
Kwon, G. and Ye, J. C. (2022). Clipstyler: Image style
transfer with a single text condition. In Proceedings
of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, pages 18062–18071.
Li, C. and Wand, M. (2016). Combining markov random
fields and convolutional neural networks for image
synthesis. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 2479–
2486.
Li, T.-M., Luk
´
a
ˇ
c, M., Gharbi, M., and Ragan-Kelley, J.
(2020). Differentiable vector graphics rasterization for
editing and learning. ACM Transactions on Graphics
(TOG), 39(6):1–15.
Li, Y., Wang, N., Liu, J., and Hou, X. (2017). De-
mystifying neural style transfer. arXiv preprint
arXiv:1701.01036.
Ma, X., Zhou, Y., Xu, X., Sun, B., Filev, V., Orlov, N.,
Fu, Y., and Shi, H. (2022). Towards layer-wise image
vectorization. In Proceedings of the IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition,
pages 16314–16323.
Park, D. Y. and Lee, K. H. (2019). Arbitrary style transfer
with style-attentional networks. In proceedings of the
IEEE/CVF conference on computer vision and pattern
recognition, pages 5880–5888.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G.,
Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark,
J., et al. (2021). Learning transferable visual models
from natural language supervision. In International
Conference on Machine Learning, pages 8748–8763.
PMLR.
Reddy, P., Gharbi, M., Lukac, M., and Mitra, N. J. (2021).
Im2vec: Synthesizing vector graphics without vector
supervision. In Proceedings of the IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition,
pages 7342–7351.
Schaldenbrand, P., Liu, Z., and Oh, J. (2022). Styleclip-
draw: Coupling content and style in text-to-drawing
translation. arXiv preprint arXiv:2202.12362.
Simonyan, K. and Zisserman, A. (2014). Very deep con-
volutional networks for large-scale image recognition.
arXiv preprint arXiv:1409.1556.
Ulyanov, D., Lebedev, V., Vedaldi, A., and Lempitsky,
V. (2016). Texture networks: Feed-forward synthe-
sis of textures and stylized images. arXiv preprint
arXiv:1603.03417.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.
(2017). Attention is all you need. Advances in neural
information processing systems, 30.
Wang, Y. and Lian, Z. (2021). Deepvecfont: synthesizing
high-quality vector fonts via dual-modality learning.
ACM Transactions on Graphics (TOG), 40(6):1–15.
Yoo, J., Uh, Y., Chun, S., Kang, B., and Ha, J.-W. (2019).
Photorealistic style transfer via wavelet transforms. In
Proceedings of the IEEE/CVF International Confer-
ence on Computer Vision, pages 9036–9045.
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., and Wang,
O. (2018). The unreasonable effectiveness of deep
features as a perceptual metric. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 586–595.
Zhang, Y., Tang, F., Dong, W., Huang, H., Ma, C., Lee,
T.-Y., and Xu, C. (2022). Domain enhanced arbitrary
image style transfer via contrastive learning. In ACM
SIGGRAPH 2022 Conference Proceedings, pages 1–
8.
Neural Style Transfer for Vector Graphics
693