vation, one can differentiate the images generated
by GAN from a real photo;
• The previous observation limits, or at least gen-
erates more work in separating deepfakes images
for the proposed implementation;
• We found limitations in the implementation to
train with the FFHQ dataset since the files and
metadata are very extensive, and we could not ac-
cess them;
• Despite the limitation reported above, the weights
made available by training with the FFHQ dataset
were sufficient to generate a synthetic image using
only the G generator;
• We were also unable to add the conditional to the
generator, a step that must be overcome in future
work.
5 CONCLUSION
This work displays the first steps in employing the
StyleGAN2-ADA framework toward the facial swap
task in images. We conducted our research starting
with a theoretical analysis of the face swap techniques
background. Then, we presented the proposed meth-
ods. Finally, we identified the main limitations and
challenges by observing this process.
In implementing the model for the generation of
synthetic faces in phase 1 of our pipeline, we ob-
tained a set of realistic facial images. The limita-
tions found were: hardware for training (the authors
of the StyleGAN2-ADA architecture used eight self-
performance NVIDIA GPUs for training the FFHQ
facial dataset) and a large part of the dataset gener-
ated with artifacts in the images. We observed that
the generator trained using this dataset was able to
generate synthetic images.
Our experiments allowed us to identify several
challenges in developing this solution. Among these,
the initial observation is the need for more detail in
generating synthetic faces. This aspect can jeopardize
the production of face swap technologies, as we ob-
served a visible loss in realism. There were also tech-
nical challenges, such as adding a conditional into the
generator.
The difficulty in accessing the dataset was a rel-
evant challenge in this research. In future works, we
intend to carry out our training with our facial dataset,
obtained from embedded hardware, 300x300 pixels
resolution for comparison purposes. Our dataset has
about 45,000 images. We also intend to reassess the
strategy for obtaining facial data. In future work, we
intend to use a 3D facial generator network to acquire
synthetic facial images in all positions of the same
identity.
ACKNOWLEDGEMENTS
The authors would like to thank CAPES, CNPq and
the Federal University of Ouro Preto for support-
ing this work. This study was financed in part
by the Coordenac¸
˜
ao de Aperfeic¸oamento de Pes-
soal de N
´
ıvel Superior - Brasil (CAPES) - Finance
Code 001, the Conselho Nacional de Desenvolvi-
mento Cient
´
ıfico e Tecnol
´
ogico (CNPQ) and the Uni-
versidade Federal de Ouro Preto (UFOP).
REFERENCES
Cao, H., Tan, C., Gao, Z., Chen, G., Heng, P.-A., and Li,
S. Z. (2022). A survey on generative diffusion model.
arXiv preprint arXiv:2209.02646.
Croitoru, F.-A., Hondru, V., Ionescu, R. T., and Shah, M.
(2022). Diffusion models in vision: A survey. arXiv
preprint arXiv:2209.04747.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2014). Generative adversarial nets. Advances
in neural information processing systems, 27.
Harshvardhan, G., Gourisaria, M. K., Pandey, M., and
Rautaray, S. S. (2020). A comprehensive survey and
analysis of generative models in machine learning.
Computer Science Review, 38:100285.
Ho, J., Jain, A., and Abbeel, P. (2020). Denoising diffusion
probabilistic models. Advances in Neural Information
Processing Systems, 33:6840–6851.
Karras, T., Laine, S., and Aila, T. (2019). A style-based
generator architecture for generative adversarial net-
works. In Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, pages
4401–4410.
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen,
J., and Aila, T. (2020). Analyzing and improving
the image quality of stylegan. In Proceedings of the
IEEE/CVF conference on computer vision and pattern
recognition, pages 8110–8119.
Meira, N., Santos, R., Silva, M., Luz, E., and Oliveira,
R. (2023). Towards an automatic system for gen-
erating synthetic and representative facial data for
anonymization. In Proceedings of the 18th Interna-
tional Joint Conference on Computer Vision, Imag-
ing and Computer Graphics Theory and Applications
- Volume 5: VISAPP,, pages 854–861. INSTICC,
SciTePress.
Mirsky, Y. and Lee, W. (2021). The creation and detec-
tion of deepfakes: A survey. ACM Computing Surveys
(CSUR), 54(1):1–41.
ICEIS 2023 - 25th International Conference on Enterprise Information Systems
654