qualitatively and quantitatively. Despite being one of
the most widely used neural network structures, GAN
still faces several difficulties. For instance, the
training process of GAN is very unstable, and
maintaining the balance between the generator and
the discriminator can be difficult, leading to the
problem of non-convergence. In addition, the conflict
between generators and discriminators also causes the
problem of mode collapse, which significantly
reduces the diversity of generated images. WGAN
provides a solution to this issue, but it still performs
poorly on high-resolution datasets.
The future of GAN is still promising despite the
numerous challenges that still need to be overcome.
BigGAN, which appeared in recent years, has made
great breakthroughs in high-quality image generation
compared with early GAN, CGAN, and DCGAN. In
addition, GAN has been extensively applied in
different fields. For example, NVIDIA uses GAN to
convert graffiti into highly realistic landscapes or
scenes (Park, 2019); AC Duarte et al. developed
Wav2Pix to produce high-precision photographs of
speakers' faces from voice sounds (Duarte, 2019).
The potential of GAN is still far from being fully
developed.
4 CONCLUSIONS
GAN and its variants are one of the most popular and
promising generative models for applying game
concepts to generative problems. This study provides
a detailed introduction to the history and basic
concepts of GAN. The article then reviews the basic
principles of GAN and its three variants - GAN,
DCGAN and BigGAN. Based on their principles, the
article then discusses the properties, advantages, and
disadvantages of model structure generation. After
that, the article analyzes and compares the four
models using qualitative and quantitative analysis
methods based on MNIST and CIFAR-10 datasets.
According to the experimental results, the
performance of the four models, in descending order,
is BigGAN, DCGAN, CGAN, and GAN, with
BigGAN showing much higher performance than the
remaining three models in both qualitative and
quantitative experiments.BigGAN performs
significantly better than the other three models in both
qualitative and quantitative tests. It is worth noting
that CGAN produces targeted results despite its poor
performance. In the future, the limitations of GAN
such as training stability and so on will be considered
as the research objective for the next stage. The
research will focus on providing feasible solutions to
the above problems. In addition, the latest GAN
variants have many breakthroughs in the form of data,
and model performance. The potential of GAN is far
from being fully explored.
REFERENCES
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014).
Generative adversarial nets. Advances in neural
information processing systems, 27.
Durugkar, I., Gemp, I., & Mahadevan, S. (2016).
Generative multi-adversarial networks. arXiv preprint
arXiv:1611.01673.
Arjovsky, M., Chintala, S., & Bottou, L. (2017, July).
Wasserstein generative adversarial networks. In
International conference on machine learning (pp. 214-
223). PMLR.
Mirza, M., & Osindero, S. (2014). Conditional generative
adversarial nets. arXiv preprint arXiv:1411.1784.
Larsen, A. B. L., Sรธnderby, S. K., Larochelle, H., &
Winther, O. (2016, June). Autoencoding beyond pixels
using a learned similarity metric. In International
conference on machine learning (pp. 1558-1566).
PMLR.
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised
representation learning with deep convolutional
generative adversarial networks. arXiv preprint
arXiv:1511.06434.
Brock, A., Donahue, J., & Simonyan, K. (2018). Large
scale GAN training for high fidelity natural image
synthesis. arXiv preprint arXiv:1809.11096.
Cheng, K., Tahir, R., Eric, L. K., & Li, M. (2020). An
analysis of generative adversarial networks and variants
for image synthesis on MNIST dataset. Multimedia
Tools and Applications, 79, 13725-13752.
Yinka-Banjo, C., & Ugot, O. A. (2020). A review of
generative adversarial networks and its application in
cybersecurity. Artificial Intelligence Review, 53, 1721-
1736.
Park, T., Liu, M. Y., Wang, T. C., & Zhu, J. Y. (2019).
Semantic image synthesis with spatially-adaptive
normalization. In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition
(pp. 2337-2346).
Duarte, A. C., Roldan, F., Tubau, M., Escur, J., Pascual, S.,
Salvador, A., ... & Giro-i-Nieto, X. (2019, May).
WAV2PIX: Speech-conditioned Face Generation using
Generative Adversarial Networks. In ICASSP (Vol.
2019, pp. 8633-8637).