
to the public instead of the victim model. The key
point of the proposed method is to force the dummy
model to output a high confidence score only for the
limited range of synthetic images and a low confi-
dence score for real images. Owing to this property,
the proposed method can maintain the recognition ac-
curacy. We experimentally confirmed that the pro-
posed method reduces the success rate of MIA to less
than 30% while maintaining the recognition accuracy
of more than 95%.
In our future work, we will examine the relation-
ship between the characteristics of the victim model
and the hyper-parameter α, which controls the com-
bination weights for the victim model and the dummy
model, to realize a method for easily finding the best
choice of the α. At that time, we will focus on more
practical (or accurate) victim models. It is also an
important future work to extend the proposed method
using DDPM instead of GAN to further reduce the
risk of MIA.
Furthermore, we need to experimentally examine
the robustness of the proposed method against vari-
ous MIA attack methods other than Khosravy’s one.
Particularly, when an attacker knows the presence of
the proposed defense method, he might exploit a set
of face images morphed between real and synthetic
faces to conduct MIA, using a sophisticated morphing
method such as (Schardong et al., 2024). The robust-
ness against such an attack is interesting and should
be examined in the future.
This study is partially supported by JST CREST
Grant (JPMJCR20D3).
REFERENCES
Cao, Q., Shen, L., Xie, W., Parkhi, O. M., and Zisserman,
A. (2018). Vggface2: A dataset for recognising faces
across pose and age. In Proc. 13th IEEE International
Conference on Automatic Face and Gesture Recogni-
tion (FG2018), pages 67–74.
Fredrikson, M., Jha, S., and Ristenpart, T. (2015). Model
inversion attacks that exploit confidence information
and basic countermeasures. In Proc. The 22nd ACM
SIGSAC Conference on Computer and Communica-
tions Security, pages 1322–1333.
Goodfellow, I., Abadie, J., Mirza, M., Xu, B., Farley, D.,
Ozair, S., Courville, A., and Bengio, Y. (2014a). Gen-
erative adversarial networks. In Proc. 28th Interna-
tional Conference on Neural Information Processing
Systems (NeurIPS), pages 2672–2680.
Goodfellow, I., Shlens, J., and Szegedy, C. (2014b). Ex-
plaining and harnessing adversarial examples. In
Proc. 2014 International Conference on Learning
Representations (ICLR), pages 1–11.
He, Y., Meng, G., Chen, K., Hu, X., and He, J. (2022).
Towards security threats of deep learning systems:
A survey. IEEE Trans. on Software Engineering,
48:1743–1770.
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and
Hochreiter, S. (2017). Gans trained by a two time-
scale update rule converge to a local nash equilib-
rium. In Proc. 31st International Conference on Neu-
ral Information Processing Systems (NeurIPS), pages
6629–6640.
Ho, J., Jain, A., and Abbeel, P. (2020). Denoising dif-
fusion probabilistic models. In Proc. 34th Interna-
tional Conference on Neural Information Processing
Systems (NeurIPS), pages 6840–6851.
Ho, J. and Salimans, T. (2021). Classifier-free diffu-
sion guidance. In Proc. 2021 NeurIPS Workshop on
Deep Generative Models and Downstream Applica-
tions, pages 1–8.
Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J.,
and Aila, T. (2020). Gtraining generative adversar-
ial networks with limited data. In Proc. 34th Interna-
tional Conference on Neural Information Processing
Systems (NeurIPS), pages 12104–12114.
Khosravy, M., Nakamura, K., Hirose, Y., Nitta, N., and
Babaguchi, N. (2021). Model inversion attack: Anal-
ysis under gray-box scenario on deep learning based
face recognition system. KSII Trans. on Internet and
Information Systems, 15(3):1100–1118.
Khosravy, M., Nakamura, K., Hirose, Y., Nitta, N., and
Babaguchi, N. (2022). Model inversion attack by in-
tegration of deep generative models: Privacy-sensitive
face generation from a face recognition system. IEEE
Trans. on Information Forensics and Security, pages
357–372.
Liu, R., Wang, D., Ren, Y., Wang, Z., Guo, K., Qin,
Q., and Liu, X. (2024). Unstoppable attack: Label-
only model inversion via conditional diffusion model.
IEEE Trans. on Information Forensics and Security,
19:3958–3973.
Liu, X., Xie, L., Wang, Y., Zou, J., Xiong, J., Ying, Z., and
Vasilakos, A. V. (2020). Privacy and security issues in
deep learning: A survey. IEEE Access, 9:4566–4593.
Liu, Z., Luo, P., Wang, X., and Tang, X. (2015). Deep learn-
ing face attributes in the wild. In Proc. 2015 Interna-
tional Conference on Computer Vision (ICCV), pages
3730–3738.
Salem, A., Bhattacharya, A., Backes, M., Fritz, M., and
Zhang, Y. (2020). Updatesleak: Data set inference
and reconstruction attacks in online learning. In Proc.
29th USENIX Security Symposium, pages 1290–1308.
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V.,
Radford, A., and Chen, X. (2016). Improved tech-
niques for training gans. In Proc. 30th International
Conference on Neural Information Processing Sys-
tems (NeurIPS), pages 2234–2242.
Schardong, G., Novello, T., Paz, H., Medvedev, I., da Silva,
V., Velho, L., and Gonves, N. (2024). Neural implicit
morphing of faces. In Proc. 2024 IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition
(CVPR) Workshops, pages 8395–8399.
Defense Against Model Inversion Attacks Using a Dummy Recognition Model Trained with Synthetic Samples
891