Authors:
Joshua Stock
1
;
Jens Wettlaufer
2
;
Daniel Demmler
1
and
Hannes Federrath
1
Affiliations:
1
Security in Distributed Systems, Universität Hamburg, Germany
;
2
Institute of Electrical and Electronics Engineers (IEEE), U.S.A.
Keyword(s):
Machine Learning, Privacy Attacks, Property Inference, Defense Mechanisms, Adversarial Training.
Abstract:
This work investigates and evaluates defense strategies against property inference attacks (PIAs), a privacy attack against machine learning models. While for other privacy attacks like membership inference, a lot of research on defense mechanisms has been published, this is the first work focusing on defending against PIAs. One of the mitigation strategies we test in this paper is a novel proposal called property unlearning. Extensive experiments show that while this technique is very effective when defending against specific adversaries, it is not able to generalize, i.e., protect against a whole class of PIAs. To investigate the reasons behind this limitation, we present the results of experiments with the explainable AI tool LIME and the visualization technique t-SNE. These show how ubiquitous statistical properties of training data are in the parameters of a trained machine learning model. Hence, we develop the conjecture that post-training techniques like property unlearning mi
ght not suffice to provide the desirable generic protection against PIAs. We conclude with a discussion of different defense approaches, a summary of the lessons learned and directions for future work.
(More)