processing steps and retrain on shadow models with
preprocessed training data as well. Intuitively, this
would weaken the defense while costing the same
utility in the target models. Additionally, as the tech-
nique with most potential for defending against PIAs
for tabular data, the generation of artificial data could
be further explored: One could adapt the synthesis al-
gorithm s.t. statistical properties are arbitrarily mod-
ified in the generated data set. A similar goal is pur-
sued in many bias prevention approaches in the area
of fair ML.
Adapting the Training Process. Another method
from a similar area called fair representation learning
is punishing the model when learning biased informa-
tion by introducing a regularization term in the loss
function during training, e.g., (Creager et al., 2019).
As a defense strategy against PIAs, one would need
to introduce a loss term which expresses the current
property manifestation within the model and causes
the model to hide this information as good as possible.
In theory, this would be a very efficient way to pre-
vent the property from being embedded in the model
parameters. Since it would be incorporated into the
training process, the side effects on the utility of the
target model should be low.
Post-Training Methods. (Liu et al., 2022) experi-
ment with knowledge distillation (KD) as a defense
against privacy attacks like membership inference.
The idea is to decrease the number of neurons in an
ANN in order to lower its memory capacity. Unfortu-
nately, the authors do not consider PIAs – it would be
interesting to see the impact of KD on their success
rate.
9 CONCLUSION
In this paper, we performed the first extensive anal-
ysis on different defense strategies against white-box
property inference attacks. This analysis includes a
series of thorough experiments on property unlearn-
ing, a novel approach which we have developed as a
dedicated PIA defense mechanism. Our experiments
show the strengths of property unlearning when de-
fending against a dedicated adversary instance and
also highlight its limits, in particular its lacking abil-
ity to generalize. We elaborated on the reasons of this
limitation and concluded with the conjecture that sta-
tistical properties of training data are deep-seated in
the trained parameters of ML models. This allows PI
adversaries to focus on different parts of the param-
eters when inferring such properties, but also opens
up possibilities for much simpler attacks, as we have
shown via t-SNE model parameter visualizations.
Apart from the post-training defense property un-
learning, we have also tested different training data
preprocessing methods (see full paper version (Stock
et al., 2022)). Although most of them were not di-
rectly targeted at the sensitive property of the training
data, some methods have shown promising results. In
particular, we believe that generating a property-free,
artificial data set based on the distribution of an orig-
inal training data set could be a candidate for a PIA
defense with very good privacy-utility tradeoff.
ACKNOWLEDGEMENTS
We wish to thank Anshuman Suri for valuable discus-
sions and we are grateful to the anonymous reviewers
of previous versions of this work for their feedback.
REFERENCES
Ateniese, G., Mancini, L. V., Spognardi, A., Villani, A.,
Vitali, D., and Felici, G. (2015). Hacking smart ma-
chines with smarter ones: How to extract meaningful
data from machine learning classifiers. IJSN.
Creager, E., Madras, D., Jacobsen, J.-H., Weis, M., Swer-
sky, K., Pitassi, T., and Zemel, R. (2019). Flexibly fair
representation learning by disentanglement. In ICML.
Dwork, C., McSherry, F., Nissim, K., and Smith, A. (2006).
Calibrating noise to sensitivity in private data analysis.
In TCC.
Fredrikson, M., Jha, S., and Ristenpart, T. (2015). Model
inversion attacks that exploit confidence information
and basic countermeasures. In CCS.
Ganju, K., Wang, Q., Yang, W., Gunter, C. A., and Borisov,
N. (2018). Property inference attacks on fully con-
nected neural networks using permutation invariant
representations. In CCS.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998).
Gradient-based learning applied to document recogni-
tion. IEEE.
Liu, Y., Wen, R., He, X., Salem, A., Zhang, Z., Backes,
M., Cristofaro, E. D., Fritz, M., and Zhang, Y. (2022).
ML-Doctor: Holistic risk assessment of inference at-
tacks against machine learning models. In USENIX
Security.
Mahloujifar, S., Ghosh, E., and Chase, M. (2022). Property
inference from poisoning. In S&P.
Melis, L., Song, C., De Cristofaro, E., and Shmatikov, V.
(2019). Exploiting unintended feature leakage in col-
laborative learning. In S&P.
Nasr, M., Shokri, R., and Houmansadr, A. (2018). Machine
Learning with Membership Privacy using Adversarial
Regularization. In CCS.
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik,
Z. B., and Swami, A. (2017). Practical black-box at-
tacks against machine learning. In ASIACCS.
SECRYPT 2023 - 20th International Conference on Security and Cryptography
322