Balancing Act: Navigating the Privacy-Utility Spectrum in Principal Component Analysis

Saloni Kwatra, Anna Monreale, Francesca Naretto

2024

Abstract

A lot of research in federated learning is ongoing ever since it was proposed. Federated learning allows collaborative learning among distributed clients without sharing their raw data to a central aggregator (if it is present) or to other clients in a peer to peer architecture. However, each client participating in the federation shares their model information learned from their data with other clients participating in the FL process, or with the central aggregator. This sharing of information, however, makes this approach vulnerable to various attacks, including data reconstruction attacks. Our research specifically focuses on Principal Component Analysis (PCA), as it is a widely used dimensionality technique. For performing PCA in a federated setting, distributed clients share local eigenvectors computed from their respective data with the aggregator, which then combines and returns global eigenvectors. Previous studies on attacks against PCA have demonstrated that revealing eigenvectors can lead to membership inference and, when coupled with knowledge of data distribution, result in data reconstruction attacks. Consequently, our objective in this work is to augment privacy in eigenvectors while sustaining their utility. To obtain protected eigenvectors, we use k-anonymity, and generative networks. Through our experimentation, we did a complete privacy, and utility analysis of original and protected eigenvectors. For utility analysis, we apply HIERARCHICAL CLUSTERING, RANDOM FOREST regressor, and RANDOM FOREST classifier on the protected, and original eigenvectors. We got interesting results, when we applied HIERARCHICAL CLUSTERING on the original, and protected datasets, and eigenvectors. The height at which the clusters are merged declined from 250 to 150 for original, and synthetic version of CALIFORNIA-HOUSING data, respectively. For the k-anonymous version of CALIFORNIA-HOUSING data, the height lies between 150, and 250. To evaluate the privacy risks of the federated PCA system, we act as an attacker, and conduct a data reconstruction attack.

Download


Paper Citation


in Harvard Style

Kwatra S., Monreale A. and Naretto F. (2024). Balancing Act: Navigating the Privacy-Utility Spectrum in Principal Component Analysis. In Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT; ISBN 978-989-758-709-2, SciTePress, pages 850-857. DOI: 10.5220/0012855000003767


in Bibtex Style

@conference{secrypt24,
author={Saloni Kwatra and Anna Monreale and Francesca Naretto},
title={Balancing Act: Navigating the Privacy-Utility Spectrum in Principal Component Analysis},
booktitle={Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT},
year={2024},
pages={850-857},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012855000003767},
isbn={978-989-758-709-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT
TI - Balancing Act: Navigating the Privacy-Utility Spectrum in Principal Component Analysis
SN - 978-989-758-709-2
AU - Kwatra S.
AU - Monreale A.
AU - Naretto F.
PY - 2024
SP - 850
EP - 857
DO - 10.5220/0012855000003767
PB - SciTePress