Authors:
Wisam Abbasi
;
Paolo Mori
and
Andrea Saracino
Affiliation:
Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa, Italy
Keyword(s):
Data Privacy, Data Utility, Explainable AI, Privacy-Preserving Data Analysis, Trustworthy AI.
Abstract:
In this paper, we present a novel privacy-preserving data analysis model, based on machine learning, applied to tabular datasets, which defines a general trade-off optimization criterion among the measures of data privacy, model explainability, and data utility, aiming at finding the optimal compromise among them. Our approach regulates the privacy parameter of the privacy-preserving mechanism used for the applied analysis algorithms and explainability techniques. Then, our method explores all possible configurations for the provided privacy parameter and manages to find the optimal configuration with the maximum achievable privacy gain and explainability similarity while minimizing harm to data utility. To validate our methodology, we conducted experiments using multiple classifiers for a binary classification problem on the Adult dataset, a well-known tabular dataset with sensitive attributes. We used (ε,δ)-differential privacy as a privacy mechanism and multiple model explanation
methods. The results demonstrate the effectiveness of our approach in selecting an optimal configuration, that achieves the dual objective of safeguarding data privacy and providing model explanations of comparable quality to those generated from real data. Furthermore, the proposed method was able to preserve the quality of analyzed data, leading to accurate predictions.
(More)