Author:
Josep Domingo-Ferrer
Affiliation:
Universitat Rovira i Virgili, Department of Computer Science and Mathematics, CYBERCAT-Center for Cybersecurity Research of Catalonia, UNESCO Chair in Data Privacy, Av. Paı̈sos Catalans 26, 43007 Tarragona, Catalonia and Spain
Keyword(s):
Privacy, Big Data, Anonymization, Privacy Models, k-anonymity, Differential Privacy, Randomized Response, Post-randomization, t-closeness, Permutation, Deniability.
Related
Ontology
Subjects/Areas/Topics:
Data and Application Security and Privacy
;
Data Protection
;
Database Security and Privacy
;
Information and Systems Security
;
Personal Data Protection for Information Systems
;
Privacy
;
Privacy Enhancing Technologies
;
Security and Privacy for Big Data
;
Security in Information Systems
Abstract:
The big data explosion opens unprecedented analysis and inference possibilities that may even enable modeling the world and forecasting its evolution with great accuracy. The dark side of such a data bounty is that it complicates the preservation of individual privacy: a substantial part of big data is obtained from the digital track of our activity. We focus here on the privacy of subjects on whom big data are collected. Unless anonymization approaches are found that are suitable for big data, the following extreme positions will become more and more common: nihilists, who claim that privacy is dead in the big data world, and fundamentalists, who want privacy even at the cost of sacrificing big data analysis. In this article we identify requirements that should be satisfied by privacy models to be applicable to big data. We then examine how well the two main privacy models (k-anonymity and ε-differential privacy) satisfy those requirements. Neither model is entirely satisfactory, al
though k-anonymity seems more amenable to big data protection. Finally, we highlight connections between the previous two privacy models and other privacy models that might result in synergies between them in order to tackle big data: the principles underlying all those models are deniability and permutation. Future research attempting to adapt the current privacy models for big data and/or design new models will have to adhere to those two underlying principles. As a side result, the above inter-model connections allow gauging what is the actual protection afforded by differential privacy when ε is not sufficiently small.
(More)