Authors:
Aderonke Busayo Sakpere
and
Anne V. D. M. Kayem
Affiliation:
University of Cape Town, South Africa
Keyword(s):
Data Anonymity, Streaming Data, Crime Reporting, Privacy Enhancing Model, k-Anonymity, Information Loss.
Related
Ontology
Subjects/Areas/Topics:
Information and Systems Security
;
Information Assurance
;
Information Hiding
;
Privacy Enhancing Technologies
Abstract:
Mobile crime reporting systems have emerged as an effective and efficient approach to crime data collection in developing countries. The collection of this data has raised the need to analyse or mine the data to deduce patterns that are helpful in addressing crime. Since data analytic expertises are limited in developing nations, outsourcing the data to a third-party service provider is a cost effective management strategy. However, crime data is inherently privacy sensitive and must be protected from ``honest-but-curious" service providers. In order to speed up real time analysis of the data, streaming data can be used instead of static data. Streaming data anonymity schemes based on k-anonymity offer fast privacy preservation and query processing but are reliant on buffering schemes that incur high information loss rates on intermittent data streams. In this paper, we propose a scheme for adjusting the size of the buffer based on data arrival rates and use k-anonymity to enforce da
ta privacy. Furthermore, in order to handle buffered records that are unanonymizable, we use a heuristic that works by either delaying the unanonymized record(s) to the next buffering cycle or incorporating the record(s) into a cluster of anonymized records with similar privacy constraints. The advantage of this approach to streaming-data anonymization is two-fold. First, we ensure privacy of the data through k-anonymization, and second, we ensure minimal information loss from the unanonymized records thereby, offering the opportunity for high query result accuracy on the anonymized data. Results from our prototype implementation demonstrate that our proposed scheme enhances privacy for data analytics. With varied data privacy requirement levels, we incur an average information loss in delay of 1.95\% compared to other solutions that average a loss of 12.7\%.
(More)