Adaptive Buffer Resizing for Efficient Anonymization of Streaming Data with Minimal Information Loss

Aderonke Busayo Sakpere, Anne V. D. M. Kayem

2015

Abstract

Mobile crime reporting systems have emerged as an effective and efficient approach to crime data collection in developing countries. The collection of this data has raised the need to analyse or mine the data to deduce patterns that are helpful in addressing crime. Since data analytic expertises are limited in developing nations, outsourcing the data to a third-party service provider is a cost effective management strategy. However, crime data is inherently privacy sensitive and must be protected from ``honest-but-curious" service providers. In order to speed up real time analysis of the data, streaming data can be used instead of static data. Streaming data anonymity schemes based on k-anonymity offer fast privacy preservation and query processing but are reliant on buffering schemes that incur high information loss rates on intermittent data streams. In this paper, we propose a scheme for adjusting the size of the buffer based on data arrival rates and use k-anonymity to enforce data privacy. Furthermore, in order to handle buffered records that are unanonymizable, we use a heuristic that works by either delaying the unanonymized record(s) to the next buffering cycle or incorporating the record(s) into a cluster of anonymized records with similar privacy constraints. The advantage of this approach to streaming-data anonymization is two-fold. First, we ensure privacy of the data through k-anonymization, and second, we ensure minimal information loss from the unanonymized records thereby, offering the opportunity for high query result accuracy on the anonymized data. Results from our prototype implementation demonstrate that our proposed scheme enhances privacy for data analytics. With varied data privacy requirement levels, we incur an average information loss in delay of 1.95\% compared to other solutions that average a loss of 12.7\%.

References

  1. Aggarwal, C. and Philip, S. (2008). A general survey of privacy-preserving data mining models and algorithms.
  2. Bayardo, R. J. and Agrawal, R. (2002). Data privacy through optimal k-anonymization. In Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on (pp. 217-228).
  3. Cao, J., Carminati, B., Ferrari, E., and Tan, K. L. (2008). Castle: Continuously Anonymizing Data Streams. Dependable and Secure Computing, IEEE Transactions on, 8(3), 337-352.
  4. CryHelp-App (2014). http://people.cs.uct.ac.za/ tndlovu /(accessed, may 2014).
  5. Dwork, C. (2006). Differential privacy. In Automata, languages and programming (pp. 1-12).
  6. Guo, K. and Zhang, Q. (2013). Fast clustering-based anonymization approaches with time constraints for data streams. Knowledge-Based Systems, Elsevier.
  7. Iyengar, V. S. (2002). Transforming data to satisfy privacy constraints. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 279-288).
  8. Jensen, K. L., Iipito, H. N., Onwordi, M. U., and Mukumbira, S. (2012). Toward an mpolicing solution for namibia: leveraging emerging mobile platforms and crime mapping. In Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference (pp. 196-205).
  9. Kayem, A. V. D. M., Martin, P., and Akl, S. G. (2011). Effective cryptographic key management for outsourced dynamic data sharing environments. In Proc. of the 10th Annual Information Security Conference (ISSA 2011), Johannesburg, South Africa, pp.1-8.
  10. Li, J., Ooi, B. C., and Wang, W. (2008). Anonymizing streaming data for privacy protection. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on (pp. 1367-1369).
  11. Li, S. (2006). Poisson process with fuzzy rates. In Fuzzy Optimization and Decision Making, 9(3), pp. 289-305.
  12. Mark-John, B. and Kayem, A. V. D. M. (2014). Kanonymity for privacy preserving crime data publishing in resource constrained environments. In the 8th International Symposium on Security and Multinodality in Pervasive Environments, (SMPE 2014), Victoria, Canada - May 13-16, 2014.
  13. Patroumpas, K. and Sellis, T. (2006). Window specification over data streams. In Current Trends in Database TechnologyEDBT 2006 (pp. 445-464).
  14. Qiu, L., Li, Y., and Wu, X. (2008). Protecting business intelligence and customer privacy while outsourcing data mining tasks. In TEMPLATE'06, 1st International Conference on Template Production. Knowledge and information systems, 17(1), pp. 99-120.
  15. Sakpere, A. B. and Kayem, A. V. D. M. (2014). A state of the art review of data stream anonymisation schemes. Information Security in Diverse Computing Enviroments, 24. IGI Global, PA, USA., USA.
  16. Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10 (05), 557- 570.
  17. Vimercati, S. D. C. D., Foresti, S., Jajodia, S., Paraboschi, S., and Samarati, P. (2010). Encryption policies for regulating access to outsourced data. ACM Trans. Database Syst., 35(2), pp. 12:1-12:46.
  18. Wang, P., Lu, J., Zhao, L., and Yang, J. (2010). B-castle: an efficient publishing algorithm for k-anonymizing data streams. Proceedings of the 2010 Second WRI Global Congress on Intelligent Systems, Wuhan, China, 2010, pp. 132136.
  19. Zakerzadeh, H. and Osborn, S. L. . (2011). Faanst: Fast anonymizing algorithm for numerical streaming data. In Data Privacy Management and Autonomous Spontaneous Security (pp. 36-50).
  20. Zakerzadeh, H. and Osborn, S. L. (2013). Delay-sensitive approaches for anonymizing numerical streaming data. International Journal of Information Security, 1-15.
  21. Zhang, J., Yang, J., Zhang, J., and Yuan, Y. (2010). KIDS: K-anonymization data stream base on sliding window. In Future Computer and Communication (ICFCC), 2010 2nd International Conference on (Vol. 2, pp. V2- 311).
Download


Paper Citation


in Harvard Style

Busayo Sakpere A. and V. D. M. Kayem A. (2015). Adaptive Buffer Resizing for Efficient Anonymization of Streaming Data with Minimal Information Loss . In Proceedings of the 1st International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-081-9, pages 191-201. DOI: 10.5220/0005288901910201


in Bibtex Style

@conference{icissp15,
author={Aderonke Busayo Sakpere and Anne V. D. M. Kayem},
title={Adaptive Buffer Resizing for Efficient Anonymization of Streaming Data with Minimal Information Loss},
booktitle={Proceedings of the 1st International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2015},
pages={191-201},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005288901910201},
isbn={978-989-758-081-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - Adaptive Buffer Resizing for Efficient Anonymization of Streaming Data with Minimal Information Loss
SN - 978-989-758-081-9
AU - Busayo Sakpere A.
AU - V. D. M. Kayem A.
PY - 2015
SP - 191
EP - 201
DO - 10.5220/0005288901910201