Anonylitics: From a Small Data to a Big Data Anonymization System for Analytical Projects

Alexandra Pomares-Quimbaya, Alejandro Sierra-Múnera, Jaime Mendoza-Mendoza, Julián Malaver-Moreno, Hernán Carvajal, Victor Moncayo

2019

Abstract

When a company requires analytical capabilities using data that might include sensitive information, it is important to use a solution that protects those sensitive portions, while maintaining its usefulness. An analysis of existing anonymization approaches found out that some of them only permit to disclose aggregated information about large groups or require to know in advance the type of analysis to be performed, which is not viable in Big Data projects; others have low scalability which is not feasible with large data sets. Another group of works are only presented theoretically, without any evidence on evaluations or tests in real environments. To fill this gap this paper presents Anonylitics, an implementation of the k-anonymity principle for small and Big Data settings that is intended for contexts where it is necessary to disclose small or large data sets for applying supervised or non-supervised techniques. Anonylitics improves available implementations of k-anonymity using a hybrid approach during the creation of the anonymized blocks, maintaining the data types of the original attributes, and guaranteeing scalability when used with large data sets. Considering the diverse infrastructure and data volumes managed by current companies, Anonylitics was implemented in two versions, the first one uses a centralized approach, for companies that have small data sets, or large data sets, but good vertical infrastructure capabilities, and a Big Data version, for companies with large data sets and horizontal infrastructure capabilities. Evaluation on different data sets with diverse protection requirements demonstrates that our solution maintains the utility of the data, guarantees its privacy and has a good time-complexity performance.

Download


Paper Citation


in Harvard Style

Pomares-Quimbaya A., Sierra-Múnera A., Mendoza-Mendoza J., Malaver-Moreno J., Carvajal H. and Moncayo V. (2019). Anonylitics: From a Small Data to a Big Data Anonymization System for Analytical Projects.In Proceedings of the 21st International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-372-8, pages 61-71. DOI: 10.5220/0007685200610071


in Bibtex Style

@conference{iceis19,
author={Alexandra Pomares-Quimbaya and Alejandro Sierra-Múnera and Jaime Mendoza-Mendoza and Julián Malaver-Moreno and Hernán Carvajal and Victor Moncayo},
title={Anonylitics: From a Small Data to a Big Data Anonymization System for Analytical Projects},
booktitle={Proceedings of the 21st International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2019},
pages={61-71},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007685200610071},
isbn={978-989-758-372-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 21st International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Anonylitics: From a Small Data to a Big Data Anonymization System for Analytical Projects
SN - 978-989-758-372-8
AU - Pomares-Quimbaya A.
AU - Sierra-Múnera A.
AU - Mendoza-Mendoza J.
AU - Malaver-Moreno J.
AU - Carvajal H.
AU - Moncayo V.
PY - 2019
SP - 61
EP - 71
DO - 10.5220/0007685200610071