according to the evaluation by standard benchmarks.
Further, the clustering algorithm MONDRIAN has
been adapted to perform well in microaggregation ap-
plications. Two variants, namely MONDRIAN V and
MONDRIAN V2D have been presented, both delivering
superior information loss compared to MONDRIAN and
operating at different ratios of performance to data
quality. For lower-dimensional data the MONDRIAN
technique can achieve almost the same data quality
as much more time consuming algorithms.
Combining both advantages, the data quality
of ONA
∗
and the performance of MONDRIAN V, we
have designed new classes of algorithms MONA
ρ
and
MONA 2D
ρ
that achieve high quality data anonymiza-
tion even for huge databases where quadratic time
would be far too expensive.
What could be the next steps in further improv-
ing microaggregation techniques? An obvious ques-
tion is whether there are even better splitting rules
for MONDRIAN? On the other hand, is it possible
to decrease the information loss further by spending
more than quadratic time? By design, any improve-
ment here could be applied to the MONA
ρ
approach
to get fast solutions with better quality. How well
k-anonymous microaggregation can be approximated
is still wide open. There is hope to achieve approxi-
mation guarantees for ONA
∗
by carefully designing an
initial clustering similar to the k-means++ algorithm
(Arthur and Vassilvitskii, 2007). We plan to investi-
gate this issue in more detail.
REFERENCES
Anwar, N. (1993). Micro-aggregation-the small aggregates
method. Technical report, Internal report. Luxem-
bourg: Eurostat.
Arthur, D. and Vassilvitskii, S. (2007). k-means++: The
advantages of careful seeding. In Proceedings of the
eighteenth annual ACM-SIAM symposium on Discrete
algorithms, pages 1027–1035. Society for Industrial
and Applied Mathematics.
Defays, D. and Nanopoulos, P. (1993). Panels of enterprises
and confidentiality: the small aggregates method. In
Proceedings of the 1992 symposium on design and
analysis of longitudinal surveys, pages 195–204.
Domingo-Ferrer, J., Mart
´
ınez-Ballest
´
e, A., Mateo-Sanz,
J. M., and Seb
´
e, F. (2006). Efficient multivariate
data-oriented microaggregation. The VLDB Jour-
nal—The International Journal on Very Large Data
Bases, 15(4):355–369.
Domingo-Ferrer, J. and Mateo-Sanz, J. M. (2002).
Reference data sets to test and compare sdc
methods for protection of numerical microdata.
https://web.archive.org/web/20190412063606/http:
//neon.vb.cbs.nl/casc/CASCtestsets.htm.
Domingo-Ferrer, J. and Torra, V. (2005). Ordinal, continu-
ous and heterogeneous k-anonymity through microag-
gregation. Data Mining and Knowledge Discovery,
11(2):195–212.
Dwork, C., McSherry, F., Nissim, K., and Smith, A. (2006).
Calibrating noise to sensitivity in private data analy-
sis. In Theory of cryptography conference, pages 265–
284. Springer.
LeFevre, K., DeWitt, D. J., and Ramakrishnan, R.
(2006). Mondrian multidimensional k-anonymity. In
22nd International conference on data engineering
(ICDE’06), pages 25–25. IEEE.
Li, N., Li, T., and Venkatasubramanian, S. (2007).
t-closeness: Privacy beyond k-anonymity and l-
diversity. In 2007 IEEE 23rd International Confer-
ence on Data Engineering, pages 106–115. IEEE.
Li, N., Qardaji, W., and Su, D. (2012). On sam-
pling, anonymization, and differential privacy or, k-
anonymization meets differential privacy. In Pro-
ceedings of the 7th ACM Symposium on Information,
Computer and Communications Security, pages 32–
33. ACM.
Lichman, M. (2013). UCI machine learning repository.
http://archive.ics.uci.edu/ml.
Lloyd, S. P. (1982). Least squares quantization in pcm.
IEEE transactions on information theory, 28(2):129–
137.
Machanavajjhala, A., Kifer, D., Gehrke, J., and Venkita-
subramaniam, M. (2007). l-diversity: Privacy beyond
k-anonymity. ACM Transactions on Knowledge Dis-
covery from Data (TKDD), 1(1):3.
Oganian, A. and Domingo-Ferrer, J. (2001). On the com-
plexity of optimal microaggregation for statistical dis-
closure control. Statistical Journal of the United Na-
tions Economic Commission for Europe, 18(4):345–
353.
Rebollo-Monedero, D., Forn
´
e, J., Pallar
`
es, E., and Parra-
Arnau, J. (2013). A modification of the Lloyd algo-
rithm for k-anonymous quantization. Information Sci-
ences, 222:185–202.
Samarati, P. (2001). Protecting respondents identities in mi-
crodata release. IEEE transactions on Knowledge and
Data Engineering, 13(6):1010–1027.
Soria-Comas, J., Domingo-Ferrer, J., and Mulero, R.
(2019). Efficient near-optimal variable-size microag-
gregation. In International Conference on Modeling
Decisions for Artificial Intelligence, pages 333–345.
Springer.
Sweeney, L. (2002). k-anonymity: A model for protecting
privacy. International Journal of Uncertainty, Fuzzi-
ness and Knowledge-Based Systems, 10(05):557–570.
Thaeter, F. and Reischuk, R. (2018). Improving anonymiza-
tion clustering. In Langweg, H., Meier, M., Witt,
B. C., and Reinhardt, D., editors, SICHERHEIT 2018,
pages 69–82, Bonn. Gesellschaft f
¨
ur Informatik e.V.
Thaeter, F. and Reischuk, R. (2020). Hardness of
k-anonymous microaggregation. Discrete Applied
Mathematics.
SECRYPT 2021 - 18th International Conference on Security and Cryptography
96