Utility of Anonymised Data in Decision Tree Derivation

Jack Davies, Jianhua Shao

2022

Abstract

Privacy Preserving Data Publishing (PPDP) is a practice for anonymising microdata such that it can be publicly shared. Much work has been carried out on developing methods of data anonymisation, but relatively little work has been done on examining how useful anonymised data is in supporting data analysis. This paper evaluates the utility of k-anonymised data in decision tree derivation and examines how accurate some commonly used metrics are in estimating this utility. Our results suggest that whilst classification accuracy loss is minimal in most common scenarios, using a small selection of simple metrics when calibrating a k-Anonymisation could help significantly improve decision tree classification accuracy for anonymised data.

Download


Paper Citation


in Harvard Style

Davies J. and Shao J. (2022). Utility of Anonymised Data in Decision Tree Derivation. In Proceedings of the 8th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-553-1, pages 273-280. DOI: 10.5220/0010778300003120


in Bibtex Style

@conference{icissp22,
author={Jack Davies and Jianhua Shao},
title={Utility of Anonymised Data in Decision Tree Derivation},
booktitle={Proceedings of the 8th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2022},
pages={273-280},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010778300003120},
isbn={978-989-758-553-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 8th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - Utility of Anonymised Data in Decision Tree Derivation
SN - 978-989-758-553-1
AU - Davies J.
AU - Shao J.
PY - 2022
SP - 273
EP - 280
DO - 10.5220/0010778300003120