On the Effect of Dataset Size and Composition for Privacy Evaluation

Danai Georgiou, Carlos Franzreb, Tim Polzehl

2025

Abstract

Speaker anonymization is the practice of concealing a speaker’s identity and is commonly used for privacy protection in voice biometrics. As proposed by the Voice Privacy Challenge (VPC), Automatic Speaker Verification (ASV) currently represents the de facto standard for privacy evaluation; it includes extracting speaker embeddings from speech samples, which are compared with a trained PLDA back-end model. We implement this ASV system to systematically explore the influence of two factors on the ASV performance: a) the amount of speakers to be evaluated, and b) the amount of utterances per speaker to be compared. The experimentation encompasses the privacy evaluation of the StarGANv2-VC and the kNN-VC on the LibriSpeech dataset. The experimental results indicate that the validity and reliability of privacy scores inherently depend on the evaluation dataset. It is, furthermore, demonstrated that limiting the number of speakers and utterances per speaker can reduce the evaluation time by 99%, while maintaining the reliability of the scores at a comparative level.

Download


Paper Citation


in Harvard Style

Georgiou D., Franzreb C. and Polzehl T. (2025). On the Effect of Dataset Size and Composition for Privacy Evaluation. In Proceedings of the 11th International Conference on Information Systems Security and Privacy - Volume 2: ICISSP; ISBN 978-989-758-735-1, SciTePress, pages 510-517. DOI: 10.5220/0013152900003899


in Bibtex Style

@conference{icissp25,
author={Danai Georgiou and Carlos Franzreb and Tim Polzehl},
title={On the Effect of Dataset Size and Composition for Privacy Evaluation},
booktitle={Proceedings of the 11th International Conference on Information Systems Security and Privacy - Volume 2: ICISSP},
year={2025},
pages={510-517},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013152900003899},
isbn={978-989-758-735-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Conference on Information Systems Security and Privacy - Volume 2: ICISSP
TI - On the Effect of Dataset Size and Composition for Privacy Evaluation
SN - 978-989-758-735-1
AU - Georgiou D.
AU - Franzreb C.
AU - Polzehl T.
PY - 2025
SP - 510
EP - 517
DO - 10.5220/0013152900003899
PB - SciTePress