XPCA Gen: Extended PCA Based Tabular Data Generation Model
Sreekala Padinjarekkara, Jessica Alecci, Mirela Popa
2024
Abstract
The proposed method XPCA Gen, introduces a novel approach for synthetic tabular data generation by util-ising relevant patterns present in the data. This is performed using principle components obtained through XPCA (probabilistic interpretation of standard PCA) decomposition of original data. Since new data points are obtained by synthesizing the principle components, the generated data is an accurate and noise redundant representation of original data with a good diversity of data points. The experimental results obtained on benchmark datasets (e.g. CMC, PID) demonstrate performance in ML utility metrics (accuracy, precision, recall), showing its ability to capture inherent patterns in the dataset. Along with ML utility metrics, high Hausdorff distance indicates diversity in generated data without compromising statistical properties. Moreover, this is not a data hungry method like other complex neural networks. Overall, XPCA Gen emerges as a promising solution for data privacy preservation and robust model training with diverse samples.
DownloadPaper Citation
in Harvard Style
Padinjarekkara S., Alecci J. and Popa M. (2024). XPCA Gen: Extended PCA Based Tabular Data Generation Model. In Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-758-684-2, SciTePress, pages 141-151. DOI: 10.5220/0012568600003654
in Bibtex Style
@conference{icpram24,
author={Sreekala Padinjarekkara and Jessica Alecci and Mirela Popa},
title={XPCA Gen: Extended PCA Based Tabular Data Generation Model},
booktitle={Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2024},
pages={141-151},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012568600003654},
isbn={978-989-758-684-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - XPCA Gen: Extended PCA Based Tabular Data Generation Model
SN - 978-989-758-684-2
AU - Padinjarekkara S.
AU - Alecci J.
AU - Popa M.
PY - 2024
SP - 141
EP - 151
DO - 10.5220/0012568600003654
PB - SciTePress