Tab-VAE: A Novel VAE for Generating Synthetic Tabular Data
Syed Tazwar, Max Knobbout, Enrique Hortal Quesada, Mirela Popa
2024
Abstract
Variational Autoencoders (VAEs) suffer from a well-known problem of overpruning or posterior collapse due to strong regularization while working in a sufficiently high-dimensional latent space. When VAEs are used to generate tabular data, categorical one-hot encoded data expand the dimensionality of the feature space dramatically, making modeling multi-class categorical data challenging. In this paper, we propose Tab-VAE, a novel VAE-based approach to generate synthetic tabular data that tackles this challenge by introducing a sampling technique at inference for categorical variables. A detailed review of the current state-of-the-art models shows that most of the tabular data generation approaches draw methodologies from Generative Adversarial Networks (GANs) while a simpler more stable VAE method is ignored. Our extensive evaluation of the Tab-VAE with other leading generative models shows Tab-VAE improves the state-of-the-art VAEs significantly. It also shows that Tab-VAE outperforms the best GAN-based tabular data generators, paving the way for a powerful and less computationally expensive tabular data generation model.
DownloadPaper Citation
in Harvard Style
Tazwar S., Knobbout M., Hortal Quesada E. and Popa M. (2024). Tab-VAE: A Novel VAE for Generating Synthetic Tabular Data. In Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-758-684-2, SciTePress, pages 17-26. DOI: 10.5220/0012302400003654
in Bibtex Style
@conference{icpram24,
author={Syed Tazwar and Max Knobbout and Enrique Hortal Quesada and Mirela Popa},
title={Tab-VAE: A Novel VAE for Generating Synthetic Tabular Data},
booktitle={Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2024},
pages={17-26},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012302400003654},
isbn={978-989-758-684-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - Tab-VAE: A Novel VAE for Generating Synthetic Tabular Data
SN - 978-989-758-684-2
AU - Tazwar S.
AU - Knobbout M.
AU - Hortal Quesada E.
AU - Popa M.
PY - 2024
SP - 17
EP - 26
DO - 10.5220/0012302400003654
PB - SciTePress