Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments
Toon Stuyck, Eric Demeester
2024
Abstract
One of the main challenges of using machine learning in the chemical sector is a lack of qualitative labeled data. Data of certain events can be extremely rare, or very costly to generate, e.g. an anomaly during a production process. Even if data is available it often requires highly educated observers to correctly annotate the data. The performance of supervised classification algorithms can be drastically reduced when confronted with limited amounts of training data. Data augmentation is typically used in order to increase the amount of available training data but the risk exists of overfitting or loss of information. In recent years Generative Adversarial Networks have been able to generate realistically looking synthetic data, even on small amounts of training data. In this paper the feasibility of utilizing Generative Adversarial Network generated synthetic data to improve classification results will be demonstrated via a comparison with and without standard augmentation methods such as scaling, rotation,... . In this paper a methodology is proposed on how to combine original data and synthetic data to achieve the best classifier result and to quantitatively verify generalization of the classifier using an explainable AI method. The proposed methodology compares favourably to using no or standard augmentation methods in the case of classification of chemical foam.
DownloadPaper Citation
in Harvard Style
Stuyck T. and Demeester E. (2024). Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments. In Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-758-684-2, SciTePress, pages 620-627. DOI: 10.5220/0012305300003654
in Bibtex Style
@conference{icpram24,
author={Toon Stuyck and Eric Demeester},
title={Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments},
booktitle={Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2024},
pages={620-627},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012305300003654},
isbn={978-989-758-684-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments
SN - 978-989-758-684-2
AU - Stuyck T.
AU - Demeester E.
PY - 2024
SP - 620
EP - 627
DO - 10.5220/0012305300003654
PB - SciTePress