Towards Fairness in Machine Learning: Balancing Racially Imbalanced Datasets Through Data Augmentation and Generative AI

Anthonie Schaap; Sofoklis Kitharidis; Niki van Stein

doi:10.5220/0013002600003837

Towards Fairness in Machine Learning: Balancing Racially Imbalanced Datasets Through Data Augmentation and Generative AI

Anthonie Schaap, Sofoklis Kitharidis, Niki van Stein

2024

Abstract

Existing AI models trained on facial images are often heavily biased towards certain ethnic groups due to training data containing unrealistic ethnicity splits. This study examines ethnic biases in facial recognition AI models, resulting from skewed dataset representations. Various data augmentation and generative AI techniques were evaluated to mitigate these biases, employing fairness metrics to measure improvements. Our methodology included balancing training datasets with synthetic data generated through Generative Adversarial Networks (GANs), targeting underrepresented ethnic groups. Experimental results indicate that these interventions effectively reduce bias, enhancing the fairness of AI models across different ethnicities. This research contributes practical approaches for adjusting dataset imbalances in AI systems, ultimately improving the reliability and ethical deployment of facial recognition technologies.

Download

Paper Citation

in Harvard Style

Schaap A., Kitharidis S. and van Stein N. (2024). Towards Fairness in Machine Learning: Balancing Racially Imbalanced Datasets Through Data Augmentation and Generative AI. In Proceedings of the 16th International Joint Conference on Computational Intelligence - Volume 1: NCTA; ISBN 978-989-758-721-4, SciTePress, pages 584-592. DOI: 10.5220/0013002600003837

in Bibtex Style

@conference{ncta24,
author={Anthonie Schaap and Sofoklis Kitharidis and Niki van Stein},
title={Towards Fairness in Machine Learning: Balancing Racially Imbalanced Datasets Through Data Augmentation and Generative AI},
booktitle={Proceedings of the 16th International Joint Conference on Computational Intelligence - Volume 1: NCTA},
year={2024},
pages={584-592},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013002600003837},
isbn={978-989-758-721-4},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Computational Intelligence - Volume 1: NCTA
TI - Towards Fairness in Machine Learning: Balancing Racially Imbalanced Datasets Through Data Augmentation and Generative AI
SN - 978-989-758-721-4
AU - Schaap A.
AU - Kitharidis S.
AU - van Stein N.
PY - 2024
SP - 584
EP - 592
DO - 10.5220/0013002600003837
PB - SciTePress