The Imbalance Data Handling of XGBoost in Insurance Fraud Detection
Nathanael Averro, Hendri Murfi, Gianinna Ardaneswari
2023
Abstract
Insurance fraud is an emerging problem threatening the insurance industry because of its potential severe loss. Many conventional efforts have been implemented to detect fraud, such as releasing blacklists and deeper investigation on every claim, but these efforts tend to cost financial resources a lot. Because of that, machine learning is proposed as a decision support system to detect potential insurance fraud. Insurance fraud detection problems often have data with an imbalanced class. This paper examines the imbalanced class handling of XGBoost in predicting insurance fraud. Our simulation shows that the weighted-XGBoost outperforms other approaches in handling the imbalanced class problem. The imbalance-XGBoost models are quite reliable in improving base models. They can reach up to 28% improvement of the recall score on minority class compared to the basic XGBoost model. The precision score of both imbalance-XGBoost models decreases, while the weighted-XGBoost model simultaneously improves the precision and recall score.
DownloadPaper Citation
in Harvard Style
Averro N., Murfi H. and Ardaneswari G. (2023). The Imbalance Data Handling of XGBoost in Insurance Fraud Detection. In Proceedings of the 12th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-664-4, SciTePress, pages 460-467. DOI: 10.5220/0012126900003541
in Bibtex Style
@conference{data23,
author={Nathanael Averro and Hendri Murfi and Gianinna Ardaneswari},
title={The Imbalance Data Handling of XGBoost in Insurance Fraud Detection},
booktitle={Proceedings of the 12th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2023},
pages={460-467},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012126900003541},
isbn={978-989-758-664-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - The Imbalance Data Handling of XGBoost in Insurance Fraud Detection
SN - 978-989-758-664-4
AU - Averro N.
AU - Murfi H.
AU - Ardaneswari G.
PY - 2023
SP - 460
EP - 467
DO - 10.5220/0012126900003541
PB - SciTePress