The Comparisons of the Machine Learning Models in Credit Card Default under Imbalance and Multi-features Dataset

Zhongtian Yu

2022

Abstract

Affected by the novel coronavirus pneumonia, the global financial market has suffered from a terrible crisis, so the risk tolerance of banks around the world is greatly weakened, which requires the improvement of risk management in banks. The development of machine learning makes programming more convenient and prediction more accurate. In terms of risk management, the introduction of machine learning models enables banks to more accurately predict the potential risks, providing more opportunities to avoid them. China is the fastest recovery country under COVID-19, but previous studies are lack of analysis data from the Bank of China. Therefore, the paper processes the data from the Bank of China to train five models (the support vector machine, decision tree, logistic regression, bagging and random forest) and selects the best model by three standards: effectiveness, efficiency and stability. For achieving the best classification, the paper also tests the optimization effect of feature selection on the five models. In order to ensure the results are fair and universal, the SMOTE is used to solve the problem of data imbalance and grid search is used to obtain the best model parameters, so the influence of parameters on the comparison results between models can be eliminated. Decision tree model performs better considering the complexity and training time and the feature selection does not show improvement in the performance of the tree model in the solution.

Download


Paper Citation


in Harvard Style

Yu Z. (2022). The Comparisons of the Machine Learning Models in Credit Card Default under Imbalance and Multi-features Dataset. In Proceedings of the International Conference on Big Data Economy and Digital Management - Volume 1: BDEDM, ISBN 978-989-758-593-7, pages 864-872. DOI: 10.5220/0011355600003440


in Bibtex Style

@conference{bdedm22,
author={Zhongtian Yu},
title={The Comparisons of the Machine Learning Models in Credit Card Default under Imbalance and Multi-features Dataset},
booktitle={Proceedings of the International Conference on Big Data Economy and Digital Management - Volume 1: BDEDM,},
year={2022},
pages={864-872},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011355600003440},
isbn={978-989-758-593-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the International Conference on Big Data Economy and Digital Management - Volume 1: BDEDM,
TI - The Comparisons of the Machine Learning Models in Credit Card Default under Imbalance and Multi-features Dataset
SN - 978-989-758-593-7
AU - Yu Z.
PY - 2022
SP - 864
EP - 872
DO - 10.5220/0011355600003440