Variable Importance Analysis in Default Prediction using Machine Learning Techniques
Başak Gültekin, Betül Erdoğdu Şakar
2018
Abstract
In this study, different data mining techniques were applied to a real bank credit data set from a public bank to provide an automated and objective credit scoring. Two-step methodology was used for objective credit scoring: Determining the variables to be included in the model and deciding on the model to classify the potential credit application as “bad credit (default)” or “good credit (not default)”. The phrases “bad credit” and “good credit” are used as class labels since they are used like this in banking jargon in Turkey. For this two-step procedure, different variable selection algorithms like Random Forest, Boruta and machine learning algorithms like Logistic Regression, Random Forest, Artificial Neural Network were tried. At the end of the feature selection phase, CRA_Score and III_Score variables were determined as most important variables. Moreover, occupation and bank product number were also predictor variables. For the classification phase, Neural Network model was the best model with higher accuracy and low average square error also Random Forest model better resulted than Logistic Regression model.
DownloadPaper Citation
in Harvard Style
Gültekin B. and Erdoğdu Şakar B. (2018). Variable Importance Analysis in Default Prediction using Machine Learning Techniques.In Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-318-6, pages 56-62. DOI: 10.5220/0006872400560062
in Bibtex Style
@conference{data18,
author={Başak Gültekin and Betül Erdoğdu Şakar},
title={Variable Importance Analysis in Default Prediction using Machine Learning Techniques},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2018},
pages={56-62},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006872400560062},
isbn={978-989-758-318-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Variable Importance Analysis in Default Prediction using Machine Learning Techniques
SN - 978-989-758-318-6
AU - Gültekin B.
AU - Erdoğdu Şakar B.
PY - 2018
SP - 56
EP - 62
DO - 10.5220/0006872400560062