Variable Importance Analysis in Default Prediction using Machine Learning Techniques

Başak Gültekin, Betül Erdoğdu Şakar

Abstract

In this study, different data mining techniques were applied to a real bank credit data set from a public bank to provide an automated and objective credit scoring. Two-step methodology was used for objective credit scoring: Determining the variables to be included in the model and deciding on the model to classify the potential credit application as “bad credit (default)” or “good credit (not default)”. The phrases “bad credit” and “good credit” are used as class labels since they are used like this in banking jargon in Turkey. For this two-step procedure, different variable selection algorithms like Random Forest, Boruta and machine learning algorithms like Logistic Regression, Random Forest, Artificial Neural Network were tried. At the end of the feature selection phase, CRA_Score and III_Score variables were determined as most important variables. Moreover, occupation and bank product number were also predictor variables. For the classification phase, Neural Network model was the best model with higher accuracy and low average square error also Random Forest model better resulted than Logistic Regression model.

Download


Paper Citation


in Harvard Style

Gültekin B. and Erdoğdu Şakar B. (2018). Variable Importance Analysis in Default Prediction using Machine Learning Techniques.In Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-318-6, pages 56-62. DOI: 10.5220/0006872400560062


in Bibtex Style

@conference{data18,
author={Başak Gültekin and Betül Erdoğdu Şakar},
title={Variable Importance Analysis in Default Prediction using Machine Learning Techniques},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2018},
pages={56-62},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006872400560062},
isbn={978-989-758-318-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Variable Importance Analysis in Default Prediction using Machine Learning Techniques
SN - 978-989-758-318-6
AU - Gültekin B.
AU - Erdoğdu Şakar B.
PY - 2018
SP - 56
EP - 62
DO - 10.5220/0006872400560062