term deposit. The dataset is multivariate with 41188
instances (4640 subscription), 21 attributes (5 real,
5 integer and 11 object), no missing values and it
is available at https://archive.ics.uci.edu/ml/datasets/
Bank+Marketing (Moro et al., 2014).
The Default of Credit Card Clients dataset con-
tains information of default payments, demographic
factors, credit data, history of payment, and bill
statements of credit card clients in Taiwan from
April to September 2005. The classification goal
is to predict if the clients is credible. The dataset
is multivariate with 30000 instances (6636 credita-
tion), 24 integer attributes, no missing values, and it
is available at https://archive.ics.uci.edu/ml/datasets/
default+of+credit+card+clients (Bache and Lichman,
2013).
The Kaggle Credit Card dataset is a modified
version of Default of Credit Card Clients, with data
in the same period. Both datasets have the same
classification goal: predict if the client is credible.
However, Kaggle Credit Card has more features, al-
most 31 only numerical attributes. and a lower num-
ber of positive credible client instances. The dataset
has 284807 instances (492 positive credible client in-
stances). The dataset is highly unbalanced and the
positive class accounts for 0.172% of all instances.
It is available at https://www.kaggle.com/uciml/
default-of-credit-card-clients-Dataset (Dal Pozzolo
et al., 2015)
The Statlog German Credit dataset contains cat-
egorical and symbolic attributes. It contains credit
history, purpose, personal client data, nationality, and
other information. The goal is to classify clients using
a set of attributes as good or bad for credit risk. We
used an alternative dataset provided by Strathclyde
University. The file was edited and several indica-
tor variables were added to make it suitable for al-
gorithms that cannot cope with categorical variables.
Several attributes that are ordered categorically (such
as attribute 17) were coded as integer. The dataset is
multivariate with 1,000 instances (300 instances are
classified as Bad), 24 integer attributes, no missing
values and it is available at https://archive.ics.uci.edu/
ml/datasets/statlog+(german+credit+data) (Hofmann,
1994).
The Statlog Australian Credit Approval dataset is
used for analysis of credit card operations. All at-
tribute names and values were anonymized to pro-
tect data privacy. The dataset is multivariate with
690 instances (307 instances are labeled as 1), 14
attributes (3 real and 11 integer), no missing val-
ues and it is available at http://archive.ics.uci.edu/ml/
datasets/statlog+(australian+credit+approval) (Quin-
lan, 1987).
For each dataset, we preprocessed the attributes, sam-
pled the data, and divided the data into 90% for train-
ing and 10% for testing. After splitting the dataset,
we employed cross-validation with ten Stratified k-
folds, fifteen seeds (55, 67, 200, 245, 256, 302, 327,
336, 385, 407, 423, 456, 489, 515, 537), and nine pre-
dictive methods. Firstly, the methods used the scikit-
learn default hyperparameters. The F1 Score and AU-
ROC metrics were measured. Tests were performed
on the measured metrics to rank statistic differences
over methods. Finally, we employed Optuna to opti-
mize the hyperparameters and used the classification
methods again.
The main scikit-learn default hyperparameters
used to test the different methods are::
• GaussianNB: priors=’None’, and
var smoothing=’1e-09’.
• Logistic Regression: C=1.0, fit intercept=True,
intercept scaling=1, max iter=100, penalty=’l2’,
random state=None, solver=’warn’, and
tol=0.0001.
• kNN: algorithm=’auto’, leaf size=30, met-
ric=’minkowski’, n neighbors=5, p=2, and
weights=’uniform’.
• SVC: C=1.0, cache size=200, deci-
sion function shape=’ovr’, degree=3, ker-
nel=’rbf’, shrinking=True, and tol=0.001.
• Decision Tree: criterion=’gini’,
min samples split=2, and splitter=’best’.
• Random Forest: bootstrap=True, criterion=’gini’,
min samples leaf=1, min samples split=2, and
n estimators=’warn’.
• Gradient Boosting: criterion=’friedman mse’,
learning rate=0.1, loss=’deviance’, max depth=3,
min samples leaf=1, min samples split=2,
n estimators=100, subsample=1.0, tol=0.0001,
and validation fraction=0.1.
• XGBoost: base score=0.5, booster=’gbtree’,
learning rate=0.1, max depth=3, and
n estimators=100.
• Multilayer Perceptron: activation=’relu’, hid-
den layer sizes=(100,), learning rate=’constant’,
max iter=200, solver=’adam’, and tol=0.0001.
We have used Optuna to optimize the hyperparame-
ters in the methods, running one study with 100 itera-
tions, using the following ranges:
• GaussianNB: none.
• Logistic Regression: C range: 1e-10 to 1e10.
• kNN: N neighbors range: 1 to 100; Distances
range: 1 to 10.
Comparing Supervised Classification Methods for Financial Domain Problems
443