Liquidity Stress Detection in the European Banking Sector

Richard Heuver

and Ron Triepels

1,2

De Nederlandsche Bank, Amsterdam, The Netherlands

Tilburg University, Tilburg, The Netherlands

Keywords:

Liquidity Stress, Risk Monitoring, Financial Market Infrastructures, Large-value Payment Systems.

Abstract:

Liquidity stress constitutes an ongoing threat to ﬁnancial stability in the banking sector. A bank that manages

its liquidity inadequately might ﬁnd itself unable to meet its payment obligations. These liquidity issues, in

turn, can negatively impact the liquidity position of many other banks due to contagion effects. For this reason,

central banks carefully monitor the payment activities of banks in ﬁnancial market infrastructures and try to

detect early-warning signs of liquidity stress. In this paper, we investigate whether this monitoring task can be

performed by supervised machine learning. We construct probabilistic classiﬁers that estimate the probability

that a bank faces liquidity stress. The classiﬁers are trained on a dataset consisting of various payment features

of European banks and which spans several known stress events. Our experimental results show that the

classiﬁers detect the periods in which the banks faced liquidity stress reasonably well.

1 INTRODUCTION

It is the nature of banks to attract deposits and pro-

vide loans. The maturity mismatch of short-term de-

posits versus long-term loans makes banks vulnerable

to liquidity risk. Liquidity risk is ”the risk that a ﬁrm

will not be able to meet efﬁciently both expected and

unexpected current and future cash ﬂow and collat-

eral needs” (BIS, 2008). When a bank does not man-

age its liquidity adequately, it might ﬁnd itself unable

to fulﬁll its short-term payment obligations and face

bankruptcy. These liquidity issues, in turn, can spread

across a payment system and affect the liquidity po-

sition of many other banks. For this reason, central

banks closely monitor the payment activities of banks

and try to anticipate early signs of liquidity stress.

In recent years, the payment data generated by Fi-

nancial Market Infrastructures (FMIs) has become an

important new source to detect liquidity risks. FMIs

are often called the ﬁnancial backbone of our modern

society. Their main purpose is to facilitate the clear-

ing, settlement, and recording of monetary and other

ﬁnancial transactions. The most important FMIs are

the Large-Value Payment Systems (LVPSs) which are

developed and maintained by central banks to process

high-value payments and administer monetary policy.

The transaction log generated by such systems pro-

vides detailed insight into the payment behavior of

banks and can be analyzed to detect cases were banks

manage their liquidity in an unsafe manner.

Several unsupervised methods have been pro-

posed for this purpose based on traditional statistics,

see e.g. (Heijmans and Heuver, 2014), and unsu-

pervised machine learning, see e.g. (Triepels et al.,

2018). The idea behind these methods is to derive the

patterns by which banks usually manage their liquid-

ity from the transaction log of an LVPS and search for

cases where the current payment behavior of banks

deviates from their expected patterns. Such anoma-

lies can be due to a bank facing liquidity stress which

forces it to change its payment behavior.

However, a drawback of these unsupervised meth-

ods is that it can be difﬁcult to determine what kind

of patterns are learned about the payment behavior of

banks. In addition, when there is a signiﬁcant devi-

ation between the expected and current payment be-

havior of a bank, it is often not clear whether this de-

viation is due to the bank facing liquidity stress or

whether the bank needs to pay some unusual one-time

payments that do not pose a real threat to its liquidity

position on the long-term.

This paper aims to investigate whether liquidity

stress at banks can also be detected by supervised

machine learning. In supervised machine learning, a

model is trained from a labeled training set contain-

ing explicit examples of the output to be predicted. A

supervised machine learning model can be trained to

detect whether a bank is likely facing liquidity stress

by learning the patterns that are characteristic for a

266

Heuver, R. and Triepels, R.

Liquidity Stress Detection in the European Banking Sector.

DOI: 10.5220/0007395602660274

In Proceedings of the 11th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2019), pages 266-274

ISBN: 978-989-758-350-6

stressed and non-stressed bank. These patterns can be

derived from historical labeled payment data of a pre-

selected set of banks that faced known stress events

such as a takeover or bank run.

There are two main challenges to make this super-

vised method work in practice. First, stress events at

banks are quite rare and typically last for only a few

days which makes it difﬁcult to learn the patterns of

a stressed bank. Second, there is currently not much

data recorded about stress events at banks, and such

data is difﬁcult to obtain.

In this paper, we show how these challenges can

be addressed. We construct several probabilistic clas-

siﬁers that estimate the probability that a bank faces

liquidity stress. The classiﬁers are trained on a dataset

that describes the payment behavior of several Euro-

pean banks over the past ten years. We elaborate on

how to deal with the imbalance between stress and

non-stress examples by training the classiﬁers based

on a weighted loss function. Furthermore, we dis-

cuss how we labeled the dataset by searching online

for news articles about stress events at the banks. Al-

though the quality of the stress classes is not ideal,

we will show that the classiﬁers detect liquidity stress

reasonably well.

2 RELATED RESEARCH

Many papers have studied the problem of predicting

the emergence of a ﬁnancial crisis in a country. Fi-

nancial crises are predicted from historical panel data

consisting of macroeconomic variables such as the

GDP growth rate of a country. Traditionally, this has

been done by a logit model (Demirg-Kunt and Detra-

giache, 1998) or by the signal extraction method of

(Kaminsky and Reinhart, 1999). An extensive com-

parison of these two methods can be found in (Davis

and Karim, 2008). More recent papers have also ex-

plored how machine learning can be applied to predict

ﬁnancial crises. For example, (Chamon et al., 2007)

applied a random forest to predict the emergence of

a capital account crisis based on a wide range of

macroeconomic features categorized by four sectors

(external, ﬁscal, ﬁnancial, and corporate). Moreover,

(Fioramanti, 2008) constructed a multi-layer percep-

tron network to predict the emergence of a sovereign

debt crisis based on a large set of internal, external,

and debt related features of a country.

There are also many papers that have studied the

problem of predicting how well a bank performs. This

problem is called bank performance prediction. Typi-

cally, the performance of a bank is predicted from ﬁ-

nancial ratios that are derived from the ﬁnancial state-

ments of banks such as their balance sheets. Early

papers on this topic predict bank performance based

on statistical methods such as discriminant analysis,

see (Beaver, 1966; Altman, 1968). In recent years,

machine learning is also becoming increasingly pop-

ular in this research area. Several machine learning

techniques have been applied to perform bank per-

formance prediction including neural networks (Tam,

1991) and support-vector machines (Min and Lee,

2005). An extensive overview of these techniques can

be found in (Kumar and Ravi, 2007).

The problem studied in this paper is similar to

bank performance prediction. However, we predict

stress at banks based on features derived from the pay-

ment data generated by FMIs. We call this problem

liquidity stress detection. The use of payment data

has some advantages over ﬁnancial statements. Un-

like ﬁnancial statements, payment data can be made

available in near real-time, provides detailed insight

into the liquidity management of banks, and cannot

be easily manipulated (e.g. by window dressing).

3 LIQUIDITY STRESS

DETECTION

In this section, we formalize the problem of liquidity

stress detection (section 3.1 until 3.3). Furthermore,

we discuss how this problem can be solved by a lo-

gistic regression model (section 3.4) and multi-layer

perceptron network (section 3.5).

3.1 Notation

Let B = {b

, . . . , b

} be a set of n banks and T =

, . . . ,t

} a set of l time intervals. The time intervals

are consecutive, equally spaced, and collectively span

the operating time of the ﬁnancial system (e.g. by

days or hours). Furthermore, let:

(k)

= [x

(k)

, . . . , x

(k)

]

(1)

be a column vector of m payment features of bank b

at time interval t

. Each feature vector describes the

payment behavior of a bank at a particular time in-

terval and includes features related to the bank’s liq-

uidity position, payment ﬂows, and collateral. We de-

note the set of all feature vectors by X . Finally, let

(k)

∈ {0, 1} be the stress class that indicates whether

bank b

faces liquidity stress at time interval t

3.2 Classiﬁcation Problem

Our goal is to construct a probabilistic classiﬁer that

classiﬁes a feature vector by whether or not the cor-

Liquidity Stress Detection in the European Banking Sector

267

responding bank faces liquidity stress. We can deﬁne

this classiﬁer as a probability function:

f : X → [0, 1] (2)

where:

f (x

(k)

) = P(y

(k)

= 1|x

(k)

) (3)

is the conditional probability that bank b

faces liq-

uidity stress given that we observe feature vector x

(k)

at time interval t

. A bank is classiﬁed as facing liq-

uidity stress if f (x

(k)

) is high, i.e.:

φ(x

(k)

, ζ) =

(

1, if f (x

(k)

) ≥ ζ

0, otherwise

(4)

Here, ζ ∈ [0, 1] is a threshold close to one that deter-

mines how conﬁdent the classiﬁer needs to be to clas-

sify a feature vector as belonging to a stressed bank.

3.3 Model Assumptions

Throughout this paper, we consider the case of esti-

mating f under the following two assumptions:

1. Payment features and stress classes are indepen-

dent of each bank, i.e. (x

(a)

, y

(a)

) is independent

of (x

(b)

, y

(b)

) for each time interval t

∈ T and

a 6= b.

2. Payment features and stress classes are time

invariant, i.e. (x

(k)

, y

(k)

) is independent of

(k)

, y

(k)

) for each bank b

∈ B and a 6= b.

These assumptions do not hold in practice but greatly

simplify the detection of liquidity stress. It is well

known that liquidity issues can spread across banks

by contagion effects. Moreover, a bank that is cur-

rently facing liquidity stress has a higher probability

to be also stressed in the next time intervals since liq-

uidity issues can take quite some time before they are

resolved. We will show that, even by making these

strong simpliﬁcations, we can detect liquidity stress

quite well.

Not every classiﬁer is suitable to estimate f . We

want a classiﬁer that produces well-calibrated prob-

abilities and can deal with severely imbalanced data.

In our experiments, the probability of a feature vec-

tor belonging to a stressed bank was less than 0.1

percent. Such severely imbalanced data harms the

performance of many classiﬁers. A logistic regres-

sion model or multi-layer perception network is par-

ticularly suited to estimate f . Both types of mod-

els are probabilistic and have found to produce well-

calibrated probabilities in practice (Niculescu-Mizil

and Caruana, 2005). Moreover, they can be easily

adapted to deal with unbalanced data.

3.4 Logistic Regression

Logistic Regression (LR) is a simple probabilistic bi-

nary classiﬁer. It is an extension of multiple linear

regression to the case where the response variable of

the regression model is a binary variable.

We consider the following LR model to detect liq-

uidity stress:

ˆy

(k)

= σ(wx

(k)

+ b) (5)

where, w is a m-dimensional row vector of weights, b

is a bias term, and σ is the sigmoid function:

σ(x) =

1 + e

−x

(6)

The sigmoid function in equation 5 rescales the linear

combination to the range (0, 1). The output ˆy

(k)

is an

estimate of f (x

(k)

3.5 Multi-Layer Perceptron

A Multi-Layer Perceptron (MLP) network is a type

of feed-forward neural network. It is similar to an

LR model with the exception that it processes input

features through one or more hidden layers consisting

of non-linear computational units. These additional

hidden layers enable the MLP network to learn a non-

linear mapping from its inputs to its outputs.

We focus on an MLP network consisting of mul-

tiple hidden layers and a single sigmoid output. Let δ

be the number of layers or depth of the network and s

the size of the i-th layer. We denote the output of the

i-th layer by h

. The ﬁrst layer of the network is the

input layer, and the output of this layer is the feature

vector that is presented to the network:

= x

(k)

(7)

The input is processed through the hidden layers. The

output of the i-th layer is:

= ψ

)

i−1

+ b

) for 1 < i < δ (8)

where, W

is a s

by s

i−1

matrix of weights, b

is a s

dimensional column vector of bias terms, and ψ

)

(x)

is a set of s

non-linear activation functions that are

applied to each element of x. Usually, ψ is taken to be

the hyperbolic tanh function (LeCun et al., 1998) or

rectiﬁed linear function (Glorot et al., 2011). Finally,

the output of the last hidden layer h

δ−1

is processed

through a single sigmoid unit:

ˆy = h

= σ(w

δ−1

+ b

) (9)

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

268

where w

is a s

-dimensional row vector of weights

and b is a bias term. The output h

of the ﬁnal layer

is an estimate of f (x

(k)

3.6 Model Estimation

The parameters of the LR model and MLP network

can be estimated from a historical dataset of features

vectors with known stress classes. Let Θ be the set

of parameters to be optimized. Moreover, let D =

, d

, . . . } be a dataset of tuples where each tuple:

= (x

(k)

, y

(k)

) (10)

consists of a feature vector x

(k)

and corresponding

stress class y

(k)

. We ﬁnd optimal values for the pa-

rameters by minimizing the mean cross entropy. The

cross entropy of a single feature vector is:

J (x

(k)

, y

(k)

) = −y

(k)

log ˆy

(k)

(11)

− (1 − y

(k)

)log(1 − ˆy

(k)

) (12)

The cross entropy averaged over all feature vectors in

D is:

J (D) =

|D|

∑

d∈D

J (x

(k)

, y

(k)

) (13)

The optimal values of the parameters are found by

solving the following optimization problem:

∗

= arg min

J (D) (14)

Usually, such a problem is solved by gradient-based

optimization in conjunction with back-propagation

(Werbos, 1982; Bottou, 2004).

However, an issue in our application is that the

stress classes are highly imbalanced which makes it

difﬁcult to solve equation 14 by gradient-based opti-

mization. One way to deal with this issue is to opti-

mize a weighted version of cross entropy which also

takes into account the probability of a feature vector

belonging to a stressed bank. The weighted cross en-

tropy of a single feature vector is:

(k)

, y

(k)

) = −ay

(k)

log ˆy

(k)

(15)

− (1 − a)(1 − y

(k)

)log(1 − ˆy

(k)

) (16)

where a ∈ (0, 1) is the importance that we assign to

predicting the stress class correct. We set a equal to

the probability that a feature vector does not belong

to a stressed bank. In this way, when there are fewer

stress examples in the dataset, there is a higher incen-

tive for the classiﬁers to assign a high probability to

the stress examples.

4 EXPERIMENTAL SETUP

In this section, we elaborate on a series of experi-

ments that were conducted to determine how well the

LR model and MLP network detect liquidity stress in

real-world data. We discuss the characteristics of the

data (section 4.1 until 4.4), the implementation of the

classiﬁers (section 4.5), and the performance evalua-

tion procedure (section 4.6).

4.1 Data Sources and Features

We created a dataset of feature vectors which de-

scribes the payment behavior of all European banks

on a daily basis over the last ten years. It is compiled

from data generated by three important FMIs of the

Eurosystem: its large-value payment system called

TARGET2

, collateral management system, and min-

imum reserve system.

The majority of features were derived from TAR-

GET2. Based on the transaction log of this payment

system, we calculated for each bank its:

• Daily net value of payments (i.e. total inﬂow mi-

nus total outﬂow)

• Daily net number of transactions (i.e. total num-

ber of incoming payments minus total number of

outgoing payments)

• Daily net payment time within the day weighted

by value (i.e. the weighted payment time of in-

coming payments minus the weighted payment

time of outgoing payments)

• Daily net payment time within the day weighted

by the number of transactions (i.e. the weighted

payment time of incoming payments minus the

weighted payment time of outgoing payments)

These payment features were calculated for each pay-

ment type separately. Each payment settled in TAR-

GET2 has an associated type that describes the nature

of the payment, e.g. a customer payment, inter-bank

payment, or administrative payment. Furthermore,

we calculated for each bank its:

• Daily end-of-day account balance

• Daily minimum account balance (i.e. the lowest

value within the day)

We also derived features that describe the activities

of the banks on the interbank money market. To de-

rive these features, we applied the Furﬁne algorithm

(Furﬁne, 1999) on the transaction log of TARGET2

to classify each transaction as a regular transaction or

More information about TARGET2 can be found in

(ECB, 2018).

Liquidity Stress Detection in the European Banking Sector

269

money market transaction

. Accordingly, based on

the subset of money market transactions, we calcu-

lated for each bank its:

• Daily number of money market counterparties

• Daily HHI-index (Hirschman, 1945) of money

market counterparties weighted by the value of

the money market loans

• Daily spread of the weighted borrowing rate to

EONIA (i.e. the difference between the money

market rate of the bank and the EONIA)

These money-market features were calculated for the

case in which a bank is the lender as well as the bor-

rower in a money-market transaction.

Besides the payment features and money-market

features, we also derived features from the European

collateral management system. This system records

the amount of collateral European banks have de-

posited at the Eurosystem and how much of this col-

lateral is available for banks to make payments during

the day. From collateral data of this system, we cal-

culated for each bank its:

• Daily average haircut on all collateral

• Daily value of collateral before the haircut

• Daily value of collateral after the haircut

Banks cannot use collateral within the Eurosystem at

the market value. Instead, the value of collateral is de-

creased by a certain percentage to account for poten-

tial credit risk that the European Central Bank could

face should a bank default and its collateral needs to

be sold. This percentage is called the haircut.

Finally, we derived data about the minimum re-

serve requirement of each bank from the Europen

minimum reserve system. The minimum reserve re-

quirement is the average amount of liquidity that a

bank must keep on its settlement accounts during the

maintenance period. Based on this data and the trans-

action data of TARGET2, we calculated for each bank

the relative difference between its end-of-day balance

and its minimum reserve requirement.

4.2 Data Normalization

However, a problem with the features is that they each

are on a different scale. In addition, the range of each

feature depends on the type of bank (e.g. small or

large) for which it is calculated. Hence, we cannot

easily compare the feature vectors of one bank with

the feature vectors of another bank.

See (Arciero et al., 2016) for more information on how

the Furﬁne algorithm can be applied to identify money mar-

ket transactions in the transaction log of TARGET2.

To address this issue, we re-scaled the feature vec-

tors of each bank separately by z-normalization. The

normalized value of a feature was calculated by:

˜x

(k)

i j

(k)

i j

− ¯x

(k)

(17)

where, ¯x

(k)

and s

(k)

are respectively the sample mean

and standard deviation of the j-th feature in a feature

vector estimated for bank b

. Normalized feature ˜x

(k)

i j

represents the number of standard deviations x

(k)

i j

de-

viates from the mean s

(k)

of bank b

. Notice that, af-

ter performing this normalization, the features of each

bank have zero mean and unit variance.

4.3 Stress Classes

Obtaining data about the periods in which banks faced

liquidity stress is difﬁcult. A good indicator of liquid-

ity stress is when a bank requests emergency liquidity

assistance. When a bank faces liquidity stress, it can

turn to the central bank that acts as a lender of last re-

sort and request liquidity in exchange for collateral of

lesser quality. Although data about the use of emer-

gency liquidity by European banks is recorded, we

were not able to obtain it because such data is highly

conﬁdential and could not be made available for re-

search purposes.

Instead, we obtained the stress classes by perform-

ing an online news analysis. We asked payment ex-

perts in the Eurosystem which banks suffered from

severe liquidity stress recently. The experts provided

us with a short-list of seven banks

. For each bank on

this list, we searched for evidence of liquidity stress

on Wikipedia and in online news articles of several

national and international ﬁnancial newspapers (e.g.

the Financial Times). All noteworthy events that we

found about the banks that could indicate possible liq-

uidity stress were organized in a detailed timeline.

Based on the timeline, we assigned the feature

vector of each bank at each day in the analysis period

to one of the following stress codes:

1. No stress - if we could not ﬁnd any evidence of

liquidity stress at the bank at the given day

2. Possibly stress - if we could ﬁnd some evidence

of liquidity stress at the bank at the given day but

which was not that severe (e.g. shares of the bank

dropped or a staff member had been ﬁred)

Because of conﬁdentiality reasons, we cannot disclose

the names of these banks. Instead, we will refer to the banks

by letter, i.e. bank A, bank B, and so on.

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

270

Bank A Bank B

. . .

Bank F

Bank G

Experiment 1

Test

Train / Fold 1

. . .

Train / Fold 5 Train / Fold 6

Experiment 2

Train / Fold 1

Test

. . .

Train / Fold 5 Train / Fold 6

Experiment 6

Train / Fold 1 Train / Fold 2

. . .

Test

Train / Fold 6

Experiment 7

Train / Fold 1 Train / Fold 2

. . .

Train / Fold 6

Test

Figure 1: The partitioning of the dataset during the experiments. In each experiment, the dataset was partitioned in a training

set and test set. Accordingly, the training set was further partitioned in six cross-validation folds. The test set and the cross-

validation folds each contained only the feature vectors of a particular bank. The experiments were repeated such that the

feature vectors of each bank are used exactly once for testing.

3. Stress - if we could ﬁnd clear evidence of liquidity

stress at the bank at the given day (e.g. the start of

a bank run or rumors of a takeover)

4. Bankrupt - if the bank is no longer operating and

only participates in the ﬁnancial system for ad-

ministrative reasons

In most cases, the stress codes of the banks started at

1 (no stress) and incrementally increased over time to

stress code 4 (bankrupt).

Finally, the feature vectors were labeled based on

the stress codes. Feature vectors with stress code 1

(no stress) were assigned the no stress class and fea-

ture vectors with stress code 3 (stress) were assigned

the stress class. No stress class was assigned to the

feature vectors with stress code 2 (possibly stress) and

stress code 4 (bankrupt). These unlabeled feature vec-

tors were not used for model training but only for out-

of-sample prediction.

4.4 Data Partitioning

We performed a series of experiments to determine

how well liquidity stress at the banks can be detected

out-of-sample. During each experiment, the dataset

was partitioned into a separate training set and test

set. A classiﬁer was trained on the training set which

contained the feature vectors of all banks except for

one bank. The feature vectors of this holdout bank

were put in the test set and used to evaluate how well

the classiﬁer performs. We repeated this experiment

seven times such that the feature vectors of each bank

were used exactly once for testing. Accordingly, we

measured the average performance of the classiﬁer

over the experiments. Figure 1 provides a schematic

overview of the experiments.

4.5 Model Implementation

We trained the LR model described in section 3.4 and

two variations of the MLP network described in sec-

tion 3.5 on each training set. The MLP networks have

one hidden layer with either a hyperbolic tangent ac-

tivation or rectiﬁer linear activation. We will refer to

these networks as MLP1 and MLP2 respectively. All

classiﬁers were optimized by stochastic gradient de-

scent with momentum and mini-batches. This proce-

dure was performed for a ﬁxed number of epochs with

a constant learning rate. To avoid over-ﬁtting, we ap-

plied L2 weight decay (Krogh and Hertz, 1992) on all

weights of the MLP networks.

The MLP networks have some hyper-parameters

that needed to be tuned. These parameters include the

number of units in the hidden layer and the amount

of weight decay. We optimized these parameters by

a variation of k-fold cross-validation using R package

caret (Kuhn, 2008). During the validation procedure,

the training set was partitioned in six folds that each

contained the feature vectors of a particular bank. An

MLP network was trained on ﬁve of the folds having

a particular conﬁguration of hidden units and weight

decay. Accordingly, the loss function of the network

was evaluated on the holdout fold. This process was

repeated until each conﬁguration of parameters was

evaluated once on each fold. Finally, we choose the

conﬁguration for which the network achieved the low-

est loss averaged over all holdout folds.

4.6 Evaluation Metrics

The performance of each classiﬁer was evaluated by

calculating its precision, recall and F

-score on each

Liquidity Stress Detection in the European Banking Sector

271

Table 1: The precision, recall, and F

-score of the classiﬁers in case they are trained based on regular cross entropy. In all

experiments, a threshold of ζ = 0.9 was used to generate alarms for liquidity stress.

Precision Recall F

ζ = 0.9 LR MLP1 MLP2 LR MLP1 MLP2 LR MLP1 MLP2

Bank A 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Bank B 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Bank C 1.00 0.00 0.00 0.25 0.00 0.00 0.40 0.00 0.00

Bank D 1.00 1.00 1.00 0.20 0.20 0.20 0.33 0.33 0.33

Bank E 0.80 0.00 1.00 0.09 0.00 0.09 0.16 0.00 0.16

Bank F 1.00 1.00 1.00 0.40 0.10 0.30 0.57 0.18 0.46

Bank G 0.80 0.83 1.00 0.40 0.50 0.40 0.53 0.63 0.57

Average 0.66 0.40 0.57 0.19 0.11 0.14 0.28 0.16 0.22

Table 2: The precision, recall, and F

-score of the classiﬁers in case they are trained based on weighted cross entropy. In all

experiments, a threshold of ζ = 0.9 was used to generate alarms for liquidity stress.

Precision Recall F

ζ = 0.9 LR MLP1 MLP2 LR MLP1 MLP2 LR MLP1 MLP2

Bank A 0.02 0.00 0.00 0.17 0.00 0.00 0.04 0.00 0.00

Bank B 0.67 1.00 1.00 0.40 0.40 0.20 0.50 0.57 0.33

Bank C 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Bank D 0.33 0.50 1.00 0.20 0.20 0.20 0.25 0.29 0.33

Bank E 1.00 1.00 1.00 0.96 0.48 0.89 0.98 0.65 0.94

Bank F 0.40 0.17 1.00 0.40 0.50 0.40 0.40 0.26 0.57

Bank G 0.86 1.00 0.88 0.60 0.60 0.70 0.71 0.75 0.78

Average 0.47 0.52 0.70 0.39 0.31 0.34 0.41 0.36 0.42

test set. For a given threshold ζ, precision is the prob-

ability that a feature vector belongs to a stressed bank

given that a classiﬁer predicted liquidity stress:

Precision(ζ) = P(y

(k)

= 1| ˆy

(k)

≥ ζ) (18)

In contrast, recall is the probability that a classiﬁer

predicts liquidity stress given that a feature vector be-

longs to a stressed bank:

Recall(ζ) = P( ˆy

(k)

≥ ζ|y

(k)

= 1) (19)

The F

-score is the harmonic mean of precision and

recall:

(ζ) = 2 ·

Precision(ζ) · Recall(ζ)

Precision(ζ) + Recall(ζ)

(20)

It constitutes an overall measure to compare the per-

formance of a set of competing classiﬁers. We deter-

mined the classiﬁer that achieved the highest F

-score

averaged over all test sets.

5 RESULTS

Table 1 shows the performance evaluation of the clas-

siﬁers in case they are trained by regular cross en-

tropy. The results for weighted cross entropy are

shown in Table 2. A threshold of ζ = 0.9 was used in

all experiments. This means that the classiﬁers raised

an alarm when they were at least 90% sure that a bank

was facing liquidity stress.

A few things stand out in these tables. We see

that the classiﬁers detect liquidity stress much better

when they are trained by weighted cross entropy. The

average F

-score of the models increases when they

are trained by weighted cross entropy instead of cross

entropy. This increase can be attributed to the fact that

weighted cross entropy incentives the models more to

assign a high probability to the stress cases.

Moreover, we see that MLP2 detects liquidity

stress overall the best. The network achieves an av-

erage F

-score of 0.42 with an average precision of

70% and average recall of 34%. These measures im-

ply that more than two-thirds of all alarms generated

by the network were correct and the network detected

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

272

100

Time

Stress Probability (%)

Stress Codes

1 − No Stress 2 − Possibly Stress 3 − Stress

Figure 2: The out-of-sample predictions of MLP2 for bank E. The background colors highlight the stress codes that were

assigned to the feature vectors of the bank by our news analysis. The classiﬁer picks up an early-warning sign of liquidity

stress in the ’Possibly Stress’ period, quite some time before newspapers reported the stress in the ’Stress’ period. Note that,

because of conﬁdentiality reasons, we were not allowed to include the original ﬁgure in this paper. This ﬁgure is artiﬁcially

generated and closely resembles the original ﬁgure.

more than one-third of all stress cases. The LR model

achieved a similar F

-score but generated alarms for

liquidity stress with much lower precision.

The break-down of the performance metrics by

bank in Table 1 and 2 also shows that the classiﬁers

do not perform well on every bank. In particular, they

have difﬁculties detecting liquidity stress at bank A

and C. We suspect that the classiﬁers do not perform

well on these banks because the stress classes of these

banks are of poor quality. Our news analysis provides

only a rough approximation of the level of stress that

banks experience. News items can be incorrect, im-

precise, or incomplete. Also, many stress events do

not become known to the general public or cannot

be observed in payment data. All these factors could

have caused the wrong stress class being assigned to

the feature vectors of these banks.

Another factor that negatively impacts the perfor-

mance of the classiﬁers is the relatively small train-

ing sets on which they were trained. There are many

forms of liquidity stress that a bank can face. For ex-

ample, a bank can face liquidity problems for only a

few days which results in a bank-run or face long-

term solvency issues that eventually lead to a state

takeover. It is unlikely that the classiﬁers were able

to learn to recognize all these different forms of stress

from data of only seven stressed banks.

We also determined whether the classiﬁers detect

liquidity stress before the stress became known to the

general public. This was done, similarly as in the

performance evaluation, by classifying out-of-sample

the feature vectors of a holdout bank. However, this

time, we also classiﬁed the features vectors that are

assigned stress code 2 (possibly stress) by our news

analysis. If the classiﬁers assign these feature vectors

to the stress class, then they likely pick up an early-

warning sign of liquidity stress.

Figure 2 depicts the out-of-sample predictions of a

bank. It shows that the classiﬁer detects an early sign

of liquidity stress in the ’possibly stress’ period, quite

some time before the stress became publicly known in

the ’stress’ period. We checked whether we could ﬁnd

the same stress sign by the method of (Heijmans and

Heuver, 2014), i.e. by studying simple plots of the

features one-at-a-time, but were unable to spot any

irregularities. Hence, the classiﬁer must have found

that a combination of features that is characteristic for

a bank that is facing liquidity stress.

6 CONCLUSIONS

We conclude that liquidity stress at banks can be rea-

sonably well detected by supervised machine learn-

ing. Our best classiﬁer generated alarms for liquidity

stress with a precision of 70% and a recall of 34%. In

some cases, the classiﬁer identiﬁed signs of liquidity

stress well before the stress was reported by ﬁnan-

cial newspapers. Most of these signs remained un-

detectable when studying simple plots of the features

one-at-a-time. Although our method needs some fur-

ther improvements to be used in practice, we believe

Liquidity Stress Detection in the European Banking Sector

273

that it is a promising new tool for central banks to

monitor the ﬁnancial activities of banks.

There are several ways in which our method can

be improved. Our classiﬁers were trained on data of

only seven banks. It is to be expected that the classi-

ﬁers perform much better when they are trained on a

larger dataset containing a more diverse set of stress

events. Moreover, the generation of the stress classes

by our news analysis still involved a lot of manual

work. Future research could investigate whether this

step can be automated by applying techniques of nat-

ural language processing or sentiment analysis.

REFERENCES

Altman, E. I. (1968). Financial ratios, discriminant anal-

ysis and the prediction of corporate bankruptcy. The

Journal of Finance, 23(4):589–609.

Arciero, L., Heijmans, R., Heuver, R., Massarenti, M., Pi-

cillo, C., and Vacirca, F. (2016). How to measure

the unsecured money market: the Eurosystem’s im-

plementation and validation using TARGET2 data. In-

ternational Journal of Central Banking.

Beaver, W. H. (1966). Financial ratios as predictors of fail-

ure. Journal of Accounting Research, pages 71–111.

BIS (2008). Principles for sound liquidity risk management

and supervision. Bank for International Settlements.

Bottou, L. (2004). Stochastic learning. In Advanced Lec-

tures on Machine Learning, pages 146–168.

Chamon, M., Manasse, P., and Prati, A. (2007). Can we pre-

dict the next capital account crisis? IMF Staff Papers,

54(2):270–305.

Davis, E. P. and Karim, D. (2008). Comparing early warn-

ing systems for banking crises. Journal of Financial

stability, 4(2):89–120.

Demirg-Kunt, A. and Detragiache, E. (1998). The deter-

minants of banking crises in developing and devel-

oped countries. Staff Papers (International Monetary

Fund), 45(1):81–109.

ECB (2018). TARGET2 user information guide v12.0. ECB

Website.

Fioramanti, M. (2008). Predicting sovereign debt crises

using artiﬁcial neural networks: a comparative ap-

proach. Journal of Financial Stability, 4(2):149–164.

Furﬁne, C. H. (1999). The microstructure of the federal

funds market. Financial Markets, Institutions & In-

struments, 8(5):24–44.

Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep sparse

rectiﬁer neural networks. In Proceedings of the Four-

teenth International Conference on Artiﬁcial Intelli-

gence and Statistics, pages 315–323.

Heijmans, R. and Heuver, R. (2014). Is this bank ill? the

diagnosis of doctor TARGET2. The Journal of Finan-

cial Market Infrastructures, 2 nr 3:3–36.

Hirschman, A. O. (1945). National power and the structure

of foreign trade. University of California Press.

Kaminsky, G. L. and Reinhart, C. M. (1999). The

twin crises: the causes of banking and balance-of-

payments problems. The American Economic Review,

89(3):473–500.

Krogh, A. and Hertz, J. A. (1992). A simple weight decay

can improve generalization. In Advances in Neural

Information Processing Systems 4, pages 950–957.

Kuhn, M. (2008). Building predictive models in R using the

caret package. Journal of Statistical Software, 28.

Kumar, P. R. and Ravi, V. (2007). Bankruptcy predic-

tion in banks and ﬁrms via statistical and intelligent

techniques–a review. European Journal of Opera-

tional Research, 180(1):1–28.

LeCun, Y., Bottou, L., Orr, G. B., and M

uller, K. R. (1998).

Efﬁcient BackProp, pages 9–50. Springer Berlin Hei-

delberg.

Min, J. H. and Lee, Y.-C. (2005). Bankruptcy prediction

using support vector machine with optimal choice of

kernel function parameters. Expert Systems with Ap-

plications, 28(4):603–614.

Niculescu-Mizil, A. and Caruana, R. (2005). Predicting

good probabilities with supervised learning. In Pro-

ceedings of the 22nd International Conference on Ma-

chine Learning, pages 625–632.

Tam, K. Y. (1991). Neural network models and the predic-

tion of bank bankruptcy. Omega, 19(5):429–445.

Triepels, R., Daniels, H., and Heijmans, R. (2018). De-

tection and explanation of anomalous payment behav-

ior in real-time gross settlement systems. In Lec-

ture Notes in Business Information Processing, vol-

ume 321, pages 145–161.

Werbos, P. J. (1982). Applications of advances in nonlin-

ear sensitivity analysis. In System Modeling and Op-

timization, pages 762–770.

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

274