to some model of normality (such as a Gaussian Mix-
ture Model, or GMM), which is summarised in Sec-
tion 1.4. This process is automatic, and requires only
the selection of a probabilistic novelty threshold (e.g.,
P(x) ≤ 0.99) in order to achieve accurate identifica-
tion of patient deterioration.
1.3 Contributions in this Paper
Our previously-proposed work has a number of limi-
tations:
1. The system described in (Clifton et al., 2009b)
uses EVT for determining when multivariate test
data are “extreme” with respect to a model of nor-
mality. In this case, a fully multimodal model
is allowed, such as a GMM comprised of many
Gaussian kernels. However, it is a numerical al-
gorithm that requires large quantities of sampling,
making it unsuitable for on-line learning of mod-
els that are frequently updated.
2. The system described in (Hugueny et al., 2009)
provides a closed-form solution to the problems
posed in (1) such that sampling is avoided, but is
valid only for unimodal multivariate models con-
sisting of a single Gaussian kernel. In practice,
such single-kernel models are too simple to de-
scribe the distribution of training data accurately.
Thus, there is a need for an EVT algorithm that
allows multimodal, multivariate models of normality
to be constructed, overcoming the unimodal limita-
tion of (2), while being computationally light-weight,
overcoming the heavy sampling-based limitation of
(1). This paper proposes such a method, described
in Section 2, illustrates its use with synthetic data in
Section 3, and presents results from a large patient
monitoring investigation in Section 4.
1.4 Classical Extreme Value Theory
If we have a univariate probability distribution de-
scribing some univariate data, F(x), classical EVT
(Embrechts et al., 1997) provides a distribution de-
scribing where the most “extreme” of m points drawn
from that distribution will lie. For example, if we
draw m samples from a univariate Gaussian distribu-
tion, EVT provides a distribution that describes where
the largest of those m samples will lie. It also pro-
vides a distribution that describes where the smallest
of those m samples will lie. These distributions de-
termined by EVT are termed the Extreme Value Dis-
tributions (EVDs). The EVDs tell us where the most
“extreme” data generated from our original distribu-
tion will lie under “normal” condition after observ-
ing m data. Thus, if we observe data which are more
extreme than where we would expect (as determined
by the EVDs), we can classify these data “abnormal”,
and generate an alarm. This process lies at the heart
of using EVT for patient monitoring, where we can
classify observed vital signs as “extreme” if EVT de-
termines that they lie further than one would expect
under “normal” conditions (given by the EVDs).
Though classical EVT is defined only for univari-
ate data, we present a generalisation of EVT to mul-
tivariate, multimodal models as described later in this
paper.
To state this introduction more formally, consider
{x
m
}, a set of m independent and identically dis-
tributed random variables (iid rvs), which are univari-
ate, and where each x
i
∈ R is drawn from some under-
lying distribution F(x). We define the maximum of
this set of m samples to be M
m
= max(x
1
,x
2
,.. .,x
m
).
EVT tells us the distribution of where to expect this
maximum, M
m
, and, by symmetrical argument, the
distribution of the minimum in our dataset. The fun-
damental theorem of EVT, the Fisher-Tippett theorem
(Fisher and Tippett, 1928), shows that the distribution
of the maximum, M
m
, depends on the form of the dis-
tribution F(x), and that this distribution of M
m
can
only take one of three well-known asymptotic forms
in the limit m → ∞: the Gumbel, Fr
´
echet, or Weibull
distributions.
The Fisher-Tippett theorem also holds for the dis-
tribution of minima, as minima of {x
m
} are maxima
of {−x
m
}. EVDs of minima are therefore the same as
EVDs of maxima, with a reverse x-axis. The Gumbel,
Fr
´
echet, and Weibull distributions are all special cases
of the Generalised Extreme Value (GEV) distribution,
H
+
GEV
(x;γ) = exp
−[1 + γx]
−1/γ
. (1)
where γ is a shape parameter. The cases γ → 0, γ > 0
and γ < 0 give the Gumbel, Fr
´
echet and Weibull dis-
tributions, respectively. In the above, the superscript
‘+’ indicates that this is the EVD describing the max-
imum of the m samples generated from F(x).
1.5 Redefining “Extrema”
Classical univariate EVT (uEVT), as described above,
cannot be directly applied to the estimation of multi-
variate EVDs. In the case of patient monitoring, for
example, our data will be multivariate, where each di-
mension of the data corresponds to a different channel
of measurement (heart-rate, respiration-rate, SpO
2
,
etc.) In this multivariate case, we no longer wish to
answer the question “how is the sample of greatest
magnitude distributed?”, but rather “how is the most
improbable sample distributed?” This will allow us,
as will be shown in Section 2, to generalise uEVT to
BIOSIGNALS 2010 - International Conference on Bio-inspired Systems and Signal Processing
6