dimensional, comprising heart rate (HR), breathing
rate (BR), peripheral arterial oxygen saturation
(SpO
2
), and the arithmetic mean of systolic and
diastolic blood pressures (the systolic-diastolic aver-
age, or SDA), was acquired from 332 patients in a
step-down unit, and contains over 18,000 hours of
patient data (Tarassenko, 2005). It is impractical for
clinical experts to annotate such a large data-set in
its entirety.
The approach taken with the data-set was to de-
termine retrospectively which periods of patient data
exceeded standard “medical emergency team”
(MET) criteria (Smith, 2008). The latter are stan-
dard thresholds on each vital sign that, if exceeded,
should result in the clinical review of the patient.
Periods of patient data that exceeded the MET crite-
ria for at least four minutes were shown to a panel of
clinicians, who then determined which periods were
due to artefact (such as a sensor becoming detached
from the patient), and which were sufficiently ab-
normal to require patient review. We here term the
latter class labels of “abnormal” patient condition
C
2
, and will refer to examples of “normal” patient
condition to have class label C
1
.
As vital signs can take more extreme values
during periods of abnormal physiology, the distribu-
tions of data from periods labelled C
2
have heavier
tails than the distributions of data from “normal”
patients. However, there is significant overlap be-
tween the distributions of data from the two classes.
Additionally some types of physiological abnormal-
ity are not represented in the data-set as frequently
as other types; e.g., apnoea and bradycardia (low BR
and HR, respectively) are under-represented in com-
parison with tachypnoea and tachycardia (high BR
and HR, respectively). We found that this imbal-
ance leads to a linear classifier trained using such
data successfully classifying the majority of the
more well-represented tachypnoea and tachycardia
data, while misclassifying the under-represented
apnoea and bradycardia data.
Similarly, a non-linear classifier trained using the
original labels incorrectly includes the distribution
of bradycardia and apnoea data within its decision
boundary, and hence misclassifies test data from this
region of data space as belonging to class C
1
.
The remainder of this paper investigates methods
for refining class labels C
1
and C
2
, such that they
may be used to construct a multi-class classifier that
successfully classifies under-represented types of
“abnormal” data. We will illustrate the procedure
using bivariate analysis, such that the decision
boundary of a classifier may be examined. The
application to the full multivariate data-set (e.g., 4-
dimensional in this example) is considered in
Section 4.
3 REFINING CLINICAL LABELS
TO OPTIMISE THE
CLASSIFIER
The left-hand plot of Figure 1 shows all of the data
from the two classes in the bivariate space of HR
and BR. Clusters of C
2
data corresponding to ap-
noea, tachypnoea, bradycardia, and tachycardia may
be seen in the figure, although data from class C
1
often overlap with those clusters. Intuitively, we
wish to increase the separation between the two
classes such that a classifier trained using those
labels results in a decision boundary that correctly
classifies data from all modes of class C
2
. We pro-
pose a method for doing so using an estimate of the
probability density function (pdf) of the entire data-
set.
3.1 Defining a Multivariate
Distribution to Estimate Labels
We approximated the pdf of the whole data-set using
a Parzen windows estimator (Bishop, 2006), after
reducing the size of the data-set to 400 prototype
patterns using k-means clustering with k = 400 clus-
ter centres. The covariance σ
2
of the 400 kernels in
the pdf was set using the heuristic proposed in
(Bishop, 2006). Given some data-point x’, its den-
sity κ
x
= p(x’) defines a contour on the pdf. We then
define a probability P[κ
x’
] as follows:
P
=
(1)
where κ
m
= max[ p ], the density at the mode of the
pdf, p. Thus P[κ
x’
] is the probability mass contained
by integrating the pdf from its highest point down to
the probability density contour κ
x’
. This represents
the probability that some random data-point x dis-
tributed according to p will take a density value
higher than density value κ
x’
; i.e., P[κ
x’
] ≡ P[ p(x) ≥
κ
x’
]. Thus, as x’ varies throughout the data space,
its probability density will vary over the range
[0 κ
m
], and thus P[κ
x’
] will vary over the range [0 1]
correspondingly.
We define a threshold T on P[κ
x’
], and consider
which data have P[κ
x’
] ≥ T for varying values of T.
As described above, we expect data that lie fur-
thest from the mode of the distribution of the whole
BIOSIGNALS 2011 - International Conference on Bio-inspired Systems and Signal Processing
426