2 PREVIOUS WORK
The prioritisation mechanism proposed in this paper
is founded on time series classification. Many time
series classification approaches have been proposed.
One of the most popular, and that used with respect to
the work presented in this paper, is k Nearest Neigh-
bour (kNN) classification. The fundamental idea of
kNN classification is to compare a previously unseen
time series, which we wish to label, with a “bank”
of time series whose labels are known, identify the k
most similar and use the labels from the k most simi-
lar to label the previously unseen time series. Usually
k = 1 is adopted because it avoids the need for any
conflict resolution.
Time series classification using kNN entails two
challenges: (i) the data format for the input time se-
ries, and (ii) the nature of the similarity (distance)
measure to be used to establish similarity (Wang et al.,
2013). There are two popular time series formats:
(i) instance-based and (ii) feature-based. Using the
instance-based format the original time series format
is maintained. Using the feature-based representation,
properties of the time series are used (Wang et al.,
2008). For the work presented in this paper the in-
stance based format was used. There are a number of
similarity measure options including Euclidean, Man-
hattan and Minkowski distance measurement, but Dy-
namic Time Warping (DTW) is considered to be the
most effective with respect to the instance-based for-
mat, and offers the additional advantage that the time
series considered do not have to be of the same length
(Wang et al., 2013). For the work presented in this
paper DTW was adopted.
The recent success of deep learning offers a more
substantive way of processing time series than in the
case of traditional models. Among many deep learn-
ing techniques, Recurrent Neural Networks(RNNs)
are considered as an effective way of classifying time
series, because they allow for the processing of vari-
able length inputs and outputs by maintaining state
information across time steps. There are many ex-
amples in the literature where RNNs have been used
with respect to time series classification; see for ex-
ample (Choi et al., 2016; Esteban et al., 2015). Long
Short-Term Memory (LSTM) networks are a popular
form of RNNs. The advantage of RNNs in general,
and LSTMs in particular, is that they have shown to
be more accurate, with respect to time series classi-
fication, then kNN. However, kNN does not require
significant training or large amounts of training data
as in the case of RNNs (LSTMs). There are many
variations of LSTMs (Greff et al., 2016). In this pa-
per, the standard “vanilla” LSTM setup was used.
3 APPLICATION DOMAIN
The work presented in this paper is focused on the
Urea and Electrolytes (U&E) test; a commonly used
test to detect abnormalities of blood chemistry, pri-
marily kidney (renal) function and dehydration. A
U&E test is usually performed to confirm normal kid-
ney function or to exclude a serious imbalance of
biochemical salts in the bloodstream. The U&E test
data considered in this paper comprised, for each test,
measurement of levels of: (i) Sodium (So), (ii) Potas-
sium (Po), (iii) Urea (Ur), (vi) Creatinine (Cr), and
(v) Bicarbonate (Bi). The measurement of each is re-
ferred to as a “task”, thus we have five tasks per test.
In other words each U&E test results in five pathology
values. It is suggested that U&E pathology results can
be prioritised more precisely if the trend of the his-
torical records is taken into consideration, therefore
providing more efficient treatment for patients with a
potential risk of renal function conditions. Given a
new set of pathology values for a U&E test we wish
to determine the priority to be associated with this set
of values.
4 FORMALISM
In the context of the foregoing, the assumption is that
the training data comprises a set of pathology results,
D = {P
1
, P
2
, . . . }, where the class (event) label c for
each pathology record P
j
∈ D is known. As the focus
of the work is U&E test data, which comprises five
tasks (components), each record P
j
∈ D is of the form:
P
j
= hId, Date, Gender, T
So
, T
Po
, T
Ur
, T
Cr
, T
Bi
, ci (1)
Where T
so
to T
bi
are five multi-variate time series rep-
resenting, in sequence, pathology results for the five
tasks typically found in a U&E test: Sodium (So),
Potassium (Po), Urea (U r), Creatinine (Cr) and Bicar-
bonate (Bi); and c is the class label taken from a set of
classes C. Each time series T
i
has three dimensions:
(i) pathology result, (ii) normal low and (iii) normal
high. The normal low and high dimensions indicate
a “band” in which pathology results are expected to
fall. These values are less volatile than the pathology
result values themselves, but do change for each pa-
tient over time. Thus each times series T
i
comprises a
sequence of tuples, of the form hv, nl, nhi (pathology
result, normal low and normal high respectively).
To derive the class label for each record P
j
∈ D
reference was made to the outcome event(s) associ-
ated with each patient. For the evaluation presented
later in this paper, three outcome events were consid-
ered: (i) Emergency Patient (EP), an In-Patient (IP)
KDIR 2021 - 13th International Conference on Knowledge Discovery and Information Retrieval
122