The system model f(y
t
|u
t
, d(t − 1)) obtained after
“excluding” unknown parameters via identification is
essentially predictor of the output y
t
. Its performance
depends weakly on overestimation of the structure of
respective regressors but it is significantly influenced
by the assumption that the component weights are
time invariant. The assumption allows independent
jumps between active (the best describing) compo-
nents irrespectively of u
t
, d(t − 1). This condition is
met in some applications but in the considered tech-
nical ones is unrealistic: usually, the system is de-
scribed just by a subset (often with a single term) of
components for some period of time. Under this situa-
tion, the output prediction based on the whole mixture
is poor. This problem can be overcome by detecting
and utilizing the active components for time periods
in question.
3 BASIC IDEA
Particular short data record, which is to be processed
contains too few samples for valuable identification.
The basic idea consists in rearrangement of data and
their identification by a dynamic Bayesian mixture.
The mixture or its selected components are then used
for prediction.
3.1 Rearrangement of data
To illustrate the method, let us simulate a simple ex-
ample of n
r
= 10 one-dimensional data records each
consisting of n
d
= 5 samples generated by the model
y
k
= a
1
y
k−1
, (2)
where a
1
= 0.6, y
1
= 75 and k = 2, . . . , 5 is the
discrete time index.
Data can be depicted by a mesh plot shown on the
upper graph of Fig. 1. For the sake of identification
particular records can be merged into a single vector
with l = n
r
· n
d
items as shown on the lower graph
of the same figure. Then, the overall sample index is
t = 1, . . . , l.
3.2 Identification – deterministic
case
Mixtools package (Nedoma et al., 2002) was used for
mixture identification. For the given deterministic ex-
ample, the result came up to expectation exactly. The
mixture is composed of two components, one corre-
sponding to the model dynamics (2) and another mod-
elling transitions among records.
1
2
3
4
5
1
2
3
4
5
6
7
8
9
10
0
50
100
record number
Data transformation
k (sample index)
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
0
20
40
60
80
k (sample index)
5 10 15 20 25 30 35 40 45 50
t (overall samples index)
Figure 1: Data rearrangement. Short data records generated
by a simple model shown on the upper mesh plot are merged
into the single vector shown on the lower graph.
3.3 Prediction and evaluation
criterion
The mixture was identified in order to get valuable
prediction. One-step-ahead prediction is considered
for the sake of simplicity. For an m-order model the
prediction is accomplished for (n
d
− m) time instants
for a single data record ( y
c;k
means prediction by c-th
component):
y
p;k
=
n
c
X
c=1
α
c
y
c;k
, k = m + 1, . . . , n
d
(3)
Predictions are treated as merged original data
forming a vector y
p
(t), t = 1, . . . , l. To evaluate pre-
diction quality the following modified quadratic cri-
terion E
s
is used (subscript s stands for selected in-
stants of t for which predictions are evaluated):
E
s
=
1
l
l
X
t=1
(y
t
− y
p;t
)
2
. (4)
Fig. 2 shows the original data and predictions for
3 randomly chosen succeeding records. The whole
mixture, ie. both components for this case were used
for prediction on the upper graph. It is obvious that
the prediction is poor (E
s
= 52.9). The lower graph
shows predictions calculated from the selected com-
ponent. For this deterministic case the component
matches the model (2) exactly and thus the prediction
is perfect (E
s
= 0).
3.4 Component selection
Criterion for selection of components to be used for
prediction is crucial for the mentioned principle.
IDENTIFICATION AND PREDICTION OF MULTIPLE SHORT RECORDS BY DYNAMIC BAYESIAN MIXTURES
67