A Modification of Training and Recognition Algorithms for Recognition
of Abnormal Behavior of Dynamic Systems
Victor Shcherbinin and Valery Kostenko
Department of Computational Mathematics and Cybernetics, Moscow State University, Moscow, Russia
Keywords:
Genetic Algorithm, Machine Learning, Supervised Learning, Recognition Algorithm, Dynamic System,
Training Set, Algebraic Approach.
Abstract:
We consider the problem of automatic construction of algorithms for recognition of abnormal behavior seg-
ments in phase trajectories of dynamic systems. The recognition algorithm is constructed using a set of
examples of normal and abnormal behavior of the system. We use axiomatic approach to abnormal behavior
recognition to construct abnormal behavior recognizers. In this paper we propose a modification of the genetic
recognizer construction algorithm and a novel DTW-based recognition algorithm within this approach. The
proposed modification reduces search space for the training algorithm and gives the recognition algorithm
more information about phase trajectories. Results of experimental evaluation show that the proposed modifi-
cation allows to reduce the number of recognition errors by an order of magnitude and to reduce the training
time by a factor of 2 in comparison to the existing recognizer and recognizer construction algorithm.
1 INTRODUCTION
Consider a dynamic system information about which
can be accessed by reading data from sensors sur-
rounding the system. The sensor readings are ob-
tained from sensors with a fixed frequency 1/τ.
A multidimensional phase trajectory in the space
of sensor readings is an ordered set of vectors X =
(x
1
, x
2
, . .., x
k
), where x
i
R
s
is a vector of sensor
readings at t = t
0
+ i · τ.
We assume that at any given moment of time the
system can be in one of two states:
Normal state. In this state, the system is fully
functional, and there are no signs that it is going
to lose any of its functionality any time soon.
Abnormal state. In this state, the system is not
fully functional or is going to lose some of its
functions soon.
The behavior (trajectory in the space of observed
parameters) that precedes a transition of the system
from normal state to abnormal state is called abnor-
mal behavior. We suppose that there are L classes
of abnormal behavior, each of these classes is charac-
terized by a phase trajectory X
l
Anom
called a reference
trajectory.
The observed phase trajectory X of the system can
have segments of abnormal behavior which are dis-
torted compared to the reference trajectories. The
distortions can be classified as amplitude distortions
and time distortions. We say that a segment of abnor-
mal behavior is distorted by amplitude compared to
a reference trajectory if values in some points of the
segment differ from those in the corresponding points
of the reference trajectory. We say that a segment of
abnormal behavior is distorted by time compared to a
reference trajectory if there are missing or extra points
in the segment compared to the reference trajectory.
An example of an amplitude distortion is a stationary
noise.
The problem of recognizing abnormal behavior
can be defined as follows. We have
An observed multidimensional trajectory X;
A set of L classes of abnormal behavior for each
of which reference trajectory X
l
Anom
is defined;
Recognition accuracy requirement:
e
I
const
1
, e
II
const
2
(1)
Here e
I
is the number of type I errors, e
II
is the
number of type II errors, const
1
and const
2
are
given numerical constraints.
We need to recognize abnormal behavior of the
system, i. e. to find abnormal behavior segments in
the trajectory X and abnormal behavior class number
for each segment found.
103
Shcherbinin V. and Kostenko V..
A Modification of Training and Recognition Algorithms for Recognition of Abnormal Behavior of Dynamic Systems.
DOI: 10.5220/0004552201030110
In Proceedings of the 5th International Joint Conference on Computational Intelligence (ECTA-2013), pages 103-110
ISBN: 978-989-8565-77-8
Copyright
c
2013 SCITEPRESS (Science and Technology Publications, Lda.)
This problem belongs to a class of pattern recog-
nition problems. A wide variety of methods are
used in the pattern recognition field, including the
methods based on artificial neural networks (Haykin,
1998), k-nearest neighbour algorithm (Cover and
Hart, 1967), algorithms based on Singular Spectrum
Analysis (Hassani, 2007), etc. However, applica-
tion of these methods and algorithms to this partic-
ular problem is complicated because of the presence
of non-linear amplitude and time distortions of ab-
normal behavior segments in the observed phase tra-
jectory X. To overcome these difficulties (emerg-
ing from the properties of dynamic systems in ques-
tion) a parametric family of recognition algorithms
based on algebraic approach was introduced in (Ko-
valenko et al., 2005). The idea of this parametric fam-
ily is based on the idea of using algebraic approach
to label planar configurations described in (Rudakov
and Chekhovich, 2003). A genetic training algo-
rithm for the parametric family was suggested in (Ko-
valenko et al., 2010). Results from (Kostenko and
Shcherbinin, 2013) show that this parametric fam-
ily of recognition algorithms demonstrates high tol-
erance to non-linear amplitude and time distortions of
abnormal behavior segments compared to other ap-
proaches. In this paper we describe a modification
of recognition algorithms from this parametric family
and a modification of the genetic training algorithm
from (Kovalenko et al., 2010).
2 CONSTRUCTION OF AN
ABNORMAL BEHAVIOR
RECOGNITION ALGORITHM
USING A SET OF EXAMPLES
We call a set of dynamic system’s trajectories T S =
{X} obtained in different conditions of its operation
or via simulation of the system a set of examples.
Each trajectory X from T S includes sections of nor-
mal and abnormal behavior. For each X T S the
starting point, the end point and abnormal behavior
class number of each abnormal behavior segment is
given.
The set of examples X is divided into three non-
overlapping parts: a set of reference trajectories
{X
l
Anom
}
L
l=1
, a training set
f
T S, and a validation set
c
T S.
The training set
f
T S and the validation set
c
T S have the
same size and contain trajectories that include both
abnormal and normal behavior segments.
Suppose we are given an objective function
ϕ(e
I
, e
II
) : Z
+
× Z
+
R
+
which is non-decreasing
w.r.t. both its arguments. The problem of automatic
construction of abnormal behavior recognition algo-
rithm from a set of examples is formulated as follows
(Kovalenko et al., 2005). Given
a set of reference trajectories {X
l
Anom
}
L
l=1
,
a training set
f
T S,
a validation set
c
T S,
an objective function ϕ(e
I
, e
II
),
produce a recognition algorithm Al that satisfies the
following conditions:
1. Al should show limited number of type I and type
II errors on the training set
f
T S:
e
I
(Al,
f
T S) const
1
, e
II
(Al,
f
T S) const
2
(2)
Here e
i
(Al, T S) is the number of type i errors that
Al makes on the trajectories from T S.
2. Al should minimize the objective function
ϕ(e
I
, e
II
) on the validation set
c
T S:
Al = argmin
Al
(ϕ(e
I
(Al,
c
T S), e
II
(Al,
c
T S))) (3)
3. Computational complexity Θ
Al
(m) of the recogni-
tion algorithm Al on any trajectory of length less
or equal to m should be limited by a given func-
tion θ(m):
Θ
Al
(m) θ(m) (4)
The function θ(m) is determined by the system
operation rate and the available processing power.
The problem definition described here corre-
sponds to the classic definition of the problem of
learning from examples (a. k. a. supervised learning
problem) described in (Vorontsov, 2004) and (Vapnik,
1998).
3 AXIOMATIC APPROACH TO
ABNORMAL BEHAVIOR
RECOGNITION
In this section we describe the parametric family of al-
gorithms for recognition of abnormal behavior of dy-
namic systems introduced in (Kovalenko et al., 2005).
3.1 Basic Notions
Let X = (x
1
, x
2
, . .. , x
k
), be a one-dimensional trajec-
tory, x
t
R.
An elementary condition ec = ec(t, X, p) is a
function defined on a point t and its neighborhood on
a trajectory X. It depends on a set of parameters p and
takes either true value or false value.
IJCCI2013-InternationalJointConferenceonComputationalIntelligence
104
An example of an elementary condition is
ec(t,X, p) =
true, if i [t l, t +r]
a x
i
b,
f alse, otherwise.
(5)
Here p = {a, b, l, r} is the set of parameters of
this elementary condition, a, b R, a < b, l, r N
+
.
We have introduced the concept of an elementary
condition for one-dimensional trajectories. However,
an s-dimensional trajectory can be regarded as a col-
lection of s one-dimensional trajectories. We intro-
duce elementary conditions for multidimensional tra-
jectories by adding to the elementary condition the
number of one-dimensional trajectory to which it is
applied as a parameter.
Let X = (x
1
, x
2
, . .. , x
k
) be a multidimensional tra-
jectory, x
i
R
s
.
An axiom a = a(t, X) is a function defined as a
Boolean formula over a set of elementary conditions
defined on a point t and its neighborhood on a multi-
dimensional trajectory X:
a(t, X) =
p
_
i=1
q
^
j=1
ec
i j
(t,X, p
i j
) (6)
We call a finite collection of axioms As =
{a
1
, a
2
, . .. , a
m
} an axiom system if it meets the con-
dition:
X x
t
X ! a
i
As : a(t,X) = true (7)
I. e. for any point t in any trajectory X there exists
one and only one axiom a
i
in axiom system As that is
true on point t.
A marking of a trajectory X = (x
1
, x
2
, . .. , x
k
) by
an axiom system As = {a
1
, a
2
, . .. , a
m
} is a finite se-
quence
J = ( j
1
, j
2
, ... , j
k
)
of numbers of axioms from as, such that a
j
t
is true on
the point t of trajectory X.
3.2 The Recognition Algorithm
We define our parametric family of recognition algo-
rithms S as a family of algorithms which recognize
abnormal behavior segments in trajectory X by per-
forming the following steps:
1. Perform marking of reference trajectories
{X
l
Anom
}
L
l=1
corresponding to different classes of
abnormal behavior by an axiom system As.
2. Perform marking of trajectory X by an axiom sys-
tem As. We denote the marking of trajectory X as
J.
3. Perform fuzzy search for reference trajectory
markings in marking J.
Each of the recognition algorithms from the para-
metric family S is defined by an axiom system and an
algorithm for searching for the markings of reference
trajectories in the marking of the observed trajectory.
The use of fuzzy search algorithms for searching
markings allows us to tackle time distortions. Algo-
rithms based on DTW (Keogh and Pazzani, 2001) are
used for marking search.
3.3 Construction of the Recognition
Algorithm
Construction of a recognition algorithm is performed
in two stages:
1. The recognition algorithm is constructed for each
pair (preprocessing algorithm, marking search al-
gorithm): parameters of the preprocessing algo-
rithm and marking search algorithm are deter-
mined, an axiom system is constructed.
2. From all constructed solutions we pick an algo-
rithm that shows the least value of the objective
function ϕ(e
I
, e
II
) on control sample
c
T S.
Local optimization algorithms are used to adjust the
parameters of preprocessing algorithm and marking
search algorithm. The greatest difficulty is posed by
construction of an axiom system.
4 AXIOM SET
TRANSFORMATION
In the existing axiom system construction algorithms
described in (Kovalenko et al., 2010), (Kostenko and
Shcherbinin, 2013) an axiom system As is constructed
by first constructing a set of axioms as for which the
condition (7) is not guaranteed to hold. Then a trans-
formation is applied to this set transforming it into an
axiom system which is guaranteed to satisfy (7).
The cited works use a transformation of axiom set
called prioritizing of axioms. Formally, prioritizing of
axioms is given by
As = {a
0
1
, . .. , a
0
m
, a
0
}, (8)
where
a
0
1
= a
1
a
0
2
= a
2
¬a
1
a
0
3
= a
3
¬a
2
¬a
1
(9)
.
.
.
a
0
m
= a
m
¬a
m1
... ¬a
2
¬a
1
a
0
= ¬a
m
¬a
m1
... ¬a
2
¬a
1
AModificationofTrainingandRecognitionAlgorithmsforRecognitionofAbnormalBehaviorofDynamicSystems
105
One can readily see that (7) is satisfied for As.
Note that in order to apply this transformation to an
axiom set as one needs to choose an order (i. e. pri-
ority) on the axiom set. That means that there are m!
ways to apply this transformation to a set of m axioms.
This increases the search space of axiom system con-
struction algorithms since they search for an axiom
system which minimizes the objective function.
In this paper we propose a different type of trans-
formation which we call superset construction. The
idea is to map to point t not one axiom from as but
the whole subset of axioms from as which are true on
point t. Formally we can construct an axiom for each
subset of as in the following way:
As = {a
0
0
, . .. , a
0
2
m
1
}, (10)
where
a
0
0
= ¬a
1
¬a
2
... ¬a
m1
¬a
m
a
0
1
= ¬a
1
¬a
2
... ¬a
m1
a
m
a
0
2
= ¬a
1
¬a
2
... a
m1
¬a
m
(11)
a
0
3
= ¬a
1
¬a
2
... a
m1
a
m
.
.
.
a
0
2
m
1
= a
1
a
2
... a
m1
a
m
One can readily see that the result of this transfor-
mation is guaranteed to satisfy (7), i. e. this trans-
formation can be used in construction algorithms in
place of prioritizing of axioms. The proposed trans-
formation has the following properties:
Each axiom a
0
As bijectively corresponds to a
subset ˜as as. The fact that a
0
is true on a point
t of a trajectory X is equivalent to the fact that
each axiom from the corresponding subset ˜as is
true on t and each axiom from as\ ˜as is false on t.
Marking search algorithms can exploit the struc-
ture of the subsets corresponding to the elements
of marking to improve the recognition quality.
There is only one way to apply superset construc-
tion to an axiom set. That means that axiom sys-
tem construction algorithms that use this transfor-
mation will have lesser search space compared to
ones that use prioritizing of axioms.
5 GENETIC AXIOM SYSTEM
CONSTRUCTION ALGORITHM
Genetic axiom system construction algorithm de-
scribed in (Kovalenko et al., 2010) uses prioritizing
of axioms as the axiom set transformation. An indi-
vidual in this algorithm is an ordered set of axioms.
In this paper we propose a modification of this algo-
rithm that uses superset construction as the axiom set
transformation. Two principal modifications of the al-
gorithm are:
Modification of the structure of population indi-
viduals. An individual in the new algorithm is
an unordeded set of aixoms, since we don’t need
to choose an order in axiom set to apply superset
construction.
Modification of mutation and crossover oper-
ations. Since the structure of an individual
is changed, we need to adjust mutation and
crossover operations. In particular, we need to re-
move priority changing from these operations.
5.1 General Scheme of the Algorithm
The goal of genetic axiom system construction al-
gorithm is to construct an axiom system that mini-
mizes objective function ϕ(e
I
, e
II
) on validation set
c
T S given a fixed preprocessing algorithm and a fixed
marking search algorithm.
The general scheme of the proposed algorithm is
as follows:
1. Generation of the initial population.
2. Iterative optimization of the population:
(a) Mutation of individuals.
(b) Crossover of individuals and expansion of the
population.
(c) Selection of individuals and reduction of the
population.
(d) Checking of termination condition: iterative
process is repeated until the termination con-
dition is met.
An individual in the population Pl is an (un-
ordered) set of axioms:
Pl = {as
i
}, as
i
= {a
i
1
, a
i
2
, . .. , a
i
m
i
}
To compute the objective function ϕ(e
I
, e
II
) for an
individual axiom set as
i
we apply superset construc-
tion transformation described in section 4 to as
i
deriv-
ing an axiom system As
i
, run recognition algorithm as
described in subsection 3.2 using axiom system As
i
,
and calculate the number of type I and type II errors
and the objective function ϕ(e
I
, e
II
).
During generation of the initial population in-
dividual axiom sets are generated randomly. Ax-
ioms are randomly constructed from a finite set
of elementary conditions using boolean operations
{and, or, not}. The used set of elementary conditions
is defined by the user. Axiom sets are randomly con-
structed from generated axioms.
IJCCI2013-InternationalJointConferenceonComputationalIntelligence
106
The algorithm ends if any of the following condi-
tions is met:
We found an axiom set as for which the value of
objective function ϕ(e
I
, e
II
) is 0.
The number of iterations of the algorithm ex-
ceeded a predefined value I
all
.
The number of iterations without decreasing of
the lowest objective function value in the popu-
lation exceeded a predefined value I
stop
.
5.2 Genetic Algorithm Operations
Selection operation has two parameters: N
max
as
maximal number of axiom sets in population, p
[0, 1] the fraction of best axiom sets that survive
selection.
During selection the next generation population is
formed from bN
max
as
· pc axiom sets with lowest ob-
jective function value and N
max
as
bN
max
as
· pc axioms
chosen randomly from the current population.
Mutation and crossover are defined on three lev-
els: elementary condition level, axiom level, axiom
set level.
5.2.1 Elementary Condition Level
The mutation operation at this level alters parameter
values of an elementary condition:
ec(t, X, p)
ec(t, X, p) with probability 1 P
mut
ec
ec(t, X, p
0
) with probability P
mut
ec
(12)
Here p
0
= m
ec
(p,
mut
ec
) is a new set of parameter
values of elementary condition ec, m
ec
(p,
mut
ec
) is a
mutation function that alters parameters of ec. This
function is specific for each type of elementary con-
dition.
Parameters of this operation are P
mut
ec
muta-
tion probability,
mut
ec
degree of mutation (it deter-
mines how much parameters of elementary condition
change).
The crossover operation produces an elementary
condition from two parent elementary conditions of
the same type:
(ec
1
(t, X, p
1
), ec
2
(t, X, p
2
)) ec
12
(t,X, p
12
) (13)
The new elementary condition ec
12
has the same
type as its parents, each parameter value is inherited
from one of the parents. For each parameter, the par-
ent from which to inherit its value is chosen randomly.
The probability of crossover of an elementary condi-
tion P
cross
ec
is a parameter of this operation.
5.2.2 Axiom Level
The mutation operation at this level works on axioms:
a
a with probability 1 P
mut
a
m
a
(a,
mut
a
) with probability P
mut
a
(14)
Here m
a
(a,
mut
a
) is a mutation function that ran-
domly adds a new elementary condition to the axiom,
removes an elementary condition from the axiom, re-
places an elementary condition with a randomly gen-
erated one, or changes the boolean operation between
two elementary conditions (i. e. replaces and with or
or vice-versa, adds or removes not).
Parameters of this operation are P
mut
a
mutation
probability,
mut
a
degree of mutation (i. e. the frac-
tion of affected elementary conditions).
The crossover operation at this level generates a
new axiom containing a combination of elementary
conditions of the two parent axioms:
(a
1
, a
2
) a
12
(15)
Each elementary condition in a
12
is either inher-
ited from one of its parents or generated as a result of
crossover between two elementary conditions of the
same type randomly selected from different parents.
The probability of crossover of an axiom P
cr
a
is a pa-
rameter of this operation.
5.2.3 Axiom Set Level
The mutation operation at this level is defined as fol-
lows:
as
as with probability 1 P
mut
as
m
as
(as,
mut
as
) with probability P
mut
as
(16)
Here m
as
(as,
mut
as
) is a mutation function that adds
or removes an axiom from axiom set as.
Parameters of this operation are P
mut
as
mutation
probability,
mut
as
degree of mutation (i. e. the frac-
tion of affected axioms).
The crossover operation at this level generates a
new axiom set from two parent axiom sets:
(as
1
, as
2
) as
12
Each axiom in as
12
is either inherited from one of
its parents or generated as a result of crossover be-
tween two axioms randomly selected from different
parents. The probability of crossover of an axiom set
P
cr
as
is a parameter of this operation.
AModificationofTrainingandRecognitionAlgorithmsforRecognitionofAbnormalBehaviorofDynamicSystems
107
5.3 Selection of Parameters of Genetic
Algorithm Operations
Each individual axiom set from the population and
also each axiom from these axiom sets has param-
eters associated with it which control the mutation
and crossover operations. To select these parame-
ters for each individual on each step of the genetic
algorithm we follow (Kovalenko et al., 2010) and in-
troduce functions that evaluate population individuals
and their parts and use them to adjust parameters of
genetic algorithm operations for axiom sets and ax-
ioms. We call these functions evaluation functions.
We define an evaluation function M
as
for axiom
sets as follows:
M
as
= c
1
e
I
+ c
2
e
II
+ c
3
ϕ
s
(e
I
,e
II
)
ϕ
min
(e
I
,e
II
)
(17)
Here e
I
, e
II
are the numbers of type I and type II
errors; ϕ
s
(e
I
,e
II
) is the objective function value for
axiomatic set as on the training set
f
T S; ϕ
min
(e
I
,e
II
) is
the lowest objective function value in the population;
c
i
are given positive constants.
We define an evaluation function M
a
for axioms
as follows:
M
a
= c
4
M
as
+ c
5
Sec(
f
T S) num
a
Sec(
f
T S)
+ c
6
L re f
a
L
(18)
Here M
as
is the evaluation function value for the
axiom set that contains axiom a; Sec(
f
T S) is the num-
ber of abnormal behavior segments in the training set
f
T S; num
a
is the number of points in the training set
f
T S on which a is true; L is the number of reference
trajectories; re f
a
is the number of points in reference
trajectories on which a is true; c
i
are given constants.
Parameters of mutation and crossover operations
for axiom sets and axioms are determined according
to the value of evaluation functions:
[P
mut
as
,
mut
as
, P
cr
as
] = F
1
(M
as
) (19)
P
mut
a
,
mut
a
, P
cr
a
, P
mut
ec
,
mut
ec
, P
cr
ec
= F
2
(M
a
) (20)
Functions F
1
and F
2
are chosen so as to satisfy the
following conditions:
All parameters of mutation and crossover opera-
tions should take a value within [0, 1].
All parameters of mutation operation:
P
mut
ec
,
mut
ec
, P
mut
a
,
mut
a
, P
mut
as
,
mut
as
should be
directly proportional to the corresponding
evaluation function.
All parameters of the crossover operation:
P
cr
ec
, P
cr
a
, P
cr
as
should be inversely proportional to
the corresponding evaluation function.
Adjustment of parameters of mutation and
crossover operations at every step for each individ-
ual allows to improve the algorithm convergence and
to obtain better results.
6 DTW-BASED MARKING
SEARCH ALGORITHM
In subsection 4 we point out that if axiom system As is
a result of superset construction from axiom set as =
{a
1
, a
2
, . .. , a
m
}, each axiom a
0
As corresponds to a
subset of as. We can say that marking J of a trajectory
X by axiom system As corresponds to a sequence
˜
J of
sets
˜
j
t
as:
˜
J = (
˜
j
1
,
˜
j
2
, . .. ,
˜
j
k
) (21)
Here
˜
j
t
= { j : a
j
(t, X) = true} We will denote the
set sequence
˜
J corresponding to marking J as s(J).
In this paper we propose a marking search algo-
rithm that analyses axiom set sequences correspond-
ing to markings. The proposed algorithm is based on
DTW distance between two finite sequences of arbi-
trary elements (Keogh and Pazzani, 2001).
The goal of a marking search algorithm is to find
segments which correspond to abnormal behavior tra-
jectory markings in the observed trajectory marking.
The algorithm uses sliding window approach and con-
sists in the following:
1. We choose the point in the observed trajectory
marking J from which we start recognition:
t = min
l=1,L
(N
min
l
) (22)
Here
N
min
l
= b(1 s) · len(X
l
Anom
)c is the minimal
window length for recognition of class l abnor-
mal behavior segments. We use not one but sev-
eral window lengths to better tackle time distor-
tions.
s (0, 1) is a parameter that controls minimal
and maximal window lengths (relative to the
length of reference trajectory).
2. For each class l whose reference trajectory mark-
ing length is not greater than t we check the fol-
lowing conditions:
DTW (s(J
l
Anom
), s(J
tN
min
l
:t
)) p
l
DTW (s(J
l
Anom
), s(J
tN
min
l
1:t
)) p
l
.
.
.
DTW (s(J
l
Anom
), s(J
tN
max
l
+1:t
)) p
l
(23)
Here
IJCCI2013-InternationalJointConferenceonComputationalIntelligence
108
N
max
l
= d(1 + s) · len(X
l
Anom
)e is the maximal
window length for recognition of class l abnor-
mal behavior segments.
J
tN
min
l
:t
is the marking of the segment of the
observed trajectory from point (t N
min
l
) to
point t.
p
l
is a parameter which determines how close
the marking of a segment should be to the refer-
ence trajectory marking for it to be considered
an abnormal behavior segment.
If any of conditions (23) is met, the corresponding
segment is considered a class l abnormal behavior
segment.
3. Move to the next point (t t +1). If t = len(X )+
1 then the algorithm stops. Otherwise the algo-
rithm proceeds from item 2.
Note that we compute all DTW distances from
(23) in one go, i. e. with time complexity O(N
max
l
len(X
l
Anom
)), using the approach described in (M
¨
uller,
2007).
7 EXPERIMENTAL EVALUATION
During experiments we used the genetic axiom sys-
tem construction algorithm described in section 5 to-
gether with DTW-based marking search algorithm de-
scribed in section 6. The experiments were conducted
on artificial data generated by using a software pro-
gram that can generate a set of precedents with given
characteristics and given reference abnormal behav-
ior trajectories. The values in the points of normal be-
havior segments were generated so that they obey the
Gaussian distribution. The abnormal behavior seg-
ments were generated as stretched or squeezed ref-
erence abnormal behavior trajectories with Gaussian
noise applied to them.
We compared the results and training time for rec-
ognizers constructed by existing algorithm described
in (Kovalenko et al., 2010) with the results and train-
ing time for recognizers constructed by the algorithm
proposed in this paper.
7.1 Distance Functions
To be able to compute DTW distance between two se-
quences of subsets of as we need to define a distance
function on 2
as
, i. e. a function d : 2
as
× 2
as
[0, 1]
which measures the degree of difference between two
subsets of as. We used the following distance func-
tions during experiments:
1. Trivial distance function:
d(as
1
, as
2
) =
(
1 as
1
6= as
2
0 as
1
= as
2
(24)
2. Distance function based on Jaccard coefficient
(Tan et al., 2005):
d(as
1
, as
2
) =
(
1
|as
1
as
2
|
|as
1
as
2
|
if as
1
as
2
6=
0 otherwise
(25)
3. Normalized Hamming metric (Hamming, 1950):
d(as
1
, as
2
) = 1
|as
1
as
2
|
|as|
(26)
4. The following distance function proposed by the
authors:
d(as
1
, as
2
) =
1
|as
1
as
2
|
|min(|as
1
|,|as
2
|)|
if as
1
as
2
6=
1 if as
1
as
2
= ,
as
1
as
2
6=
0 otherwise
(27)
7.2 Results
The results for a dataset with time and amplitude dis-
tortions of abnormal segments relative to reference
trajectories being up to 10%, test trajectory length
of 3000 points and 2 abnormal behavior classes are
shown in table 1. The results show that we were
able to achieve better recognition quality and notably
higher training speed when we used Hamming metric
and the proposed metric compared to the existing al-
gorithm. The proposed metric behaved slightly better.
Other metrics demonstrated results that were worse
than the results for the existing algorithm.
Table 1: Results for a dataset with time and amplitude dis-
tortions up to 10%, test trajectory length of 3000 points and
2 abnormal behavior classes. e
I
is the number of type I er-
rors, e
II
is the percent of type II errors.
e
I
e
II
Training time
Existing algorithm 17 0% 2 h. 12 min.
Trivial metric 6 5% 51 min.
Jaccard metric 27 0% 1 h. 17 min.
Hamming metric 1 0% 17 min.
Proposed metric 0 0% 15 min.
The results for a dataset with higher time and am-
plitude distortions (up to 30%), test trajectory length
of 3000 points and 2 abnormal behavior classes are
shown in table 2. The results show that we were able
AModificationofTrainingandRecognitionAlgorithmsforRecognitionofAbnormalBehaviorofDynamicSystems
109
to achieve better recognition quality and more than
twice lesser training time when we used Hamming
metric and the proposed metric compared to the ex-
isting algorithm. The proposed metric again behaved
slightly better.
Table 2: Results for a dataset with time and amplitude dis-
tortions up to 30%, test trajectory length of 3000 points and
2 abnormal behavior classes.
e
I
e
II
Training time
Existing algorithm 18 0% 2 h. 33 min.
Trivial metric 29 0% 1 h. 33 min.
Jaccard metric 52 0% 1 h. 18 min.
Hamming metric 6 0% 54 min.
Proposed metric 1 0% 53 min.
Overall results show that recognizers that are con-
structed with the proposed algorithm and use either
Hamming metric or metric proposed by the authors
can achieve better recognition quality while requiring
less time for training even in the presence of ampli-
tude and time distortions up to 30%.
8 CONCLUSIONS
This paper considers the problem of automatic con-
struction of algorithms that recognize segments of ab-
normal behavior in multidimensional phase trajecto-
ries of dynamic systems. The recognizers are con-
structed using a set of examples of normal and abnor-
mal behavior of the system. We employ axiomatic ap-
proach to abnormal behavior recognition to construct
recognizers of abnormal behavior. In this paper we
propose a modification of the way a set of axioms
is transformed into an axiom system during recog-
nizer construction. This modification implies modi-
fication of the training and recognition algorithm. We
present modified genetic recognizer construction al-
gorithm and DTW-based search algorithm.
Results of experimental evaluation of the pro-
posed algorithms show that they allowed to decrease
the number of errors by one order of magnitude com-
pared to the old training and recognition algorithms
and recognizer training took less than half of the time
it took to train a recognizer using the old algorithms.
REFERENCES
Cover, T. and Hart, P. (1967). Nearest neighbor pattern clas-
sification. Information Theory, IEEE Transactions on,
13(1):21–27.
Hamming, R. W. (1950). Error detecting and error correct-
ing codes. Bell System Tech. J., 29:147–160.
Hassani, H. (2007). Singular spectrum analysis: Method-
ology and comparison. Journal of Data Science,
5(2):239–257.
Haykin, S. (1998). Neural Networks: A Comprehensive
Foundation. Prentice Hall PTR, Upper Saddle River,
NJ, USA, 2nd edition.
Keogh, E. J. and Pazzani, M. J. (2001). Derivative dynamic
time warping. In First SIAM International Conference
on Data Mining (SDM2001).
Kostenko, V. A. and Shcherbinin, V. V. (2013). Training
methods and algorithms for recognition of nonlinearly
distorted phase trajectories of dynamic systems. Opti-
cal Memory and Neural Networks, 22:8–20.
Kovalenko, D. S., Kostenko, V. A., and Vasin, E. A. (2005).
Investigation of applicability of algebraic approach to
analysis of time series. In Proceedings of II Interna-
tional Conference on Methods and Tools for Informa-
tion Processing, pages 553–559. (in Russian).
Kovalenko, D. S., Kostenko, V. A., and Vasin, E. A. (2010).
A genetic algorithm for construction of recognizers
of anomalies in behaviour of dynamical systems. In
Proceedings of 5th IEEE Int. Conf. on Bio Inspired
Computing: Theories and Applications, pages 258–
263. IEEEPress.
M
¨
uller, M. (2007). Information Retrieval for Music and
Motion. Springer-Verlag New York, Inc., Secaucus,
NJ, USA.
Rudakov, K. V. and Chekhovich, Y. V. (2003). Algebraic
approach to the problem of synthesis of trainable al-
gorithms for trend revealing. Doklady Mathematics,
67(1):127–130.
Tan, P.-N., Steinbach, M., and Kumar, V. (2005). Introduc-
tion to Data Mining, (First Edition). Addison-Wesley
Longman Publishing Co., Inc., Boston, MA, USA.
Vapnik, V. (1998). Statistical Learning Theory. Wiley-
Interscience.
Vorontsov, K. V. (2004). Combinatorial substantiation of
learning algorithms. Journal of Comp. Maths Math.
Phys, 44(11):1997–2009.
IJCCI2013-InternationalJointConferenceonComputationalIntelligence
110