is more important. Indeed, the application of back-
ward window allows to expand continuation patterns
with those containing unordered events on a window
size interval. Then, where GSP considers no fre-
quent patterns, our approach searches through back-
ward window redundant information and extracts a
frequent pattern. The approach presented in (Hirate
and Yamana, 2006) has as input a sequences database,
a value of minsupp, time constraints and a time func-
tion to align timestamps. This approach and ours use
the same process of extracting patterns based on Pre-
fixSpan algorithm. The main difference concerns the
amount of data. While the use of the sliding window
can group events by degrees relative to the size of the
window, the function level has only the events whose
timestamps are in the same level. So, we end up with
more frequent patterns due to the sliding form of the
window, which groups gradually close events. Con-
crete results are presented in the next section.
is more important. Indeed, the application of back-
ward window allows to expand continuation patterns
with those containing unordered events on a window
size interval. Then, where GSP considers no fre-
quent patterns, our approach searches through back-
ward window redundant information and extracts a
frequent pattern. The approach presented in (Hirate
and Yamana, 2006) has as input a sequences database,
a value of minsupp, time constraints and a time func-
tion to align timestamps. This approach and ours use
the same process of extracting patterns based on Pre-
fixSpan algorithm. The main difference concerns the
amount of data. While the use of the sliding window
can group events by degrees relative to the size of the
window, the function level has only the events whose
timestamps are in the same level. So, we end up with
more frequent patterns due to the sliding form of the
window, which groups gradually close events. Con-
crete results are presented in the next section.
Algorithm 2: Principal.
Input: SDB, minsupp, mingap, max gap,
min whole interval, max whole interval, ws,
Patterns
Patterns = null;
Find frequent items in SDB ;
foreach frequent item I do
prefix = (0, I);
SDB |
(0,I)
= Projection(SDB, (0,I), 0) ;
foreach pair (δt
f
, I
f
) in Find Frequent Pairs
(SDB|
(0,I)
, C1, C2) do
newprefix = concat(prefix, (δt
f
, I
f
));
if newprefix satisfies min whole interval and
max whole interval then
SDB|
(0,I)|(δt
f
,I
f
)
= Projection(SDB|
(0,I)
,
(t
f
, I
f
), ws) ;
Projection*(SDB|
(0,I)|(δt
f
,I
f
)
, minsupp,;
mingap, maxgap, min whole interval,;
max whole interval,
newF pre f ix, FSeq) ;
if newprefix 6∈ Patterns then
Add newprefix to Patterns ;
5 EXPERIMENTS
In this section, we present a qualitative experimenta-
tion of our approach. In a first part, the data used for
our experimentation are described. Then, we detail a
performance evaluation of the process used by the ap-
proaches, to motivate the method that we use to im-
plement our work. In a third part, we compare our im-
plementation to a GSPM implementation of patterns
growth process.
Algorithm 3: Projection*
Input: SDB, minsupp, mingap, maxgap,
min whole interval,
max whole interval, pre f ix, Patterns)
foreach Pair (f(t), t) in Find Frequent Pairs
(SDB, mingap, maxgap) do
newprefix = concat(prefix, (f(t),I));
if newprefix satisfies min whole interval and
max whole interval then
if support( f (t), I) ≥ minsupp then
Projection*(SDB|
( f (t)
p
,I
p
)
, mingap,
maxgap,;
min whole interval,
max whole interval,;
newprefix, patterns );
Add newprefix to Patterns;
Data Description. We applied our algorithms to real
aeronautical data related to a life history of six same
aircraft. These data represent missions, reports car-
ried out on different part of the vehicles and equip-
ments maintenance tasks execution. It is organized
on temporal sequences. A sequence is built by ac-
cumulating successive occurred events on an aircraft
between occurrence of a specific maintenance task.
Preprocessed sequences, from all vehicles and ended
with the application of a same maintenance task, rep-
resent lists of temporal events preceding the execu-
tion of the task. Extracting patterns from this database
consists in identifying commonly usages that lead to
the application of this maintenance task. It allows to
distinguish maintenance operations that use common
root causes. Table 1 represents a sequences history
sample for the task op m1. We used a GSP imple-
Table 1: Sample of preprocessed sequences.
ID Sequences
S 1 h(t=0, taxi, sale),(t=223, PARAPUB-
LIC, sandy ), (t=300, EMS, normal),
(t=330, report 1),(t=490, PARAPUB-
LIC, normal),(t=520, op m1)i
S 2: h(t=0, PARAPUBLIC, sandy), (t=190,
taxi,normal), (t=324, OEM, salt),
(t=500, op m1 ) i
S 3: h(t=0, EMS, normal), (t=190,taxi,salt),
(t=340, PARAPUBLIC, normal)(t=390,
report 1),(t=400 , op m1 )i
mentation available in WEKA
1
without any time con-
straints implementation. We also modified an imple-
mentation of (Fournier-Viger et al., 2008)
2
to obtain
the GSPM implementation. We modified the same
1
http://www.cs.waikato.ac.nz/ ml/weka/
2
http://www.philippe-fournier-viger.com/spmf
5 EXPERIMENTS
In this section, we present a qualitative experimenta-
tion of our approach. In a first part, the data used for
our experimentation are described. Then, we detail a
performance evaluation of the process used by the ap-
proaches, to motivate the method that we use to im-
plement our work. In a third part, we compare our im-
plementation to a GSPM implementation of patterns
growth process.
is more important. Indeed, the application of back-
ward window allows to expand continuation patterns
with those containing unordered events on a window
size interval. Then, where GSP considers no fre-
quent patterns, our approach searches through back-
ward window redundant information and extracts a
frequent pattern. The approach presented in (Hirate
and Yamana, 2006) has as input a sequences database,
a value of minsupp, time constraints and a time func-
tion to align timestamps. This approach and ours use
the same process of extracting patterns based on Pre-
fixSpan algorithm. The main difference concerns the
amount of data. While the use of the sliding window
can group events by degrees relative to the size of the
window, the function level has only the events whose
timestamps are in the same level. So, we end up with
more frequent patterns due to the sliding form of the
window, which groups gradually close events. Con-
crete results are presented in the next section.
Algorithm 2: Principal.
Input: SDB, minsupp, mingap, max gap,
min whole interval, max whole interval, ws,
Patterns
Patterns = null;
Find frequent items in SDB ;
foreach frequent item I do
prefix = (0, I);
SDB |
(0,I)
= Projection(SDB, (0,I), 0) ;
foreach pair (δt
f
, I
f
) in Find Frequent Pairs
(SDB|
(0,I)
, C1, C2) do
newprefix = concat(prefix, (δt
f
, I
f
));
if newprefix satisfies min whole interval and
max whole interval then
SDB|
(0,I)|(δt
f
,I
f
)
= Projection(SDB|
(0,I)
,
(t
f
, I
f
), ws) ;
Projection*(SDB|
(0,I)|(δt
f
,I
f
)
, minsupp,;
mingap, maxgap, min whole interval,;
max whole interval,
newF pre f ix, FSeq) ;
if newprefix 6∈ Patterns then
Add newprefix to Patterns ;
5 EXPERIMENTS
In this section, we present a qualitative experimenta-
tion of our approach. In a first part, the data used for
our experimentation are described. Then, we detail a
performance evaluation of the process used by the ap-
proaches, to motivate the method that we use to im-
plement our work. In a third part, we compare our im-
plementation to a GSPM implementation of patterns
growth process.
Algorithm 3: Projection*
Input: SDB, minsupp, mingap, maxgap,
min whole interval,
max whole interval, pre f ix, Patterns)
foreach Pair (f(t), t) in Find Frequent Pairs
(SDB, mingap, maxgap) do
newprefix = concat(prefix, (f(t),I));
if newprefix satisfies min whole interval and
max whole interval then
if support( f (t), I) ≥ minsupp then
Projection*(SDB|
( f (t)
p
,I
p
)
, mingap,
maxgap,;
min whole interval,
max whole interval,;
newprefix, patterns );
Add newprefix to Patterns;
Data Description. We applied our algorithms to real
aeronautical data related to a life history of six same
aircraft. These data represent missions, reports car-
ried out on different part of the vehicles and equip-
ments maintenance tasks execution. It is organized
on temporal sequences. A sequence is built by ac-
cumulating successive occurred events on an aircraft
between occurrence of a specific maintenance task.
Preprocessed sequences, from all vehicles and ended
with the application of a same maintenance task, rep-
resent lists of temporal events preceding the execu-
tion of the task. Extracting patterns from this database
consists in identifying commonly usages that lead to
the application of this maintenance task. It allows to
distinguish maintenance operations that use common
root causes. Table 1 represents a sequences history
sample for the task op m1. We used a GSP imple-
Table 1: Sample of preprocessed sequences.
ID Sequences
S 1 h(t=0, taxi, sale),(t=223, PARAPUB-
LIC, sandy ), (t=300, EMS, normal),
(t=330, report 1),(t=490, PARAPUB-
LIC, normal),(t=520, op m1)i
S 2: h(t=0, PARAPUBLIC, sandy), (t=190,
taxi,normal), (t=324, OEM, salt),
(t=500, op m1 ) i
S 3: h(t=0, EMS, normal), (t=190,taxi,salt),
(t=340, PARAPUBLIC, normal)(t=390,
report 1),(t=400 , op m1 )i
mentation available in WEKA
1
without any time con-
straints implementation. We also modified an imple-
mentation of (Fournier-Viger et al., 2008)
2
to obtain
the GSPM implementation. We modified the same
1
http://www.cs.waikato.ac.nz/ ml/weka/
2
http://www.philippe-fournier-viger.com/spmf
Data Description. We applied our algorithms to real
aeronautical data related to a life history of six same
aircraft. These data represent missions, reports car-
ried out on different part of the vehicles and equip-
ments maintenance tasks execution. It is organized
on temporal sequences. A sequence is built by ac-
cumulating successive occurred events on an aircraft
between occurrence of a specific maintenance task.
Preprocessed sequences, from all vehicles and ended
with the application of a same maintenance task, rep-
resent lists of temporal events preceding the execu-
tion of the task. Extracting patterns from this database
consists in identifying commonly usages that lead to
the application of this maintenance task. It allows to
distinguish maintenance operations that use common
root causes. Table 1 represents a sequences history
sample for the task op m1. We used a GSP imple-
Table 1: Sample of preprocessed sequences.
ID Sequences
S 1 h(t=0, taxi, sale),(t=223, PARAPUB-
LIC, sandy ), (t=300, EMS, normal),
(t=330, report 1),(t=490, PARAPUB-
LIC, normal),(t=520, op m1)i
S 2: h(t=0, PARAPUBLIC, sandy), (t=190,
taxi,normal), (t=324, OEM, salt),
(t=500, op m1 ) i
S 3: h(t=0, EMS, normal), (t=190,taxi,salt),
(t=340, PARAPUBLIC, normal)(t=390,
report 1),(t=400 , op m1 )i
mentation available in WEKA
1
without any time con-
straints implementation. We also modified an imple-
mentation of (Fournier-Viger et al., 2008)
2
to obtain
the GSPM implementation. We modified the same
1
http://www.cs.waikato.ac.nz/ ml/weka/
2
http://www.philippe-fournier-viger.com/spmf
TIME CONSTRAINTS EXTENSION ON FREQUENT SEQUENTIAL PATTERNS
285