the number of occurrences of the m
i,n−1
(i.e. the
n-ary relation m
i,n
without the last binary relation
R
n−1,n
(C
n−1
,C
n
)).
Definition 6. Cover Rate. The cover rate Tc(m
i,n
) of
a n-ary relation m
i,n
is the ratio between the number
of occurrences of m
i,n
with the number of occurrences
of the final class C
n
of the n-ary relation m
i,n
.
When an n-ary relation m
i,n
satisfies these crite-
ria, m
i,n
is called a signature (Benayadi and Le Goc,
2008b). For Ta = 25% and Tc = 20%, all the n-ary
relations of the set M of the illustrative example are
signatures (S = M). These signatures are the only re-
lations (patterns) that are linked with the car system.
4 DISCUSSION
To evaluate the performance of TOM4L process, we
will report on the results obtained on the car exam-
ple (section 2) by TOM4L process and the three pop-
ular timed data mining algorithms Winepi(Mannila
et al., 1997), AprioriAll (Agrawal and Srikant, 1995)
and Minepi (Mannila et al., 1997). It shows that the
TOM4L process outperforms Winepi, AprioriAll and
Minepi in terms of the number of discovered patterns
and theirs accuracy. As we can see from the table
1 and the figure 3, TOM4L process outperforms the
three algorithms Winepi, AprioriAll and Minepi in
terms of number of the discovered patterns. Further-
more, TOM4L discovers patterns witch are consis-
tent with the structural model of the car system, while
most of the patterns discovered by Winepi, AprioriAll
and Minepi contradict this structural model.
Also, the three algorithms Winepi, AprioriAll and
Minepi require the setting of a set of parameters, so
the discovered patterns depend therefore on the val-
ues of this parameters (Mannila, 2002). To obtain an
interesting patterns, we must found the ideal set of pa-
rameters witch need to have some a priori knowledge
about the car system while this is precisely the global
aim of the Data Mining techniques.
Others experiments were made on sequences gen-
erated by complex dynamic process as blast furnace
process where they show that TOM4L approach con-
verges towards a minimal set of operational relations
and outperforms Winepi, AprioriAll and Minepi.
5 CONCLUSIONS
This paper presents the basis of the TOM4L process
for discovering temporal knowledge from timed mes-
sages generated by monitored dynamic process. The
TOM4L process is based on four steps: (1) a stochas-
tic representation of a given set of sequences from
which is induced (2) a minimal set of timed binary
relations, and an abductive reasoning (3) is then used
to build a minimal set of n-ary relations that is used to
find (4) the most representativen-ary relations accord-
ing to the given set of sequences. The induction and
the abductive reasoning are based on an interesting-
ness measure of the timed binary relations that allows
eliminating the relations having no meaning accord-
ing to the given set of sequences. Our experiment
on a very simple illustrative process, the car system
shows that TOM4L process outperforms literature ap-
proaches.
REFERENCES
Agrawal, R. and Srikant, R. (1995). Mining sequential pat-
terns. Proceedings of the 11th International Confer-
ence on Data Engineering (ICDE95), pages 3–14.
Benayadi, N. and Le Goc, M. (2008a). Discovering tempo-
ral knowledge from a crisscross of observations timed.
The proceedings of the 18th European Conference on
Artificial Intelligence (ECAI’08). University of Patras,
Patras, Greece.
Benayadi, N. and Le Goc, M. (2008b). Using a measure of
the crisscross of series of timed observations to dis-
cover timed knowledge. In Proceedings of the 19th
International Workshop on Principles of Diagnosis
(DX’08), Blue Mountains, Australia.
Blachman, N. M. (1968). The amount of information that
y gives about x. IEEE Transcations on Information
Theory IT, 14.
Le Goc, M. (2006). Notion d’observation pour le diagnostic
des processus dynamiques: Application `a Sachem et `a
la d´ecouverte de connaissances temporelles. HDR,
Facult´e des Sciences et Techniques de Saint J´erˆome.
Mannila, H. (2002). Local and global methods in data min-
ing: Basic techniques and open problems. 29th In-
ternational Colloquium on Automata, Languages and
Programming.
Mannila, H., Toivonen, H., and Verkamo, A. I. (1997). Dis-
covery of frequent episodes in event sequences. Data
Mining and Knowledge Discovery, 1(3):259–289.
Shannon, C. E. (1949). Communication in the presence of
noise. Institute of Radio Engineers, 37.
MINING TIMED SEQUENCES TO FIND SIGNATURES
455