all time values by fitting kernel density estimators and
computing their local minima. For every local mini-
mum BUTLA creates a subevent. This preprocessing
shall remove the necessity of split operations.
In (Klerx et al., 2014) a Probabilistic Determin-
istic Timed-Transition Automaton (PDTTA) and an
algorithm for learning PDTTAs are presented. The
learning algorithm does not split events (like BUTLA)
or transitions based on time values (like RTI+). In-
stead, it learns the event structure using any state-
of-the-art algorithm (e.g. ALERGIA; (Carrasco and
Oncina, 1994)) and approximates the time values per
transition via kernel density estimators. Hence, it mod-
els the time behavior in more detail, is easier to learn,
but cannot detect temporal substructures.
8 CONCLUSION
RTI+ is an efficient algorithm that learns PDRTAs
from timed sequences. We have revealed a deficit of
RTI+ in learning broadened intervals for time values.
Combined with the independence of symbol and time
probability distributions, this deficit leads to wrong
predictions of sequences. We have investigated that
two of three types of gaps cause the broadened inter-
vals and developed our novel IDA procedure to remove
those gaps. IDA has been integrated into the RTI+ al-
gorithm, which we now call Extended RTI+ (eRTI+).
We have shown that IDA is an effective way to elim-
inate the disadvantage of the independent time and
symbol probability distributions used by RTI+. For
our experiment with an artificial example PDRTA, IDA
was able to identify and remove all gaps in intervals.
IDA was also able to improve the results in the exper-
iment with ATM fraud detection. Although IDA did
not work optimal on this real-world data, we are confi-
dent that this result can be improved further. IDA is a
very flexible and adaptable procedure. As mentioned
in Section 6, we want to apply IDA after RTI+ has
terminated instead of integrating IDA into the proce-
dure in the future. Furthermore, we plan to replace our
statistical outlier detection by established clustering
algorithms and density estimation procedures.
REFERENCES
Ankerst, M., Breunig, M. M., Kriegel, H.-P., and Sander,
J. (1999). OPTICS: Ordering Points to Identify the
Clustering Structure. In SIGMOD’99, ACM Interna-
tional Conference on Management of Data, pages 49–
60. ACM.
Carrasco, R. C. and Oncina, J. (1994). Learning Stochas-
tic Regular Grammars by Means of a State Merging
Method. In ICGI’94, 2nd International Colloquium
on Grammatical Inference and Applications, pages
139–152. Springer.
Dima, C. (2001). Real-Time Automata. Journal of Automata,
Languages and Combinatorics, 6(1):3–23.
Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A. J.,
and Vapnik, V. (1997). Support Vector Regression Ma-
chines. In NIPS’96, 9th Neural Information Processing
Systems Conference, pages 155–161. MIT Press.
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996). A
Density-Based Algorithm for Discovering Clusters in
Large Spatial Databases with Noise. In KDD’96, 2nd
International Conference on Knowledge Discovery and
Data Mining, pages 226–231. AAAI Press.
Klerx, T., Anderka, M., Kleine B
¨
uning, H., and Priesterjahn,
S. (2014). Model-Based Anomaly Detection for Dis-
crete Event Systems. In ICTAI’14, 26th IEEE Interna-
tional Conference on Tools with Artificial Intelligence,
pages 665–672. IEEE Computer Society.
Lang, K. J., Pearlmutter, B. A., and Price, R. A. (1998). Re-
sults of the Abbadingo One DFA Learning Competition
and a New Evidence-Driven State Merging Algorithm.
In ICGI’98, 4th International Colloquium Conference
on Grammatical Inference, pages 1–12. Springer.
Leys, C., Ley, C., Klein, O., Bernard, P., and Licata, L.
(2013). Detecting Outliers: Do Not Use Standard
Deviation around the Mean, Use Absolute Deviation
around the Median. Journal of Experimental Social
Psychology, 49(4):764–766.
Maier, A. (2015). Identification of Timed Behavior Mod-
els for Diagnosis in Production Systems. PhD thesis,
University of Paderborn.
Parzen, E. (1962). On Estimation of a Probability Den-
sity Function and Mode. The Annals of Mathematical
Statistics, 33(3):1065–1076.
Pelleg, D. and Moore, A. W. (2000). X-means: Extending
K-means with Efficient Estimation of the Number of
Clusters. In ICML’00, 7th International Conference on
Machine Learning, pages 727–734. Morgan Kaufmann
Publishers Inc.
Tukey, J. W. (1977). Exploratory Data Analysis. Pearson.
Verwer, S., de Weerdt, M., and Witteveen, C. (2010). A
Likelihood-Ratio Test for Identifying Probabilistic De-
terministic Real-Time Automata from Positive Data.
In ICGI’10, 10th International Colloquium Conference
on Grammatical Inference, pages 203–216. Springer.
Verwer, S., Weerdt, M., and Witteveen, C. (2012). Efficiently
Identifying Deterministic Real-Time Automata from
Labeled Data. Machine Learning, 86(3):295–333.
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
358