press them. Its algorithm is hence specific to this
assumption. We compared CkTail with Assess and,
as expected, we showed that Assess builds imprecise
models when event logs include communications.
6 CONCLUSION
This paper has proposed CkTail, an approach that
learns models of communicating systems from event
logs. Our algorithm improves the model precision
by integrating the identification of dependency rela-
tions among components and by better detecting ses-
sions in event logs to extract traces. Unlike CSight,
which targets the same kind of systems, CkTail re-
quires as inputs one event log only. Then, it builds ex-
ecution traces while trying to recognise complete ses-
sions with respect to 4 constraints, whereas the other
approaches rely on one or two rules for the trace seg-
mentation. The constraints used by CkTail are specif-
ically related to communicating systems and restrict
the trace generation w.r.t. the association of request-
s/responses, time delay, data dependency, component
identification. Besides, CkTail infers DAGs show-
ing the component dependencies. They offer another
viewpoint of the component interactions and system
architecture, and they may be used to different pur-
poses, e.g., testability measurement, or security anal-
ysis.
As future work, we firstly plan to evaluate CkTail
on further kinds of systems, e.g., Web service com-
positions. The trace analysis step relies upon some
assumptions for finding sessions in event logs when
these are not identified by means of a session mech-
anism. But, if sessions are clearly identified in mes-
sages, these assumptions can be relaxed and the algo-
rithm reduced. We will investigate this possibility in a
future work to redesign the first step of CkTail so that
it also supports session identification.
ACKNOWLEDGEMENT
Research supported by the French Project
VASOC (Auvergne-Rh
ˆ
one-Alpes Region):
https://vasoc.limos.fr/
REFERENCES
Ammons, G., Bod
´
ık, R., and Larus, J. R. (2002). Mining
specifications. SIGPLAN Not., 37(1):4–16.
Beschastnikh, I., Brun, Y., Ernst, M. D., and Krishna-
murthy, A. (2014). Inferring models of concurrent
systems from logs of their behavior with csight. In
Proceedings of the 36th International Conference on
Software Engineering, ICSE 2014, pages 468–479,
New York, NY, USA. ACM.
Biermann, A. and Feldman, J. (1972). On the synthesis of
finite-state machines from samples of their behavior.
Computers, IEEE Transactions on, C-21(6):592–597.
Fu, Q., Lou, J.-G., Wang, Y., and Li, J. (2009). Execu-
tion anomaly detection in distributed systems through
unstructured log analysis. 2009 Ninth IEEE Interna-
tional Conference on Data Mining, pages 149–158.
Groz, R., Li, K., Petrenko, A., and Shahbaz, M. (2008).
Modular system verification by inference, testing and
reachability analysis. In Suzuki, K., Higashino, T.,
Ulrich, A., and Hasegawa, T., editors, Testing of Soft-
ware and Communicating Systems, pages 216–233,
Berlin, Heidelberg. Springer Berlin Heidelberg.
Krka, I., Brun, Y., Popescu, D., Garcia, J., and Medvi-
dovic, N. (2010). Using dynamic execution traces and
program invariants to enhance behavioral model infer-
ence. In Proceedings of the 32Nd ACM/IEEE Interna-
tional Conference on Software Engineering - Volume
2, ICSE ’10, pages 179–182, New York, NY, USA.
ACM.
Lo, D., Mariani, L., and Santoro, M. (2012). Learning ex-
tended fsa from software: An empirical assessment.
Journal of Systems and Software, 85(9):2063 – 2076.
Selected papers from the 2011 Joint Working IEEE/I-
FIP Conference on Software Architecture (WICSA
2011).
Lorenzoli, D., Mariani, L., and Pezz
`
e, M. (2008). Auto-
matic generation of software behavioral models. In
Proceedings of the 30th International Conference on
Software Engineering, ICSE’08, pages 501–510, New
York, NY, USA. ACM.
Makanju, A., Zincir-Heywood, A. N., and Milios, E. E.
(2012). A lightweight algorithm for message type ex-
traction in system application logs. IEEE Transactions
on Knowledge and Data Engineering, 24(11):1921–
1936.
Mariani, L. and Pastore, F. (2008). Automated identification
of failure causes in system logs. In Software Reliabil-
ity Engineering, 2008. ISSRE 2008. 19th International
Symposium on, pages 117–126.
Mariani, L., Pastore, F., and Pezze, M. (2011). Dy-
namic analysis for diagnosing integration faults. IEEE
Transactions on Software Engineering, 37(4):486–
508.
Messaoudi, S., Panichella, A., Bianculli, D., Briand, L., and
Sasnauskas, R. (2018). A search-based approach for
accurate identification of log message formats. In Pro-
ceedings of the 26th Conference on Program Compre-
hension, ICPC ’18, pages 167–177, New York, NY,
USA. ACM.
Ohmann, T., Herzberg, M., Fiss, S., Halbert, A., Palyart,
M., Beschastnikh, I., and Brun, Y. (2014). Behavioral
resource-aware model inference. In Proceedings of
the 29th ACM/IEEE International Conference on Au-
tomated Software Engineering, ASE ’14, pages 19–
30, New York, NY, USA. ACM.
Pastore, F., Micucci, D., and Mariani, L. (2017). Timed k-
tail: Automatic inference of timed automata. In 2017
CkTail: Model Learning of Communicating Systems
37