Termination of the Chandy-Lamport Algorithm:
Checkpoints. The algorithm by Szymanski, Shy,
and Prywes (the SSP algorithm, for short) detects
when each process has reached its termination condi-
tion. Running both the Chandy-Lamport and the SSP
algorithms, each process can thus locally and anony-
mously detect when all processes have computed their
local snapshot.
This algorithms combination gives us a funda-
mental debugging feature: it defines checkpoints for
the distributed algorithm of interest. First in case of
system failure, the computation can be restarted from
the last valid checkpoint. Second our debugger can
offer step-by-step forward and rewind functionalities.
Global Predicates. From execution checkpoints,
we can run once again the SSP algorithm to evaluate
GP (graph invariants). The most obvious predicate is
the termination detection of the monitored algorithm.
Hence, in ViSiDiA we obtain a semi automatic de-
bugging; user control is required to react against GP
evaluation (e.g., a detected predicate could be a sys-
tem failure).
A more elegant approach is, still from the execu-
tion checkpoints, to apply an adaptation of the algo-
rithm by Mazurkiewicz (Mazurkiewicz, 1997) which
gives a distributed way to compute graph coverings.
More precisely, each process can compute a graph
from which the network graph is a covering. From
this graph, predicates can be locally analyzed and ver-
ified; processes can then automatically react against
any system state.
4 CONCLUSIONS
In this paper, we presented a new design of the Vi-
SiDiA platform for the simulation and visualization
of distributed algorithms. We added debugging fea-
tures with a fully-distributed approach in the context
of anonymous and asynchronous networks. These are
made effortless accessible to users: the ViSiDiA API
contains new primitives, and the GUI offers visualiza-
tion of debugging information along with algorithm
execution. We also introduced a new method to build
our debugger.
Our proposal helps in monitoring a distributed
system, determining its global state from local in-
formation and detecting failures. We set a check-
point and rollback recovery system, and implemented
a semi automatic debugger. User oversight can be re-
leased computing local graph coverings.
We plan to focus on this technique, and to visual-
ize the graphs within each process in a multi-scale ap-
proach. Finally, our theoretical basis can be extended
to rewriting rules and mobile agents.
REFERENCES
Bauderon, M., Gruner, S., M´etivier, Y., Mosbah, M., and
Sellami, A. (2001). Visualization of distributed al-
gorithms based on labeled rewriting systems. In GT-
VMT’01, volume 50 of ENTCS, pages 229–239.
Bauderon, M. and Mosbah, M. (2003). A unified frame-
work for designing, implementing and visualizing dis-
tributed algorithms. ENTCS, 72(3):13 – 24.
Ben-Ari, M. (2001). Interactive execution of distributed al-
gorithms. J. Educ. Resour. Comput., 1.
Carr, S., Fang, C., Jozwowski, T., Mayo, J., and Shene, C.-
K. (2003). Concurrent mentor: A visualization system
for distributed programming education. In PDPTA’03.
Chalopin, J., M´etivier, Y., and Morsellino, T. (2011). On
snapshots and stable properties detection in anony-
mous fully distributed systems. submitted.
Chandy, K. M. and Lamport, L. (1985). Distributed snap-
shots: Determining global states of distributed sys-
tems. ACM Trans. Comput. Syst., 3(1):63–75.
Chang, X. (1999). Network simulations with OPNET, pages
307–314. ACM.
Derbel, B. and Mosbah, M. (2003). Distributing the exe-
cution of a distributed algorithm over a network. In
INFOVIS’03, pages 485 – 490.
Guerraoui, R. and Ruppert, E. (2005). What can be imple-
mented anonymously? In DISC, pages 244–259.
Koldehofe, B., Papatriantafilou, M., and Tsigas, P. (2003).
Integrating a simulation-visualisation environment in
a basic distributed systems course: a case study using
lydian. In ITiCSE’03, pages 35–39. ACM.
Matocha, J. and Camp, T. (1998). A taxonomy of dis-
tributed termination detection algorithms. Journal of
Systems and Software, 43(3):207–221.
Mazurkiewicz, A. (1997). Distributed enumeration. Inf.
Processing Letters, 61:233–239.
Moses, Y., Polunsky, Z., Tal, A., and Ulitsky, L. (1998).
Algorithm visualization for distributed environments.
In INFOVIS’98, pages 71–78.
Pongor, G. (1993). Omnet: Objective modular network
testbed. In MASCOTS ’93, pages 323–326.
Raynal, M. (1988). Networks and distributed computation.
MIT Press.
Stasko, J. T. and Kraemer, E. (1993). A methodology for
building application-specific visualizations of parallel
programs. J. Parallel Distrib. Comput., 18:258–264.
Szymanski, B., Shy, Y., and Prywes, N. (1985). Synchro-
nized distributed termination. IEEE Transactions on
software engineering, SE-11(10):1136–1140.
Tel, G. (2000). Introduction to distributedalgorithms. Cam-
bridge University Press.
Yamashita, M. and Kameda, T. (1996). Computing on
anonymous networks: Part i - characterizing the solv-
able cases. IEEE TPDS, 7(1):69–89.
FULLY-DISTRIBUTED DEBUGGING AND VISUALIZATION OF DISTRIBUTED SYSTEMS IN ANONYMOUS
NETWORKS
767