FULLY-DISTRIBUTED DEBUGGING AND VISUALIZATION

OF DISTRIBUTED SYSTEMS IN ANONYMOUS NETWORKS

C´edric Aguerre, Thomas Morsellino and Mohamed Mosbah

LaBRI, CNRS, Universit´e de Bordeaux 351, Cours de la Lib´eration, 33405 Talence, France

Keywords:

Distributed Algorithm, Visualization, Debugging, Anonymous Network, Snapshot, Global Predicate Evalua-

tion.

Abstract:

The debugging of distributed algorithms is a major challenge which greatly beneﬁts from the help of an in-

teractive and informative human-computer interface. In this paper we present ViSiDiA, a platform for the

visualization, simulation and debugging of distributed algorithms. Our approach respects real-life constraints

such as process anonymity and privacy, network synchronicity. We propose a new fully-distributed method

for the debugging and monitoring of distributed systems, based on the computation of global states and global

predicates from local information in anonymous and asynchronous networks. We show how the debug infor-

mation can be visualized concurrently with the algorithm execution.

1 INTRODUCTION

The analysis and understanding of distributed algo-

rithms involved in complex information systems are

fundamental. These algorithms must be proved, im-

plemented, debugged and tested in a context where

several processes collaborate to the execution of a

same task. Main issues concern concurrent access to

resources, critical failure detection, or even process

communication strategy.

In recent years, several tools help in assessing the

question of simulation and visualization of distributed

algorithms (Moses et al., 1998; Stasko and Kraemer,

1993; Koldehofe et al., 2003; Ben-Ari, 2001; Carr

et al., 2003; Pongor, 1993; Chang, 1999). Most of

them consider that processes have unique identiﬁers

or have particular knowledge on the network. How-

ever in large and heterogeneous networks, processes

may not have unique identiﬁers or may not wish to di-

vulge them for privacy reasons (Guerraoui and Rup-

pert, 2005). We thus focus on fully-distributeddebug-

ging issues in anonymous networks.

In this paper, we expose a new design of ViSiDiA,

a tool for simulating and visualizing distributed algo-

rithms. We add debugging features in both simulation

and visualization parts. We are interested in a way of

visualizing debug information along with algorithm

execution, and we present our theoretical approachfor

debugging.

2 THE VISIDIA PLATFORM

Concept. ViSiDiA

(Visualization and Simulation

of Distributed Algorithms) is a tool aiming at simu-

lating and visualizing the execution of distributed al-

gorithms (Bauderon et al., 2001; Derbel and Mosbah,

2003; Bauderon and Mosbah, 2003), used as an edu-

cational and research utility.

Distributed networks can be deﬁned using an edi-

tor in ViSiDiA. Algorithms are run along with a visu-

alization which respects events sequencing. The user

can interact with the network whilst the simulation ex-

ecutes. This all makes it possible to study algorithms,

to detect errors or to compute complexity.

ViSiDiA encompasses a simulation core, a Graph-

ical User Interface (GUI), and an Application Pro-

gramming Interface (API) to implement distributed

algorithms thanks to a set of simple primitives.

Illustrative Example. For the sake of clarity, we

here illustrate visual components using a standard

Broadcast algorithm applied to a small distributed

network with a message passing model (Yamashita

and Kameda, 1996). The network is modeled as

a graph, whose nodes and edges correspond to au-

tonomous processes and to communication links, re-

spectively. The graph is such that one node is la-

beled A, all the others being labeled N. We call X-

http://visidia.labri.fr

764

Aguerre C., Morsellino T. and Mosbah M..

FULLY-DISTRIBUTED DEBUGGING AND VISUALIZATION OF DISTRIBUTED SYSTEMS IN ANONYMOUS NETWORKS.

DOI: 10.5220/0003861807640767

In Proceedings of the International Conference on Computer Graphics Theory and Applications (IVAPP-2012), pages 764-767

ISBN: 978-989-8565-02-0

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: Overview of the visual interface. (a): Settings are accessible on top and left panels. Algorithm executes on the

network. The user can observe changes on node and edge states, as well as transiting messages. A button on left panel

launches the debugger. (b-c): Debugging results displayed during algorithm execution. (b): monitoring process 2 with the

state of its incoming channels. (c): some evaluated predicates such as algorithm termination.

node a node with label X.

The algorithm is then the following. At any time,

each A-node sends a message to each of its neighbors,

and each N-node receiving a message changes its la-

bel to A and propagates the message to its neighbors.

When a message is delivered, the edge between the

sender and the receiver takes a special marked state.

The algorithm terminates when no N-nodes re-

main. All marked edges and connected nodes form

a spanning tree of the initial graph.

Visual Semantics. Nodes are circles whose inner

part represents a label according to a user-modiﬁable

color palette. In Figure 1(a), inner parts of A-nodes

and N-nodes are ﬁlled with red and green, respec-

tively. A node outer part indicates if the node is se-

lected (red) or not (black). Below a node appear its

properties: an id, a label, or any user-deﬁned value.

The user controls which information is displayed.

Edges are line segments, or arrows if they are

oriented. Default visualization uses thin black lines.

Thick lines indicate marked edges. A selected edge is

red, additional colors representing user-deﬁned states.

Other properties, if any, appear close to the edge.

Messages are textual information sliding from a

sender node to a receiver node. Different colors are

used according to the message types deﬁned by the

running algorithm. In the case of Broadcast algo-

rithm, there are wave messages (in red) and acknowl-

edgements (in blue). The user can switch on and off

message display during the simulation.

Dynamic changes in color, position and thickness

of graph elements give an instantaneous global per-

ception of the running algorithm. To access local

or detailed information, simulation speed can be ad-

justed or the simulation can be paused, edge/node

properties can be modiﬁed, the message display can

be adapted to ﬁt user requirements.

The Simulation Core. A node owns an au-

tonomousthread which only operates on its own prop-

erties and its connected edges. As the network is

anonymous each process holds a copy of the simu-

lated algorithm.

A unique simulation console manages requests

from processes. For example a process can ask it for

sending a message. The console is then responsible

for delivering the message, pushing it into the receiver

FIFO-based mailbox. An event/acknowledgement

system ensures the order of requests.

2.1 Monitoring and Debugging Feature

Message Passing. Consider a distributed system in

which a process crashes: the corresponding thread

is deadlocked, whereas its neighbors are still waiting

FULLY-DISTRIBUTED DEBUGGING AND VISUALIZATION OF DISTRIBUTED SYSTEMS IN ANONYMOUS

NETWORKS

765

for its termination. We want our debugger to propa-

gate the information “a neighbor has crashed”. As the

debugging operator cannot be the deadlocked thread

itself, we associate to each process a second thread

used for debugging.

A process owns two threads: one for algorithm ex-

ecution, the other for debugging. In the console now

transit both execution and debug messages. A process

distributes each message according to its type to the

appropriate thread. The debugging thread monitors

both the algorithm thread and its incoming messages.

If the algorithm fails, all messages are processed by

the debugging thread (Figure 2).

Figure 2: Message passing at process scale. A process con-

tains a mailbox and two threads: one for algorithm execu-

tion, the other for debugging. Left: In normal case, mes-

sages are routed according to their type. Right: If the algo-

rithm crashes, the debugging thread takes over.

Visual Components. The ViSiDiA graphical inter-

face contains a button to launch the debugger during

an algorithm execution. Debug information shows up

as a tree view for processes and incoming channels

(Figure 1(b)), and some evaluated global predicates

are listed in another panel (Figure 1(c)). This infor-

mation can also be visualized when the mouse hovers

over a node. See Section 3 for details on debug infor-

mation nature.

API Extension. Debugging features have been

added to the ViSiDiA API. Algorithm developers can

thus monitor the value of speciﬁed variables and test

if a global predicate occurs, just adding a few lines

of code. In the case of Broadcast (Algorithm 1), we

follow changes in the value of processes label (regis-

terVariable method). We have created a global pred-

icate sp and we tell the debugger to use it (addGlob-

alPredicate method). Finally, we want a feedback on

algorithm termination (setTerminated method).

3 DEBUGGING ALGORITHMS

A debugger needs network snapshots, composed of

both processes and communication channels states.

Such global snapshots are computed using only local

information processes exchange. These snapshots are

then exploited to evaluate global predicates (GP, for

short), i.e., properties which remain true as soon as

they are veriﬁed (Tel, 2000). The main motivation for

GP evaluation is to react against particular situations

which can occur in distributed systems.

Our solution (Chalopin et al., 2011) is a com-

bination of the Chandy-Lamport algorithm (Chandy

and Lamport, 1985) with an algorithm by Szymanski,

Shy, and Prywes (Szymanski et al., 1985) to compute

snapshots and to evaluate GP anonymously.

Computing Snapshots. The Chandy-Lamport al-

gorithm determines global snapshots in which each

process has computed its local snapshot within ﬁ-

nite time. Once local snapshots are computed, this

knowledge is fully distributed over the system then

exploited by processes. From this knowledge, one can

simulate a global clock (Raynal, 1988), nevertheless

it does not enable iterated computation of snapshots.

Another way to exploit this knowledge is based on

wave algorithms: a message is passed to each process

by a single initiator according to the network topology

(Matocha and Camp, 1998). These solutions are not

available in the context of anonymous networks with

no distinguished processes and no particular knowl-

edge on the topology.

begin

static

Message

wave = new

Message

(”Wave”, true);

static

Message

ack = new

Message

(”Ack”, true, Color.blue);

int arity =

getArity

();

String label =

getProperty

(”label”);

registerVariable

(”labelstart”, label);

addGlobalPredicate

(sp);

if label.compareTo(”A”) == 0 then

for neighbor = 0 to arity-1 do

sendTo

(neighbor, Broadcast.wave);

else

Door

door =

receiveMessage

();

int doorNum = door.

getNum

();

sendTo

(doorNum, Broadcast.ack);

putProperty

(”label”, new String(”A”));

registerVariable

(”labelin progress”, label);

setDoorState

(new

MarkedState

(true), doorNum);

for neighbor = 0 to arity-1 do

if neighbor != doorNum then

sendTo

(neighbor, Broadcast.wave);

registerVariable

(”labelend”, label);

setTerminated

(true);

Algorithm 1: Example of a Broadcast algorithm written

in Java using the ViSiDiA API. In black, the algorithm as

written without any debug procedure. In blue, the only 5

lines of code needed to enable debugging this algorithm

on the visual interface.

IVAPP 2012 - International Conference on Information Visualization Theory and Applications

766

Termination of the Chandy-Lamport Algorithm:

Checkpoints. The algorithm by Szymanski, Shy,

and Prywes (the SSP algorithm, for short) detects

when each process has reached its termination condi-

tion. Running both the Chandy-Lamport and the SSP

algorithms, each process can thus locally and anony-

mously detect when all processes have computed their

local snapshot.

This algorithms combination gives us a funda-

mental debugging feature: it deﬁnes checkpoints for

the distributed algorithm of interest. First in case of

system failure, the computation can be restarted from

the last valid checkpoint. Second our debugger can

offer step-by-step forward and rewind functionalities.

Global Predicates. From execution checkpoints,

we can run once again the SSP algorithm to evaluate

GP (graph invariants). The most obvious predicate is

the termination detection of the monitored algorithm.

Hence, in ViSiDiA we obtain a semi automatic de-

bugging; user control is required to react against GP

evaluation (e.g., a detected predicate could be a sys-

tem failure).

A more elegant approach is, still from the execu-

tion checkpoints, to apply an adaptation of the algo-

rithm by Mazurkiewicz (Mazurkiewicz, 1997) which

gives a distributed way to compute graph coverings.

More precisely, each process can compute a graph

from which the network graph is a covering. From

this graph, predicates can be locally analyzed and ver-

iﬁed; processes can then automatically react against

any system state.

4 CONCLUSIONS

In this paper, we presented a new design of the Vi-

SiDiA platform for the simulation and visualization

of distributed algorithms. We added debugging fea-

tures with a fully-distributed approach in the context

of anonymous and asynchronous networks. These are

made effortless accessible to users: the ViSiDiA API

contains new primitives, and the GUI offers visualiza-

tion of debugging information along with algorithm

execution. We also introduced a new method to build

our debugger.

Our proposal helps in monitoring a distributed

system, determining its global state from local in-

formation and detecting failures. We set a check-

point and rollback recovery system, and implemented

a semi automatic debugger. User oversight can be re-

leased computing local graph coverings.

We plan to focus on this technique, and to visual-

ize the graphs within each process in a multi-scale ap-

proach. Finally, our theoretical basis can be extended

to rewriting rules and mobile agents.

REFERENCES

Bauderon, M., Gruner, S., M´etivier, Y., Mosbah, M., and

Sellami, A. (2001). Visualization of distributed al-

gorithms based on labeled rewriting systems. In GT-

VMT’01, volume 50 of ENTCS, pages 229–239.

Bauderon, M. and Mosbah, M. (2003). A uniﬁed frame-

work for designing, implementing and visualizing dis-

tributed algorithms. ENTCS, 72(3):13 – 24.

Ben-Ari, M. (2001). Interactive execution of distributed al-

gorithms. J. Educ. Resour. Comput., 1.

Carr, S., Fang, C., Jozwowski, T., Mayo, J., and Shene, C.-

K. (2003). Concurrent mentor: A visualization system

for distributed programming education. In PDPTA’03.

Chalopin, J., M´etivier, Y., and Morsellino, T. (2011). On

snapshots and stable properties detection in anony-

mous fully distributed systems. submitted.

Chandy, K. M. and Lamport, L. (1985). Distributed snap-

shots: Determining global states of distributed sys-

tems. ACM Trans. Comput. Syst., 3(1):63–75.

Chang, X. (1999). Network simulations with OPNET, pages

307–314. ACM.

Derbel, B. and Mosbah, M. (2003). Distributing the exe-

cution of a distributed algorithm over a network. In

INFOVIS’03, pages 485 – 490.

Guerraoui, R. and Ruppert, E. (2005). What can be imple-

mented anonymously? In DISC, pages 244–259.

Koldehofe, B., Papatriantaﬁlou, M., and Tsigas, P. (2003).

Integrating a simulation-visualisation environment in

a basic distributed systems course: a case study using

lydian. In ITiCSE’03, pages 35–39. ACM.

Matocha, J. and Camp, T. (1998). A taxonomy of dis-

tributed termination detection algorithms. Journal of

Systems and Software, 43(3):207–221.

Mazurkiewicz, A. (1997). Distributed enumeration. Inf.

Processing Letters, 61:233–239.

Moses, Y., Polunsky, Z., Tal, A., and Ulitsky, L. (1998).

Algorithm visualization for distributed environments.

In INFOVIS’98, pages 71–78.

Pongor, G. (1993). Omnet: Objective modular network

testbed. In MASCOTS ’93, pages 323–326.

Raynal, M. (1988). Networks and distributed computation.

MIT Press.

Stasko, J. T. and Kraemer, E. (1993). A methodology for

building application-speciﬁc visualizations of parallel

programs. J. Parallel Distrib. Comput., 18:258–264.

Szymanski, B., Shy, Y., and Prywes, N. (1985). Synchro-

nized distributed termination. IEEE Transactions on

software engineering, SE-11(10):1136–1140.

Tel, G. (2000). Introduction to distributedalgorithms. Cam-

bridge University Press.

Yamashita, M. and Kameda, T. (1996). Computing on

anonymous networks: Part i - characterizing the solv-

able cases. IEEE TPDS, 7(1):69–89.

FULLY-DISTRIBUTED DEBUGGING AND VISUALIZATION OF DISTRIBUTED SYSTEMS IN ANONYMOUS

NETWORKS

767