of RDD transformations until they really need to be
evaluated for passing their changes to RDD actions
(which produce non-RDD values). In agent-based
graph programming, agents travel or propagate over a
graph as changing each data item. This in turn means
that, if we use the same strategy as Spark’s dataset
immutability, we need to take a snapshot every time
an agent changes each vertex. Furthermore, unlike
Spark’s RDD, (i.e., a collection of data items), a graph
needs more disk space for storing its serialized data
upon a checkpointing and more time for de-serializing
it upon a rollback. Taking these overheads in consid-
eration, we implemented interactive parallelization in
the MASS library as follows:
1. Maintaining only One Snapshot of Computa-
tion: MASS users are supposed to commit their
operations to an in-memory graph once they have
no intention to roll back beyond this checkpoint.
This saves the secondary storage space.
2. Maintaining a History of Previous MASS
Function Calls: The MASS library will keep
recording any MASS functions invoked since the
last snapshot was taken, so that MASS can rebuild
any past graph structure between the snapshot and
the latest graph state.
3. Rolling Back Computation by Re-executing
Functions in History: Upon a user-specified roll-
back, the MASS library will re-apply previous
function calls to the snapshot in a chronological
order all the way to the rollback point. While
this rollback scheme needs a substantial time to
rebuild a past graph, the normal computation can
run faster without continuously taking a snapshot
of ongoing executions onto disk.
At the highest level, InMASS is simply a wrap-
per class that initializes a JShell window and receives
MASS statements from users. The challenges of In-
MASS implementation, however, revolve around (1)
making JShell function properly for all cluster nodes
in a distributed environment and (2) deciding how to
save and reload computation state for checkpoint and
rollback functionality.
To address issue 1, we customized a Java class
loader named InMASSLoader that facilitates distribu-
tion of new classes’ bytecode from the MASS mas-
ter to worker nodes. Once all computing nodes are
aware of the new classes, they use new MASSObject-
InputStream and MASSObjectOutputStream functions
to assist in serialization and deserialization of these
dynamic classes.
To address issue 2, we had Agents and Places in-
herit AgentsInternal and PlacesInternal serializable
classes to facilitate serialization and deserialization of
all agents and places data. Then, each MASS worker
process gathers all hash tables containing all Agents
and Places instances, and holds them in one single
object named MState. This is the object to be saved
and updated on checkpoint and rollback. (Note that
users can choose a checkpoint storage from active
memory, temporary disk location, or a specified file
in disk.) To facilitate user ability to rollback to states
other than the original checkpoint, the MASS master
process prepares the MHistory object to keep a log of
all API calls to Agents and Places and to store their
bytecode to enable re-execution on demand. Conse-
quently, when a user requests rollback to “step 5”, for
example, the original snapshot will be loaded from
MState, and then MHistory will execute the next five
API calls that follow the snapshot.
3.4 Graph and Agent Visualization
To facilitate visualization and validation of agent ac-
tivities over a distributed graph, we have extended
the existing MASS-Cytoscape integration, based on
the following three implementation strategies: (1) al-
lowing users to focus on programming their graph
application, (2) following the OSGi framework to
modularize MASS-related plugins, and (3) allowing
large-scale graphs to be visualized by retrieving par-
tial graphs from MASS.
3.4.1 Usability Enhancement
Figure 4 presents an overview of the MASS-
Cytoscape architecture. We have illustrated the user’s
two points of interaction on the left side of the fig-
ure, with the JShell window for running their code in
MASS and with the MASS Control Panel for manag-
ing their data flow and visualization in Cytoscape.
The MASS Control Panel serves three main func-
tions. First, it provides a single point of interaction for
the user by internally managing the data transfer plu-
gins: import-network, export-network, and import-
agents. Second, it provides the ability to manipu-
late the MASS Configuration tables that inform the
data transfer plugins of how to find the MASS com-
putation and what data to pull back into Cytoscape.
Lastly, it provides the interface and logic for visualiz-
ing agent movement through manipulation of the Cy-
toscape data tables and network view.
In MASS, the CytoscapeListener class must be
started by the user application to open a TCP-based
communication port for MASS-Cytoscape communi-
cation. This listener will then field any requests from
Cytoscape by first parsing the request, then obtaining
reference to the corresponding GraphPlaces method,
and finally invoking that method and returning the re-
sults to the requesting Cytoscape plugin. Internally,
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
152