Enhancing the Performances of D-MASON

A Motivating Example

Michele Carillo

, Gennaro Cordasco

, Rosario De Chiara

Francesco Raia

, Vittorio Scarano

and Flavio Serrapica

ISISLab, Dipartimento di Informatica, Universit`a degli Studi di Salerno, Salerno, Italy

Dipartimento di Psicologia, Seconda Universit`a degli Studi di Napoli, Caserta, Italy

Keywords:

Agent-based Simulation, Load-balancing, Visualization of Distributed Models, Performance Evaluation.

Abstract:

Agent-based simulation models are an increasingly popular tool for research and management in many, dif-

ferent and diverse ﬁelds. In executing such simulations the “speed” is one of the most general and important

issues and the traditional answer to this issue is to invest resources in deploying a dedicated installation of ded-

icated computers, with highly specialized parallel applications, devoted to the purpose of achieving extreme

computational performances.

In this paper we present our experience with a distributed framework, D-MASON, that is a distributed version

of MASON, a well-known and popular library for writing and running Agent-based simulations. D-MASON

introduces the parallelization at framework level so that scientists that use the framework (domain expert but

with limited knowledge of distributed programming) can be only minimally aware of such distribution.

The framework allowed only a static decomposition of the work among workers, and was not able to cope

with load unbalance among them, therefore incurring in serious performance degradation where, for example,

many of the agents were concentrate on one speciﬁc part of the space. We elaborated two strategies for ame-

liorate the balancing and enhance the synchronization among workers. We present their design principles and

the experimental tests that validate our approach.

1 INTRODUCTION

Agent-Based simulation Models (ABMs) are an in-

creasingly popular tool for research and management

in many, different and diverse ﬁelds such as biology,

ecology, economics, political science, etc.. In some

ﬁelds, such as social sciences, ABMs are seen as a key

instrument (L´opez-Paredes et al., 2012) to the gener-

ative approach (Epstein, 2007), essential for under-

standing complex social phenomena. But also in pol-

icy making and economics (eco, 2010; for Economic

Co-operation and Forum, 2009), the relevance and ef-

fectiveness of ABMs is recently recognized.

Computer science community has responded to

the need for tools and platforms, that can help the de-

velopment and testing of new models in each speciﬁc

ﬁeld, by providing tools, libraries and frameworks

that speed up and make easier the task of (massive)

simulations. Several important issues in evaluating

different platforms for ABM, well identiﬁed in the re-

views (Berryman, 2008; Najlis et al., 2001; Railsback

et al., 2006), are speed of execution, ﬂexibility, repro-

ducibility, documentation, open-source and facilities

for recording and analyze data.

Our work is based on D-MASON, a parallel ver-

sion of the MASON library for writing and running

simulations of ABMs. D-MASON addresses, in par-

ticular, the speed of execution with no harm on other

features that characterize MASON. The intent of D-

MASON is to provide an effective and efﬁcient way of

parallelizing MASON ABMs: effective because with

D-MASON you can do more (e.g. faster and/or larger

simulations) than what you can do with MASON; efﬁ-

cient because, in order to obtain this additional com-

puting power, the developer has to do some incremen-

tal modiﬁcations to the MASON ABMs he has already

written without re-designing them.

While D-MASON is efﬁcient and its scalability

has been proved to be high (Cordasco et al., 2011;

Cordasco et al., 2012), no load balancing mechanism

is available, thereby impeding any kind of dynamic

adaptation to the possible unbalance in the spatial de-

composition of the world where the agents are lo-

cated. This feature is extremely important in a class

137

Carillo M., Cordasco G., De Chiara R., Raia F., Scarano V. and Serrapica F..

Enhancing the Performances of D-MASON - A Motivating Example.

DOI: 10.5220/0004060501370143

In Proceedings of the 2nd International Conference on Simulation and Modeling Methodologies, Technologies and Applications (SIMULTECH-2012),

pages 137-143

ISBN: 978-989-8565-20-4

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

of simulations where “spatially deﬁned” goals have to

be pursued (i.e. a speciﬁc position must be searched

for and located by the agents) where all the agents are

probable to gather, thereby loading one node of the

distributed system more than the others.

In this paper we describe how, starting from a par-

allelization of a speciﬁc simulation, sprung two mod-

iﬁcations of D-MASON that had measurable positive

effects on both performances and scalability.

Distributed Simulations. The research in many

ﬁelds that uses the simulation toolkits for ABMs

is often conducted interactively, since the “genera-

tive” paradigm described in (Epstein, 2007) describes

an iterative methodology where models are designed

tested and reﬁned to reach the generation of an out-

come with a simple generative approach. In this con-

text, given that scientists of the speciﬁc domain of-

ten are not computer scientists, usually they do not

have access to systems for high performances com-

putations for a long time, and usually they have to

perform preliminary studies within their limited re-

sources and, only later (if needed), allow extensive

testing on large supercomputing centers. In social

sciences, for example, the need for “the capacity to

model and make up in parallel, reactive and cogni-

tive systems, and the means to observe their interac-

tions and emerging effects” (Conte and Castelfranchi,

1995) clearly outlined, since 1995, the needs of ﬂexi-

ble, though powerful, tools.

In this scenario, D-MASON’s goal is to offer to

such scientists a setting where a traditional MASON

program can be run on one desktop, ﬁrst, but can im-

mediately harness the power of other desktops in the

same laboratory (available, maybe, during off-peak

hours) by using D-MASON, thereby providing scal-

ing up the size they can treat or reducing signiﬁcantly

the time needed for each iteration.

Of course, since the resulting distributed system,

collecting hardware from research labs, administra-

tion ofﬁces, etc. is highly heterogeneous in nature,

the challenge that is tackled by D-MASON is also

how to use efﬁciently all the hardware without an

impact on the “legitimate” user (i.e., the owner of

the desktop) both on performances and on installa-

tion/customization of the machine. On the other hand,

one of the objectives pursued by D-MASON is that the

program in MASON should not be very different than

the corresponding program in D-MASON so that the

scientist can easily modify it to run over an increasing

number of hosts.

The need for efﬁciency among the Agent-Based

modeling tools is well recognized in literature: many

reviews of state-of-the-art frameworks (Berryman,

2008; Najlis et al., 2001; Railsback et al., 2006) place

“speed” upfront as one of the most general and im-

portant issues. While a consistent work has been done

to allow the distribution of agents on several comput-

ing nodes (see (Collier and North, 2011; Mengistu

et al., 2008; Pawlaszczyk and Strassburger, 2009)),

D-MASON’s claims to have a different approach in

principle: distribution is introduced at the framework

level, so that scientists who use the framework (do-

main experts but with limited knowledge of computer

programming and systems) can be unaware of such

distribution. Several works in this ﬁeld (Collier and

North, 2011; Mengistu et al., 2008; Pawlaszczyk and

Strassburger, 2009) directly affects the implementa-

tion and the architecture of a distributed agent model

(dealing with lazy synchronization etc.), D-MASON’s

approach is concentrated on the upper layer of the

simulation framework, thereby hiding, as much as

possible, the details of the architecture. In this way,

D-MASON provides a certain degree of backward-

compatibility with pre-existing MASON applications,

ensuring cost-effectivenessof porting a sequential im-

plementation into a distributed setting.

Outline of the Paper. The rest of the paper is orga-

nized as follows: Section 2 introduces D-MASON. In

Section 3, we brieﬂy discuss our motivating example

Ants Foraging and introduce the two improvements

on D-MASON. In Section 4 we report on and discuss

the tests we performed on the enhanced version of D-

MASON. In Section 5, we conclude and discuss some

possible extensions of this work.

2 MASON AND D-MASON

Before presenting the features of D-MASON, we will,

brieﬂy, introduce MASON.

MASON. MASON toolkit is a discrete-event simu-

lation core and visualization library written in Java,

designed to be used for a wide range of ABMs. The

toolkit is composed of two independent layers: the

simulation layer and the visualization layer. The sim-

ulation layer is the core of MASON and is mainly rep-

resented by an event scheduler and a variety of ﬁelds

which hold agents into a given simulation space. MA-

SON is mainly based on step-able agent: a computa-

tional entity which may be scheduled to perform some

action (step), and which can interact (communicate)

with other agents. The visualization layer permits

both visualization and manipulation of the model.

SIMULTECH2012-2ndInternationalConferenceonSimulationandModelingMethodologies,Technologiesand

Applications

138

Figure 1: Field partitioning.

D-MASON. D-MASON adds a new layer named

D-simulation which extends the MASON simulation

layer. The new layer adds some features that allows

the distribution of the simulation work on multiple,

even heterogeneous, machines (workers). Notice that

the new layer does not alter in any way the existing

layers. Moreover, it has been designed so as to en-

able the porting of existing applications on distributed

platforms in a transparent and easy way.

From a functional point of view D-MASON archi-

tecture is divided into three functional blocks: Man-

agement, Workers and Communication. The Manage-

ment layer provides a master application which will

be used for coordinating the workers, handle the boot-

strap and running the simulation. D-MASON is based

on a master/workers paradigm which exploits a space

partitioning approach: the master partitions the space

to be simulated (the ﬁeld) into regions (see Figure

1). Each region, together with the agents contained

in it, is assigned to a worker; then each worker is in

charge of: simulating the agents that belong to the as-

signed region; handling the migration of agents; man-

aging the synchronization between neighboring re-

gions (this information exchange is required in order

to let the simulation run consistently). Workers com-

municate by using a publish–subscribe mechanism.

D-MASON uses a standard approach to achieve

a consistent local synchronization of the distributed

simulations: each step is associated with a ﬁxed state

of the simulation. Regions are simulated step by step.

Since the step i of region r is computed by using the

states i−1 of r’s neighborhood, the step i of a region

cannot be executed until the states i−1 of its neigh-

borhood have been computed and delivered. In other

words, each region is synchronized with its neighbor-

hood before each simulation step.

3 D-MASON IMPROVEMENTS

A Motivating Example: Ants Forage. In (Panait

and Luke, 2004b; Panait and Luke, 2004a; Panait

and Luke, 2004c) is described the Ants Forag-

Figure 2: Ants foraging (left to right): ants leave the nest,

part of the ants have found the food and head toward the

nest and all the ants walk along shortest path between the

food and the nest. The grey gradient represents the different

levels of pheromone.

ing Model. Ants Forage is an agent based sim-

ulation of the Ants Foraging Model and can be

found in the standard distribution of MASON (from

http://cs.gmu.edu/ eclab/projects/mason/). In Ants

Forage the space is represented by a grid of square

cells where there are some food sources, a nest and

a number of optional obstacles. When the simulation

starts, the ants leave the nest in search for the food.

Each ant occupying a cell may move to any of the

eight cells in the neighbor which is not occupied by

either other ants or an obstacle. When an ant reaches

a food source it becomes laden with food and begins

its search for the nest. When it has reached the nest

again, the ant leaves the food and begins searching

for food again. The goal is to maximize the rate of

food brought to the nest from the food sources. A

key factor of ants foraging simulation is that there is

no direct communication among agents: once an ant

ﬁnds the food source it does not communicate where

it is located to other component of the colony. On the

other hand, an indirect communication is provided by

pheromone trails: ants release two of different kinds

of pheromones on cells that have crossed depending

on where they are heading to, the nest or the food.

Pheromones evaporate over time therefore each cell

is associated with a level of pheromone that depends

on the time. Ants take local decisions that depend on

their state (going to the nest or going to the food), the

kind and level of pheromone that is located in adja-

cent cells.

In Figure 2 are depicted the three typical phases of

an ants foraging simulation, from left to right: in the

beginning they leave the nest wandering around so the

area around it is overcrowded; after a phase of equi-

librium during which the ants almost equally spread

in the ﬁeld looking for food, some of the ants ﬁnd it

and get back to the nest; the last phase is character-

ized, once again, by an uneven distribution of the ants

on the ﬁeld along the shortest path between the nest

and the food source.

This kind of simulations, characterized by dy-

EnhancingthePerformancesofD-MASON-AMotivatingExample

139

namic unbalancing distribution of the agents through-

out the steps, motivates the need for a speciﬁc load

balancing policy that measures the amount of compu-

tational work needed by each worker to carry out the

simulation and, consequently, takes decisions about

how to re-distribute the work among workers. Ants

Forage is based on 5 different specialized ﬁelds: one

for the ants, two for the pheromones level, one for the

obstacles and one for food sources. The ﬁrst three

ﬁelds are dynamic and updated at every step. Ants

Forage has pushed us to meditate about two needs:

3.1 Enhancing Communication

The performances of distributed systems, like D-

MASON, are strongly bound by the performances of

the slowest component in the system when the various

components (i.e. workers) needs to be synchronized.

In D-MASON the synchronization of the ﬁelds imme-

diately follows the simulation phase and is carried out

sequentially by running along the list of the dynamic

ﬁelds (static ﬁelds are not synchronized). Together

with each of ﬁeld synchronization there is some over-

head due to both the communication channel and the

barrier, the mechanism which allows to the different

workers to “wait” for the slowest one to complete its

work and to begin the successive step on the same

time. Hence, synchronization phases adds dependen-

cies between the operations carried out by workers

that harm the parallelization process (Amdahl, 1967).

A reasonable solution to this waste of computing

power has been adopted by moving to a multi thread

communication phase during which all the updates on

the ﬁelds are transmitted in parallel reducing the com-

munication overhead. A single synchronization phase

is done at the end of all the communications.

3.2 Load Balancing

The Need for Load Balancing. As described be-

fore D-MASON uses a space partitioning approach

where the ﬁelds are subdivided in regions assigned

to workers; this approach allows to limit the commu-

nication among the workers. Indeed, since each agent

interacts only within a small area around it, the com-

munication is limited to local messages (messages be-

tween workers, managing neighboring spaces, etc.).

The problem with this approach is that agents can

migrate between regions and consequently the asso-

ciation between workers and agents changes during

the simulation. Moreover, load balancing is not guar-

anteed and needs to be addressed by the application.

To better exploit the computing power provided by

the workers of the system, it is necessary to design

the system so that the simulation always evolves in

parallel, avoiding bottlenecks. Since the simulation

is synchronized after each step, the system advances

with the same speed provided by the slower peer in

the system. For this reason it is necessary to design

the system in order to balance the load between the

workers.

Our Approach. The choice of the partitioning

strategy is important for the efﬁciency of the whole

system. Two key factors need to be considered: (i)

Static vs Dynamic Partitioning; (ii) The granularity

of the world decomposition. Dynamic partitioning

can be useful, for instance, when the workload of

the simulation changes along the time, as in the case

of Ants foraging. In this cases, in order to balance

the workload across the workers the system can adapt

the partitioning step by step. Unfortunately, the man-

agement of dynamic regions requires a large amount

of communication between workers that consumes

bandwidth and introduces latency. Similarly the gran-

ularity of the world decomposition (that is, the region

size and, consequently, the number of regions, which

a given space is partitioned into) determines a trade

off between load balancing and communication over-

head. The ﬁner is the granularity adopted, the higher

is the balancing that, ideally, can be reached by the

system. However, due to regions’ interdependency

and system synchronizations, ﬁne granularity usually

determines a huge amount of communication which

may harm the overall performances.

Based on the above considerations, we decided to

opt for a system that allows a dynamic partitioning

with two levels of granularity. At each step every

worker compares the amount of agents it has to sim-

ulate with the ideal number of agents per region, that

is the total number of ants divided by the number of

regions the ﬁeld is split into. When the ratio between

this two values is above a given threshold the worker

decides to move on a ﬁner granularity by splitting its

region.

The balancing phase is depicted in Figure 3: on

the left there is the ﬁeld partitioned in 9 (3 × 3) re-

gions, this is the coarse grained subdivision of the

work, while in the middle image is depicted the ﬁne

grained subdivision of the work. The last image

shows what happens when a worker decides to split

its region, in this particular case the worker that is

in charge of the central region decompose the region

in 9 sub-regions then assigns 8 of this sub-regions to

its 8 neighboring workers. Please note that each sub-

region is assigned in way that allows to minimize the

communication between the neighbors.

Symmetrically when a worker notices that the

SIMULTECH2012-2ndInternationalConferenceonSimulationandModelingMethodologies,Technologiesand

Applications

140

Figure 3: (Left to right): coarse grained partitioning, ﬁner grained partitioning and balancing phase.

workload among the split subregions is below a given

threshold the worker may decide to merge back the

subregions returning to the initial (coarser) subdivi-

sion (Figure 3 (left)).

4 TESTING ENHANCED

D-MASON

We performed a number of tests of the enhanced ver-

sion of D-MASON in order to assess the effectiveness

of the two improvements described above.

Setting of the Experiments. Simulations were con-

ducted on a scenario consisting of two different con-

ﬁguration of hosts/workers: for the enhanced com-

munication schema we conducted a series of tests on

a single host (CPU i7, 8GB RAM) while for the load

balancing experiments we also performed the tests on

a network of 6 machines: a master machine, one com-

munication server and 4 hosts each running an evenly

distributed amount of workers. In the load balanc-

ing tests, each region is simulated by using a dedi-

cated Java Virtual Machine (JVM). The communica-

tion is managed by a dedicated host running Apache

ActiveMQ Server. Master, workers and the communi-

cation server are connected using a standard 100Mbit

LAN network.

The DAnts Testbed. We have performed our tests

on DAnts, the distributed version of Ants Forage, by

considering more than 32 different test settings. Each

setting is characterized by the choice of the follow-

ing parameter: number of ants (the size of the ﬁeld is

determined by the number of agents in order to main-

tain a ﬁxed density), number of regions, the kind of

synchronization/load balancing policy.

4.1 Discussion of Results

In the following we will brieﬂy discuss the results.

We have decided to test the two improvements by us-

ing an incremental approach, in the ﬁrst set of tests

we tested the communication enhancement while in

the second batch we added the load balancing.

Enhanced Communication. The rationale behind

this test is to check the new communication mecha-

nism against the previous one. Figure 4 depicts two

square partitioning, 4×4 and 6×6. In each plot the X-

axis represents the increasing number of ants while on

the Y-axis are reported the performances of the sys-

tem in terms of simulation steps per second.

In both test settings the improved communication

strategy works more efﬁciently than the older one.

The reason why, as along as the number of agents in-

creases the delta between the curves decreases, is that

the impact of simulation time augments proportion-

ally to the number of agents while our improvement

affects only the communication phase.

Load balancing. The batch of tests we performed

simulates 100, 000 ants running in two settings: 1

host and 6 host. Each test lasted 40, 000 simulation

steps. In both settings we used 3.0 for the split thresh-

old (i.e., a worker decides to split its region when the

number of agents in the region are 3.0 times the ideal

number of agents per region) and 1.5 as the merge

threshold.

Figure 5 shows the results. In each plot the X-

axis represents the partitioning while on the Y-axis are

reported the performances of the system in terms of

simulation steps per second. The results are encour-

aging and show that the load balancing policy is effec-

tive in mitigating the unbalancing. A ﬁrst considera-

tion must be done on the fact that by using load bal-

ancing is possible “to do more” with the D-MASON

even on a single machine. In the multiple machines

EnhancingthePerformancesofD-MASON-AMotivatingExample

141

Figure 4: Multi message vs single message.

Figure 5: D-MASON load balancing effectiveness.

setting we can obtain even better improvements be-

cause in the previous setting the load balancing was

performed on workers running on a single host, while

in this setting each host is serving a smaller number of

workers. Clearly as the number of regions increases

the ﬁner granularity of the subdivision naturally bal-

ance better the load but, on the other hand, the com-

munication overhead increases, coherently, limiting

the overall improvement.

5 CONCLUSIONS AND FUTURE

WORK

This paper reports our experience with D-MASON a

distributed version of MASON. D-MASON has been

developed with the purpose of speeding up the perfor-

mances of MASON by letting the computational work

to be distributed among several machines. Hence

by harvesting the unused CPU power usually largely

available in installations like laboratories.

This work has been motivated by the develop-

ment of the distributed version of Ants Forage. We

observed that this kind of simulations have common

characteristics that needed to be better addressed by

D-MASON: (i) the simulation is based on more than

a ﬁeld dynamically updated during the simulation; (ii)

agents are not balanced among the space often accu-

mulating in some zones. We have showed two strate-

gies that deal with the issues presented above and we

have validated the effectiveness of the strategies by

several experimental tests.

Some work still need to be done, for in-

stance the load balancing policy has to be tuned

in order to better exploit the work subdivision,

the control of the thresholds is the key to lever-

age this mechanism. D-MASON is available at

http://www.isislab.it/projects/dmason/. The project

will be soon released under a Free and Open Software

license.

REFERENCES

The Economist (2010). Agents of change.

Amdahl, G. M. (1967). Validity of the single processor ap-

proach to achieving large scale computing capabili-

ties. In Proceedings of AFIPS ’67, pages 483–485.

Berryman, M. (2008). Review of Software Platforms for

Agent Based Models. Tech. Rep. DSTO-GD-0532,

Australian Government, Department of Defence.

Collier, N. and North, M. (2011). A platform for large-scale

agent-based modeling. In W. Dubitzky, K. Kurowski,

and B. Schott, eds., Large-Scale Computing Tech-

niques for Complex System Simulations, Wiley.

SIMULTECH2012-2ndInternationalConferenceonSimulationandModelingMethodologies,Technologiesand

Applications

142

Conte, R. and Castelfranchi, C. (1995). Cognitive and So-

cial Action. UCL Press.

Cordasco, G., De Chiara, R., Mancuso, A., Mazzeo, D.,

Scarano, V., and Spagnuolo, C. (2011). A Frame-

work for distributing Agent-based simulations. In

Ninth Inter. Workshop on Algorithms, Models and

Tools for Parallel Computing on Heterogeneous Plat-

forms (HeteroPar’2011).

Cordasco, G., De Chiara, R., Mancuso, A., Mazzeo, D.,

Scarano, V., and Spagnuolo, C. (2012). D-MASON:

A Distributed Framework for Agent-based simula-

tions. Simulation SI: Advancing Simulation Theory

and Practice with Distributed Computing, (Submit-

ted).

Epstein, J. M. (2007). Generative Social Science: Studies

in Agent-Based Computational Modeling. Princeton

University Press.

For Economic Co-operation, O. and Forum, D. O. G. S.

(2009). Applications of complexity science for pub-

lic policy: new tools for ﬁnding unanticipated conse-

quences and unrealized opportunities.

L´opez-Paredes, A., Edmonds, B., and Klugl, F. (2012). Ed-

itorial of the special issue: Agent based simulation of

complex social systems. Simulation, 88(1):4–6.

Mengistu, D., Troger, P., Lundberg, L., and Davidsson, P.

(2008). Scalability in Distributed Multi-Agent Based

Simulations: The JADE Case. In Proc. of FGCNS ’08,

volume 5, pages 93–99.

Najlis, R., Janssen, M. A., and Parkerx, D. C. (2001).

Software tools and communication issues. In Parker,

D. C., Berger, T., and Manson, S. M., editors, Proc.

Agent-Based Models of Land-Use and Land-Cover

Change Workshop, pages 17–30.

Panait, L. and Luke, S. (2004a). Ant foraging revisited.

In Proceedings of the Ninth International Conference

on the Simulation and Synthesis of Living Systems

(ALIFE-IX).

Panait, L. and Luke, S. (2004b). Learning ant foraging

behaviors. In Proceedings of the Ninth International

Conference on the Simulation and Synthesis of Living

Systems (ALIFE-IX).

Panait, L. and Luke, S. (2004c). A pheromone-based utility

model for collaborative foraging. In Proceedings of

AAMAS 2004.

Pawlaszczyk, D. and Strassburger, S. (2009). Scalability

in distributed simulations of agent-based models. In

Proc. of WSC 2009, pages 1189–1200.

Railsback, S. F., Lytinen, S. L., and Jackson, S. K. (2006).

Agent-based simulation platforms: Review and devel-

opment recommendations. Simulation, 82:609–623.

EnhancingthePerformancesofD-MASON-AMotivatingExample

143