A Network Dispersion Problem for Non-communicating Agents

Martin N. Simanjuntak

Warwick Business School, The University of Warwick, Coventry CV4 7AL, U.K.

1 THE RESEARCH PROBLEM

Is coordination possible when agents are unable to

communicate? This is an abstract problem that can

only be made concrete by the specification of a well-

defined goal of the coordination processes. For

example, the problem of rendezvous of two or more

agents on a network is one that has been extensively

studied for non-communicating agents (see, e.g.,

Alpern and Gal, 2003). A less studied and more

recent problem is the opposite one of achieving

spatial dispersion. If there are n agents at

locations, we say that they are dispersed if no

location has more than one agent (this requires of

course that

k is at least n ). In the case of nk



(the main case studied here), dispersion implies

coverage, that is, every location has at least one

agent.

Alpern and Reyniers (2002) initiated the study of

spatial dispersion of agents in the context of non-

communicating agents who could move between any

two locations, based only on the knowledge of the

populations at all locations.

The research problem addressed in this paper is

to see how the problem of Alpern and Reyniers

changes if a network (graph) structure is imposed on

the set of locations: An agent only knows the

population at his current node (location) and he can

only stay still or move to an adjacent node. He can

see the number of arcs at his node, but if he moves

he must choose among them equiprobably. As we

are dealing with non-communicating agents with

limited amount of global knowledge, we shall limit

our discussion to simple and myopic strategies. In

particular, our agents adopt common (Markovian)

stunted random walk strategy, completely specified

by the probability

of staying still for another

period at the same node. Given such strategies, how

long will it take, on average, for the agents to

achieve dispersion? The optimization can be simply

with respect to

or with respect to ,

p where

the population at the agent’s current node. For

example, in the trivial case of a network with two

nodes, where both agents start at the same node, the

optimal value of

is ½ and the expected time to

dispersion is



periods.

We shall introduce the network dispersion

problems by way of the following illustration.

Suppose that at 11am the guards at the Louvre have

called a wildcat strike for noon. The Louvre by law

must be kept open. We know that there are n

rooms, but at short notice we don’t have a map. We

can hire in an hour n guards, but they may speak

different languages and will not be able to

communicate with each other. We have a single

broadcasting line for communication to all these new

guards, but we cannot give different guards different

instructions as we do not even know their names.

Once all the rooms have guards, we can broadcast a

command to stay still. To keep things simple, we

assume that time is divided into periods and that

each guard visits only one location in each period.

All we can broadcast is a single number

p , the

probability that a guard should stay still for that

period. We want this to occur before the word gets

out that this is a good time to steal a painting, so we

want to minimise the expected time to reach full

coverage of all the rooms. How can we define a

common strategy for the guards that is simple yet

still effective? And how fast on average will they

arrive at the goal of getting as dispersed as possible?

2 STATE OF THE ART

Our agents are not the smart rational individuals that

are normally assumed in game theory models.

Instead, this type of agent is normally studied in the

domains of Artificial Intelligence and Biology

(social animals). Therefore we found that the works

that are most relevant to our study comes from those

two research communities.

After the paper of Alpern and Reyniers (2002),

Grenager et al. (2002) generalized the specific

problem to a general class of dispersion games. In

this type of games, agents win positive payoff only

N. Simanjuntak M..

A Network Dispersion Problem for Non-communicating Agents.

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

when they choose distinct actions. Problems with

similar characteristics have been studied in many

research areas. Here we list some of them:

1. Load balancing: a set of tasks is to be

assigned to a set of machines, and to minimise

the overall processing time, the load need to be

distributed as evenly as possible (see, e.g.,

Azar et al., 1999).

2. Niche selection: this problem has been studied

in economics as well as in ecology. In

economics, it models a situation in which

producers wish to occupy one of a number of

different market niches, and they prefer to

occupy niches with fewer competitions. A few

examples are the Santa Fe problem (Arthur,

1994) and the class of minority games (Challet

and Zhang, 1997). In ecology, this problem is

also called habitat selection, and an example of

this is how animals choose feeding patches

with low population density with respect to

food supply (see, e.g., Houston and

McNamara, 1997). In particular, Fretwell and

Lucas (1969) introduced the equilibrium notion

of the ideal free distribution for habitat

distribution in birds. That is a particular

analogue of our notion of dispersion, where

some nodes have greater capacity (habitat is

more supportive of foragers).

3. Congestion games: proposed initially by

Rosenthal (1973), in this class of games

individuals seek facilities or locations of low

population density, due to the monotonically

increasing cost of using the facilities. One of

the subclasses of congestion games that has

received significant attention is routing games,

in which a set of players in a network choose

the route with low congestion level (see, e.g.,

Qiu et al. 2003).

4. Multi-agent area coverage: a team of agents

seek to cover an entire area (which may or may

not be known a priori), and therefore it is

preferred that each agent covers different part

of the area. This problem has been studied in

several research communities, including

robotics/agents, sensor networks, and

computational geometry. For a recent survey in

the fields of robotics, see Galceran and

Carreras (2013).

3 OBJECTIVES

Given a graph

with k nodes, we want to see

which (Markovian) stunted random walk strategies,

if simultaneously adopted by all agents, lead to the

least expected time

to dispersion? In all the

problems considered, the same optimization

criterion is used. We consider both (i) a random

initial placement of the n agents onto the

k nodes

,G and (ii) an initial placement of all the agents

onto a single node

In the latter case, it is of

interest to see where the best place to introduce the

agents is.

We also will consider two types of stunted

random walk strategies, and compare them. First, a

simple type, where there is a common parameter

representing the probability of staying still for the

next period. Second, a probability vector

in which

p (the i-th entry in the vector) gives the probability

of staying still (rather than randomly moving) that

depends on the number

of agents currently at your

node (called the agent population of the node).

A conjecture of Alpern and Reyniers (2002) in a

related context says that if you are alone at your

node then you should not move and no one should

move to your node. While this might work in the

context of a complete graph, it clearly fails on the

line graph

L In the case when there are many

agents at node 1 and one agent at node 2, the ones at

node 1 could never disperse. However the weaker

property of setting



seems to hold for some

graphs, which does not have the added assumption

that no one goes to a node with current population of

Our objective is to determine the minimum

dispersal time and optimal levels of

or of

for

several classes of networks, including line networks

L cycle networks ,

C and complete graphs .

We also wish to compare on

L the efficiency of an

initial placement of all agents at a node

varies. It would seem likely that it is better to place

them as close as possible to the centre.

From now on we assume that the graphs we

consider have n nodes and that there are n agents.

4 METHODOLOGY

To evaluate the expected dispersal time T on a

graph

as a function of the stationary probability

(or of the vector

( ,..., )



p ) we use two

distinct methodologies. For small graphs (small n ),

we use the theory of absorbing time for Markov

chains. The states of the chain are the population

ANetworkDispersionProblemforNon-communicatingAgents

distributions, allowing for symmetry. For example,

the state





123

,,mm m says there are

agents at

node 1,

m at 2 and

m at 3. By symmetry of nodes

1 and 3 we may assume

is at least as large as

.m Thus the states when

GL are





3, 0, 0 ,





2,1, 0 ,





2, 0,1 ,





1, 2, 0 ,





0,3, 0 , and the absorbing

(dispersed) state





1,1,1 . In general, the set

S of all

states for the network dispersion problem on

L can

be described as follows, where

12 1

[, ,..., ]: ,

nnni

Ssmmmmmmn





  







When

is larger than 3, the number of states in

this type of analysis becomes too large for

Markovian analysis, so another methodology is

needed – simulation. We use the software

application Mathematica to simulate the movement

of agents adopting a common p-strategy, and

minimize the dispersal time numerically. For

purposes of comparison, and to check our simulation

program, we analyse the line graph

L both ways.

5 STAGE OF RESEARCH –

RESULTS SO FAR

5.1 Markovian Analysis for L

We first analyse the problem on a line graph

L of

three nodes using Markov chain theory. We define

six states in

L problem (up to symmetry):

[3, 0, 0],s 

[2,1,0],s 

[2,0,1],s 

[1, 2, 0],s



[0,3,0],s  and the absorbing state,

[1,1,1].ss

We do not put any restrictions on the movement of

the agents, other than if an agent chooses to move in

the next period, then they can only move to a node

adjacent to the current node. This implies that each

state in the Markov chain is accessible from any

other states (albeit accessing a particular state could

take more than one or two steps). A Markov chain

with this characteristic is defined as an absorbing

Markov chain, for which a set of theoretical

formulas has been devised to calculate the expected

time to absorption. In this subsection we follow the

procedures explained in Grinstead and Snell (1997)

to find the time to absorption, or more generally, the

expected dispersal time

Definition (Absorbing Markov Chain) (Grinstead

and Snell, 1997): A Markov chain is absorbing if it

has at least one absorbing state, and if from every

state it is possible to go to an absorbing state (not

necessarily in one step).

In an absorbing Markov chain, the probability of

absorption is 1. This means that for our problem, the

process will eventually end at the absorbing state

[1,1,1].s



However, we have two special situations

in which our Markov chain is no longer absorbing. It

is if the value of p, the probability of staying still, is

set to either 0 or 1. If p is set to 0, three of the states

([3, 0,0], [2,0,1], and [0,3,0]) cannot access

whereas if

p is set to 1, then all states become

absorbing states, and thus do not communicate with

each other. Therefore, the absorbing Markov chain

theory only applies when we restrict the range of

01.p



For the case of 0p  , as well as



, we shall describe another approach separately

in the following section.

5.1.1 Simple Stunted Random Walk

We begin by considering simple strategies based on

a single probability

p for staying still. In each

period, each agent independently decides to remain

at his current node or to move. If he chooses to

move, he observes the number of arcs from his node

and choses to move along them equiprobably. In the

case of a line graph this is elementary: if at an end,

he moves to the unique adjacent node; otherwise, he

randomly moves left or right. We wish to find the

strategy parameter



that minimises the

expected time





,TTsp

to reach the unique

absorbing state

[1,1,1]s



from the state s. We

denote this minimum time by





,.TTsp

Sometimes we wish to start from a random state,

where each state occurs with its probability of

occurring from a binomial distribution, that is, if

each of the n agents was randomly placed on one of

the n nodes. In this case the expected absorption

time is called





,TTrp

, where r stands for the

random initial state. To calculate

,T we first define

the transition matrix

comprising a set of

transition probabilities between all states in the

chain.

In the transition matrix Q,

gives the transition

probability of moving from state

to state

in one

step. For example, for the transition probability from

we have

(1 )pqp

, which means

ICAART2014-DoctoralConsortium

that, if the current state

[2,0,1]s  , then we can

reach the state

[2,1,0]s  in the next period if the

two agents at the left stay still

(),p

and the agent at

the right moves to the middle

(1 ).p

We then construct the matrix



(a submatrix of

Q ), which is a 5-by-5 matrix consisting of the

transition probabilities of all the transient states (all

states except

), that is, the upper left submatrix.

We wish to determine the column vector





pττ

whose i-th entry is





Ts p





, the

expected dispersal time from state

. Then we can

minimise this expression, for each

with respect to

the parameter

.p According to the theory of

absorbing Markov chains (Grinstead and Snell,

1997), the vector is given by the formula:



1

 τ IQ c



where

I is the identity matrix, and c is a column

vector, of which all of the entries are 1. In the simple

stunted random walk strategy, the value of

as a

function of

p is:

Before we present the results for optimal stunted

random walk strategy, we now come back to the two

specials situations in which the chain is not

absorbing. It is when the value for

p is either 0 or 1.

For

1,p



it is quite obvious that if no agent moves,

then we will never reach the equidistribution state.

We then focus on the case of

0.p 

Let us see the

value for the transition matrix in this case:

From the transition matrix above, we can see that

the equidistribution state

()

s is only accessible

(not necessarily directly) from two states,

and

meaning that it is only possible to achieve

dispersal if the agents start from state

[2,1,0]s  or

[1, 2, 0].s



From

if all agents move, the one at

the left end must move to the center. The two at the

center equiprobably move to different ends

(reaching state

) or to the same end (state

). We

can now simplify the chain into the following

diagram:

Figure 1: The state transition diagram for

with. 0.p



Now we can calculate the expected time to reach

s6 from those two states:

11 12 13 14 15 16

21 22 23 24 25 26

31 32 33 34 35 36

41 42 43 44 45 46

51 52 53 54 55 56

61 62 63 644 65 66

()

qqq q qq

qqqq qq











11 12 13 14 15

21 22 23 24 25

31 32 33 34 35

41 42 43 44 45

51 52 53 54 55

qqqqq















65432

6432

54 32

5432

6432

432

8 128 22 58 10 1

38 4 3 1

16 1

3( 1) 2 3 2 1

8834891

()

3( 1) 8 4 1

8402636212

38 4 3 1

41615121

pppppp

pp pp pp

pppp

pp ppp

ppp pp

ppppp

pppp

ppppp

    

  







   







  















000010

000100

000010

0000

000

(0)

000001







Q

(,0)1 (,0)

(,0) (2 (,0)) 1

Ts Ts







ANetworkDispersionProblemforNon-communicatingAgents

which gives the values

(,0) 4Ts  and

(,0) 3.Ts  These values happen to be the optimal

expected time from

and

as the expected times

for other values of p are bigger than 4 and 3,

respectively. In other words, for

2i  or

the

minimising value of

p is 1. For other values of ,i

we minimise the expressions for each row of the

vector τ with respect to

p in (0,1).

We can now list our results on the optimal

expected time using simple stunted random walk

strategy as well as the accompanying p values

(Table 1).

Table 1: Optimal expected times for simple stunted

random walk on L

Starting state

Random, r 4.2274 0.2332

= [3,0,0] 6.2125 0.2415

= [2,1,0] 4 0

= [2,0,1] 5.3333 0.5000

= [1,2,0] 3 0

= [0,3,0] 5.2836 0.3219

We can see that on average, the agents can

achieve their goal the fastest by starting from state

and setting their probability of staying still p as

zero. It is unlikely, though, that setting



would provide an optimal solution in larger line

graphs. This occurrence is mainly due to the

simplicity of the graph. Other solutions that give

good values are from random placement

( 4.2274)T  and when all agents start at the middle

node

( 5.2836)T  . These values provide us with a

benchmark for checking the validity of our

simulation techniques for the

L .

5.1.2 Compound Stunted Random Walk

In the compound stunted random walk model on

,L we allow the stationarity probability (of

remaining at the current node) to depend on the

population of agents at that node. That is, if the total

number of agents at your node is ,i then stay at that

node with probability

;

p otherwise move randomly

to a neighbouring node. In this case we get a

transition matrix

whose elements are functions

of the three probabilities

,p and

For example the transition probability

from

[1, 2, 0] to [1,1,1] is given by















12 2 12 2

21 1/2 1 .pp p pp p The agent

who is alone at the left must stay still, which

happens with probability

p , and one of the two at

the middle must move while the other one stays,

which happens with probability



21pp .

Finally, the one who moves must move to the

unoccupied end (probability ½ in a random walk).

Then, using the same formula for τ as in the

previous section, we get dispersal times from state

i as a function of

,p and

.p Minimising

these five expressions (and an average expression

for the top line corresponding to the random start

r )

with respect to varying all three probabilities gives

the solution to the minimum problem as seen in

Table 2. Note that except for the two states

discussed separately above, we have that the optimal

value of

p is 1. This says that if you are the only

agent at your node, stay still. This is similar to a

conjecture in Alpern and Reyniers (2002) for a

related problem, that singleton agents should stay

still, and other agents should not move to their

locations.

Table 2: Optimal expected times for compound stunted

random walk on L

Starting

state

Random, r 3.4006 1.0000 0.4646 0.1331

= [3,0,0] 5.3672 1.0000 0.4987 0.0831

= [2,1,0] 4 0 0 any

= [2,0,1] 2.7405 1.0000 0.5696 0.0537

= [1,2,0] 3 0 0 any

= [0,3,0] 4.3076 1.0000 0.4973 0.1425

5.2 Simulation Analysis for L

For line graphs

L with large n (3),n  the

Markov solution technique is not feasible, so we will

use simulation. To check the validity of the

simulation technique, we apply it also to the solved

case of

L . We shall see that we get a close fit to the

Markovian solution. For the simulation programme,

we modify the state formulation, by numbering the

three agents. The state

,,ijk

says that agent 1 is at

node i, agent 2 is at node j and agent 3 is at node k.

For example, state

1, 3, 2

. The first entry, 1, shows

that the first agent is at location 1. The second entry,

3, means that the second agent is at location 3, and

the final entry means the third agent is at location 2.

So this state is actually one of our absorbing states,

ICAART2014-DoctoralConsortium

in which every agent choose distinct locations.

There are in total 3

= 27 states in this formulation.

5.2.1 Simple Stunted Random Walk

We perform 50000 runs of simulation using simple

stunted random walk strategies. We found the results

are similar to the ones using exact method using

Markov chain theory (see Table 3). We also show

the comparison between the two results for the case

of initial random placement,

(, ),Trp on the graph in

Figure 2. This graphs shows that our simulation

provides outcomes which closely approximate the

ones calculated using the exact method.

Table 3: Simulation results of optimal expected times,

simple stunted random walk on L

(50000 runs).

Starting state

Random, r 4.1950 0.2350

1 =

[3,0,0] 6.2000 0.2300

2 =

[2,1,0] 3.9760 0.0600

3 =

[2,0,1] 5.3110 0.4900

4 =

[1,2,0] 2.9810 0.0400

5 =

[0,3,0] 5.2220 0.3100

Figure 2: Comparison of the expected times between the

two methods (exact method and simulation) for the case of

random initial placement T(r,p).

5.2.2 Compound Stunted Random Walk

From Table 4 below, we can see that the simulation

also gives results similar to the exact method shown

in Table 2. In particular, simulation can handle both

absorbing and non-absorbing chain in the same way,

unlike in the exact method where we need to

differentiate the approach between the two classes of

Markov chain.

Table 4: Simulation results of optimal expected times,

compound stunted random walk on L

(10000 runs).

Starting

state

Random, r 3.3115 1.0000 0.4900 0.1200

1 =

[3,0,0] 5.3159 0.9900 0.4900 0.1000

2 =

[2,1,0] 3.9385 0.0000 0.0400 0.6100

3 =

[2,0,1] 2.7392 0.9900 0.5400 0.0600

4 =

[1,2,0] 2.9532 0.0000 0.0000 0.0600

5 =

[0,3,0] 4.2666 0.9900 0.4700 0.1600

6 EXPECTED OUTCOME –

FURTHER RESEARCH

There are several directions of further research to be

carried out. The most obvious is to continue the

investigation to larger line graphs, such as

and eventually the class of

L in general. We also

plan to study other classes of graphs, in particular

the class

of complete graphs and the class

C of

cycle graphs. As both are transitive, all initial

placements of agents at a single node are equivalent,

which will simplify the analysis.

A second direction of research is to analyse how

strategies work on unknown graphs (mazes) or

random graphs. When sending in robots to search a

large house whose interconnecting room structure is

unknown, what is a good stationary probability p to

program them with?

Another possibility is the manager of the agents

observes the state and then broadcasts a ‘state-

dependent’ value of

or of the vector

Finally, we could give our agents some limited

eyesight, so for example they might know the

current population of all nodes adjacent to their own.

The walk they choose in this case need not be

random, as neighboring nodes are distinguished by

their populations. Presumably one would go more

likely to a less populated node.

REFERENCES

Alpern, S., Gal, S., 2003. The theory of search games and

rendezvous. Kluwer Academic Publishers, Boston.

Alpern, S., Reyniers, D., 2002. Spatial Dispersion as a

Dynamic Coordination Problem. Theory Decis. 53,

29–59.

Arthur, W. B., 1994. Inductive Reasoning and Bounded

Rationality. Am. Econ. Rev. 84, 406–411.

Azar, Y., Broder, A., Karlin, A., Upfal, E., 1999. Balanced

Allocations. SIAM J. Comput. 29, 180–200.

0.2 0.4 0.6 0.8

4.2

4.4

4.6

4.8

5.0

5.2



ANetworkDispersionProblemforNon-communicatingAgents

Challet, D., Zhang, Y.-C., 1997. Emergence of

cooperation and organization in an evolutionary game.

Phys. A Stat. Mech. its Appl. 246, 407–418.

Fretwell, S. D., Lucas Jr., H. L., 1969. On territorial

behavior and other factors influencing habitat

distribution in birds. Acta Biotheor. 19, 16–36.

Galceran, E., Carreras, M., 2013. A survey on coverage

path planning for robotics. Rob. Auton. Syst. 61, 1258–

1276.

Grenager, T., Powers, R., Shoham, Y., 2002. Dispersion

games: general definitions and some specific learning

results, in: AAAI-02 Proceedings. pp. 398–403.

Grinstead, C. M., Snell, J.L., 1997. Introduction to

Probability, 2nd ed., 2nd rev. e. ed. American

Mathematical Society.

Houston, A., McNamara, J., 1997. Patch choice and

population size. Evol. Ecol. 11, 703–722.

Qiu, L., Yang, Y. R., Zhang, Y., Shenker, S., 2003. On

Selfish Routing in Internet-like Environments, in:

Proceedings of the 2003 Conference on Applications,

Technologies, Architectures, and Protocols for

Computer Communications. ACM, New York, NY,

USA, pp. 151–162.

Rosenthal, R. W., 1973. A class of games possessing pure-

strategy Nash equilibria. Int. J. Game Theory 2, 65–

67.

ICAART2014-DoctoralConsortium