of the divided areas are left unattended. For exam-
ple, agents proposed in Kato and Sugawara (Kato
and Sugawara, 2013) may generate split (and often
chopped) RAs due to the local decision. This kind of
inefficient division leads to a decrease in overall per-
formance. Conversely, agents often have to generate
disconnected RAs. For example, if rooms have nar-
row entrance doors or are led by narrow passages that
must be allocated to only one agent (an actual exam-
ple is shown later), the splitting of RAs is mandatory.
Therefore, in such an environment, if we introduce
a constraint that each RA must be connected, the di-
vided working areas cannot be balanced.
Therefore, we propose a method for partition-
ing the entire area into a number of RAs for indi-
vidual agents to obtain a balanced workload with-
out unnecessarily splitting the areas. We experimen-
tally demonstrated that our proposed method exhib-
ited better performance than that by the conventional
method (Kato and Sugawara, 2013). We also inves-
tigated the performance when the environment had
many obstacles and when the environment character-
istics are not uniform and show that in all such cases,
our method outperformed the conventional method.
2 RELATED WORK
There are a number of studies on multi-agent contin-
uous patrolling problems (Huang et al., 2019). We
can classify them roughly into two based on the meth-
ods for solving them. In the first type of method, the
agents share the entire area to move around without
dividing the area into smaller subareas. Carrillo and
Rapp (Carrillo and Rapp, 2020) proposed a method
to identify patrolling policies for multiple agents with
limited visibility regions and non-deterministic pa-
trolling paths. Yoneda et al. (Yoneda et al., 2013)
propose a method in which agents individually de-
termine their exploration algorithms using reinforce-
ment learning to contribute more toward the shared
goal. They also assumed that agents’ intermittent be-
haviors depend on the battery charge and agents de-
cide to explore the environment further or return to
their charging base depending on their battery capac-
ity. Our study also assumes the charging base, and
that agents have to return to the base before run-out.
Sugiyama et al. (Sugiyama et al., 2019), who also
used the model of cyclic charging activities, consider
the cycle of agent patrolling while shifting the time
phase to visit each location. However, because en-
vironments are not partitioned, there is a possibility
that more than two agents patrol the same area, which
may be redundant and hence, unnecessary actions in-
crease.
In the second type of method for solving continu-
ous patrolling problems, the environment is fairly di-
vided into a number of subareas and each of them is
allocated to an agent to move around in a balanced
manner. In the method proposed by Nasir et al. (Nasir
et al., 2016), the leader agent divides the environment
and determines and allocates the exploring area to in-
dividual member agents. However, this requires cen-
tralized control by the leader agent, and hence, its fail-
ure affects the entire system. Ahmadi and Stone (Ah-
madi and Stone, 2006) introduced area division based
on boundary relationships between agents. If there
were overlapping locations, they were transferred to
the agents which frequently visited those areas. Elor
and Bruckstein (Elor and Bruckstein, 2009) proposed
a model based on balloon expansion so that individual
agents can fairly divide the environment into subareas
of the same size in a bottom-up manner. However,
given the obstacles and non-uniform structures in the
environment, divisions of the same size are not always
fair from the viewpoint of agents’ workload. More-
over, these methods did not consider the constraints
due to battery capacity.
By contrast, Kato et al. (Kato and Sugawara,
2013) introduced the constraint on the battery ca-
pacity and proposed the partitioning method for fair
workload among agents. Their method for area par-
titioning is based on the expansion power, like the
current study, that reflects the degree to which the
agent has completed the work in its RA. However,
because their method generated many fragments of
RAs (Kato and Sugawara, 2013), the performance
often decreased or could not be applied in a compli-
cated environment. In this study, agents divide their
RAs according to the shape of the environment so
that agents with the method can be applied to even
complex environments. We also reduce the unneces-
sary fragments of the area of responsibility to improve
the efficiency of event collection/observation in multi-
agent patrolling problem.
3 BACKGROUND AND PROBLEM
3.1 Environment
We introduce discrete time t ≥ 0, whose unit is a
step. The multi-agent continuous cooperative pa-
trolling problem can be expressed by (G,A,P ), where
A = {1,·· · ,N} is a set of agents, G = (V, E) is a con-
nected graph embeddable into two-dimensional Eu-
clidean space, V = {v
1
,· ·· ,v
x
} is the set of nodes,
E is the set of edges e
v
i
,v
j
connecting two nodes
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
282