Multiagent Planning Supported by
Plan Diversity Metrics and Landmark Actions
Jan Toˇziˇcka
1
, Jan Jakub˚uv
1
, Karel Durkota
1
, Anton´ın Komenda
2
and Michal Pˇechouˇcek
1
1
Agent Technology Center, Department of Computer Science, Czech Technical University, Prague, Czech Republic
2
Technion - Israel Institute of Technology, Haifa, Israel
Keywords:
Multiagent Planning, Diverse Planning, Planning with Landmarks.
Abstract:
Problems of domain-independent multiagent planning for cooperative agents in deterministic environments
can be tackled by a well-known initiator–participants scheme from classical multiagent negotiation protocols.
In this work, we use the approach to describe a multiagent extension of the Generate-And-Test principle dis-
tributively searching for a coordinated multiagent plan. The generate part uses a novel plan quality estimation
technique based on metrics borrowed from the field of diverse planning. The test part builds upon planning
with landmarks by compilation to classical planning. Finally, the proposed multiagent planning approach was
experimentally analyzed on one newly designed domain and one classical benchmark domain. The results
show what combination of plan quality estimation and diversity metrics provide the best planning efficiency.
1 INTRODUCTION
Multiagent planning is a specific form of distributed
planning and problem solving, which was summa-
rized by (Durfee, 1999). Multiagent planning re-
search and literature focused mostly on the coordina-
tion part of the problem while the synthesis part deal-
ing with a particular ordering of actions was studied
in the area of classical planning.
The coordination was, for instance, studied
in well-known General Partial Global Planning
by (Decker and Lesser, 1992) or with additional
domain-specific information as TALPlanner by (Do-
herty and Kvarnstr¨om, 2001). The first fusion of
the coordination and synthesis parts for domain-
independent multiagent planning with deterministic
actions was proposed by (Brafman and Domshlak,
2008). The approach was based on the classical plan-
ning formalism STRIPS (Fikes and Nilsson, 1971) ex-
tended to multiagent settings denoted as MA-STRIPS.
(Brafman and Domshlak, 2008) also proposed a solu-
tion for the coordination part of the problem by trans-
lation to a Distributed Constraint Satisfaction Prob-
lem (DCSP). Since the paper was focused primar-
ily on theoretical analysis of computational complex-
ity of MA-STRIPS problems, several algorithmic ap-
proaches appeared later in other papers, e.g., in (Nis-
sim et al., 2010) or (Torre˜no et al., 2012).
In this paper, we propose a novel algorithmic ap-
proach to multiagent planning for problems described
in MA-STRIPS based on the principle of classical
multiagent negotiation protocols as Contract Net with
one agent acting as an initiator and the rest acting as
participants. The approach can be seen as a proto-
col describing distribution of the Generate-And-Test
Search which was a base principle also in multia-
gent planners described by (Nissim et al., 2010) (us-
ing DCSP for the coordination part and classical plan-
ner for the generation part) and by (Pellier, 2010) (us-
ing backtracking search for the coordination part and
planning graphs for the generation part).
The contribution of our work is in the way how
plan candidates are generated and tested. Our gener-
ative process uses estimation of quality of generated
plans based on metrics of diverse planning (particu-
larly from (Bhattacharya et al., 2010) and (Srivastava
et al., 2007)). In other words, the idea is to generate
good-quality plans and avoid low-quality ones. The
quality measure is based on the history of answers
of the participants who were trying to extend the ini-
tial plan. Therefore it can be understood as a learn-
ing of the initiator agent to generate plan candidates
which can be more likely extended by more partici-
pant agents to a final solution.
The testing part utilizes planning with landmarks
(similarly as used by (Nissim et al., 2010)). The dif-
ference is that we translate a planning problem with
landmarks into an ordinary planning problem, which
178
Toži
ˇ
cka J., Jakub˚uv J., Durkota K., Komenda A. and P
ˇ
echou
ˇ
cek M..
Multiagent Planning Supported by Plan Diversity Metrics and Landmark Actions.
DOI: 10.5220/0004918701780189
In Proceedings of the 6th International Conference on Agents and Artificial Intelligence (ICAART-2014), pages 178-189
ISBN: 978-989-758-015-4
Copyright
c
2014 SCITEPRESS (Science and Technology Publications, Lda.)
can be then solved by a classical planner. Usually,
landmarks are incorporated into planners as special
heuristics as in (Richter and Westphal, 2010). How-
ever, our translation enables a straightforward incor-
poration of externally defined landmarks, which is re-
quired by the proposed planning protocol.
Finally, we provide experimental evaluation of the
planner on a newly designed planning domain tools
and rovers planning domain from International Plan-
ning Competition extended for multiagent planning.
2 PLANNING MODEL
We consider a number of cooperative and coordi-
nated agents featuring distinct sets of capabilities (ac-
tions) which concurrently plan and execute their local
plans in order to achieve a joint goal. The environ-
ment wherein the agents act is classical with deter-
ministic actions. The following formal preliminaries
compactly restate the MA-STRIPS problem (Brafman
and Domshlak, 2008) required for the following sec-
tions.
2.1 Planning Problem
An MA-STRIPS planning problem P is defined as a
quadruple P = hP,A,I,Gi, where P is a set of propo-
sitions or facts, A is a set of agents, I is an initial state
and G is a set of goals. We use α and β to range over
agents in A.
An action an agent can perform is a triple a =
ha
pre
,a
add
,a
del
i of subsets of P, where a
pre
is the set
of preconditions, a
add
is the set of add effects, and
a
del
is the set of delete effects. We define functions
pre(a), add(a), and del(a) such that for any action
a it holds a = hpre(a),add(a),del(a)i. Moreover let
eff(a) = add(a) del(a).
The set A contains agents. We identify an
agent with its capabilities, that is, an agent α =
{a
1
,... , a
n
} is characterized by a finite repertoire of
actions it can preform in the environment. A state
s = {p
1
,... , p
m
} P is a finite set of facts and we say
that p
i
holds in s. When no confusion can arise, we
use A also to denote the set of all actions of P , that is,
when we write a A then A is to be considered as a
shortcut for
S
A.
Example 1. We shall demonstrate definitions of this
section on a simple logistic problem involving three
locations Prague, Brno, Ostrava, and a Crown to be
delivered from Prague to Ostrava. A Plane can travel
from Prague to Brno and back. Similarly, a Truck pro-
vides connection between Brno and Ostrava.
The set of facts P contains (1) facts to describe po-
sitions of Plane and Truck like Plane-at-Prague and
Truck-at-Ostrava, and (2) facts to describe position
of the Crown like Crown-in-Brno and Crown-in-Truck.
The initial state and the goal are given as follows.
I = {Plane-at-Prague, Truck-at-Brno,Crown-in-Prague}
G = {Crown-in-Ostrava}
Agents can execute actions to:
1. load and unload the Plane or the Truck like
load
Plane@Prague
and unload
Truck@Ostrava
. The action
load
Plane@Prague
has preconditions Plane-at-Prague
and Crown-in-Prague, one add effect Crown-in-Plane
and it deletes Crown-in-Prague. Other actions are de-
fined similarly.
2. fly the Plane and drive the Truck between allowed
destinations like fly
BrnoPrague
and drive
BrnoOstrava
.
For example, drive
BrnoOstrava
has precondition
Truck-at-Brno and it adds Truck-at-Ostrava while
removing Truck-at-Brno.
Agent Plane is defined as being capable of executing
following actions.
Plane = { fly
PragueBrno
,fly
BrnoPrague
,
load
Plane@Prague
,load
Plane@Brno
,
unload
Plane@Prague
,unload
Plane@Brno
}
Agent Truck is defined similarly. Agent set A is then
simply {Plane,Truck}.
2.2 Problem Projections
MA-STRIPS problems distinguish between public
and internal facts and actions. Let facts(a) =
pre(a) add(a) del(a) and similarly facts(α) =
S
aα
facts(a). An α-internal and public subset of all
facts P, denoted P
α-int
and P
pub
respectively, are sub-
sets of P such that the following hold.
P
pub
S
α6=β
(facts(α) facts(β))
P
α-int
= facts(α) \ P
pub
P
α
= P
α-int
P
pub
The set P
pub
contains all the facts that are used in
actions of at least two different agents. The set can
possibly contain also other facts, that is, some facts
mentioned in actions of one agent only. This defini-
tion of public facts differs from other definitions in
literature (Brafman and Domshlak, 2008) where P
pub
is defined using equality instead of superset (), i.e.,
our definition gives partial freedom what is treated as
public. Our definition allows us to experiment with
extensions of the set of public facts. For the purpose
of this paper, however, the definition with equality can
be considered without any effect on our results. We
suppose that P
pub
is an arbitrary but fixed set which
MultiagentPlanningSupportedbyPlanDiversityMetricsandLandmarkActions
179
satisfies the above condition. Set P
α-int
of α-internal
facts contains facts mentioned only in the actions of
agent α, but possibly not all of them. The set P
α
con-
tains facts relevant to agent α.
Example 2. In our running example the set
facts(Plane) contains Plane-at-Prague, Plane-at-
Brno, Crown-in-Prague, Crown-in-Plane, and Crown-
in-Brno. The only fact shared by the two agents is
Crown-in-Brno but later on we will require also G
P
pub
so we have the following.
P
pub
= {Crown-in-Brno,Crown-in-Ostrava}
P
Plane-int
= {Plane-at-Prague,Plane-at-Brno
Crown-in-Prague,Crown-in-Plane}
P
Plane
= P
pub
P
Plane-int
The set P
Truck
is defined appropriately.
The projection a
α
of action a to agent α is an ac-
tion defined as follows.
a
α
= hpre(a) P
α
,add(a) P
α
,del(a) P
α
i
Example 3. In our example we can compute the
below action projections. To save space we write
(fly
PragueBrno
)
Plane
as fly
Plane
PragueBrno
and so on.
fly
Plane
PragueBrno
= fly
PragueBrno
fly
Truck
PragueBrno
= h
/
0,
/
0,
/
0i
load
Plane
Truck@Brno
= h{Crown-in-Brno},
/
0,{Crown-in-Brno}i
unload
Plane
Truck@Ostrava
= h
/
0,{Crown-in-Ostrava},
/
0i
The set α
pub
of public actions of agent α is defined
as α
pub
= {a | a α,eff(a) P
pub
6=
/
0}, and the set
α
int
of internal actions of agent α as α
int
= α\ α
pub
.
The set A
pub
of all public actions of problem P is de-
fined as A
pub
=
S
αA
α
pub
, and the set A
α
of all ac-
tions relevant to agent α is A
α
= α
int
{a
α
|a A
pub
}.
Note that a
α
= a for any a α. Hence in the definition
of A
α
we do not need to project internal actions, and
the only actions which are effected by α-projection
are public actions of agents other than α.
Example 4. In our example we have the following
public and relevant actions.
Plane
pub
= { load
Plane@Brno
,unload
Plane@Brno
}
A
Plane
= { fly
PragueBrno
,fly
BrnoPrague
,
load
Plane@Prague
,unload
Plane@Prague
,
load
Plane
Plane@Brno
,unload
Plane
Plane@Brno
,
load
Plane
Truck@Brno
,unload
Plane
Truck@Brno
,
load
Plane
Truck@Ostrava
,unload
Plane
Truck@Ostrava
}
Note that A
Plane
has ten actions while A
Truck
has only
eight because load
Plane@Prague
and unload
Plane@Prague
are private for Plane.
In a MA-STRIPS problem P , all the agents op-
erate on a shared global state. The projection P
α
of
a problem P to agent α is a classical STRIPS prob-
lem where an agent has an internal copy of the global
state. Previously defined relevant actions A
α
contain
(1) internal actions of agent α, (2) public actions of α,
and (3) projections of public actions of other agents
which emulate effects of external actions on the inter-
nal state. Projection P
α
of P is defined as follows.
P
α
= hP
α
,A
α
,I P
α
,Gi
Example 5. In our example we have
I P
Plane
= {Plane-at-Prague,Crown-in-Prague}
P
Plane
= hP
Plane
,A
Plane
,I P
Plane
,Gi
Projection P
Truck
is defined similarly.
In the rest of this paper we consider only problems
where all the facts of the goal state G are public, that
is, G P
pub
which is common in literature (Nissim
and Brafman, 2012). This assures that any agent is
able to find its local solution fulfilling the goal if it is
satisfiable. Then it is up to the agent negotiation to
extend this local solution to a valid plan. Moreover
we suppose that two different agents do not execute
the same action, that is, we suppose that the sets α
i
are pairwise disjoint (Brafman and Domshlak, 2008).
2.3 Plans and Solutions
A plan π is a sequence of actions ha
1
,... , a
k
i. A plan
π defines an order in which the actions are executed
by their unique owner agents. It is supposed that inde-
pendent actions can be executed in parallel. A plan π
is called a solution of P when it contains actions from
A and a sequential execution of the actions from π by
their respective owners transforms the initial state I to
a state which is a subset of G. Let sol(P ) denote the
set of all solutions of MA-STRIPS problem P . Simi-
larly, let sol(P
α
) denote the set of all solutions of the
classical STRIPS problem P
α
.
Example 6. Let us consider the following plans.
π
0
= hload
Plane@Prague
,fly
PragueBrno
,unload
Plane@Brno
,
load
Truck@Brno
,drive
BrnoOstrava
,unload
Truck@Ostrava
i
π
1
= hunload
Plane
Truck@Ostrava
i
π
2
= hunload
Truck
Plane@Brno
,load
Truck
Truck@Brno
,
drive
BrnoOstrava
,unload
Truck
Truck@Ostrava
i
It is easy to check that π
0
is a solution of our example
MA-STRIPS problem P. Plan π
1
is a solution of pro-
jection P
Plane
because projection unload
Plane
Truck@Ostrava
of Trucks public action simply produces the goal state
out of the blue. Finally, clearly π
2
sol(P
Truck
).
ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence
180
A public plan of problem P is a plan that contains
only actions from A
pub
, that is, contains only public
actions of P . A public plan can be seen as a solution
outline that captures execution order of public actions
while ignoring agents internal actions. For a solu-
tion π of P we construct the public projection π
pub
by
removing internal actions, that is, by restricting π to
A
pub
. Hence π
pub
is a public plan of P . For a solution
π of P
α
the public projection π
pub
is constructed sim-
ilarly by removing internal actions and additionally
by translating projection images back to their projec-
tion origins. That is to say, that π
pub
is composed of
public actions from A
pub
rather than from their pro-
jections which are present in π. Thus π
pub
is again a
public plan of P .
Example 7. In our example we know that π
0
sol(P )
and π
1
sol(P
Plane
) and π
2
sol(P
Truck
). Thus we
can construct the following public plans.
π
pub
0
= hunload
Plane@Brno
,load
Truck@Brno
,
unload
Truck@Ostrava
i
π
pub
1
= hunload
Truck@Ostrava
i
π
pub
2
= hunload
Plane@Brno
,load
Truck@Brno
,
unload
Truck@Ostrava
i
Note that π
pub
0
= π
pub
2
.
2.4 Public Plan Extensibility
We want to construct a solution of P from solutions
of agent projections P
α
. But not all projection solu-
tions can be easily composed to a solution of P . The
concept of public plan extensibility helps us to select
projection solutions which are conductive to our pur-
pose. In this section we use σ to range over public
plans to improve readability.
Definition 1. Let σ be a public plan of P . We say that
σ is internally extensible if there is π sol(P ) such
that π
pub
= σ. Similarly, we say that σ is internally α-
extensible if there is π sol(P
α
) such that π
pub
= σ.
Example 8. In our example it is clear that π
pub
0
is
internally extensible because it was constructed from
the solution of P . From the same reason we see that
π
pub
1
is internally Plane-extensible and π
pub
2
is inter-
nally Truck-extensible. It is easy to see that π
pub
2
is
also internally Plane-extensible. However, π
pub
1
is not
internally Truck-extensible because Truck needs to ex-
ecute other public actions prior to unload
Truck@Ostrava
.
The following lemma states that a solution of
problem P can be constructed from a public plan σ
which is internally α-extensible for all the involved
agents. The constructive proof suggests an algorithm
to construct a solution.
Lemma 1. Let public plan σ of P be given. Public
plan σ is internally extensible if and only if σ is inter-
nally α-extensible for every agent α that owns some
action from σ.
Proof. Case () is trivial. When σ is internally ex-
tensible then there is π sol(P ) such that π
pub
= σ.
We can construct projection π
α
of π to agent α by re-
moving internal actions of agents other than α, and by
applying projection a
α
to the remaining actions a. It
holds that π
α
sol(P
α
) and also π
pub
α
= σ. Thus σ is
internally α-extensible.
To prove case () let us suppose that α
1
,...,α
n
are all the agents that owns some action in σ. For
every i, σ is internally α
i
-extensible and thus there
is π
i
such that π
i
sol(P
α
i
) and π
pub
i
= σ. Now we
construct a solution π of P from projection solutions
π
i
s as follows. We split each π
i
by the public actions
from σ and we join the corresponding internal parts
of different plans together. Then we construct π from
σ by adding the joined parts between corresponding
public actions in σ. Note that we do not need to do
a reverse projection because for action a internal to
agent α it holds that a
α
= a. Clearly π
pub
= σ and
it is not hard to prove that π sol(P ). Hence σ is
internally extensible.
The consequence of the lemma is that to ensure
that P has a solution it is enough to find a solution π
sol(P
α
) for some agent α such that π
pub
is internally
extensible.
Example 9. We have seen previously that π
pub
2
is in-
ternally Truck-extensible and also internally Plane-
extensible. Hence we know that there is some solution
of P even without knowing π
0
. On the other hand, we
know that π
pub
1
is not internally Truck-extensible and
thus π
pub
1
is not internally extensible.
Some public plans of P can be extended to a valid
solution of P but it might require inserting also public
actions into σ. The following definition captures this
notion which will be used in the following sections.
Definition 2. Let public plan σ of P be given. We say
that σ is publicly extensible if there is public plan σ
of P which is internally extensible and σ is a subse-
quence of σ
.
Example 10. We have seen that π
pub
1
is not internally
extensible, however, it is still publicly extensible be-
cause it is a subsequence of π
pub
0
.
Similarly we define that σ is publicly α-extensible.
Projection solution π sol(P
α
) is called internally ex-
tensible (or publicly extensible) when the correspond-
ing public plan π
pub
is so.
MultiagentPlanningSupportedbyPlanDiversityMetricsandLandmarkActions
181
3 CONFIRMATION SCHEME
In this section we present a multiagent planning al-
gorithm which effectively iterates over all solutions
of one selected agent (initiator) in order to find such
a solution which is internally extensible by all the
other agents (participants). The confirmation algo-
rithm provides a sound and complete multiagent plan-
ning algorithm (see Theorem 2).
Algorithm 1: Multiagent planning algorithm with it-
erative deepening.
input : multiagent planning problem P
output : a solution π of P when solution exists
Function MultiPlanIterative(P ) is
l
max
1
loop
π MultiPlan(P ,l
max
)
if π 6=
/
0 then
return π
end
l
max
l
max
+ 1
end
end
We suppose that we have a separate agent capa-
ble of running planning algorithms for each agent
mentioned in a given problem P . Procedure
MultiplanIterative from Algorithm 1 is the main
entry point of our algorithms, both in this and the fol-
lowing sections. This procedure is initially executed
by one of the agents called initiator. It takes a prob-
lem P as the only argument and it iteratively calls
procedure MultiPlan(P,l
max
) to find a solution of
P of length l
max
, increasing l
max
by one on a failure.
In this way we ensure completeness of our algorithm
because we enumerate the infinite set of all plans in
a way that does not miss any solution. To simplify
the presentation, we restrict our research only to those
problems P which actually have a solution, that is,
sol(P ) 6=
/
0.
Algorithm 2 presents implementation of
MultiPlan in the confirmation algorithm. We
suppose that SinglePlan(P ,F ,l
max
) implements a
sound and complete classical planner which returns
a solution of (an initiator projection of) P of length
l
max
which is not in F . Moreover we suppose that
SinglePlan always terminates and that it returns
/
0
when there is no solution.
Initially, we set F to
/
0. Then we invoke
SinglePlan to obtain a solution of P denoted as
π. Afterwards, we ask the participant agents whether
or not the public plan π
pub
is internally α-extensible.
How participant agents fulfill this task is described in
Section 5.1 When answers from all of the agents are
affirmative then π is returned as a result. Other-
Algorithm 2: MultiPlan(P ,l
max
) in the confirma-
tion scheme. Function SinglePlan(P , F ,l
max
) re-
turns a plan of length l
max
solving problem P omit-
ting forbidden plans from F or
/
0 if there is no such
plan. Method AskAllAgents(π
pub
) ask all agents α
mentioned in the plan whether they consider the pub-
lic plan π
pub
to be internally α-extensible and returns
OK if all agents reply YES.
input : problem P and a maximum plan length l
max
output : a solution π of P when solution exists
Function MultiPlan(P , l
max
) is
F
/
0
loop
π SinglePlan(P , F ,l
max
)
if π =
/
0 then
return
/
0
end
reply AskAllAgents(π
pub
)
if reply = OK then
return π
end
F F {π}
end
end
wise π is added to the set of forbidden plans F and
SinglePlan is called to compute a different solution.
The following states that the (public projection of
the) plan returned by the confirmation algorithm is in-
ternally extensible to a solution of P (soundness), and
that the algorithm finds internally extensible solution
when there is one (completeness). It is easy to con-
struct a solution of P given an internally extensible
plan.
Theorem 2. Let procedure SinglePlan in
MultiPlan (Alg. 2) be sound and complete.
Then algorithm MultiplanIterative (Alg. 1) with
confirmation procedure MultiPlan is sound and
complete.
Proof. To prove soundness, let us suppose that π is
the result of MultiPlanIterative. Public plan π
pub
was confirmed by each agent α to be internally α-
extensible. Thus, by Lemma 1, it is internally extensi-
ble and followingthe lemma proof we can reconstruct
the whole solution of P .
Let us prove completeness. During each loop
iteration in MultiPlan one plan is added to F .
There are only finitely many plans of length l
max
and thus algorithm MultiPlan always terminates be-
cause SinglePlan is sound and complete. When P
is solvable, then some internally extensible solution
π has to be eventually returned by SinglePlan at
some point because SinglePlan is complete. This
solution is then the result of MultiPlan (and hence
ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence
182
MultiPlanIterative) because, as a solution of P ,
it has to be confirmed by all the participants.
4 GENERATING PLANS USING
DIVERSE PLANNING
In the previous section we have supposed that func-
tion SinglePlan(P ,F ,l
max
) selects an arbitrary so-
lution of P of length l
max
which is distinct from all
the previous solutions stored in F . In this section we
present an improved version of SinglePlan which
selects a solution based on evaluation of qualities of
previously found solutions.
Section 4.1 defines the notion of plan metrics
which are used to describe how much two plans dif-
fer. Based on these metrics we define in Section 4.2
a notion of the relative quality of a plan based on
evaluation of previously considered solutions which
were, however, rejected by at least one of the partic-
ipant agents. Finally, Section 4.3 describes improved
version of function SinglePlan.
4.1 Plan Metrics
While planning looks for a single solution of a prob-
lem, the goal of diverse planning is to find several dif-
ferent solutions. There are two main approaches to
define how much two plans differ. Firstly, the differ-
ence of two plans can be defined by their member-
ship to the same homotopy class (Bhattacharya et al.,
2010). Another approach defines a distance between
plans. The distance can be defined either on (i) ac-
tions and their relations, or on (ii) states that the exe-
cution of a plan goes through, or on (iii) causal links
between actions and goals (Srivastava et al., 2007). In
this paper, we use two metrics of the first type, that is,
distance metrics defined on actions and their mutual
positions in the plan.
4.1.1 Different Actions Metric
The Different Actions Metric counts the ratio of ac-
tions which are contained only in one of the plans. It
is defined as follows. Let π
0
\ π
1
denote the plan π
0
with all the actions from π
1
removed.
δ
A
(π
A
,π
B
) =
|π
A
\ π
B
| + |π
B
\ π
A
|
|π
A
| + |π
B
|
This metric considers neither the ordering of ac-
tions nor the fact that some of the actions can be in
a plan multiple times. Nevertheless, it is very simple
for evaluation.
4.1.2 Levenshtein Distance Metric
The Levenshtein Distance Metric (Levenshtein, 1966)
is a general distance metric defined on two sequences.
Let trim(π) be the plan π with the last action removed.
Moreover let diff(π
A
,π
B
) be 1 if the last actions of π
A
and π
B
differ and 0 otherwise. Then the Levenshtein
metric δ
L
(π
A
,π
B
) is defined as follows.
δ
L
(π,
/
0) = |π|
δ
L
(
/
0,π) = |π|
δ
L
(π
A
,π
B
) = min
δ
L
(trim(π
A
),π
B
) + 1
δ
L
(π
A
,trim(π
B
)) + 1
δ
L
(trim(π
A
),trim(π
B
))+
+diff(π
A
,π
B
)
This metric describes how many changes using el-
ementary operations have to be performed to convert
one plan into another. The elementary operations are
add an action into the plan, remove an action from the
plan, and replace one action in the plan by another
action.
4.2 Plan Quality Estimation
In Algorithm 2, the initiator agent generates its lo-
cal solution π and asks participant agents to check
whether π
pub
can be extended to a solution of their
local problems. Each participant either accepts or re-
jects π
pub
. Based on their replies, we can define the
quality Q (π) of π as the ratio of the number of partic-
ipants accepting π
pub
and the total number of partici-
pants.
Q (π) =
# of participants accepting π
pub
# of all participants
Hence the plan π with Q (π) = 1 is accepted by all of
the participants and the algorithm successfully termi-
nates.
Once we have a plan π
whose quality has al-
ready been established, we can define a relative qual-
ity (π,π
) of an arbitrary π with respect to π
using a
selected metric δ on plans as follows.
(π,π
) =
Q (π
) δ(π,π
)
The relativequality (π,π
) is high when either Q (π
)
is high and π is close to π
, or when Q (π
) is low and
π is distanced from π
. In other cases the value is close
to zero.
Suppose we have a set of plans Π whose qualities
have already been established. Then we can compute
the relative quality (π,Π) of an arbitrary plan π with
respect to Π in several ways. In our work we work
with the following two quality estimators.
MultiagentPlanningSupportedbyPlanDiversityMetricsandLandmarkActions
183
4.2.1 Average Quality Estimator
The average estimator
(π,Π) is defined as the av-
erage of the relative qualities of π with respects to the
plans from Π.
(π,Π) =
π
Π
(π,π
)
|Π|
4.2.2 Minimal Quality Estimator
The minimal estimator
min
(π,Π) is defined as the
minimal relative quality.
min
(π,Π) = min
π
Π
(π,π
)
4.3 Generating Diverse Plans
During the execution of Algorithm 2, the initiator
agent remembers the qualities Q of generated but re-
jected plans, that is, it remembers the qualities of all
the plans from F . We suppose that Q is updated with
every call to AskAllAgents. Additionally, the initia-
tor computes the following statistics about actions.
Q (a) = average quality of plans containing a
Q (a,a
) = average quality of plans containing a be-
fore a
The function SinglePlan executed repeatedly by
the initiator is described in Algorithm 3. It calls
DiversePlan to generate a fixed number (n) of local
solutions. Function DiversePlan works as follows.
Firstly it generates a solution candidate using roulette
wheel selection (B¨ack, 1996) based on average action
qualities Q (a). These actions are then presorted us-
ing statistics about action ordering Q (a, a
). Note that
two actions are swapped only if the difference of the
statistics is larger then some threshold
Q
(0.1 in our
experiments). This ordering step allows algorithm to
find the correct solution faster, but the price for that is
lost of completeness of SinglePlan procedure.
Once a solution candidate is generated, the initia-
tor α tests whether this sequence of actions is publicly
α-extensible, that is, whether it is its local solution.
If so, the solution is added to a set of diverse plans.
This process is repeated until the required number of
local solutions is found. In our implementation, this
process is further extended and occasionally, instead
of a roulette selection, those action which have not
been used often are chosen. In this way the algorithm
gathers further information about unused actions. Fi-
nally, function SinglePlan selects the diverse plan
with the maximum relative quality.
Algorithm 3: SinglePlan(P , F ) uses
DiversePlan(P , n,l
max
) to generate n differ-
ent solutions to the problem P and then selects
the best one using metric (π, F ). The generation
of different plans is based on the roulette wheel
selection by the quality evaluation received by other
agents.
input : classical STRIPS problem P , the set F of
forbidden plans, and a maximum plan length
l
max
output : a solution π of P when solution exists
Function SinglePlan(P , F ,l
max
) is
/*
n
is a constant */
Π
div
DiversePlan(P , F , n, l
max
)
π argmax
πΠ
div
((π,F ))
return π
end
input : problem P and n number of solutions
output : a set of diverse solutions
Function DiversePlan(P , F , n,l
max
) is
Π
/
0
while |Π| < n do
A GetRandomActions(P )
π
OrderActions(A)
π CreatePublicExtension(P ,π
)
if π 6=
/
0 & π / F & |π| l
max
then
Π Π {π}
end
end
return Π
end
Function GetRandomActions(P , l
max
) is
n RandomInt(1.. .min(l
max
,|A|))
A
/
0
while |A| < n do
A A {a : roulette selection by Q (a)}
end
return A
end
Function OrderActions(A) is
π A
for i = 2..|π| do
if Q (π
i
,π
i1
) Q (π
i1
,π
i
) >
Q
then
SwapActions(π
i1
,π
i
)
end
end
return π
end
5 FROM THEORY TO PRACTICE
We have implemented the algorithms described in the
previous sections taking advantage of several existing
techniques and systems. An overall scheme of the ar-
ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence
184












Figure 1: Architecture of the planner.
chitecture of our planner is sketched at Figure 1. An
input problem P described in PDDL is translated into
SAS using Translator script which is a part of Fast
Downward
1
system. Our Multi-SAS script then splits
SAS representation of the problem P into agents’ pro-
jections P
α
using user provided selection of public
facts P
pub
. Initiator then computes public extension
of actions to create a solution to its own projection of
the problem. Participants are then requested to check
whether they consider it to be internally α-extensible.
Next part of this section demonstrates how the
public and internal extension can be easily verified us-
ing any standard STRIPS planner.
5.1 Computing Plan Extensions
In our algorithms, agents are asked whether a pro-
vided sequence of actions can be extended into a solu-
tion by adding other actions into the sequence. Tech-
nically, this is similar to the planning problem with
landmarks (Brafman and Domshlak, 2008). In this
section we describe our algorithm to solve this prob-
lem. Based on this solution we describe how an ini-
tiator agent computes public extensions of a given se-
quence and how participant agents check whether a
sequence of public actions is internally extensible.
Suppose we are given a classical STRIPS planning
problem P = hP,A,I,Gi together with a sequence
σ = ha
1
,... , a
n
i of actions build from the facts P. The
planning problem with landmarks is the task to find
a solution π of the problem hP,A {a
1
,... ,a
n
},I,Gi
such that σ is a subsequence of π, that is, that all the
actions from σ are used in π in the proposed order.
Note that an action a
i
might or might be not in A.
1
http://www.fast-downward.org/
Definition 3. A planning problem with landmarks is
a pair hP ,σi where P = hP,A,I,Gi is a classical
STRIPS problem and σ = ha
1
,... , a
n
i is a sequence
of actions build from the facts of P .
A solution π of hP ,σi is a solution of the classical
STRIPS problem hP,A{a
1
,... , a
n
},I,Gi such that σ
is a subsequence of π.
We solve a planning problem with landmarks by
translating hP,σi into a classical STRIPS problem P
σ
such that the solutions of P
σ
are in a direct correspon-
dence to the solutions of the original problem with
landmarks. Firstly we take a set P
marks
of n + 1 facts
distinct from P denoted as follows.
P
marks
= {mark
0
,... , mark
n
}
The meaning of fact mark
i
is that the landmark actions
a
1
,... , a
i
has already been used in the correct order
and that the action a
i+1
can be used now. We will
ensure that only one fact from P
marks
can hold in any
reachable state. We will add mark
0
to an initial state
and we will require mark
n
to be in the goal.
Definition 4. Let P = hP,A,I,Gi and σ = ha
1
,... , a
n
i
and P
marks
= {mark
0
,... , mark
n
} such that P and
P
marks
are distinct be given. For every action a
i
let
us define action b
i
as follows.
b
i
= h pre(a
i
) {mark
i1
},
add(a
i
) {mark
i
},
del(a
i
) {mark
i1
} i
The translation of the planning problem with land-
marks hP , σi into a classical STRIPS problem P
σ
is
defined as follows.
P
σ
= h PP
marks
, A{b
1
,... , b
n
},
I {mark
0
}, G{mark
n
}i
Basically we take action a
i
and we add mark
i1
to
its preconditions and removemark
i1
when a
i
is used.
Moreover a use of a
i
enables us to use the next action
a
i+1
from the list σ by adding mark
i
to the effects. It
is easy to show the following property.
Lemma 3. Let hP ,σi be a planning problem with
landmarks. When π is a solution of P
σ
then π with
b
i
s changedback to a
i
s is a solution of hP ,σi. More-
over when there is a solution of hP ,σi then there is a
solution of P
σ
.
Recall that every agent α is equipped with its lo-
cal projection P
α
of problem P , that is, a classical
STRIPS problem defined as follows.
P
α
= hP
α
,A
α
,I P
α
,Gi
The set A
α
of local actions consists of α-internal ac-
tions α
int
and projections of public actions.
aa
A
α
= α
int
{a
α
|a A
pub
}
MultiagentPlanningSupportedbyPlanDiversityMetricsandLandmarkActions
185










Figure 2: A scheme of the Tool problem.
In Algorithm 3, the initiator agent is asked using
function CreatePublicExtension(P ,π
) to find a
solution of its local projection P
α
that has a given ac-
tion sequence π
as a subsequence, that is, its public
extension. The initiator can simply solve the planning
problem with landmarks hP
α
,π
i as shown in the fol-
lowing Theorem 4. Note that in this case the land-
marks from π
are also in the set A
α
of actions of P
α
.
Theorem 4. Plan π is publicly α-extensible to a so-
lution of P
α
if and only if the planning problem with
landmarks hP
α
,πi is solvable. And moreover, the so-
lution of planning problem with landmarks serves as
a proof of the extensibility, and vice versa.
Proof. It is quite straightforward to translate each
plan π
proving public extensibility of plan π to a solu-
tion of the planning problem with landmarks hP
α
,πi,
and vice versa.
In Algorithm 2, the participant agents are asked
from the call to AskAllAgents(π
pub
) to establish
whether π
pub
is internally α-extensible to a solution of
P
α
. The participant can simply check the solvability
of the planning problem with landmarks ha
α
1
,... , a
α
n
i
as shown in the following Theorem 5. Note that in
this case the landmarks are not in α
int
.
Theorem 5. Plan π = ha
1
,... , a
n
i is internally α-
extensible to a solution of P
α
if and only if the
planning problem hP
α
,α
int
,I P
α
,Gi with landmarks
ha
1
,... , a
n
i is solvable. And moreover, the solution of
planning problem with landmarks serves as a proof of
the extensibility, and vice versa.
6 EXPERIMENTS
For our experiments, we have designed the Tool Prob-
lem that allows us to observe a smooth transition in
the complexity of the problem.
We focused our experiments on the following cri-
teria: (1) comparison of different estimators and (2)
an average number of iterations required to find a so-
lution.
6.1 Tool Problem
In the Tool Problem, the goal is that each of N agents
performs its public doGoal action as it is shown in
Figure 2. However,this action must be preceded by its
internal useTool action first. Only the initiator agent
can provide tools with the handTool action. Formally,
there are N tools tool1, ..., toolN, and N + 1 agents
(the initiator and N participants). In the initial state,
none of the participants has its tool and the initiator
has all of them. However, the initiator does not know
that the participants need them. One of possible solu-
tions is as follows.
1. handTool(initiator, tool1)
.
.
.
N. handTool(initiator, toolN)
N+1. useTool(participant1, tool1)
.
.
.
2N. useTool(participantN, toolN)
2N+1. doGoal(participant1, tool1)
.
.
.
3N. doGoal(participantN, toolN)
Other permutations of the plan also form a valid
solution.
6.2 Results
Let us present our results for the Tool Problem with 2,
4, 6, 8, 10, and 12 tools. Graphs in figures 3, 4, and 5
show the results of running our experiments 50 times.
Estimator Average Errors. Firstly, we compare
both estimators presented in this paper: Average Esti-
mator (titled AVG in the graphs) and Minimal Estima-
tor (MIN). Each estimator is tested with two different
distance metrics: Different Action Metric (DIFF) and
Levenshtein Distance Metric (LEV). Figure 3 demon-
strates the progress of the estimators errors for the
Tool Problem with 10 tools. Errors are computed
from the average of 50 runs. As shown in the graph,
Average Estimator with Different Action Metric con-
verts quickly to very low error and thus it seems to be
the best choice for this problem.
ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence
186
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 20 40 60 80 100
average error
# of iterations
AVG+LEV
AVG+DIFF
MIN+LEV
MIN+DIFF
Figure 3: Progress of an average error of plan qualities com-
puted by different estimators for the Tool Problem with 10
tools.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
2 4 6 8 10 12
average error
# of tools in problem
AVG+LEV
AVG+DIFF
MIN+LEV
MIN+DIFF
Figure 4: Average errors of plan qualities computed from
the first 80 iterations for the Tool Problem with a variable
number of tools.
Figure 4 shows an average error for each estima-
tor during first 80 iterations for different sizes of the
Tool Problem. We can see that the Average Estimator
with Different Action Metric again shows the lowest
errors for all the cases, and furthermore, that its error
decreases with increasing problem complexity.
Results for Tool Problem. Table 1 shows how
many Tool Problems of different sizes has been solved
during 50 runs using different plan generation tech-
niques. We can see that most of the approaches
perform better than a random generation of plans
2
.
2
We have implemented a simple implementation of
SinglePlan by translating a planning problem into a SAT
problem instance and by calling an external SAT solver to
solve it. It is easy to instruct a SAT solver to compute a
solution different from previously found solutions.
0
100
200
300
400
500
2 4 6 8 10 12
# of iterations
# of tools in problem
RND
AVG+LEV
Figure 5: The number of Generate-And-Test iterations
needed to solve different sizes of the Tool Problem using
random generation of plans (RND) and generation driven by
the Average Estimator with the Levenshtein Distance Met-
ric (AVG+LEV). Graph shows median (line in the rectangle)
and 25 % and 75 % quantile (lower and upper bound of the
rectangle) of the results.
AVG+LEV again shows the best performance. Fig-
ure 5 shows more detailed distribution for its results in
comparison to the baseline random generation. This
graph shows a significant improvement over the base-
line solution and that more complex cases of Tool
Problem can be solved using this technique.
Results for Rover Problem. Classical planners are
compared at the International Planning Competition
with a well defined set of problems called IPC prob-
lems. Unfortunately, most of these problems are by
their nature a single-agent problems and there is no
standard way to convert them into a multiagent set-
ting. Nevertheless, some of the problems are by their
nature multiagent and fulfills all the requirements we
have specified above in this article. One of the prob-
lems is called rovers and its goal is to plan actions for
multiple robotic rovers on Mars that need to collect
samples and transmit their data back to Earth via a
shared base.
Table 2 shows that we were able to solve some
problem instances very quickly when the first plan
generated by the initiator (rover0) was internally α-
extensible by all the other agents and thus formed a
solution of the problem. When the first generated plan
was not a solution of the problem then the search for a
solution usually timeouted because it requires a plan-
ner to find out that a problem has no solution. This
constitutes a challenge for the state-of-the-art plan-
ners which usually performs best on problems which
actually have a solution. When there is no solution
MultiagentPlanningSupportedbyPlanDiversityMetricsandLandmarkActions
187
Table 1: Percentage of successfully solved instances of the Tools Problem for different number of tools. Comparison of a
reference random plans generator (RND) and different combinations of estimators and plan distance metrics.
2 3 4 5 6 7 8 9 10 11 12
RND 100% 100% 100% 100% 98% 80% 54% 40% 16% 6% 10%
MIN+DIF 100% 100% 100% 98% 60% 36% 22% 6% 16% 2% 0%
MIN+LEV 100% 100% 100% 94% 96% 100% 90% 96% 68% 100% 76%
AVG+DIF 100% 100% 100% 100% 94% 88% 84% 72% 68% 70% 78%
AVG+LEV 100% 100% 100% 100% 100% 100% 100% 100% 96% 98% 82%
Table 2: Number of iterations needed to successfully solve Rovers problems from the IPC collection of planning problems.
Problems marked by were not solved because the problem was too large for the test of public extensibility and FD did
not finish in a reasonable time. Two experiments did not finish because of an error in FD planner during the test of internal
extensibility (marked as E-P). The value 0 means that a solution was found immediately and successfully confirmed by all the
participants without any need for negotiation.
# rovers 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
iteration 2 0 0 0 0 0 E-P 0 0 E-P
then the planners usually get stuck in an exhaustive
search of the whole plan space. Nevertheless, our
planner was able to solve quickly few of harder in-
stances of the problem, even faster than other multia-
gent planners (Nissim and Brafman, 2012).
7 FINAL REMARKS
We have proposed a novel approach to planning
for MA-STRIPS problems based on the Generate-
And-Test principle and initiator–participant protocol
scheme. We have experimentally compared various
combinations of plan quality estimators and plan dis-
tance metrics improving efficiency of the plan gen-
erating approach. Additionally, we have validated a
principle of planning with landmarks by compilation
to classical planning problem used as the testing part
of the planner. The results show that the principle is
viable and the best combination of estimator and met-
ric for the designed domain is averaging with action
difference metric.
In future work, we plan to test the planner in
more planning domains, as it is from the beginning
designed as domain-independent and reinforce the
plan generation process by elements of backtracking
search. Additionally, the approach hinges on efficient
solving of plan-(non)existence problems with land-
marks (the plan extensibility problem), therefore we
will analyze how to improve on that as well.
ACKNOWLEDGEMENTS
This research was supported by the Czech Science
Foundation (grant no. 13-22125S) and in part by a
Technion fellowship.
REFERENCES
B¨ack, T. (1996). Evolutionary algorithms in theory and
practice: evolution strategies, evolutionary program-
ming, genetic algorithms. Oxford University Press,
Oxford, UK.
Bhattacharya, S., Kumar, V., and Likhachev, M. (2010).
Search-based path planning with homotopy class con-
straints. In Felner, A. and Sturtevant, N. R., editors,
SOCS. AAAI Press.
Brafman, R. and Domshlak, C. (2008). From One to Many:
Planning for Loosely Coupled Multi-Agent Systems.
In Proceedings of ICAPS’08, volume 8, pages 28–35.
Decker, K. and Lesser, V. (1992). Generalizing the Par-
tial Global Planning Algorithm. International Jour-
nal on Intelligent Cooperative Information Systems,
1(2):319–346.
Doherty, P. and Kvarnstr¨om, J. (2001). Talplanner: A tem-
poral logic-based planner. AI Magazine, 22(3):95–
102.
Durfee, E. H. (1999). Distributed problem solving and plan-
ning. In Weiß, G., editor, A Modern Approach to
Distributed Artificial Intelligence, chapter 3. The MIT
Press, San Francisco, CA.
Fikes, R. and Nilsson, N. (1971). STRIPS: A new approach
to the application of theorem proving to problem solv-
ing. In Proceedings of the 2nd International Joint
Conference on Artificial Intelligence, pages 608–620.
Levenshtein, V. (1966). Binary Codes Capable of Cor-
recting Deletions, Insertions and Reversals. Soviet
Physics Doklady, 10:707.
ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence
188
Nissim, R. and Brafman, R. I. (2012). Multi-agent A* for
parallel and distributed systems. In Proceedings of
AAMAS’12, pages 1265–1266.
Nissim, R., Brafman, R. I., and Domshlak, C. (2010). A
general, fully distributed multi-agent planning algo-
rithm. In Proceedings of AAMAS, pages 1323–1330.
Pellier, D. (2010). Distributed planning through graph
merging. In Filipe, J., Fred, A. L. N., and Sharp, B.,
editors, ICAART (2), pages 128–134. INSTICC Press.
Richter, S. and Westphal, M. (2010). The lama planner:
guiding cost-based anytime planning with landmarks.
J. Artif. Int. Res., 39(1):127–177.
Srivastava, B., Nguyen, T. A., Gerevini, A., Kambhampati,
S., Do, M. B., and Serina, I. (2007). Domain indepen-
dent approaches for finding diverse plans. In Veloso,
M. M., editor, IJCAI, pages 2016–2022.
Torre˜no, A., Onaindia, E., and Sapena, O. (2012). An ap-
proach to multi-agent planning with incomplete infor-
mation. In ECAI, pages 762–767.
MultiagentPlanningSupportedbyPlanDiversityMetricsandLandmarkActions
189