an adequate interface to the higher layers, but also
provide the necessary expressivity required to model,
e.g. robot-object-interaction. For each component se-
lected by the adaptation layer, one hybrid automaton
is introduced, and the various automata can interact
via synchronization or shared variables.
Using a variant of HA with inputs according to
(Stursberg, 2006), a hybrid automaton modeling the
system is given by HA = (X,U,Z,inv, Θ,g, f), with
X as the continuous state space, U the input space,
Z the set of discrete locations, inv the assignment
of invariance sets for the continuous variables of
the discrete locations, and Θ the set of discrete
transitions. A mapping g : Θ → 2
z
associates a
guard set with each transition. The discrete-time
continuous dynamics f defining the system dynamics
is x(t
j+k
) = f(z(t
k
),x(t
k
),u(t
k
)). At any time t
k
, the
pair of continuous and discrete state forms the current
hybrid state s(t
k
) = (z(t
k
),x(t
k
)). Hybrid automata
for modeling relevant components of the environment
introduced without input sets U (since not directly
controllable).
Model Predictive Control. In order to generate con-
trol trajectories for the HA, the principle of model pre-
dictive control (MPC) is used (Morari et al., 1989).
Considering the control problem not only as a motion
planning problem like typically done in the robotic
domain, e.g. (LaValle, 2006), but as an MPC prob-
lem has the following advantages: (1) an optimal so-
lution for a given cost function and time horizon is
computed, (2) model-based predictions for the behav-
ior of the system and the environment lead to more
reliable and robust results, and (3) a set of differential
and dynamic constraints can be considered relatively
easy. The MPC scheme solves, at any discrete point
of time t
h
, an optimization problem over a finite pre-
diction horizon to obtain a sequence of optimal con-
trol inputs to the system. The optimization problem
considers the dynamics of the system and the follow-
ing additional constraints: a sequence of forbidden re-
gions φ
F,k
= {F
k
,F
k+1
, ·· · ,F
k+p
}, and a sequence of
goal regions φ
G,k
= {G
k
,G
k+1
, · ·· ,G
k+p
} are speci-
fied over the prediction horizon p·(t
k
−t
k−1
). Here, F
k
denotes a state region that the system must not enter,
and G
k
is the state set into which the system should
be driven. The constrained optimization problem can
be formulated as:
min
φ
u,k
J(φ
s,k
,φ
e,k
,φ
u,k
, p,φ
G,k
)
s.t. s(t
j
) /∈ F
j
∀ j ∈ {k, ...,k + p}
u
min
≤ u(t
j
) ≤ u
max
φ
s,k
∈ Φ
s
and φ
e,k
∈ Φ
e
where φ
s,k
and φ
e,k
are the predicted state trajectories
of the system and the environment, which must be
contained in the sets of feasible runs Φ
s
and Φ
e
. No
state s(t
j
) contained in φ
s,k
must be in a forbidden
region F(t
j
). u
min
and u
max
are the limitations for the
control inputs u(t
j
) and thus for the control trajectory
φ
u,k
. A possible solution technique for the above
optimization problem is the following sequential
one: (1) an optimizer selects a trajectory φ
u,k
, (2)
the models of system and environment are simulated
for this choice leading (possibly) to feasible φ
s,k
and φ
e,k
, (3) the cost function J is evaluated for
these trajectories, and (4) the results reveals if φ
u,k
should be further modified for improvement of the
costs or if the optimization has sufficiently converged.
Knowledge Base and Learning. In order to improve
the computational efficiency, the MPC scheme is en-
hanced by a knowledge-base, in which assignments
of action sequences to situations is stored. The ob-
jective of the learning unit with a knowledge base
is to reduce the computational effort by replacing
or efficiently initializing the optimization. The sit-
uations in the knowledge base are formulated as tu-
ple δ = (φ
s,k
,φ
e,k
,φ
G,k
,φ
F,k
,φ
u,k
,J), where s(t
k
) and
e(t
k
) again denote the current state of the system and
the environment, and φ
u,k
, φ
G,k
, and φ
F,k
are the se-
quences of control inputs, the goal, and forbidden
sets, respectively. For a given situation at the current
time t
k
, by using similarity comparison
1
, the learning
unit infers a proper control strategy φ
L
u,k
if it exists.
Otherwise, the optimization is carried out and the re-
sult is stored in the knowledge-base.
6 APPLICATION TO A KITCHEN
SCENARIO
The presented hierarchical architecture is applied to
a service robot in a kitchen scenario. The task of the
robot is to lay a table (see Fig 3), i.e. the robot is
expected to drive back and forth between a table and
a kitchenette for positioning plates and cutlery on
the table. The scenario obviously formulates a very
challenging planning task, as the robot has to decide
which object to take, how to move to the desired place
at the table, and how to avoid collision with humans
moving in the same space – this is the motivation
for employing a decomposition-based planning
approach. To simplify the upcoming presentation, the
1
Similarity is here defined by small distances of the
quantities specifying a situation in the underlying hybrid
state space.
EFFICIENT PLANNING OF AUTONOMOUS ROBOTS USING HIERARCHICAL DECOMPOSITION
265