• Tool (The object which is used by the hand to en-
hance the quality of some actions): touching the
hand all the time.
Action categories are based upon the objects,
which the hand interacts with. These fall into three
categories:
1. Actions with main support: In this category the
main object is always in touch with the main sup-
port; An example is shown in Fig. 2a.
2. Actions without main support: In this category the
main object is lifted from the main support; Nn
example is shown in Fig. 2b.
3. Actions with load and container: In this category a
container with load, e.g. a glass filled with water,
is used; An example is shown in Fig. 2c.
and several actions usually exist for each group. A
more detailed list of actions is shown in Tab. 1. The
full definition of the ontology is shown elsewhere
1
.
Now we can define the layers of the ontology.
Layer 1) SEC based Object Relations at Start: The
individual graphical panels in Fig. 2 represent the co-
lumns of a Semantic Event Chain ( which reflect the
transition of object relations and are the necessary
conditions for successful execution). Fig. 2b shows a
pick and place action; its corresponding SEC is shown
in the upper part of Fig. 3. The first column shows
the SEC-defined pre-conditions. If and only if these
touching relations are not violated, the action could
commence. But this is not yet sufficient.
Layer 2) Object Topologies: All actions are always
performed at the main object and this will only be
possible if the SEC-pre-condition hold and if the
main object appears in the scene with certain topolo-
gical connections to other objects. The middle part of
Fig. 3 shows which topologies are permitted for pick
and place.
Remarkably there are only three possible topolo-
gical relations to which all scenes that include the
main object can be reduced. To achieve this the
complete connectivity graph of who-touches-whom
will be reduced into those subgraphs that contain the
main object. Each subgraph consists of at least the
main object and the support, and, if directly touching
neighbors exist, only one directly touching neighbor
(Fig. 4). There are three cases:
1. The main object has only one touching relation.
The touched object is a support, e.g. a table (see
Fig. 4, left). A real world example is shown in
Fig. 7b; the blue plate is on top of the board and
the board becomes the support.
1
http://www.dpi.physik.uni-goettingen.de/cns/
index.php?page=ontology-of-manipulation-actions
2. The main object has two touching relations. One
is a support, the second one is another object,
which is also touching the support (see Fig. 4,
middle). In Fig. 7b, the apple touches its support
(green plate) and the yellow pedestal which is on
the same support.
3. The main object has two touching relations. It
touches its support and another object, which does
not touch the support (see Fig. 4, right). In Fig. 7b,
the pedestal is on top the green plate and the jar
is on top of the pedestal (but does not touch the
green plate).
These subgraphs determine the remaining precon-
ditions. For example, a tower structure as shown in
Fig. 4 (right graph) is not allowed for pick and place
and pushing actions.
Layer 3) Movement Primitives: SEC pre-conditions
and topological pre-conditions define the first two lay-
ers of the ontology. The third and last layer is a set of
movement primitives, which are needed to execute the
action.
For the pick and place action, the primitives are
shown at the bottom of Fig. 3. The complete list of
primitives for all actions is shown on the web page.
How to fill these abstract primitives with execution
relevant parameters will be described later and the
process of execution of actions is then the same as
in (Aein et al., 2013).
One primitive shall be explained in more detail:
The move(ob ject, T ) primitive sends a command to
the robot to move to a pose which is determined by
applying transform T to the pose of ob ject. The trans-
form T has two parts, a vector p which shows the
translation, and a matrix R which shows the rotation.
For example, when we want to grasp the main object,
we perform a move(main, T ) primitive to move the
robot arm end effector to a proper pose for grasping.
Since we want the end effector to reach the main ob-
ject, the vector p in this case is equal to zero. Ho-
wever, the rotation part R needs to be set such that
the robot approaches the main object from a proper
angle. This is necessary to avoid possible collisions
with other objects near the main.
2.2 Algorithm for
Execution-Preparation
Fig. 5 shows an overview of the algorithm used for
robotic execution of the above defined actions. Most
components rely on existing methods and will not be
described in detail.
We start with (1) an RGB-D recorded scene which
is (2) segmented using the LCCP algorithm (Stein
Context Dependent Action Affordances and their Execution using an Ontology of Actions and 3D Geometric Reasoning
221