Articulated Object Modeling based on Visual and Haptic Observations

Wei Wang

, Vasiliki Koropouli

, Dongheui Lee

and Kolja K¨uhnlenz

2,3

Institute of Automatic Control Engineering (LSR), Technische Universit¨at M¨unchen, D-80290 M¨unchen, Germany

Institute of Advanced Study (IAS), Technische Universit¨at M¨unchen, D-80290 M¨unchen, Germany

Bayerisches Landesamt f¨ur Maß und Gewicht, D-80638 M¨unchen, Germany

Keywords:

Articulated Object Modeling, Object Skeletonization, Vision-based Articulated Object Manipulation.

Abstract:

Manipulation of articulated objects constitutes an important and hard challenge for robots. This paper pro-

poses an approach to model articulated objects by integrating visual and haptic information. Line-shaped

skeletonization based on depth image data is realized to extract the skeleton of an object given different con-

ﬁgurations. Using observations of the extracted object’s skeleton topology, the kinematic joints of the object

are characterized and localized. Haptic data in the form of task-space force required to manipulate the object,

are collected by kinesthetic teaching and learned by Gaussian Mixture Regression in object joint state space.

Following modeling, manipulation of the object is realized by ﬁrst identifying the current object joint states

from visual observations and second generalizing learned force to accomplish the new task.

1 INTRODUCTION

Most tasks in human daily life require manipulation

of articulated objects of one or more degrees of free-

dom. Some characteristic examples of such tasks

consist of door opening, drawer pulling and rotat-

ing a water tap. Manipulation of articulated objects

is a great challenge for robots which are required to

recognize an articulated object mostly by vision and

make a decision about how to manipulate it. By mak-

ing robots capable of manipulating articulated ob-

jects, they could enter more actively human life and

help humans with dangerous or difﬁcult tasks as well

as helping elderly people in daily life.

Many previous works on articulated object mod-

eling mainly focus on solving the problem of identi-

fying the kinematic characteristics of articulated ob-

jects using different types of sensor systems. In

(Sturm et al., 2011), an approach is presented to learn

kinematic models of articulated objects from obser-

vations, which does not allow for object identiﬁca-

tion, and ignores kinematic joint localization and con-

strains in object. In (Katz and Brock, 2008), kine-

matic task-relevantknowledgeisacquired and learned

in object’s joint state space. This is realized via in-

teraction with the environment and, ﬁnally, a kine-

matic model of the object is incrementally built. How-

ever, only visual data is employed and information

about the dynamic properties of the object is not

Figure 1: A 7 DoF robotic arm manipulates a car’s door

(single revolute joint articulated object).

taken into account for manipulation. In (Huang et al.,

2012), joint axes’ position of an articulated object is

estimated given different object conﬁgurations from

depth image data. This aims at providing the grasp-

253

Wang W., Koropouli V., Lee D. and Kühnlenz K..

Articulated Object Modeling based on Visual and Haptic Observations.

DOI: 10.5220/0004280902530259

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2013), pages 253-259

ISBN: 978-989-8565-48-8

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

Vision

data

Frame 1

Frame 2

Articulated object

skeletonization

Check whether single

or multi joint object

Kinesthetic teaching

for object manipulation

Joint type

recognition

Joint

localization

Haptic learning

for manipulation

Joint state-dependent

manipulation behavior learning

Frame N

Articulated

object model

Task execution

Vision data

(current frame)

Object joint state

estimation

Task goal

Articulated Object Model Building

Object Recognition for Manipulation

Figure 2: Proposed framework.

ing point and position trajectory to the robot. How-

ever, recognition of the object conﬁguration is not

considered. All previous works lack a framework for

recognition of different articulated objects. In addi-

tion, they do not allow to estimate the current joint

states of the object and adapt the manipulating behav-

ior accordingly. In addition, previous works do not

account for learning the force that is required to op-

erate an object. For example, opening a completely

closed or semi-closed door are two different tasks

which require different manipulating forces. Some

other works focused on learning manipulation of ar-

ticulated mechanisms by learning force control skills

while ignoring the mechanism structure of the ob-

ject. In all these works (Kalakrishnan et al., 2011),

(Lutscher et al., 2010), no visual information is used

to recognize the object and characterize the number

and type of joints and the constraints that apply on

each joint of the object. Therefore, these approaches

cannot generalize to the objects with different struc-

tures or conﬁgurations.

All works on articulated objects so far focus, ei-

ther on using visual data for object characterization

without learning manipulation force, or on learning

manipulation force skills without analyzing the artic-

ulation characteristics of the object. Learning manip-

ulation of even a single-joint articulated object is a

challenging problem, since the articulation character-

istics of the object have to be extracted ﬁrst before

appropriate manipulation force is learned. We thus,

ﬁrst seek to solve the problem for single-joint articu-

lated objects and extend in future works to multiple-

joint objects. In this paper, a framework for learning

manipulation skills for single-joint articulated objects

is proposed, which consists of (a) skeletonization of

object, (b) joint number estimation based on object

skeleton trace from different visual frames, (c) char-

acterization of joint type, and (d) learning of Carte-

sian force which is required for manipulation. In

particular, visual data are employed to build the ob-

ject skeleton and estimate the current state of the ob-

ject’s joint. The trace of the skeleton nodes over time

is employed to determine whether it is a single- or

multi-joint object. In addition, haptic data in the form

of Cartesian-space forces are captured from multiple

human demonstrations by kinesthetic teaching and

learned in object’s joint state space. Generalization of

manipulation force can be realized based on current

joint’s state and the task goal.

This paper is organized as follows. In Section

2 We deﬁne our problem and propose a method for

skeletonize an articulated object and learning the ma-

nipulation force. In Section 3, the experimental setup

and results are presented.

2 PROPOSED APPROACH

To manipulate articulated objects, information about

both the structure and the kinematic and dynamic

properties of the object is required. An articulated

object could be described by its number and type of

joints, link properties and kinematic relationships be-

tween neighboring links. Basic geometry features

which are used for rigid object modeling and recog-

nition, such as Viewpoint Feature Histogram (VFH)

(Rusu et al., 2010), are not suitable for deformable

objects. However, these approaches require complete

depth information of the object. Since articulated ob-

jects can lie in a practically huge number of different

conﬁgurations, capturing information about all these

potential conﬁgurations is practically infeasible. For

this reason, object skeletonization is the most suit-

able method for extracting the structure and kinematic

constraints of an object. We deﬁne the model of an ar-

ticulated object as

Obj = (S, J

(T, P, C), f

f),m = 1,..,M (1)

where S represents the skeleton of the object which

is used for object recognition, J

joint descriptor of

the m-th joint, T joint type, P joint position and C

joint constraints. The f

f(J

,...,J

) is the Cartesian

force which is needed to manipulate the object where

,...,J

are joint descriptors of the articulated object

where M is the number of joints.

Investigating multiple-joint objects is highly com-

plicated and implies sufﬁcient modeling of all indi-

vidual joints of the object. For this, in this paper, we

focus on modeling of single-joint articulated objects

where visual and haptic information is integrated for

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

254

Vector ﬁeld smoothing based on

Gaussian-weighted nearest neighbors

c) Line shape skeleton extraction

based on vector ﬁeld clustering

d) Skeleton topology estimation

based on the line growth

e)Object extraction based on

RANSAC plane estimation

a) Vector ﬁeld generation based

on optimized cutting plane

iteration# n

iteration# 0

Figure 3: Skeletonization steps of a multi-joint articulated object (phone arm).

highly efﬁcient object manipulation. The framework

presented here can be extended to modeling multiple-

joint objects though and this is going to be presented

in future work. Manipulation force constitutes part

of an object’s model since it indicates the dynamic

properties of the object. This force is critical to the

success of a robotic task and depends on the object’s

current joint states. The manipulating force can be

represented by f

f = π(s

, e), m = 1,..., M, where π

is a force generation policy, s

the state of the m-

th joint which may represent the angle of a rotational

joint or length of a prismatic joint and e the task goal.

Fig. 2 shows the framework which is used to

model a single-joint articulated object. The frame-

work consists of two main components which are

building a database of articulated objects’ models and

recognizing an incoming object based on visual and

haptic information. The modeling stage can be di-

vided into two parts where the ﬁrst part involves

vision-based object skeleton extraction and the sec-

ond part consists of identiﬁcation of the object’s dy-

namic properties by teaching the robot appropriate

force to operate the object. The kinematic joint prop-

erties (T,P,C) of a joint J are estimated from obser-

vation of the skeleton S across multiple conﬁgura-

tions. Using learning by demonstrations, the appro-

priate force f

f is learned in the object’s joint space.

During generalization, the robot observes the object

and extracts its current joint state. The force is gen-

erated based on the task goal such as the position or

joint angle the object should ﬁnally reach and its cur-

rent joint state.

2.1 Object Skeletonization

A point cloud, in terms of depth image data of an ob-

ject, is used for skeletonization of the articulated ob-

jects. This is realized by observing multiple frames

of the object’s kinematic links. The skeleton of the

object is extracted which allows to recognize the ob-

ject and estimate its current joint states. Based on

extracted object skeleton and the location of skeleton

nodes, the object is classiﬁed as a single or multi-joint

object. Skeleton models which represent the medial

axis of a 3D model are widely used for object recon-

struction and arterial object analysis. In (Tagliasacchi

et al., 2009), rotational symmetry axis is used for the

object skeleton points estimation. This work requires

the full range point cloud of the object and uses the as-

sumption that all object’s model should be pipe-like.

Instead, in this paper, a novel method of skeletoniza-

tion of articulated objects is presented, which is not

based on pipe-like conﬁgurations only but it can iden-

tify objects of abstract structures such as plane-like

structure. The phone arm shown in Fig. 3 and car’s

door shown in Fig. 5 are two examples of objects

with different type of structure, the ﬁrst pipe-like and

the latter plane-like.

2.1.1 Vector Field Generation

Firstly, the Random sample consensus (RANSAC)-

based plane ﬁtting algorithm is used to extract the

object point cloud from the background (Rusu et al.,

2010), shown in Fig. 3(a) and Fig. 5(b). The vec-

tor ﬁeld presents the best local rotational symmetry

of each point in the extracted object point cloud. Our

method extracts the vector ﬁeld using the optimized

cutting plane. Based on RANSAC plane estimation

with a certain number of iteration steps T

, the vector

ﬁeld over the data points is generated. The best cut-

ting plane C

= plane[x

] which goes through the

point x

with the normal ˆv is estimated by minimizing

the number of inliers which are within the distance d

In addition, these points should also be in the same

cluster N

of the related point x

using the geometric

nearest neighbors:

ˆv

= argmin

v∈ℜ

,kvk=1

num({j

| k c

−C

(t)

k≤d

∈X

raw

}),

where t ∈[1,T

] is the iteration index. Fig. 3(b) shows

the result where the circles show the iteration step.

ArticulatedObjectModelingbasedonVisualandHapticObservations

255

X(m)

Y(m)

Z(m)

Frame #1

Frame #2

Frame #3

Frame #4

Frame #5

Frame #6

Frame #7

Frame #8

Frame #9

Frame# 10

Frame #11

Skeleton Node #1

Skeleton Node #2

Skeleton Node #3

Skeleton Node #4

Skeleton Node #5

Figure 4: Skeleton node traces through different visual

frames: black lines present the skeleton topology; each

skeleton node trace is shown by a different-color solid line.

Note that, the direction of the optimized cutting plane

could be the inverse which, however, will not inﬂu-

ence the ﬁnal results. The directions are reorganized

based on the base plane coefﬁcients.

A Gaussian-weighted method is developed for the

vector ﬁeld smoothing. The point x

with normal

has the neighbor cluster X

with points number n,

which is determined by the distance threshold d

. The

weight function w is deﬁned based on the gaussian

contribution, decided by each neighbor’s3-D distance

respect to the point x

√

2πσ

exp(−

2σ

−x

i:new

∑

j=1

∑

j=1

, x

∈X

(2)

In our case, the standard deviation σ = 1 is used. Fig.

3(c) and Fig. 5(c) shows the smoothed vector ﬁelds

over the object in different shapes.

2.1.2 Line-shape Skeleton Estimation

The skeleton of the object is described with the

lines and linked nodes named skeleton nodes. Af-

ter smoothing, the vector ﬁeld is clustered using

the nearest neighbor clustering method (Wang et al.,

2011), which considers the positions and the direc-

tions. Meanwhile, the ﬁnal skeletal point position

could be extracted using the centering of the raw

object points, which should be in the cutting plane

through the related vector point with distance thresh-

old. These skeletal points could be extracted from pla-

nar object. Instead, (Tagliasacchi et al., 2009) mini-

mize the sum of squared distances from the point to

the related normals, which will cause the position of

the skeletal points for the planar object become inﬁ-

nite. The best line l could be extracted to minimize

the distance sum from the extracted skeletal points.

The line detection result is shown in the Fig. 3(d).

2.1.3 Skeleton Topology Extraction

The line detection result presented in Fig. 3(d), does

not constitute the whole skeleton of the object since

some skeleton points have been ﬁltered out by clus-

tering step. For this, the line growth algorithm is used

to estimate the whole skeleton topology. All the de-

tected lines grow in both positive and negative direc-

tion to overcome the whole skeleton. The lines stop

growing when they,

(i) reach the edge of the object point cloud and are

viewed as skeleton root node as the Node 1 and

Node 5 in Fig. 4;

(ii) meet another skeleton line and at that time they

stop growing up and are characterized as skeleton

link node as the Node 2, 3 and 4 in Fig. 4.

These points are clustered and merged using 3-D Eu-

clidean clustering (Wang et al., 2011). Then whole

object skeleton nodes are extracted. Meanwhile the

root and link nodes indicate the topology of object

skeleton. The results are shown in Fig. 3(e) and Fig.

5(c). Different colored points represent the different

estimated skeleton nodes and the dashed line links

represents the skeleton topology.

2.1.4 Kinematic Joint Number Determination

As shown in Fig. 4 and Fig. 5(d), the object skeleton

topology is extracted frame by frame with different

conﬁgurationsof articulated objects. The dashed lines

represents the object skeleton topology and the traces

of different extracted skeleton nodes are shown as dif-

ferent colored solid line. With the traces of skeleton

nodes with different frames, all the dynamic obser-

vations are obvious. From frames 1 to 8, it is ob-

vious that the observation patterns of nodes 3 to 5

differ from the patterns from frame 8 to 11. These

two kinds of patterns in terms of the skeleton topol-

ogy of object are changing, imply that the estimated

object is not the single joint articulated object. The

skeleton node S

with index i is viewed as the base

node to estimate the Euclidean distances with others

as E

= kS

− S

k,i ∈ [1, n], which is used to calcu-

late the difference cost function DIF

between current

frame j with the previous frame j −1 as following:

DIF

∑

i=1

−E

j−1

, j ∈ [1, F] (3)

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

256

where F is the number of frames. At the frame 9,

DIF

increased signiﬁcantly, which means this artic-

ulated object contains multi kinematic joints. In com-

parison, as shown in Fig. 5(d), the door of car is the

single joint articulated objects.

With the certiﬁcation of the joint number from

the object skeleton topology observations of different

demonstrations, the kinematic joint characterization

and localization could be extracted in the different

strategies. As the one joint articulated object, the tra-

jectory from one of object skeleton nodes could rep-

resent the whole object motion pattern and be used

for its kinematic joint characterization. Otherwise,

for multi joint articulated object, we need to analyze

all the skeleton nodes trajectories hierarchically to ex-

tract all the kinematic joints’ properties.

2.2 Kinematic Joint Characterization

The kinematic joints of the articulated object are

distinguished into two types, prismatic and revolute

(Sturm et al., 2011). Given the positional trajec-

tories of the end-effector of the object, it is rather

straightforward to discriminate between the two types

of joints. The position vector of the point A of an ar-

ticulated object which is moving in the 3D space can

be expressed by ~g = g

ˆx+ g

ˆy+ g

ˆz. If only one po-

sitional component is non-zero, the joint is prismatic.

The positional components are digitized as follows:

if a component is different than zero, it is assigned

the value 1, else the value 0. The digitized compo-

nents g

, g

and g

can the input to a Boolean logic

scheme which is equivalent to the numerical compu-

tation given by

Y = (g

+ g

−g

)(g

+ g

−g

). (4)

By applying (4) at each time step and taking the av-

erage

Y of all outputs Y(n) where n is the time in-

dex, we deduce whether the joint is revolute or pris-

matic. If

Y = 0 then the joint is prismatic. If

Y 6= 0,

the joint is revolute. In case that a joint is revo-

lute, and thus, causes a rotational movement, the an-

gle range of the joint is estimated. The positional

data of the end-effector of an articulated object are

recorded during demonstrations of the task. The an-

gle range is computed by θ(n) = arctan( ¯g

(n)/ ¯g

(n)),

where n = 1, ...,N is the time index and ¯g

and ¯g

the

two non-zero average positional trajectories in direc-

tions i and j. The average positional trajectories are

computed, since many demonstrations are available,

as ¯g

(n) =

∑

(k)

, ¯g

(n) =

∑

(k)

, where g

(b)

the position of demonstration b in direction a and K

is the number of demonstrations of the task.

2.3 Learning Force Skills

We desire to extract an average expert behavior for a

task based on multiple demonstrations (Lee and Ott,

2011). Since the speed of the demonstrator varies

from trial to trial and demonstrations are not time-

aligned, demonstrations become time-aligned by Dy-

namic Time Warping. The force policy of a task is

extracted from multiple demonstrations using a prob-

abilistic approach proposed in (Calinon et al., 2007).

This approach consists of Gaussian Mixture Model-

ing and Regression and estimates a smooth general-

ized version of demonstrated signals which captures

all the important features of the task.

Time-aligned data pairs d

= {s

, f

}, i = 1,..., N

are considered, where N is the number of data points

in each demonstration, s

the input joint states and

∈ ℜ

D×N

represent force data where D is the di-

mensionality of f

f. A mixture of L Gaussian func-

tions is considered with probability density function

p(d

) =

∑

l=1

p(l)p(d

|l), where p(d

|l) is a condi-

tional probability density function and p(l) = π

is the

prior of the l-th distribution. We model the mapping

from joint angles to endpoint forces by a mixture of L

Gaussian functions. It is

p(d

|l) =

(2π)

|Σ

exp(−



(ξ

−µ

)

−1

(ξ

−µ

)



)

where {π

,µ

,Σ

} is the Gaussian function’s parame-

ter set represented by the prior probability, the mean

and covariance matrix. The parameters of the mixture

are estimated using the Expectation-Maximization

(EM) algorithm. Following learning of the mixture

parameters, a generic form of the signals f

is recon-

structed using Gaussian Mixture Regression (GMR).

The states s

are employed as inputs and the output

vectors

are estimated by regression. The mean and

covariance matrix of the l-th Gaussian component are

deﬁned as

= {µ

s,l

, µ

f,l

}, Σ



s,l

sf,l

fs,l

f,l



The conditional expectation and covarianceof the sig-

nal f

given s are

= µ

f,l

+ Σ

fs,l

(Σ

s,l

)

−1

(s −µ

s,l

f,l

= Σ

f,l

− Σ

fs,l

(Σ

s,l

)

−1

sf,l

. Finally, the condi-

tional expectation and covariance of f

f given s for

a mixture of K Gaussian components are deﬁned

f =

∑

l=1

= Σ

l=1

f,l

, where β

p(s|l)/

∑

j=1

p(s|j) is the responsibility of the l-th

Gaussian for s

. The task force proﬁle f

f is learnt in

the joint space s which is represented by the angle θ.

3 EXPERIMENTAL RESULTS

This paper focuses on skeletonization and manipulat-

ArticulatedObjectModelingbasedonVisualandHapticObservations

257

Data acquirement with

different conﬁgurations

Object extraction based on

RANSAC plane estimation

Vector ﬁeld and

extracted skeleton topology

Y(m)

5 Skeleton Nodes

7 Frames

X(m)

Z(m)

Skeleton node trace through

different frames

Figure 5: Skeletonization of a car door which has a single revolute kinematic joint.

0 2000 4000 6000

0 1000 2000 3000 4000

Door opening

Door closing

Joint angle

Time stepTime step

Figure 6: Angle state space estimated based on the position

of the car’s door handle. The joint angles are expressed in

degrees. The time step is equal to 1ms.

ing a single-joint articulated object. We demonstrate

the performance of proposed method in a pitstop sce-

nario where the single-joint car door is to be recog-

nized and manipulated. A model of the door, rep-

resented by (1), is built which contains the skeleton

topology, the kinematic descriptor of the door’s joint

and the end-effector force required for manipulation.

The point cloud of the door is acquired by one

Kinect

sensor which is mounted on the top of the

robot, shown in Fig. 1. This data is used for skele-

tonization of the door and estimation of the skele-

ton node traces over different frames, shown in Fig.

5. The skeletonization of object is realized partially

based on the Point Cloud Library

. We desire to learn

manipulation skills in terms of the force which is re-

quired to open or close this single-joint car door.

Appropriate force is demonstrated to the robot

by kinesthetic teaching and learned from multiple

demonstrations of a task using the proposed ap-

proach. Several demonstrations of a door-opening-

and-closing task are provided to a 7 DoF robotic arm.

Task space force as well as end-effector positional

trajectories are captured during demonstrations. Fol-

lowing task space force learning, generalization is re-

http://www.primesense.com

http://www.pointclouds.org

(a)

(b)

Angle (degrees) Angle (degrees)

Fy(N) Fy(N)

Fx(N)Fx(N)

Figure 7: Learning the generalized 2-dimensional force pro-

ﬁle of a task in joint angle space given 3 task demonstra-

tions. (a) Door opening, (b) door closing.

quired to situations where the initial door position

may differ and based on the task goal such as open-

ing or closing. To do so, the force constraints of the

task are learnt with respect to door’s joint states. The

current joint states are estimated using current frame’s

visual data.

Skeletonization of the car door is shown in Fig.

5, where the door is recognized as single-joint artic-

ulated object using (3). We observe that the trace of

skeleton node has the same motion pattern with the

robot arm end-effector trajectory. The current door’s

joint state could be achieved by the skeleton topology

position and learned door’s rotational joint model.

Every demonstration consists of a door-opening and a

door-closing phase without any interruption between

the two phases. The different start and end points of

each trial are due to slight sliding movement of the

robot end-effector along the handle of the door. Given

manipulation trajectory, the type of joint is identiﬁed

ﬁrstly by using the algorithm described in 2.2. The

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

258

0 1000 2000 3000 4000 5000

−10

−8

−6

−4

−2

0 1000 2000 3000 4000 5000

−4

−3

−2

−1

TimestepTimestep

(N)

Open

Figure 8: Door opening and closing where the door is ini-

tially open at 8 degrees. The time step is equal to 1ms.

door’s joint is characterized as revolute and estimate

the joint space constrains which is computed, see Fig.

6. This angle space constitutes the input state space in

terms of which the force trajectories are learned from

multiple demonstrations. Fig. 7 shows learning of the

2-dimensional force for a door opening-closing task

from 3 demonstrations by using the method described

in Section 2.3. The force is learned separately for the

two phases of the task. Following learning, we desire

to generalize the force generation policy to different

tasks with different current state. More speciﬁcally,

the case is considered where the car door is already

open at 8 degrees and the force proﬁle is estimated

which needs to be exerted in order to open the door

completely and close it afterwards. Fig. 8 shows the

generalized force for this task where the two phases,

opening and closing.

4 CONCLUSIONS

In this paper, we propose a method for articulated ob-

ject modeling by combining visual and haptic data.

Visual processing contributes to recognizing the ob-

ject and identifying its structure and more speciﬁ-

cally, its skeleton topology, the number and type of

joints as well as the current joint states. Haptic data

represented by force are learned from multiple task

demonstrations in order to be able to operate the ar-

ticulated mechanism. The forces are encoded with

respect to joint states so that the system can gener-

alize to new situations where the initial object con-

ﬁguration, and thus, joint state differs. The proposed

method is demonstrated in manipulation of a single-

joint car door. Future work will focus on modeling

of a wide-variety of objects which also involve more

than one joint.

ACKNOWLEDGEMENTS

This work is supported in part by the DFG excel-

lence initiative research cluster ”Cognition for Tech-

nical Systems – CoTeSys,” see www.cotesys.org, the

FP7 EU-STREP ”Interactive Urban Robot (IURO)”,

see www.iuro-project.eu, the Institute for Advanced

Study (IAS), Munich, and the China Scholarship

Council (CSC), see http://en.csc.edu.cn.

REFERENCES

Calinon, S., Guenter, F., and Billard, A. (2007). On learn-

ing, representing, and generalizing a task in a hu-

manoid robot. Systems, Man, and Cybernetics, Part

B: Cybernetics, 37(2):286–298.

Huang, X., Walker, L., and Birchﬁeld, S. (2012).

Occlusion-aware reconstruction and manipulation of

3d articulated objects. In In Proc. of the IEEE In-

ternational Conference on Robotics and Automation

(ICRA), pages 1365–1371, St. Paul, Minnesota.

Kalakrishnan, M., Righetti, L., Pastor, P., and Schaal, S.

(2011). Learning force control policies for compli-

ant manipulation. In Intelligent Robots and Systems

(IROS), pages 4639–4644.

Katz, D. and Brock, O. (2008). Manipulating articulated

objects with interactive perception. In In Proc. of the

IEEE International Conference on Robotics and Au-

tomation (ICRA), pages 272–277, Pasadena, CA.

Lee, D. and Ott, C. (2011). Incremental kinesthetic teaching

of motion primitives using the motion reﬁnement tube.

Autonomous Robots, 31(2):115–131.

Lutscher, E., Lawitzky, M., Cheng, G., and Hirche, S.

(2010). A control strategy for operating unknown

constrained mechanisms. In In Proc. of the IEEE In-

ternational Conference on Robotics and Automation

(ICRA), pages 819–824, Anchorage, Alaska, USA.

Rusu, R. B., Bradski, G., Thibaux, R., and Hsu, J. (2010).

Fast 3d recognition and pose using the viewpoint fea-

ture histogram. In In Proc. of the International Con-

ference on Intelligent Robot Systems (IROS), pages

2155–2162, Taipei, Taiwan.

Sturm, J., Stachniss, C., and Burgard, W. (2011). A prob-

abilistic framework for learning kinematic models of

articulated objects. Journal of Artiﬁcial Intelligence

Research, 41(2):477–526.

Tagliasacchi, A., Zhang, H., and Cohen-Or., D. (2009).

Curve skeleton extraction from incomplete point

cloud. In ACM Trans. on Graph, 28(3):71.

Wang, W., Brˇsˇci´c, D., He, Z., Hirche, S., and K¨uhnlenz,

K. (2011). Real-time human body motion estimation

based on multi-layer laser scans. In In Proc. of the

International Conference on Ubiquitous Robots and

Ambient Intelligence, pages 297–302, Incheon, Korea.

ArticulatedObjectModelingbasedonVisualandHapticObservations

259