Integration of an Autonomous System with Human-in-the-Loop for

Grasping an Unreachable Object in the Domestic Environment

Jaeseok Kim, Raffaele Limosani and Filippo Cavallo

The Biorobotics Institute, Sant’Anna School of Advanced Studies, Viale Rinaldo Piaggio, Pontedera, Pisa, Italy

Keywords:

Mobile Manipulation, Autonomous System with Human-in-the-Loop, Assistive Robotics, Grasping an

Unreachable Object, Service Robotics.

Abstract:

In recent years, autonomous robots have proven capable of solving tasks in complex environments. In partic-

ular, robot manipulations in activities of daily living (ADL) for service robots have been widely developed.

However, manipulations of grasping an unreachable object in domestic environments still present difﬁculty.

To perform those applications better, we developed an autonomous system with human-in-the-loop that com-

bined the cognitive skills of a human operator with autonomous robot behaviors. In this work, we present

techniques for integration the system for assistive mobile manipulation and new strategies to support users

in the domestic environment. We demonstrate that the robot can grasp multiple objects with random size

at known and unknown table heights. Speciﬁcally, we developed three strategies for manipulation. We also

demonstrated these strategies using two intuitive interfaces, a visual interface in rviz and a voice user interface

with speech recognition. Moreover, the robot can select strategies automatically in random scenarios, which

make the robot intelligent and able to make decisions independently in the environment. We demonstrated

that our robot shows the capabilities for employment in domestic environments to perform actual tasks.

1 INTRODUCTION

In the home environment, perception is used to rec-

ognize a variety of objects; however, a service robot

might not be able to detect all of the objects in ev-

ery circumstance. In other words, when the robot at-

tempts to recognize multiple objects on a table, it is

difﬁcult to detect all objects because many variables

such as the shape of object, robot hand position and

etc. should be considered. However, if a human can

support the judgment of the robot, the robot can ac-

quire the speciﬁc object needed quite easily.

To ﬁnd solutions for performing tasks in the home

environment, many types of service robots have been

developed. In particular, robots that provide fully au-

tonomous system and activities of daily living (ADL)

for older people have been developed. One represen-

tative service robot is Care-O-bot, which was devel-

oped with basic technologies for delivery, navigation,

and monitoring for users (Schraft et al., 1998). More-

over, in recent years, several projects have featured

different robots that integrate smart home technol-

ogy for healthcare, shopping, garaging (Cavallo et al.,

2014), and communication with users by gesture and

speech (Torta et al., 2012).

Despite enhanced functionalities of the service

robots, we still face several challenges of ADL in the

domestic environment. Particularly with tasks such

as grasping objects iteratively in the environments,

the capabilities of current robots are still lacking. To

solve these problems, robots typically focus on either

a fully autonomous or a fully teleoperated system.

However, many limitations with perception and ma-

nipulation remain. A possible solution to overcome

these issues is to use autonomous system with human-

in-the-loop, in which the human operator controls the

robot in a remote site with a high level of abstraction.

In this paper, The development of the integra-

tion of an autonomous system with human-in-the-

loop was presented for grasping an unreachable object

in the domestic environment. The main contributions

of this paper are the following:

• Multi-object segmentation was implemented and

it supports grasping point detection that used

grasping the object with two different grasp poses

using a depth camera.

• Three mobile manipulation strategies were devel-

oped for picking and placing unreachable objects

with various and unknown table heights.

306

Kim, J., Limosani, R. and Cavallo, F.

Integration of an Autonomous System with Human-in-the-Loop for Grasping an Unreachable Object in the Domestic Environment.

DOI: 10.5220/0007839503060315

In Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2019), pages 306-315

ISBN: 978-989-758-380-3

Figure 1: The mobile manipulation task is operated by human capabilities applied with Sheridan’s four-stage model (a)

Information acquisition, (b) Information analysis, (c) Detection and action selection, and (d) action implementation.

• Perception, visual and voice interfaces, and mo-

tion planning were integrated as functionalities

of grasping an unreachable object system frame-

work.

2 RELATED WORK

Mobile manipulation tasks for grasping an object in

the domestic environment have been studied exten-

sively over decades (Dogar and Srinivasa, 2011; Ki-

taev et al., 2015; Stilman et al., 2007; Ozg

ur and

Akın, ; Fromm and Birk, 2016).

Grasping objects are frequently used for au-

tonomous manipulation system in the domestic en-

vironment. However, grasping unreachable objects

(occluded objects) are still one of the issues in the

environment. In order to solve the issue, the push-

ing manipulation systems for grasping unreachable an

object have been developed. Dogar et al. (Dogar and

Srinivasa, 2011) suggest a framework to generate a

sequence of pushing actions to manipulate a target

object. However, the pushing action system needs

adequate space in which to shift or remove objects.

To solve the problem, a sequence of manipulation ac-

tions to grasp the objects is suggested. Stilman et al.

(Stilman et al., 2007) use a sampling-based planner to

take away the blocking objects in a simulation. More-

over, Fromm et al. (Fromm and Birk, 2016) propose a

method to plan strategies for a sequence of manipula-

tion actions. Based on the previous paper, a sequence

of manipulation actions for grasping unreachable ob-

jects was considered. However, the discussed liter-

atures show that the robot already knows the target

object before the robot manipulation starts. To over-

come the object selection during the robot manipula-

tion, user-interfaces operated by autonomous system

with human-in-the-loop are developed.

Many prior works address that user-interfaces are

developed for object selection in an autonomous sys-

tem with human-in-the-loop. For remote selection of

an object by people at home, the graphical point-click

interface system was developed (Pitzer et al., 2011).

The interface is allowed to drag, translate, and rotate

to select a target object by a person. In addition, an in-

terface is used to generate waypoints for desired grip-

per position to conduct grasping tasks (Leeper et al.,

2012). The interface systems support the object selec-

tion problem using human capabilities. However, the

interface systems on the papers only consider grasp-

ing reachable objects, which are not occluded, and

simple task planning is applied. In addition, to select

the target object, a human operator should concentrate

on the visual display and take their time when select-

ing an object. For theses reasons, we developed three

mobile manipulation strategies for grasping occluded

and different grasp poses to support grasp planning

with intuitively user interfaces for object selection.

3 SYSTEM ARCHITECTURE

The goal of our work is to develop a robotic system

that will be able to help people in ADL. In particular,

we studied the scenario in which a user needs a par-

ticular object located on a table and asks the assistant

robot to ﬁnd and bring the object.

In fact, the preferred way to achieve the scenario is

to operate the robotic system automatically. However,

the capability to recognize a target object, calculate

eligible grasp poses, and generate task planning for

complex tasks are still being researched. The mobile

manipulation task, which is part of the robotic sys-

tem, should be automated. Parasuraman et al. (Para-

suraman et al., 2000) proposed a model for different

levels of automation that provides a framework made

Integration of an Autonomous System with Human-in-the-Loop for Grasping an Unreachable Object in the Domestic Environment

307

Figure 2: A simple pictorial diagram with variables such

as neck angle(N

), table height(T

) and object height(O

Look at the text for the variables applied in the equations.

up of four classes: (1) information acquisition, (2) in-

formation analysis, (3) decision and action selection,

and (4) action implementation. In current work, we

adapted the same framework in our robotic system

(see Figure 1).

4 AUTONOMOUS SYSTEM WITH

HUMAN-IN-THE-LOOP

DESCRIPTION

Based on Sheridan’s four-stage model, which was

previously described, Decision & Action selection

were conducted as a good starting point for the au-

tonomous system with human-in-the-loop concept.

We aimed to develop a system that is operated by min-

imum human effort. The human operator contributes

by interpreting the environment from the camera im-

ages and by choosing a low level of automation in the

third stage of Sheridan’s model.

The ﬁnal objective of the mobile manipulation

task is to pick up an object on a table of unknown

height and bring it to a user. In our system, as the

robot ﬁnds a table, it measures the table height and ad-

justs its neck angle accordingly. Then, robot detects

and segments multiple objects that are positioned in

two rows (front and back). As the robot completes

extraction of the objects with several RGB colors, the

user selects a row and a target object using voice and

visual interfaces. Based on the table height informa-

tion, and row and object information (height, length,

weight, and distance), the robot selects one of three

mobile manipulation strategies that we have devel-

oped for grasping an object in the back row. The

strategies were designed with different grasp poses

according to the table heights.

For grasping an object, we followed two scenar-

ios, In the ﬁrst scenario, the user employed a known

table height ﬁxed at 70, 80, 90, or 100 cm. Moreover,

in the second scenario, the user employed an un-

known table height which is measured by robot itself,

and robot can decide empirically which strategies are

better for grasping. From several experimental trials,

empirical results suggest three strategy modes for

better grasping:











1 T

< 77cm

2 T

< 87cm

3 T

< 100cm

(1)

where T

is table height measured by camera. For

grasping objects and controlling the arm, we used

Point Cloud Library (PCL) (Rusu and Cousins, 2011)

and Moveit (Chitta et al., 2012) library.

5 IMPLEMENTATION OF

AUTONOMOUS SYSTEM WITH

HUMAN-IN-THE-LOOP

SYSTEM

The goal of the autonomous system with human-

in-the-loop is to provide support to improve the

quality of human life. Thus, we considered grasping

an object from a table of unknown height in the

domestic environment, which schematically is shown

in Figure 2 with our robot. Actually, many studies for

grasping objects (Dogar and Srinivasa, 2011; Kitaev

et al., 2015; Fromm and Birk, 2016) were tested

using ﬁxed table height and viewpoint. However,

if the viewpoint is changed, detection of objects on

the table will be difﬁcult by the robot. Therefore, to

overcome the difﬁculty of detecting objects from a

different viewpoint, we adjusted the neck angle of the

robot based on table height. Before applying the ﬁxed

neck angle, the robot needs to ﬁnd a table. Thus,

the initial neck angle was set at the lowest position

to ﬁnd a lower table height. After the table was

segmented by PCL, a point (which is calculated by

averaging all the coordinate points on the top surface

of the table) was substracted from the base frame of

the robot (see Figure 4), and only the z-axis value

was used to calculate table height. Then the value

was stored for changing the neck angle and choosing

the strategies. Next, the neck angle of the robot was

adjusted by the interpolation method. To interpolate

the neck angle, we set the maximum and minimum

range of neck angle and table height. Moreover, the

linearly interpolated neck angle helped the robot to

detect multiple objects easily.

θ,d

= N

θ,min

+ (T

− T

h,min

)

θ,max

− N

θ,min

h,max

− T

h,min

(2)

where N

θ,d

is the desired neck angle, N

θ,max

and N

θ,min

are maximum and minimum neck angle, respectively,

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

308

Figure 3: The ﬂow chart of the autonomous system with

human-in-the-loop; It is included with image preprocess-

ing, multi-object segmentation, object selection and action

planning with developed strategies.

and T

h,max

and T

h,min

are maximum and minimum ta-

ble heights, respectively. The T

is the current table

height described in Figure 2. To establish an appro-

priate area for grasping an object, the robot secures

the workspace using a laser sensor to measure the

distance between the table and robot base. After the

robot judges that the workspace is appropriate for the

manipulation, it starts performing task given. This

process is represented in the ﬂow chart in Figure 3.

5.1 Multi-object Segmentation

In our work, we adapted and modiﬁed the approach of

Trevoret al. (Trevor et al., 2013) to segment multiple

objects by an organized point cloud library. We used

a depth camera (Xtion) for acquiring depth data in-

stead of an RGB camera to increase depth data accu-

racy. For each point P(x,y), a label L(x, y) is assigned.

Points belonging to the same segment will be as-

signed to the same label based on the Euclidean clus-

tering comparison function (see (Trevor et al., 2013)

for more details). To segment the objects accurately,

some of the large segments, like the plane surface,

will be excluded. In addition, if the distance between

the two points in the same label set is more than a

threshold, one of the points will be discarded because

of increasing object segmentation speed. The area

to be clustered and the threshold of points for each

object. The points of each object clustered between

1500 to 10000 points were chosen experimentally. To

Figure 4: Real-time multi-object segmentation on the table

visualized in rviz (Hershberger et al., 2011)

distinguish between multiple objects easily, the object

were covered with six RGB colors. The result of this

segmentation process is presented in Figure 4. This

process is described in the ﬂow chart shown in Figure

3).

5.2 Human Object Selection

For our robotic platform, object selection was done

in two ways: 1) voice and 2) visual. We also believe

that a combination of these two methods could be eas-

ily accessible for very old people who cannot move.

Moreover, our interface platform includes a tablet for

the voice user interface (see Figure 5(b)) and rviz in

a PC for the visualization interface (see Figure 5(a)).

The visualization interface, which includes RGB col-

ors and depth information of the environment, was

provided for the selection system; the voice user inter-

face is based on speech recognition (Sak et al., 2015).

The selection system consisted of three steps:

• 1. Select one of the rows of multiple objects

• 2. Choose an object desired in the same row

• 3. Choose an object desired in a different row

First, the user selects the object from the front or

back row of the table. After identiﬁcation of the row

(front or back), the target object selection is done (see

Figure 3).

5.3 Action Planning & Execution

Among multiple objects, grasping an object is still a

challenge. Thus, we tried to ﬁnd a grasping point with

simply shaped objects such as a bottle or box, which

are common household objects in the domestic envi-

ronment. In addition, the grasping point was used to

generate possible hand poses relative to the object for

action planning.

To extract the grasping point from each object, we

used the 3D centroid function in PCL and conﬁgured

Integration of an Autonomous System with Human-in-the-Loop for Grasping an Unreachable Object in the Domestic Environment

309

Figure 5: (a) Visualization interface system, (b) Voice user

interface system.

Figure 6: (a) Top grasp pose, (b) Side grasp pose.

the grasp poses. In our case, we characterized two

types of grasp poses:

•Top pose: It is aligned by the robot hand to the

object in the vertical plane (along the x- and z-axis),

and opening and closing of the robotic hand is in the

direction of the x- or y-axis (see Figure 6(a)).

•Side pose: It is deﬁned in the horizontal plane

(along the x- and y-axis), and the opening and closing

direction of the robotic hand is the same as previous

(see Figure 6(b)).

To grasp the object, we used the motion plan-

ning library, which includes capability for collision

avoidance, self-collisions, and joint limit avoidance

of the robot arm in the domestic environment. The

motion planning library (Moveit) was used for exe-

cuting three mobile manipulation strategies. During

the motion planning, the position of the robot hand

plays an important role in grasping. For this reason,

pre-grasp position (it is an offset from the target ob-

ject with the two grasp poses) was developed. Af-

ter the pre-grasp position was obtained, the palm of

the robot hand approached the surface of the target

object to grasp it. Based on these technologies, the

strategies were enhanced to avoid crashes between the

robot arm and robot body during the operation (Cio-

carlie et al., 2014) (see Figure 3).

5.4 Developed Mobile Manipulation

Strategies

Three strategies of the mobile manipulation were con-

ceived to grasp an object, which was apart from the

robot. A set of 6 objects, arranged in two rows, was

placed in front of the robot (see Figure 5(a)). We con-

Figure 7: The ﬁrst strategy for manipulation: (a) The robot

moves close to the table, (b) The mobile platform is rotated

to grasp the target object. (c) Top grasp pose is implemented

to grasp the object directly. (d) After grasping the object,

the robot arm returns back to the initial position.

sider grasping objects placed in the back row because

grasping front row objects are an easy task that we

have already developed. Before starting the strategies

for grasping an object, we need to accomplish three

steps. The ﬁrst step is initialization of the robot arm.

The next step is to transform the coordinates of mul-

tiple objects from camera frame to robot base frame

for manipulation. The last step is pre-grasp position

based on table height. These three steps are described

in Algorithm 1 (lines 2 to 5). Actually, these steps are

capable of grasping an object on a table, but grasping

back-row objects always fails due to the obstruction

caused the front-row objects. For these reasons, we

developed three strategies for the mobile manipula-

tion for grasping an object in the back row.

• The First Strategy.

The objective of the ﬁrst strategy was to grasp an

object on the approximately 70cm high table, directly

from the back row, to reduce manipulation time. The

mobile platform was pre-deﬁned to be at a rotated an-

gle and also the speciﬁc neck angle that supports seg-

mentation of objects in the back row was set. Ac-

tually, when the same sizes of objects are detected,

the visualization of the object size shows differently

because the distance from the camera to each object

is different. In addition, the objects in the front row

would be obstructed during grasping, and it will be

difﬁcult to detect the entire size of the back row ob-

jects. For this reason, the function of the linear inter-

polation (the same as Equation 2) with different vari-

ables was developed. The output of the interpolation

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

310

Figure 8: The second strategy shows the robot performing

lateral grasp to pick up an object in the back. (a) First, the

object in front of the object selected is removed. (b) The

object is placed on the empty place. (c) The robot grasps

the target object with a side grasp pose. (d) As the object

grasps, the arm starts to return back to the initial position.

is a value that will add to the position in the z-axis

to establish the stable grasping point. After the inter-

polation, the top grasp pose was applied to grasp the

object directly. The ﬁrst strategy in the actual envi-

ronment was performed as shown in Figure 7((a)-(d))

and also as described in Algorithm 1 (lines 22 to 25).

• The Second Strategy.

The ﬁrst strategy of the mobile manipulation was

useful to grasp the objects in the back. However, we

still are challenged to ensure stable grasping by the

robot. For this reason, we developed a new strategy to

grasp the objects in the back row to compensate for an

inadequate object segmentation and robust and stable

grasping. The objective of the second strategy was

to grasp the objects from an 80cm high table while

ensuring good stability. To pick up the objects, the

algorithm for removing the objects in the front was

conceived. The point of the strategy is that when the

user selects the back row and target object, the robot

calculates the centroid of the front row object as well.

Then, the robot lifts the object off the front row and

places it in the empty place on the table. First, to ﬁnd

the objects in the front, the function was implemented

for searching the nearest distance between all objects

and the target object. After the object in the front

row was found, the pre-deﬁned grasp position was

applied. To ensure a stable grasping, the side grasp

pose was introduced. Then, as the front row object

was grasped (see Figure 8(a)), a pre-deﬁned place

was located at the right edge of the table (see Figure

Algorithm 1: The three mobile manipulation strate-

gies.

Input : Joint position q, Inital pose x

init

, Object

row O

row

, All objects information O

all

The centroid of the object selected and

transformed O

cen

, Object desired x

Grasp pose x

grasp

, The new centroid of

the object selected and transformed

newcen

, Distance of z axis z

add

, Table

height T

Output: Goal pose, x

goal

1 while until manipulator ﬁnish task given do

2 InitializationJacoArm(q);

3 TransformAllObjects;

4 x

grasp

← Pre-deﬁnedGraspPose(x

init

, O

row

);

5 if O

row

.back = True then

6 if T

<= 100 then

7 • The Third Strategy Starts

8 Update ObjectStates &

MoveMobileBase;

9 O

newcen

← Search

ObjectUsingAxis(O

cen

);

10 x

←

CalculateTargetObject(O

newcen

);

11 ManipulationBasedOnModeSelection;

12 x

goal

← x

· x

grasp

;

13 end

14 if T

<= 87 then

15 • The Second Strategy Starts

16 Search NearestObject(O

cen

);

17 Search Pre-deﬁendEmptySpace(O

all

);

18 x

← Calculate TargetObject(O

cen

);

19 x

goal

← x

· x

grasp

;

20 end

21 if T

<= 77 then

22 • The First Strategy Starts

23 Rotate MobileBase & Set the Angle

of Neck

24 z

add

←

InterpolationAddGraspAxis(O

cen

);

25 x

goal

← x

cen

· x

grasp

· z

add

;

26 end

27 end

28 end

8(b)). Since the robot arm is mounted on the right of

the body, we considered that an available place to the

right would be easier. After the object was placed on

the table, the arm returned to the initial position and

the robot started grasping the target object with the

side grasp pose (see Figure 8(c,d)). The entire process

is represented in Algorithm 1 (lines 15 to 19).

• The Third Strategy.

The ﬁrst and second strategies of the mobile manipu-

lation were helpful to grasp objects in the back row,

but the robot might fail to accomplish the task. For

example, if the robot faced a table higher than its vi-

Integration of an Autonomous System with Human-in-the-Loop for Grasping an Unreachable Object in the Domestic Environment

311

Figure 9: Experimental setup with multiple objects and an

adjustable height table.

sual ﬁeld, or if objects in the front row were taller

than objects in the back row, the robot could not de-

tect objects in the back row. For this reason, the third

strategy was developed to grasp the hidden object.

The third strategy is used in the particular situation of

a hidden object when the ﬁrst and second strategies

cannot perform grasping tasks. In this strategy, hu-

man support was exploited to overcome the difﬁculty

of object selection. To conduct a feasibility study for

the third strategy, the hidden object was evaluated ac-

cording to the decision of the user. The object selec-

tion method in the third strategy was not the same as

in previous strategies because the user cannot see the

object in the back using visual interface. However, the

user already knows the location of the target object

on the table and selects the back row and an object in

the front using voice interface. After the user selects

both row and object, the process of the third strategy,

which is similar to that of the second strategy, is im-

plemented. The basic difference between these two

strategies is to update the state of the objects. After

the object in the front is placed in the empty space,

the robot should discover the target object. To ﬁnd

the object, the state of the scene was updated using

the multi-object segmentation function. In addition,

the information of the y-axis is used to ﬁnd the target

object because the target object is located colinear to

the front object. The simpliﬁed algorithm is described

in Algorithm 1(lines 7 to 11).

Figure 10: Objects were placed as follows: (a) left top:

short objects, right: tall objects; (b) short objects in the front

(FSO, right top); (c) tall objects in the front (FTO, left bot-

tom); and (d) random size objects (RO, right bottom).

6 EXPERIMENTAL SETUP

Our experimental setup is shown in Figure 9. The

robotic platform for the experiment is the Doro (do-

mestic robot) personal robot (Cavallo et al., 2014), a

service robot equipped with an omni-directional mo-

bile base and a Kinova Jaco robotic arm. The Kinova

Jaco arm, which has six degrees of freedom, is used

for manipulation tasks. The head of the Doro is a pan-

tilt platform equipped with two stereo cameras and an

Asus Xtion Pro depth camera; they are used for ob-

ject detection and segmentation. To implement ADL,

we set up the experimental environment with multi-

ple objects placed on the table, which can be adjusted

in height as shown in Figure 9. Three objects were

placed in the front, and the others were placed in the

back. We used rectangular objects such as plastic bot-

tles and juice boxes during the experiments.

For the experiment, several scenarios were orga-

nized. Before grasping an object in the back row,

we tested a simple scenario for grasping an object in

the front row. Then, the three manipulation strate-

gies were tested to grasp an object in the back. The

known table height was set at different steps of 10

cm, such as 70, 80, 90, and 100 cm. In addition,

these three strategies were tested at an unknown table

height to apply them in the real life situation. The ob-

jects were placed in three positions: short size objects

in the front (FSO), tall size objects in the front (FTO),

and random size objects (RO) (see Figure 10). Dur-

ing the manipulation with all strategies for unknown

table height, we considered grasping one of the ob-

jects, which were placed randomly. The scenarios

were evaluated 10 times for each strategy in terms of

collision and success rate with known table height and

with unknown table height.

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

312

Figure 11: The success rate of the mobile manipulation for

grasping an object in the back row on known table heights:

(a) First strategy; (b) Second strategy; (c) Third strategy.

7 EXPERIMENTAL RESULTS

Firstly, quantitative analysis for three mobile manipu-

lation strategies for known and unknown table height

were performed.

7.1 Quantitative Analysis at Known

Table Heights

The quantitative results focused on three criteria: suc-

cess rates and collision The success rates were mea-

sured when the robot grasped a target object. We

also considered a collision case in which objects were

crashed into by a robot hand.

7.1.1 Success Rates

The success rates were evaluated in each strategy with

three different object positions (FSO, FTO, and RO)

on known table heights.

As shown in Figure 11(a), the success rate of the

ﬁrst strategy was higher for the 70 cm table height

compared to any other table heights and strategies

(60% to 80% for all three scenarios). At this ta-

ble height, the second and third strategies have low

success rates of manipulation because of lack of

workspace. Nevertheless, some trials in the ﬁrst strat-

egy also failed to grasp the target object, although

we developed linear interpolation to overcome insuf-

ﬁcient object segmentation and stable grasping. In

addition, except for the 70 cm table height, the ﬁrst

strategy was not successful in grasping because the

robot cannot reach pre-grasp position over 77 cm.

The success rate of the second strategy was im-

proved for a table height of greater than 80 cm, which

is better than for the ﬁrst strategy, but it doesn’t work

for the 70 cm table height (see Figure 11(b)). In par-

ticular, we found that the second strategy was a suc-

cess for an average of 30% of the 70 cm table height

tests in three different scenarios. Moreover, this strat-

egy performed better for 80, 90, and 100 cm table

heights for FSO and RO (success rate varies from

60% to 80%). We also observed that the second strat-

egy failed to grasp FTO objects at table heights of

90 and 100 cm, which occurred due to taller objects

blocking the target object (the human could not see

or select the target object). In addition, when the

robot grasped an object in the FTO, the robot only

segmented small parts of objects in the back at 80 cm

table height. Therefore, the grasp point was not ex-

tracted accurately.

Finally, the third strategy (see Figure 11(c)) could

be carried out with any table height. In this case,

the success rate varies from 70% to 80% (higher than

second strategy) at 80, 90, and 100 cm table heights.

However, for the 70 cm table height, the performance

is similar to that of the second strategy (20% to 30%).

As the robot removed the front object, the multi-

object segmentation system was repeated automati-

cally. As a result, the grasp point could be extracted

more accurately than with the second strategy. Fail-

ure of the strategy occurred when the grasp force was

insufﬁcient to grasp the target object. Thus, the robot

dropped the object during manipulation. As shown in

Figure 11(c), the third strategy can be applied in any

environment and shows better performance except for

the 70 cm table height.

7.1.2 Collisions

During the evaluation, the number of collisions was

measured for each table height using the three strate-

gies for a total of 10 times for all scenarios in the ex-

periments (see Figure 12).

The best results with 70 cm table height were

achieved using the ﬁrst strategy with a total average of

seven collisions from all scenarios (see Figure 12(a)).

The collisions in the strategy occurred while the robot

arm returned to the home position. Except for the

70 cm table height, the low number of collisions oc-

Integration of an Autonomous System with Human-in-the-Loop for Grasping an Unreachable Object in the Domestic Environment

313

(a)

(b)

Figure 12: Number of collisions with three strategies during the grasp of an object in the back row for known table heights:

(a) 70 cm table height; (b) 80 cm table height; (c) 90 cm table height; (d) 100 cm table height.

Figure 13: The success rate and number of collisions with

three strategies of the mobile manipulation for grasping an

object in the back row with unknown table heights.

curred at 80, 90, and 100 cm table height with the

third strategy. The total number of collisions using the

strategy occurred with all scenarios, with averages of

six, eight, and ten for 80, 90, and 100 cm table height

respectively (see Figure 12(b),(c),(d)), and standard

deviation is about 5% of each collision. However,

the second and third strategies have similar manipu-

lations. Therefore, Figure 12(b),(c),(d) show that the

collisions of the strategies are similar except for 90

cm and 100 cm table heights in the FTO scenario. The

collisions with two strategies occurred while the robot

arm was close to the object and returned to the home

position with a target object.

Actually, with the ﬁrst strategy, collisions only

with the 70 cm table height could be measured be-

cause the robot arm could not reach objects with the

other table heights (see Figure 12(a)). Moreover, we

could not measure collisions with the second strategy

in the FTO at the 90 and 100 cm table heights since

the objects in the back were occluded due to being

shorter than the front objects (see Figure 12(c),(d)).

7.2 Quantitative Results in Unknown

Table Heights

Previous quantitative results were analyzed using

known table height. However, various types of tables

exist in reality. Before we set up the table height, we

deﬁned range between 70 and 100 cm to select strate-

gies automatically. Then, the table height was set

up randomly between deﬁned ranges. Also, we only

tested the strategies with objects in the RO conﬁgura-

tion for implementing in the actual environment.

To conﬁrm the three strategies of mobile manipu-

lation, the three different table heights were measured

and the results were evaluated in the same manner as

previous cases (see Figure 13). The robot selected

one strategy automatically to manipulate according to

table height. As a result, the experiment was tested

in ten trials; the average of the success rate of the

manipulation in unknown table height is greater than

75%. We analyzed the number of collisions during

the experiment. Collisions were evaluated with the

same criteria, and an average of ﬁve collisions oc-

curred with all three scenarios.

8 CONCLUSIONS AND FUTURE

WORK

In this paper, we present three mobile manipulation

strategies in which the operator provides a simple

command using visual and voice user interfaces. The

three strategies of the mobile manipulation were de-

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

314

veloped to pick and place, and convey an object in the

domestic environment effectively. Based on the re-

sults, the three strategies have their own advantages at

the different table heights. Therefore, the intelligent

strategy selection system can be applied for domestic

environments that have different table heights.

Actually, the current system could be used to de-

tect, cluster, and extract simple household objects

such as bottles, boxes, etc. However, various objects

that are different in shape exist in the domestic en-

vironment. Therefore, the 3D centroid of an object

would not be able to grasp it. For this reason, we will

develop a grasp pose algorithm for a variety of house-

hold objects with our strategies to save time (Redmon

and Angelova, 2015). In addition, a deep learning-

based approach for extracting grasping point could

be considered to obtain more accurate performance

(Lenz et al., 2015; Levine et al., 2016).

ACKNOWLEDGEMENT

The work described was supported by the Robot-

Era and ACCRA project, respectively founded by

the European Community’s Seventh Framework Pro-

gramme FP7-ICT-2011 under grant agreement No.

288899 and the Horizon 2020 Programme H2020-

SCI-PM14-2016 under grant agreement No. 738251.

REFERENCES

Cavallo, F., Limosani, R., Manzi, A., Bonaccorsi, M., Es-

posito, R., Di Rocco, M., Pecora, F., Teti, G., Safﬁotti,

A., and Dario, P. (2014). Development of a socially

believable multi-robot solution from town to home.

Cognitive Computation, 6(4):954–967.

Chitta, S., Sucan, I., and Cousins, S. (2012). Moveit![ros

topics]. IEEE Robotics & Automation Magazine,

19(1):18–19.

Ciocarlie, M., Hsiao, K., Jones, E. G., Chitta, S., Rusu,

R. B., and S¸ucan, I. A. (2014). Towards reliable grasp-

ing and manipulation in household environments. In

Experimental Robotics, pages 241–252. Springer.

Dogar, M. and Srinivasa, S. (2011). A framework for push-

grasping in clutter. Robotics: Science and systems VII,

Fromm, T. and Birk, A. (2016). Physics-based damage-

aware manipulation strategy planning using scene dy-

namics anticipation. In Intelligent Robots and Systems

(IROS), 2016 IEEE/RSJ International Conference on,

pages 915–922. IEEE.

Hershberger, D., Gossow, D., and Faust, J. (2011). Rviz.

Kitaev, N., Mordatch, I., Patil, S., and Abbeel, P. (2015).

Physics-based trajectory optimization for grasping in

cluttered environments. In Robotics and Automa-

tion (ICRA), 2015 IEEE International Conference on,

pages 3102–3109. IEEE.

Leeper, A. E., Hsiao, K., Ciocarlie, M., Takayama, L., and

Gossow, D. (2012). Strategies for human-in-the-loop

robotic grasping. In Proceedings of the seventh an-

nual ACM/IEEE international conference on Human-

Robot Interaction, pages 1–8. ACM.

Lenz, I., Lee, H., and Saxena, A. (2015). Deep learning for

detecting robotic grasps. The International Journal of

Robotics Research, 34(4-5):705–724.

Levine, S., Finn, C., Darrell, T., and Abbeel, P. (2016). End-

to-end training of deep visuomotor policies. Journal

of Machine Learning Research, 17(39):1–40.

Ozg

ur, A. and Akın, H. L. Planning for tabletop clutter

using affordable 5-dof manipulator.

Parasuraman, R., Sheridan, T. B., and Wickens, C. D.

(2000). A model for types and levels of human in-

teraction with automation. IEEE Transactions on sys-

tems, man, and cybernetics-Part A: Systems and Hu-

mans, 30(3):286–297.

Pitzer, B., Styer, M., Bersch, C., DuHadway, C., and

Becker, J. (2011). Towards perceptual shared auton-

omy for robotic mobile manipulation. In Robotics and

Automation (ICRA), 2011 IEEE International Confer-

ence on, pages 6245–6251. IEEE.

Redmon, J. and Angelova, A. (2015). Real-time grasp

detection using convolutional neural networks. In

Robotics and Automation (ICRA), 2015 IEEE Inter-

national Conference on, pages 1316–1322. IEEE.

Rusu, R. B. and Cousins, S. (2011). 3d is here: Point cloud

library (pcl). In Robotics and automation (ICRA),

2011 IEEE International Conference on, pages 1–4.

IEEE.

Sak, H., Senior, A., Rao, K., Beaufays, F., and Schalkwyk,

J. (2015). Google voice search: faster and more accu-

rate. Google Research blog.

Schraft, R., Schaeffer, C., and May, T. (1998). Care-o-

bot/sup tm: the concept of a system for assisting el-

derly or disabled persons in home environments. In

Industrial Electronics Society, 1998. IECON’98. Pro-

ceedings of the 24th Annual Conference of the IEEE,

volume 4, pages 2476–2481. IEEE.

Stilman, M., Schamburek, J.-U., Kuffner, J., and Asfour, T.

(2007). Manipulation planning among movable obsta-

cles. In Robotics and Automation, 2007 IEEE Inter-

national Conference on, pages 3327–3332. IEEE.

Torta, E., Oberzaucher, J., Werner, F., Cuijpers, R. H., and

Juola, J. F. (2012). Attitudes towards socially assis-

tive robots in intelligent homes: results from labora-

tory studies and ﬁeld trials. Journal of Human-Robot

Interaction, 1(2):76–99.

Trevor, A. J., Gedikli, S., Rusu, R. B., and Christensen,

H. I. (2013). Efﬁcient organized point cloud segmen-

tation with connected components. Semantic Percep-

tion Mapping and Exploration (SPME).

Integration of an Autonomous System with Human-in-the-Loop for Grasping an Unreachable Object in the Domestic Environment

315