POSSESSED ROBOT: HOW TO FIND ORIGINAL
NONVERBAL COMMUNICATION STYLE
IN HUMAN-ROBOT INTERACTION
Hirotaka Osawa
1,2
and Michita Imai
1
1
Faculty of Science and Technology, Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yokohama, Japan
2
PRESTO, Japan Science and Technology Agency, Chiyodaku, Tokyo, Japan
Keywords: Design methodology, Human Robot Interaction, Human Interface.
Abstract: We propose an alternative approach called the Possessed Robot method to find each robot's unique
communication strategies. In this approach, the human manipulator behaves as if she/he possesses the robot
and finds the optimal communication strategies based on each robot's shape and modalities. We implement
the Possessed Robot system (PoRoS) including a reconfigurable body robot, an easier manipulation system,
and a recording system to evaluate the validity of our method. We evaluate a block-assembling task by
PoRoS by turning on and off the modality of the robot's head. Subsequently, the robot's motion during
player's motion significantly increases whereas the ratio of confirmatory behaviour significantly decreases
in the head-fixed design. Based on the results, we find an example case for the optimal communication
strategy in the head-fixed design. In this case, the robot leads the users and the user follows the robot as in
the turn-taking communication style of the humanlike condition. This result shows the feasibility of the
Possessed Robot method to make appropriate strategy adjustments based on the robot design.
1 INTRODUCTION
Nowadays, robots having various kinds of shapes
and modalities can support our lives in many ways.
In this paper, we define shape as the appearance of
the robot and modality as the possible observation
and behaviour of the robot. There are still questions
about what kind of interaction is required for each
robot shape and modality (del Pobil et al., 2010)
(Blow, 2006).
Previous studies have designed and implemented
the shape and modalities of robots according to
human-human interaction. There are many studies
that referred to humanlike modalities in robots, such
as gesture (Kanda et al., 2007), manner (Lee et al.,
2010), timing (Shiwa et al., 2009), and bipedal
walking (Hirai et al., 1998). This process is
conducted as shown in the two figures on the left
side of Fig. 1. First, the researchers extract a
psychological finding from human-human
interaction and create an interaction model from it.
Second, they implement the model to a humanlike
robot. Third, they conduct an interaction between a
human and a humanlike robot and confirm that the
robot can interact as the proposed model. Such a
design method is widely used in human-robot
interaction (HRI) studies because of the following
reasons. First, the researchers can base the study on
psychological findings that have been already
investigated. Next, it is easy to compare the results
and the goals. The above-mentioned reasons and
method allow the researchers to incorporate the
contributions of previous studies.
Figure 1: Difference between the previous HRI approach
and our proposal.
632
Osawa H. and Imai M..
POSSESSED ROBOT: HOW TO FIND ORIGINAL NONVERBAL COMMUNICATION STYLE IN HUMAN-ROBOT INTERACTION.
DOI: 10.5220/0003882906320641
In Proceedings of the 4th International Conference on Agents and Artificial Intelligence (SSIR-2012), pages 632-641
ISBN: 978-989-8425-95-9
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
However, we cannot find the specific behaviors
of a robot that are not related to human shape and
modalities by referring to existing findings in
human-human interaction. With the above process,
we may miss the most appropriate communication
strategies for the robot if the robot and the human
modalities are not the same. We call this kind of
robot a "nonhumanlike" robot. In this paper, we use
the word "nonhumanlikeness" to describe the lack of
humanlike social appearance, such as humanlike
head and arms. Detailed examples are shown in Fig.
2 (Kanda et al., 2007) (Hirai et al., 1998) (Li et al.,
2004) (Matsukuma et al., 2004). Several HRI studies
about less humanlike robots suggest that imitating
humans is not the only approach to designing a robot.
Sometimes, we use different communication
strategies for nonhumanlike agents. One of the best
examples is the human-pet interaction. Our
interaction style with pets is different from our style
in human-human interaction and human-tool
interaction. Robots have both aspect of tools and
pets. They generate different types of interaction to
users using different shapes and modalities from that
of humans, even if the shapes and modalities are
nonhumanlike. For example, Mu, eMuu, and Social
Trash Box extracted the essence of human
interaction and created an abstracted relationship to
humans different from human-human interaction
(Matsumoto et al., 2005) (Bartneck, 2002) (Yamaji
et al., 2010). Animal robots like Paro and AIBO
result in specific interaction experiences by merging
animal-like features with the original robot's
modalities (Shibata et al., 2002) (Fujita et al., 2010).
Training with additional humanlike features of an
object allows us to use a communication strategy by
merging the original features of the object and
humanlike features (Osawa et al., 2009) (Osawa et
al., 2010). However, there is no design method to
find original communication strategies for robots
except the analogical method (i.e., deriving
metaphors and abstractions from existing design).
This shortcoming of the previous approaches
prevents us from building a robot design on human-
nonhumanlike robot interaction (right bottom area
on Fig. 1) because we cannot directly apply human-
human interaction findings to nonhumanlike robots.
We propose the Possessed Robot method to find
a specific communication strategy for a robot that
can consider its own shape and modalities. In this
approach, one person "possesses a robot,” and
behaves as if she/he is the robot while interacting
with another person. This trial-and-error interaction
process between two persons reveals original
communication strategies that are reasonable and
specific to each robot's shape and modalities. Our
approach is applicable to both humanlike and
nonhumanlike robots, shown in Fig. 1. If the
Possessed Robot method is applied to a two-arm and
headless robot, the results are also applicable to
another robot that has the same design (shown in the
right side of Fig. 1).
Figure 2: Different styles of robots: their shapes and
modalities.
In this paper, we implemented the Possessed
Robot demonstrative System (PoRoS) to validate
our approach. PoRoS allows the user to possess the
robot by converting the user's behavior to the robot's
output and by converting the robot's input to the
user's input. We evaluated our proposal with
demonstrative tasks to instruct a user on how to
assemble a building from wooden blocks using a
robot by changing the robot head modality. A
humanlike robot with head modality resembles
human modalities and allows us to use conventional
communication strategies, such as nodding and
shaking motions. However, a humanlike robot
without head modality requires different
communication strategies that cannot be achieved
with the existing human communication theory.
Headless or head-fixed robots, such as BIRON and
SmartPal, are also popular (Li et al., 2004)
(Matsukuma et al., 2004). The demonstrative task
also answers what kind of communication strategies
are more appropriate to the commonly used headless
robots.
The following sections are organized as follows.
Section 2 explains the differences between related
methods and studies (Wizard of Oz, teleoperation
robot, and marionette system) and the Possessed
Robot method. Section 3 explains the design process
of the Possessed Robot method, and section 4
explains in detail the implementation of PoRoS
(Possessed Robot System) for realizing the
Possessed Robot method. In section 5, we explain
the evaluation of PoRoS and the results are
presented in section 6. In sections 7 and 8, we
discuss the results and the conclusions, respectively.
POSSESSED ROBOT: HOW TO FIND ORIGINAL NONVERBAL COMMUNICATION STYLE IN HUMAN-ROBOT
INTERACTION
633
2 RELATED STUDIES
In spite of differences in policy, there are several
similarities between previous approaches and ours.
In this section, we compare our work to related
studies and clarify our contribution.
2.1 Wizard of Oz
The Wizard of Oz (WoZ) method is widely used in
evaluating computer interfaces (Kelley et al., 1984).
This method uses human manipulator as sensors to
avoid unessential errors from the evaluation. The
WoZ experiment method is also widely used in the
field of HRI. Steinfeld et al. inferred several
consequential evaluation methods (called Oz of
Wizard) from WoZ for evaluating robots behavior
(Steinfeld et al., 2009).
WoZ uses a human manipulator as part of the
experimental system instead of being autonomous.
The manipulator behaves as the decision maker in
the system and selects the system behaviour from a
determined list. Consequently, the role of the human
manipulator in WoZ is restricted to replace sensor
actions. The manipulator cannot select behaviours
that are not known in advance. In contrast, a human
manipulator plays a more important role in the
Possessed Robot method because its goal is to find
optimal communication strategies and robot designs.
The entire robot input and output are directly
connected to the manipulator, and the manipulator
behaves as an intelligent computer in finding the
most optimal communication strategies for each task
using the specific robot shape and modalities.
2.2 Teleoperation Robot
Teleoperation robot studies also use manipulated
robots. The robot design is sometimes verified and
analyzed by recorded results. Kuzuoka et al.
discussed the optimal instructions in teleoperation
(Kuzuoka et al., 2000). However, teleoperation
studies themselves are not designed to find the
optimal communication strategies in autonomous
robotic systems. If the system behaves
autonomously, it is not teleoperation anymore.
Several research groups proposed to use
teleoperation to complement an autonomous robot.
Glas et al. proposed to use a human manipulator to
guide the robot (Glas et al., 2008). In their approach,
the robot behaviour is replaced by the human
manipulator if the task is hard for the robot to solve.
Thus, a human manipulator can temporarily possess
the robot. However, their study only focused on
improving the task performance in a real world
human-robot interaction. This approach did not
focus on feedback to optimize communication
strategies. They also hypothesized that the robot
might use humanlike modalities in the future. Other
robot possibilities are also not well discussed in their
paper.
2.3 Marionette and Digital Puppetry
Marionette is a well-known art for making puppets
behave lifelike (they are sometimes humanlike and
sometimes nonhumanlike). Currently, the possibility
of interactive marionettes is accelerated by
technology. They are called Digital Puppetries. This
kind of system allows us to control humanlike and
nonhumanlike robots (Lee et al., 2009). Turtle Talk
with Crush is the most successful marionette in the
commercial field (Disney, 2004). It is a screen agent
that interactively changes its face and behaviour
according to people's responses.
However, these studies are specialized to each
robot's shape and modalities. Manipulation requires
not a small amount of training time although
interface is supported by today's technologies. This
marionette system is not appropriate for the trial-
and-error approach that required in our method.
3 DESIGNING THE POSSESSED
ROBOT METHOD
Possessed Robot method is a design method
conducted by two participants. One participant
possesses a robot and behaves as if she/he is the
robot. Another participant interacting with robot.
Based on the differences to previous studies
mentioned in above section, we estimate that the
following three sub goals are required to perform the
Possessed Robot method. First, the Possessed Robot
method requires a reconfigurable robot body to
examine all kinds of robot shapes and modalities.
Second, the manipulation method must be easy for
the human manipulator to allow frequent trial-and-
error efforts. Third, the system requires recording
the interaction between the robot and the human for
later analysis.
The entire design process is described below:
Select the robot input and output, and
configure the robot shape and modalities.
Assign two persons as the manipulator (who
possesses the robot) and the player (who
follows the robot).
ICAART 2012 - International Conference on Agents and Artificial Intelligence
634
Connect the robot input and output to the
manipulator. All connections are required to
be understood and controlled by humans.
Two persons interact via the robot and
conduct a task cooperatively. They repeatedly
try to interact and gradually find the most
optimal communication strategies for the task.
The system records the entire interaction.
The evaluators analyze the result of the
interaction and the kind of modalities, which
are the most and least required. We also
compare the results with the human-human
interaction findings, which is the original
interaction setup for the robot.
This process brushes up the robot design. If we
require a more detailed analysis, we can also select
more optimal shapes and modalities with the results
from process 5, and repeat the entire process.
4 DESIGNING POROS FOR
POSSESSED ROBOT METHOD
We implemented PoRoS (Possessed Robot
demonstrative System) to estimate the validity of our
process. We used a reconfigurable robot, a
monitoring device to capture movement, and a
recording system to solve the sub goals mentioned in
the previous section.
4.1 A Reconfigurable Robot that
Allows us to Use Variable Shape
and Modalities
In the Possessed Robot method, we can evaluate not
only the humanlike robot shape and modalities but
also any kind of shapes and modalities. For
evaluating the Possessed Robot method clearly and
rapidly, we created a robot kit that has separate body
parts and allows variable shapes and modalities. The
kit includes three axis heads and two four-axis arms.
Each head has three motors. Each arm has two
motors on the root of the device to achieve
movements toward the pitch and yaw directions of
the arm. It has also two motors on the tip to achieve
movement toward the pitch and roll directions of the
hand.
These devices are attachable and detachable by
Velcro tapes. Each head and arm are wired and
connected to a microcomputer, and can be separately
turned on and off. The total axes of the kit are
sufficient to reproduce normal humanlike robots. If
you want to turn off the modality of the head of the
robot, just turn off the switch and the robot stops
controlling the head. If you want a different
humanlike robot shape, you can detach each part and
attach it on a different position. In the experiment,
we assigned each part as in Fig. 3 left and compared
the communication strategies of the humanlike robot
by turning on and off the head of the robot.
Figure 3: Implemented reconfigurable robot on PoRoS
system and motion capture markers on a participant.
4.2 Monitoring Device using the
Motion Capture System
To use a human as the controller of the robot, we
need to monitor the behaviour of the human
manipulator and feedback the robot with it. We used
a motion-capturing system for feedback from the
human manipulator because it is easy to understand
how to move a robot. In this system, we used seven
motion-capturing cameras (OptiTrack s250e
(Natural Point, 2010)) for tracing the human head
and hands. Each human body part is captured and
converted to robot body movement as described
below:
Head: The system extracts three angles (yaw,
pitch, and roll) of the head and assigns them to
the robot's head movement.
Arm: The system calculated the robot's arm
angles (yaw and pitch) by a vector from the
head position to the hand position.
Hand: The system calculates the robot's hand
angles (pitch and roll) by directions of the
user's head.
Each marker is attached to the human body as in
Fig. 3 right. Head markers are attached on the top of
the manipulator's head. Hand markers are attached
on the back of the manipulator's hands.
All origins are calculated as in Fig. 4. First, the
system calculates the centre of the human body
using the top of the head. The average position of
the centre of the body is 300 mm below the head top.
Second, the system calculates the origins of the right
and left arm from the centre of the body. Each origin
is on average 200 mm from the centre of the body.
POSSESSED ROBOT: HOW TO FIND ORIGINAL NONVERBAL COMMUNICATION STYLE IN HUMAN-ROBOT
INTERACTION
635
We can estimate that the origins of the arms are
stable because the manipulator stands in front of the
video and does not change her/his shoulder angle.
Third, we calculate the arms' vectors from each
angle and arm's length (average 500 mm). Last, we
assign the hands' directions toward the pitch and roll
axis of the robot's hands.
Figure 4: Calculation method for the position of each part.
4.3 System Connections
All modules are connected as in Fig. 5. In PoRoS,
the input data to the human manipulator is the video
image and the output data from the human
manipulator are the motion-capturing data and
angles of each motor. The latency from the robot to
the user is below 200 ms and this delay does not
cause any critical communication problems. All
input (video) and output (motor angles) data are
stored to the data server for later analysis.
Note that this PoRoS system is just one example
of realizing the Possessed Robot method and we can
select other inputs (motion-captured data by the
player) and output method (joystick) for other
implementations.
Figure 5: System implementation.
5 EXPERIMENT TO EVALUIATE
THE POSSESSED ROBOT
METHOD USING POROS
To research how our method evaluates the design
and modalities of a nonhumanlike robot, we
compared human-humanlike robot and human-
nonhumanlike robot interaction using the PoRoS
robot. In nonhumanlike robot interaction, we fixed
the head of the robot to decrease the modalities for
confirmation. We also prohibited verbal
communication during interaction to emphasize the
role of the head.
As a demonstrative task, we also setup the
assembly of wooden blocks to evaluate our method.
5.1 Pre-evaluation for Creating
Evaluation Method
Humans nod for confirmation. Nodding is conducted
by the human head. Head nodding has a regulatory
role in turn-taking in human-human communication
and human-computer interaction (Sacks et al., 1974)
(Cassell et al., 1999).
At first, we examined what kind of procedures
humans apply to make buildings by observing
human-human interaction. We gathered six
participants for this evaluation and assembled three
sets of pairs from them. One of the members of a
pair took the role of the manipulator. Another
member took the role of a player. The manipulator
instructed the player to build three kinds of buildings
as shown in Fig. 6. All examples in Fig. 6 consisted
of five kinds of blocks. First, the manipulator
watches the buildings in Fig. 6. Second, she/he sat
down in front of the player. Last, she/he instructed
the player how to construct the buildings. All
manipulators were prohibited to directly touch the
blocks. The number of instructions during the
evaluation is between five and eight and the
construction time is between 30 s and 60 s.
Figure 6: Example buildings.
The result confirmed that human-human
interaction is based on turn-taking strategies. Each
pair's turn-taking happened according to each user's
nodding and shaking motion.
In detail, the processes are as follows. In the first
turn, the player pointed out one of the blocks. If the
block was the right one, the manipulator nodded and
communication continued to the next turn. If the
block was wrong, the manipulator shook her/his
head and the player repeated the first turn. In the
ICAART 2012 - International Conference on Agents and Artificial Intelligence
636
second turn, the player brought the block to the
manipulator and the manipulator directed the player
to rotate the block. Then, the player put the block on
the building. If the placed position and direction was
right, the manipulator nodded and communication
returned to the first turn until they completed the
building. If the position or direction was wrong, the
manipulator shook her/his head. Then, the player
placed the block on the desk and repeated the second
turn.
5.2 Evaluation Method and Hypothesis
Based on the findings from the previous sections, we
compared the humanlike group and head-fixed
group for validating the proposed method. In the
humanlike group, the manipulator could handle the
PoRoS robot without any restrictions. However, in
the head-fixed group, the neck motor switches were
turned off by the system and the manipulator could
not control them.
This restriction forced both manipulator and
player to use other confirmatory behaviours for turn-
taking or it forced both persons to use different
communication strategies. When they selected
communication strategies other than the turn-taking
method, the confirmatory behaviour decreased in the
head-fixed group.
5.3 Environment for the Experiment
The experimental setup is shown in Fig. 7. The
manipulator and the player are in separate rooms.
The robot is fixed on a desk and placed in front of
the player. There are eight blocks on the desk
between the player and the robot. The viewpoints of
the camera and the robot are located in the same
direction. The manipulator can confirm the face of
the player. All input and output data are recorded
and stored in the data server for later analysis.
Figure 7: Experimental setup.
We show the scene of manipulation in Fig. 8.
The manipulator is standing on the left side of Fig.
8. Motion-capturing cameras surround him. The
video screen is in front of the manipulator and the
screen shows the robot, the blocks, and the player as
shown in the right top part of Fig. 8. An image of the
building is pasted on the right side of the screen, and
the manipulator instructs the player how to assemble
the blocks via the robot.
Figure 8: Experimental scene.
5.4 Participant and Experimental Flow
Thirty-six participants participated in the
experiment. There were 34 males and 2 females. We
assigned 18 participants (including one female) to
the humanlike group and the remaining 18
participants to the head-fixed group. Eighteen
participants on each group were paired (a
manipulator and a player). Each group had nine
pairs.
The experiment was divided into the testing
phase and the recording phase. Before the
experiment, we instructed the participants as
follows: "In this experiment, you need to create
general communication strategies for the robot with
the assembling task. Do not use any kind of code
that is incomprehensible to other person." This
instruction served the purpose to keep the designed
communication strategies general.
At first, each manipulator calibrated the robot
parameters to the scale of his/her body. Then, the
pairs started the testing phase. During this phase,
each manipulator gave instructions for any kind of
buildings she/he could imagine. The members in
each pair made trial-and-errors efforts and improved
their communication strategies.
When the pair determined that they could not
improve their manipulation time anymore, the
experiment moved to the recording phase. We
assigned the manipulator one of the three examples
in Fig. 6 and recorded the interaction. The pair
required to assemble the building within 300 s.
When the recording finished, each participant
answered the questionnaire and the experiment was
POSSESSED ROBOT: HOW TO FIND ORIGINAL NONVERBAL COMMUNICATION STYLE IN HUMAN-ROBOT
INTERACTION
637
terminated.
5.5 Prediction: Overlapped Time Ratio
and Confirmation Ratio
Pre-evaluation confirmed that turn-taking behaviour
was used in human-human interaction with
instructions on how to assemble the blocks. The
evaluation also revealed that head movement played
a key role on regulating turn taking. However, turn
taking itself is difficult to evaluate by video
recording data, especially when this evaluation lacks
verbal cues.
We used the overlap time ratio as an indicator of
turn-taking behaviour between each manipulator and
player. A previous HRI study using humanoids
showed that the increase in overlapped verbal cues
of both persons suggests failure of turn taking (Chao
et al., 2010). We extended this idea to nonverbal
situations. If turn taking took place without any
problems, the behaviour of the robot and the human
did not overlap. In contrast, if turn taking did not
succeed, the overlapped time ratio increased. In this
paper, we defined overlapped time ratio as robot's
moving time during user's lifting per user's lifting
time. Note that the failure of turn taking does not
directly mean failure of communication. If the task
is successfully completed, this increased overlapped
time suggests different communication strategies
between the manipulator and the player.
We used the player's lifting block time to
monitor the player's behavioural time. We counted
the behavioural time from the input video-recorded
data. We used the robot's moving time to monitor
the manipulator behavioural time. When the motor
moves more than ten angles in 1 s, we counted this
as the behavioural time of the manipulator. The
behavioural time of the player did not include the
suspending time in air. However, if there was a
difference in the overlapped time between the
humanlike and the head-fixed group, this difference
suggested that the two groups used different turn-
taking methods.
Our predictions for the head-fixed group in
comparison with the humanlike group are the
following:
Prediction 1: The overlapped time ratio will
increase depending on the failure of the turn-
taking behaviour.
Prediction 2: The ratio of confirmatory
behaviours will decrease.
In the head-fixed group, we asked the
manipulator questions such as "Did you use
confirmatory behaviour? If so, what kind of
confirmation did you use?".
6 RESULTS
One male pair in the humanlike group and two male
pairs in the head-fixed group could not finish
assembling the blocks. Other pairs succeeded in this
task.
The average overlapped time ratio in the
humanlike group is .608 (SD = .062). The average
overlapped time ratio in the head-fixed group is .761
(SD = .125). We applied the Welch t-test to both
groups and the p-value is .0043 < .05. This statistical
result shows that the overlapped time ratio in both
groups is significantly different. This result supports
the first prediction. The overlapped time ratio is
shown in Fig. 9. When we removed the failed pairs,
the average overlapped time ratio in the humanlike
group is .792 (SD = .132) and the overlapped time
ratio in the head-fixed group is .132 (SD = .151).
The p-value from the Welch's t-test is .01 < .05,
which also suggests significant difference.
Figure 9: Overlapped time during humanoid and hand
robot.
The questionnaires after the experiment showed
that all manipulators in the humanlike group used
head nodding and shaking for confirmation. In
contrast, nine manipulators in the head-fixed group
raised their hand for confirmation and shook their
hand for denying. Two manipulators in the head-
fixed group answered that they did not use
confirmation in their communication. Based on this
result, we counted the raising and shaking hands as
confirmation in the head-fixed group.
The players use two kinds of confirmations
before and after lifting the blocks. Confirmation
before lifting the blocks (before-confirmation) was
used to point which block is right or wrong.
Confirmation after lifting the blocks (after-
confirmation) was used to point which location and
direction is right or wrong. We counted both
ICAART 2012 - International Conference on Agents and Artificial Intelligence
638
confirmations.
The average before-confirmation ratio is .63 (SD
= .22) in the humanlike group and .09 (SD = .19) in
the head-fixed group. We applied Welch's t-test to
both groups and the results showed p-values
of .00003, which is less than .0001. When we
removed the failed pairs, the average before-
confirmation ratio is .62 (SD = .22) in the humanlike
group and .11 (SD = .20) in the head-fixed group.
The p-value of the Welch's t-test is .0006, which is
less than .001 and suggests significant difference.
The average after-confirmation ratio is .78 (SD
= .21) in the humanlike group and .30 (SD = .24) in
the head-fixed group. We applied Welch's t-test to
both groups and the result showed p-values of .0005,
clearly smaller than .001. When we removed the
failed pairs, the average before-confirmation ratio
is .78 (SD = .23) in the humanlike group and .28
(SD =.25) in the head-fixed group. The p-value of
the Welch's t-test is .001 < .005, suggesting
significant difference.
We also counted the manipulation time including
the before- and after-confirmation of the robot and
the lifting time of the player. The average time is 7.7
s (SD = 2.4 s) in the humanlike group and 12.8 s
(SD = 5.0 s) in the head-fixed group. We applied
Welch's t-test and found significant difference (p
= .017 < .05). When we removed the failed pairs, the
average time is 7.1 s (SD = 1.8 s) in the humanlike
group and 13.3 s (SD = 5.4 s) in the head-fixed
group. The p-value of the Welch's t-test is .02 < .05,
suggesting significant difference.
In contrast, the average lifting action is 10.9 (SD
= 6.0 s) in the humanlike group and 13.2 (SD = 10.8
s) in the head-fixed group. We applied Welch's t-test
and found no significant difference (p = .58 > .05).
When we removed the failed pairs, the average
lifting numbers were 9.1 (SD = 3.0 s) in the
humanlike group and 8.4 (SD = 2.9 s) in the head-
fixed group. The p-value of the Welch's t-test is .65
> .10, which suggests no significant difference.
7 DISCUSSION
7.1 Predictions
We found significant differences in the overlapped
time ratio and confirmation ratio with and without
the failed pairs. These results support our
predictions.
Pairs in the humanlike group follow the player-
first protocol. After the lifting motion, the player
sometimes skipped to check the movement of the
robot when they rotated a block and placed it.
Confirmation by the robot is sent after the placement
in this case. The manipulator usually confirmed
every movement of the player. In eight pairs of the
humanlike group, the manipulator first pointed the
target, the player subsequently pointed the same
target, and then the robot confirmed. The failed pair
skipped first pointing and it caused more misses.
They spent their entire 300 s and the task failed. The
recorded video also shows that almost player used
turn-taking style strategies because the player
watched the robot periodically.
In contrast, the pairs in the head-fixed group
follow the robot-first protocol. The manipulator in
the head-fixed group sometimes omitted the before-
confirmation. In this case, when the robot pointed to
a block and the player took it, the player moved the
block while observing and following the movement
of the robot's arms without any confirmation. The
manipulator also omitted the after-confirmation and
moved on to the next block. However, omission is
happened more in before-confirmation than in after-
confirmation. The recorded video also supports that
they used following the robot strategy because the
player carefully watched the robot during the lifting
time.
The manipulation time including lifting time
significantly increased in the head-fixed group more
than the humanlike group. Based on the video
recording, this result suggests that each manipulation
time increased in the head-fixed group because they
watched the robot motion and followed it. The
insignificant difference on the lifting action suggests
that the assembling order process is not influenced
by the change of modalities. These two results
suggest that the change in the head modality did not
drastically change the entire communication strategy
only the manipulation strategy from the turn-taking
style to following the robot style.
These findings support our hypothesis that the
turn-taking strategy changed in the head-fixed
group. In the head-fixed group, they used robot-
leading strategies. We estimate that the limited
confirmation modalities forced the pairs to use
robot-leading interaction.
7.2 Discussion about the Design
Process
The entire design process discussed in Section 3
supports the fact that we can have an alternative
communication strategy for nonhumanlike robots
using the Possessed Robot method.
The Possessed Robot method shows the potential
POSSESSED ROBOT: HOW TO FIND ORIGINAL NONVERBAL COMMUNICATION STYLE IN HUMAN-ROBOT
INTERACTION
639
power of the human computation in robot design.
The human brain is the most intelligent computer we
can access. It has the most flexible learning and
most sophisticated communication algorithms. It can
provide the most appropriate response to
unpredicted situations. For example, we estimated
that the manipulator needed a lot of calibration time
even for the motion-capturing system. However, the
manipulator quickly customized to the robot body
and could behave as if she/he was robot.
We also made variations of design process by
different usage of human resources. Participants'
free-writings in the questionnaire suggests that
swapping the manipulator and the player during the
design process will reduce the thinking time. The
questionnaire from the manipulator also suggests
that usage of a third person who does not know the
purpose will increase the generality of the strategy.
7.3 Limitations and Future Work
The purpose of this paper is to evaluate the validity
of our method by assembling a block task. Our
results show one example of the head-fixed design
with no verbal cues leading the robot-first
instructions. From the experimental conditions, we
infer that this change in the communication
strategies is caused by the lack of confirmatory
modalities in the head-fixed robot. Our experiment
only uses nonverbal communication. Our findings
may be useful if the field where verbal interaction
costs lead to high cognitive load (like rescue and
guiding robots). However, the result cannot be
directly applied to human-robot interaction studies if
verbal cues are used.
Our findings from the experiment may need
further research to show their general applicability,
however, our method validates the usefulness of the
Possessed Robot method in HRI studies because it
can find different communication strategies in
human-nonhumanlike robot interaction. Such
different strategies are hard to find in the previous
approaches that designed and implemented robot
shapes and modalities according to human-human
interaction. Our results suggest that the robot-
leading design may be optimal in the case of
headless or head-fixed design robots, such as
SmartPal and BIRON (Li et al., 2004) (Matsukuma
et al., 2004). It is also possible to assemble
guidelines (what design is reasonable and what
design is unpredictable) using Possessed Robot
method. These guidelines reduces useless
investment for development of robot's interface.
In future, we also need to discuss how to find
optimal ways to connect the robot I/O to human I/O.
In this experiment, we started our simplified
demonstration from the viewpoint of decreased
human design. Even if the human is a powerful
problem solver, we estimate that it is still difficult to
handle additional input and output that do not come
to humans natively. We predict that studies about
prosthesis and augmented human technologies will
expand the possibility of human scale.
8 CONCLUSIONS
We proposed an alternative approach called the
Possessed Robot method to find a robot's unique
communication strategy. Previous robot shapes and
modalities are designed by imitating human-human
interaction. This approach has restricted robot design
and behaviour within the limitations of the possible
human modalities. In our approach, the human
manipulator behaves as if she/he possesses the robot
and finds the optimal communication strategies
based on the shape and modalities of the robot.
We implemented the Possessed Robot system
(PoRoS) including a reconfigurable body robot, an
easier manipulation system, and a recording system
to evaluate the validity of our method. We evaluated
the block-assembling task by PoRoS with turning on
and off the modality of the robot head.
Synchronized motion significantly increased in
the head-fixed design, and the ratio of confirmatory
behaviour significantly decreased. Based on the
results, we find an example case for the optimal
communication strategy in the head-fixed design. In
this case, the robot leads the users and the user
follows the robot compared with the turn-taking
communication style in the humanoid condition.
This result shows the feasibility of the Possessed
Robot method in finding the appropriate strategy
according to each robot design.
ACKNOWLEDGEMENTS
This work was supported by the JST PRESTO
program.
REFERENCES
del Pobil, A. P., and Sudar, S. 2010. Lecture Notes of the
Workshop on the Interaction Science Perspective on
HRI: Designing Robot Morphology, at ACM/IEEE
Human Robot Interaction http://www.robot.uji.es/
ICAART 2012 - International Conference on Agents and Artificial Intelligence
640
research/events/hri2010
Blow, M., Dautenhahn, K., Appleby, A., Nehaniv, C., and
Lee, D. 2006. Perception of Robot Smiles and
Dimensions for Human-Robot Interaction Design, In
Proceedings of the 15th IEEE Int Symposium on Robot
and Human Interactive Communication, IEEE,
469-474.
Kanda, T., Kamasima, M., Imai, M., Ono, T., Sakamoto,
D., Ishiguro, H., and Anzai, Y. 2007. A humanoid
robot that pretends to listen to route guidance from a
human. Autonomous Robots, 22, 1, 87–100.
Lee, M. K., Kiesler, S., Forlizzi, J., Srinivasa, S., and
Rybski, P. 2010. Gracefully mitigating breakdowns in
robotic services. In Proceedings of Human Robot
Interaction, 203-210.
Shiwa, T., Kanda, T., Imai, M., Ishiguro, H., and Hagita,
N., 2009. How Quickly Should a Communication
Robot Respond? Delaying Strategies and Habituation
Effects. International Journal of Social Robotics, 1, 2,
141-155.
Hirai, K., Hirose, M., Haikawa, Y., and Takenaka, T.,
1998, The development of honda humanoid robot, In
proceedings of the IEEE Intl. Conf. on Robotics and
Automation (ICRA), 1321–1326.
Li, S., Kleinehagenbrock, M. Fritsch, J. Wrede, B. and
Sagerer, G., 2004. "BIRON, let me show you
something": evaluating the interaction with a robot
companion, In proceedings of the IEEE International
Conference on Systems, Man and Cybernetics, 3,
2827- 2834.
Matsukuma, K., Handa, H., and Yokoyama, K., Vision-
based manipulation system for autonomous mobile
robot ‘SmartPal’, 2004. In Proceedings of Japan
Robot Association Conference. 3D28.
Matsumoto, N., Fujii, H., Goan, M., and Okada. M., 2005.
Minimal communication design of embodied interface.
In proceedings of the International Conference on
Active Media Technology (AMT), pp.225-230 (2005).
Bartneck, C. 2002, eMuu – An InterFace for the
HomeLab, In Poster at the Philips User Interface
Conference (UI2002)
Yamaji, Y., Miyake, T., Yoshiike, Y., De Silva, P. R. S.,
and Okada. M., 2010. STB: human-dependent sociable
trash box. In Proceedings of Human Robot
Interaction. 197-198.
Shibata T., Mitsui, T., Wada, K., and Tanie, K., 2002.
Subjective Evaluation of Seal Robot: Paro
Tabulation and Analysis of Questionnaire Results -,
Journal of Robotics and Mechatronics, 14, 1, 13-19.
Fujita, M. and Kitano, H. 1998. Development of an
Autonomous Quadruped Robot for Robot
Entertainment, Autonomous Robots, 5, 7-18.
Osawa, H., Ohmura, R., Imai, M., 2009. Using Attachable
Humanoid Parts for Realizing Imaginary Intention and
Body Image, International Journal of Social Robotics,
1, 1 , 109-123.
Osawa, H., Orszulak, J., Godfrey, K. M., Coughlin, J.,
2010. Maintaining Learning Motivation of Older
People by Combining Household Appliance with a
Communication Robot, In Proceedings of the
IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), 5310-5316.
Kelley, J. F., 1984. An iterative design methodology for
user-friendly natural language office information
applications. ACM Transactions on Office Information
Systems, 2, 1, 26–41.
Steinfeld, A., Jenkins, O. C., and Scassellati, B., 2009. The
oz of wizard: simulating the human for interaction
research. In Proceedings of the 4th ACM/IEEE
international conference on Human robot interaction
(HRI '09). ACM, 101-108.
Kuzuoka, H., Oyama, S., Yamazaki, K., Suzuki, K., and
Mitsuishi, M., 2000. GestureMan: A Mobile Robot
that Embodies a Remote Instructor's Actions, In
Proceedings of Computer Supported Cooperative
Work, 155-162.
Glas, D. F., Kanda, T., Ishiguro, H., and Hagita, N., 2008.
Simultaneous Teleoperation of Multiple Social
Robots, In Proceedings of Human-Robot Interaction,
311-318.
Lee, J. K., Stiehl, W. D., Toscano, R. L., and Breazeal, C.
2009. Semi-Autonomous Robot Avatar as a Medium
for Family Communication and Education. In
Proceedings of Advanced Robotics. 1925-1949.
Disney, 2004. Turtle Talk with Crush, http://disneyland.
disney.go.com/disneys-california-adventure/turtle-talk
-with-crush/?name=TurtleTalkEntertainmentPage
NaturalPoint, OptiTrack s250e, 2010, http://www.natural
point.com/optitrack/products/s250e
Sacks, H., Schegloff, E. A., and Jefferson, G. 1974. A
simplest systematics for the organization of turn-
taking for conversation. Language, 50, 696-735.
Cassell, J. and Thórisson, K. R. 1999. The Power of a Nod
and a Glance: Envelope vs. Emotional Feedback in
Animated Conversational Agents. Applied Artificial
Intelligence, 13, 519-538.
Chao, C., and Thomaz, A. L., 2010. Turn Taking for
Human-Robot Interaction, In Proceedings of AAAI
Fall Symposium (Applied Artificial Intelligence)
POSSESSED ROBOT: HOW TO FIND ORIGINAL NONVERBAL COMMUNICATION STYLE IN HUMAN-ROBOT
INTERACTION
641