ble 4 indicates that the additional computational effort
to process the utterances with a large lexicon plays no
significant role.
Table 4: Response times with different lexicons.
small lexicon large lexicon
utt. i 0.07 sec 0.08 sec
utt. ii 0.10 sec 0.14 sec
utt. iii 0.68 sec 0.90 sec
utt. iv 0.76 sec 1.15 sec
utt. v 2.35 sec 2.51 sec
4.3 Discussion
An important point towards successful human-robot
interaction with respect to the user’s patience is the
system’s reaction time. The average human atten-
tion span (for focused attention, i.e. the short-term
response to a stimulus) is considered to be approx-
imately eight seconds (Cornish and Dukette, 2009).
Therefore, the time we require to process the utter-
ance of a user and react in some way must not exceed
8 seconds. Suitable reactions are the execution of a
request, rejection, or to start a clarification process.
Hence, the question whether computation times
are reasonable is in fact the question whether the com-
putation times exceed eight seconds. Nonetheless, the
answer is not as easy as the question. The optimised
system performs well in a realistic test scenario as
shown by the last row of Table 2. In turn, complex test
scenarios can lead to serious problems as Table 3 in-
dicated. However, we saw that ambiguity is a smaller
problem than the length of an utterance
1
. Skills that
havemore than three parameters are rare in the field of
mobile service robots. In fact, the skills with four or
five parameters we used in the tests of Table 3 needed
to be created artificially in lack of realistic examples.
5 CONCLUSIONS & FUTURE
WORK
We presented a system for interpreting commands
issued to a domestic service robot using decision-
theoretic planning. The proposed system allows for
a flexible matching of utterances and robot capabili-
ties and is able to handle faulty or incomplete com-
mands by using clarification. It is also able to provide
explanations in case the user’s request cannot be exe-
cuted and is rejected. The system covers a broader set
1
By the length of an utterance, we mean the number of
spoken objects.
of possible requests than existing systems with small
and fixed grammars. Also, it performs fast enough to
prevent annoying the user or loosing his or her atten-
tion.
Our next step is to deploy the system in a
RoboCup@Home competition to test its applicability
in a real setup. A possible extension of the approach
could be to include a list of the n most probable in-
terpretations and to verify with the user on which of
these should be executed. Moreover, properly inte-
grating the use of adverbials as qualifiers for nouns
both in the grammar and the interpretation process
would further improve the system’s capabilities.
REFERENCES
Austin, J. L. (1975). How to Do Things with Words. Harvard
University Press, 2 edition.
Beetz, M., Arbuckle, T., Belker, T., Cremers, A. B., and
Schulz, D. (2001). Integrated plan-based control of
autonomous robots in human environments. IEEE In-
telligent Systems, 16(5):56–65.
Boutilier, C., Reiter, R., Soutchanski, M., and Thrun, S.
(2000). Decision-theoretic, high-level agent program-
ming in the situation calculus. In Proc. of the 17th
Nat’l Conf. on Artificial Intelligence (AAAI-00), pages
355–362. AAAI Press/The MIT Press.
Clodic, A., Alami, R., Montreuil, V., Li, S., Wrede, B.,
and Swadzba, A. (2007). A study of interaction be-
tween dialog and decision for human-robot collabora-
tive task achievement. In Proc. Int’l Symposium on
Robot and Human interactive Communication (RO-
MAN’07), pages 913–918. IEEE.
Cohen, P. R. and Levesque, H. J. (1985). Speech acts and
rationality. In Proc. of the 23rd Annual Meeting on
Association for Computational Linguistics, pages 49–
60.
Cornish, D. and Dukette, D. (2009). The Essential 20:
Twenty Components of an Excellent Health Care
Team. RoseDog Books.
Doostdar, M., Schiffer, S., and Lakemeyer, G. (2008). Ro-
bust speech recognition for service robotics applica-
tions. In Proc. of the Int’l RoboCup Symposium 2008
(RoboCup 2008), pages 1–12. Springer.
Ervin-Tripp, S. (1976). Is Sybil there? The structure of
some American English directives. Language in Soci-
ety, 5(01):25–66.
Ferrein, A. and Lakemeyer, G. (2008). Logic-based robot
control in highly dynamic domains. Robotics and Au-
tonomous Systems, 56(11):980–991. Special Issue on
”Semantic Knowledge in Robotics”.
Fong, T., Thorpe, C., and Baur, C. (2003). Collabora-
tion, dialogue, human-robot interaction. In Robotics
Research, volume 6 of Springer Tracts in Advanced
Robotics, pages 255–266. Springer.
G¨orz, G. and Ludwig, B. (2005). Speech Dialogue Systems
- A Pragmatics-Guided Approach to Rational Interac-
tion. KI–K¨unstliche Intelligenz, 10(3):5–10.
ICAART 2012 - International Conference on Agents and Artificial Intelligence
34