MODEL BASED CONTINUAL PLANNING AND CONTROL FOR

ASSISTIVE ROBOTS

A. Anier and J. Vain

Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia

Keywords:

Model based control, Continual planning, Cognitive architecture, Online safety monitoring.

Abstract:

The paper presents a model-based robot planning and control framework for human assistive robots - namely

for Scrub Nurse Robots. We focus on endoscopic surgery as one of the most relevant surgery type for applying

robot assistants. We demonstrate that our framework provides means for seamless integration of sensor data

capture, cognitive functions for interpretation of sensor data, model based continual planning and actuation

control. The novel component of the architecture is a distributed continual planning system implemented based

on the Uppaal timed automata model-based veriﬁcation and control tool suite. The distributed and modular

architecture of the framework enables ﬂexible online reconﬁguration and easy adaptability to various appli-

cation contexts. Online learning and safety monitoring functions ensure timely and safe updates of software

components on-the-ﬂy.

1 INTRODUCTION

The assistive robotics sets high standards to cogni-

tive capabilities, autonomy and movement precision

for robots. Functionally, it means understanding hu-

man intention and providing adequate reaction to it.

Technically it means human-in-the-loop collaborative

action control, fusion of various sensor information,

high accuracy actuation and reliable software imple-

mentation. Action and trajectory planning safety is-

sues become critical in the conditions where the robot

shares user’s working envelope to achieve required

physical interaction.

This paper presents a software integration frame-

work for Scrub Nurse Robot (SNR)(Miyawaki et al.,

2005) focusing on distributed model based continual

planning and control issues. The goal of a SNR is to

learn the interactions between a surgeon and a scrub

nurse during a laparoscopic surgery and to replace the

(human) nurse on demand. The key aspect for in-

corporating the SNR in the collaborative action (e.g.

when the human scrub nurse has to deal with unex-

pected emergencies) is to avoid the need for the sur-

geon to re-adapt to the changed partnerwhile still pre-

serving the “original feel” and the accustomed work-

ﬂow. A physical scene of a SNR example deployment

is shown in Fig.1.

A scrub nurse must hand a surgical instrument to

a surgeon as soon as it is requested. If the scrub nurse

Figure 1: SNR intraoperative scene(Miyawaki et al., 2005).

has to spend time searching for the instrument after

the request the procedure is interrupted, valuable time

is lost and an unnecessary burden is placed on the sur-

geon. That possibly reduces the quality and effective-

ness of the operation. The scrub nurse must be fully

attentive to the activity in the operative ﬁeld and an-

ticipate accurately what a surgeon will need to avoid

delays. For this to be possible the scrub nurse not

only needs to know the surgical procedure as well as

the surgeon does, but must also be highly disciplined.

The “ideal” scrub nurse (if one exists) is able to pass a

surgeon whatever is needed without any verbal order

at the moment that the surgeon’s hand is extended to

receive it.

The goal of the SNR software project is to de-

velop a human-adaptive SNR capable of adapting to

382

Anier A. and Vain J..

MODEL BASED CONTINUAL PLANNING AND CONTROL FOR ASSISTIVE ROBOTS.

DOI: 10.5220/0003783503820385

In Proceedings of the International Conference on Health Informatics (HEALTHINF-2012), pages 382-385

ISBN: 978-989-8425-88-1

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

surgeons with various levels of skill and experience,

also to different personalities and moods . In other

words, the SNR ought to function as an “ideal” scrub

nurse. Highly developed cognitive faculties such as

machine vision and speech recognition as well as

adaptive robotic arm path planning and targeting are

required to attain this ideal.

In conventional surgical operations a scrub nurse

frequently has to handle an array of different instru-

ments. It is difﬁcult to make the SNR adaptive to

such busy operations. Therefore, the SNR prototype

has been designed for endoscopic surgery which only

needs limited types of surgicalinstruments. The adap-

tivity of the SNR requires unsupervised learning by

observing skilled nurses’ interactions and behavior

during surgical operations.

Online recognition and anticipation of surgeon’s

motions while operating is essential to classify which

motions are common to all surgeons and which are

speciﬁc to individuals. This in turn will aid in an-

ticipating a surgeon’s needs and in adapting to the

changes of procedure. On the other hand - the results

of the investigation of intraoperative behavior have to

be abstracted and memorized in the form of mathe-

matical and/or formal models in order to reproduce

the variety of motion trajectories that can be expected

from various combinations of surgical procedures and

varying external factors. The model of a nurse’s be-

haviors as he or she reacts to other surgical staff (sur-

geon, assistant and others) serves as a high-level be-

havior speciﬁcation for the SNR action planning.

The SNR’s control architecture depicted in Fig.2

comprises of the following components: 3D posi-

tion tracking system that is capable of measuring the

position-tracking marker’s coordinates with precision

more than 1 mm with sampling rate up to 200 fps. The

surgeon’s hand movement sampling data is passed to

gesture recognition module that uses multiple recog-

nition methods in parallel. These methods of detect-

ing operator’s current motions and the voting mecha-

nism(Vain et al., 2009) maximize the conﬁdence of

the recognition. The identiﬁed motion and its pa-

rameters are inputs for reactive motion planning that

compares the observed movement of surgeon’s hand

with that of predicted by surgeon’s behavior model

and surgery scenario model.

Such online conformance monitoring allows to

correct the current model state with precision of min-

imum sampling error. By the corrected state infor-

mation and surgery scenario model the next SNR ac-

tion is planned and the resulting control parameters

are transferred to the actuation control unit of SNR.

The information about surgeon’s possible reactions

predicted by the surgeon’s model is returned to the

Computer mouse3DMS cameras

TRC

Socket

EvaComm2 SDK

Mouse Tracker

MotionAnalysis EVaRT

Visualization

Configuration

jEvart

Object definition subsystem

Rapid Miner

Uppaal TRON (DTRON)

Voting automata

TA automataTA generator

Spread Proxy

Filtering

Motion rec.

Spread op.

EVaRT op.

Filtering

Motion rec.

Spread op

EVaRT op.

Filtering

Motion rec.

Spread op

EVaRT op.

Spread

SNR

Figure 2: Architecture model.

motion recognition module for discrimination of the

decisions space when new movement is being recog-

nized.

The control architecture described above is imple-

mented based on the open middle-ware platform dis-

cussed more thoroughly in the following sections.

2 SOFTWARE ARCHITECTURE

2.1 Data Acquisition

SNR doesn’t have integrated vision. Instead, the vi-

sual feedback control is implemented by means of

external MotionAnalysis Hawk near-infrared active

3D measurement system (3DMS). 3DMS is not the

only source of information. There are various sen-

sors to monitor the state of the robot and peripheral

interfaces that contribute to the overall situation and

context awareness. For instance, the position data

of surgical instruments is backed by RFID readings

of ceramic RFID tag positions that are attached to

the instruments. Abdominal video imaging from la-

paroscopic camera provides more accurate informa-

tion about the course of surgery.

Middle-ware jEvart uniﬁes 3DMS data with other

data acquisition sources and passes to data analy-

sis and cognitive modules implemented by means of

Rapid Miner tool - www.rapid-i.com.

MODEL BASED CONTINUAL PLANNING AND CONTROL FOR ASSISTIVE ROBOTS

383

2.2 Data Analysis and Cognitive

Functions

The robot control framework and middle-ware pro-

vide a common platform for integration of data ac-

quisition and cognitive functions.

Data analysis and cognitive functions are imple-

mented by means of data mining toolkit Rapid Miner.

It includes hundreds of algorithms ranging from ﬁlter-

ing and clustering to machine learning packaged into

an integrated development environment. Rapid Miner

is inspired by WEKA machine learning toolkit(Hall

et al., 2009) improved with extensive data visualiza-

tion and analysis automation tools.

To make the Rapid Miner ﬁt the SNR overall

control architecture some custom plug-ins are imple-

mented. Speciﬁcally, it concerns the data acquisition

components to capture the data available for analy-

sis and visualization, but also the DTRON plug-in

that bridges cognitive functions to deliberative con-

trol level functions. The deliberative control is based

on provably correct timed automata models executed

symbolically by DTRON tool.

3 DISTRIBUTED TRON

The SNR timed automata based action planning and

control make use of Uppaal tool suite(Behrmann

et al., 2004). Uppaal editor allows manual con-

struction of timed automata in a way of visual pro-

gramming paradigm. Limited functionality of vari-

ous elements of the automata can be encoded using

C-like functions. Although those functions make it

somewhat easy to specify state transitions, their us-

age is prone to state space explosion. The Uppaal

tool-suite includes an extension for Testing Real-time

systems Online (TRON)(Hessel et al., 2008). Al-

though TRON was originally developed for confor-

mance testing, it also supports the functionality rel-

evant to model-based discrete control. To interface

the TRON model-based control module with control-

lable object requires “adapters” on the object side.

Adapters intermediate and interpret the signals traf-

ﬁcking between the Uppaal automata and the con-

trol object. TRON was originally designed for single

tester-testee pair and does not scale well with n > 1

testers and m > 1 testees. So it does not easily scale

to distributed control applications. The main limita-

tion of TRON usage is that it requires an extensive

effort for adapter coding between controllers and con-

trol objects. When the adapter-controller pairs are

tightly coupled every change in conﬁguration requires

re-wiring on both adapter ends.

Distributed TRON (DTRON) proposed in this pa-

per is a framework built around the TRON tool to sup-

port multicast messaging between TRON instances

running in parallel. In the ISO OSI networking ar-

chitecture sense it implements the whiteboard pattern

where publishers publish data and subscribers get no-

tiﬁed about this. On the other hand, it embraces the

dependency injection programming paradigm to make

the controller-controllable object pairs loosely cou-

pled for much better scaling.

To multicast is to send a message not to one recip-

ient but to n recipients. DTRON is able to intercept

the designated transitions within one control agent

(model) and inform the other control agents of inter-

ests about it. The designation is deﬁned by predicate

on a synchronized transition of the controlling agent

model. The synchronization and communication be-

tween agents is implemented by means of multicast

message passing that allows the agents (dynamically)

to join and leave a multicast wheneverthey want with-

out the need to re-conﬁgure existing infrastructure. It

only requires an agreement or protocol how messages

are deﬁned and what data they carry when they tra-

verse the multicast.

4 CONTINUAL PLANNING AND

CONTROL

Continual planning(DesJardins et al., 1999) denotes a

planning strategy where the interactions between the

controller and controllable object cannot be planned

deterministically up front. The control signals have

to be chosen depending on the situation as it emerges.

The controller “knows” the state of the control object

it tries to reach, but has limited control over stimuli

or limited observation power of the control object be-

havior. The continual planning controller stimulates

the object by limited set of stimuli step-by-step driv-

ing it towards the control goal by adjusting the stimuli

to the control object responses.

Timed automata based planning and control suits

for continual control due to its non-deterministic na-

ture. Observations are mapped to automata struc-

ture and transition guards that encode the selection

of stimuli to guide the (possibly) non-deterministic

moves of the controllable object.

Uppaal comes with a formal veriﬁcation engine

that is used to establish weather a “plan” always

drives the object to a desired state, provided the ob-

ject responses are (at least partially) known. An ex-

treme case would be a fully non-deterministic object

that implies that it cannot be guaranteed or estimated

which conditions should hold in order to reach the

HEALTHINF 2012 - International Conference on Health Informatics

384

target state. This sets practical limits to the control-

lability for the SNR. If major deviations from pre-

speciﬁed scenario model occur the SNR would safely

disengage human interaction from the working enve-

lope and switches to manual override.

5 REACTIVE PLANNER

For continual planning and control the SNR actions in

nondeterminstic situations are synthesized on-the-ﬂy.

The synthesis is based on the interaction model the

SNR has learned by observing and recording Scrub

Nurse and Surgeon’s interactive behavior. The timed

automata model learning algorithm used for that has

been introduced in (Vain et al., 2009). The synthesis

of reactive planning controller(Vain et al., 2011), that

guides the SNR action when being active is based on

the interaction model learned. The intended control

goal of the SNR operation is encoded in the scenario

automaton that speciﬁes the sub-goals of the control,

their temporal order and timing constraints. When-

ever one of the sub-goals has been reached it triggers

resets on guard conditions of the interaction model

and activates driving conditions to reach the subse-

quent goal or one of the alternatives if multiple equal

goals are reachable. In case of violating timing con-

straints or blocking an exception handling procedure

or reset is activated and diagnostics recorded. Spe-

cial care has been taken to address the safety precau-

tions in SNR control. An independent safety monitor-

ing process is running to check if all safety invariants

are satisﬁed. Whenever safety violation is detected

the disengagementprocedure from continual planning

unit is activated.

6 CONCLUSIONS

The cognitive robot architecture framework described

in this paper supports several innovative aspects

needed for implementing assisting robots in different

applications. Our experience is based on the Scrub

Nurse robot control architecture and software plat-

form development exercise. We demonstrated that

DTRON model-based distributed control framework

provides ﬂexible infrastructurefor interfacing data ac-

quisition and cognitive functions with the ones of de-

liberative control level planning and decision mak-

ing. The architecture also incorporates a module for

learning human interactions and model construction

with reactive planning controller generator and run-

time execution engine. The timed automata based in-

teraction model learning, on-the-ﬂy reactiveplanning,

controller synthesis and online safety monitoring are

steps towards the concept of provably correct robot

design of cognitive assisting robots.

ACKNOWLEDGMENTS

This work was partially supported by the Estonian

Science Foundation under grant No. 7667 and by

Centre of Research Excellence in Dependable Em-

bedded Systems - CREDES.

We want to thank Fuji Miyawaki for the produc-

tive discussions and suggestions on this subject.

REFERENCES

Behrmann, G., David, A., and Larsen, K. G. (2004). A tuto-

rial on uppaal. In Bernardo, M. and Corradini, F., edi-

tors, Formal Methods for the Design of Real-Time Sys-

tems: 4th International School on Formal Methods for

the Design of Computer, Communication, and Soft-

ware Systems, SFM-RT 2004, LNCS, page 200–236.

Springer–Verlag.

DesJardins, M. E., Durfee, E. H., Ortiz Jr, C. L., and

Wolverton, M. J. (1999). A survey of research in dis-

tributed, continual planning. AI Magazine, 20(4):13.

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann,

P., and Witten, I. H. (2009). The WEKA data min-

ing software: an update. SIGKDD Explor. Newsl.,

11(1):10–18.

Hessel, A., Larsen, K., Mikucionis, M., Nielsen, B., Petters-

son, P., and Skou, A. (2008). Testing Real-Time sys-

tems using UPPAAL. In Formal Methods and Testing,

page 77–117.

Miyawaki, F., Masamune, K., Suzuki, S., Yoshimitsu, K.,

and Vain, J. (2005). Scrub nurse robot system-

intraoperative motion analysis of a scrub nurse and

timed-automata-based model for surgery. Indus-

trial Electronics, IEEE Transactions on, 52(5):1227

– 1235.

Vain, J., Kull, A., K¨a¨aramees, M., Maili, M., and Raiend, K.

(2011). Reactive testing of nondeterministic systems

by test purpose directed tester. In Model-Based Test-

ing for Embedded Systems., Computational Analysis,

Synthesis, and Design of Dynamic Systems, pages

425–452. CRC Press - Taylor & Francis Group, Mas-

sachusetts, USA.

Vain, J., Miyawaki, F., Nomm, S., Totskaya, T., and Anier,

A. (2009). Human-robot interaction learning using

timed automata. In ICCAS-SICE, 2009, pages 2037

–2042.

MODEL BASED CONTINUAL PLANNING AND CONTROL FOR ASSISTIVE ROBOTS

385