robot side. The only requirement is the diagnostic ag-
gregator node, which is already present in most ROS-
compatible robot platforms. On the Nagios server
side, it needs the basic ROS installation to use the
ROS protocols and standard diagnostic message for-
mats.
The ROS plugin is a Python script developed to
access the robot’s ROS core node via XML-RPC, to
remotely subscribe to the diagnostic aggregator topic
of each monitored robot, and to parse the information
in the Nagios output format. The proposed plugin is
general in the sense that it is independent of the mon-
itored device. It can parse information of any kind of
robot with different number and types of devices.
The following example shows the robot’s overall
status, which is OK. It lists the name of topics which
carry the status of different parts of the robot. The
robot’s overall status and its topics can have 3 pos-
sible statuses: OK, CRITICAL and WARNING. The
overall status assumes the most severe status of all
monitored topics.
$ . / r os −d i a g n o s t i c s a g g . py −H < h o s t >
OK − OK S e n s o r ( s ) l i s t :
/ Camera , / Camera / Caml , / L aser ,
/ L a s er / L a ser 1 , / L a se r / L a ser 2 , / Motor ,
/ Motor / Motor1 , / Motor / Motor2 ,
/ Motor / Motor3 , / Power ,
/ Power / L a pto p B a t t e r y ,
/ Power / Robot B a t t e r y , / Temp ,
/ Temp / Sen s or1 , / Temp / S e n s o r 2
The following example shows topics in different
severity status. Figure 2 shows Nagios monitoring
multiple robots with different statuses. The main ta-
ble shows the statuses of two robots (#1 and #2 in
Figure 2). Robot #1 has 6 checks and only the last
one presents a warning state (Figure 2(c)), in yellow.
Robot #2 has 9 checks, the 8th in critical state (Fig-
ure 2(a), in red) and a few other checks are in pending
state (Figure 2(b), in grey) because the robot has been
powered up recently. Robot #1 has a single check for
all sensors (Figure 2(d)), while robot #2 has multiple
checks for different sensors (laser, camera, battery).
This scenario gives an example of two types of moni-
toring configurations: detailed or summarized.
CRITICAL − CRITICAL s e n s o r ( s ) l i s t :
/ Camera , / Camera / Cam1
WARNING s e n s o r ( s ) l i s t :
/ Power , / Power / L apt op
OK s e n s o r ( s ) l i s t :
/ L a ser , / L a s er / L a ser 1 , / L a se r / L a ser 2 ,
/ Motor , / Motor / Motor1 , / Motor / Motor2 ,
/ Motor / Motor3 , / Power ,
/ Power / Robot B a t t e r y ,
/ Temp , / Temp / S enso r1 ,
/ Temp / S e n s o r 2
The plugin also has the ability to monitor only
specific sensors’ status, instead of monitoring all
robot’s statuses. For example, the plugin syntax al-
lows the monitoring of only battery statuses:
$ . / r os −d i a g n o s t i c s a g g . py −H < h o s t > −
,→ N b a t t e r y
OK − OK S e n s o r ( s ) l i s t :
/ Power , / Power / L apt op B a t t e r y ,
/ Power / Robot B a t t e r y
In this case, all other sensors not containing bat-
tery in the name are ignored by the plugin. The name
parameter also allows more configuration flexibility
on Nagios. For example, it allows monitoring of the
Motor status every 5 minutes, and the temperature
sensor every 30 seconds.
An important characteristic of this architecture is
the fact that the monitor system is completely inde-
pendent of the robot application, meaning that if the
monitor server stops, only the monitor system will
cease. The robotic system will carry on working as
if nothing happened.
4 EXPERIMENTS
Two types of experiments are performed to evaluate
the proposed architecture: a scalability experiment
with up to 100 heterogeneous virtual robots, and an
experiment with one real robot.
4.1 Server Scalability Experiment
Up to 100 heterogeneous virtual robots were executed
at the same time to test the scalability of the monitor-
ing server. The rest of this section details the virtual
robot setup and the monitoring server setup.
Figure 3 illustrates the architecture of the scala-
bility experiment. The Database server in the left side
is not a solution requirement and was created only to
collect performance data (e.g. cpu load, memory us-
age and network bandwidth) from the virtual robots
during the experiment. Nagios server is running and
the proposed ROS plugin is installed on the server
side, without any change to the virtual robots’ soft-
ware. All computers (servers and robots) are on the
same network, or equivalent via VPN. Details of this
setup are presented in the next sections.
4.1.1 The Virtual Robot Setup
The virtual robot is a Python application, running on
a Virtual Machine (VM), developed to generate di-
agnostic data typically generated by robots compliant
with ROS diagnostics. The python application reads
ICINCO 2018 - 15th International Conference on Informatics in Control, Automation and Robotics
144