Figure 1: Kinect sensor and coordinate system (taken from
(Garcia and Zalevsky, 2008)).
Although there are some works that use Kinect as an
input sensor for reactive navigation (see for example
(Cunha et al., 2011), (Biswas and Veloso, 2011)), to
the best knowledge of the authors the commented
drawbacks have not been explicitly addressed.
In order to test the suitability of the proposed
approach, a set of experiments has been conducted
within cluttered scenarios. The experiments have
validated the improvement in performance of a
particular reactive navigator (Blanco et al., 2008),
though the solution presented here can be applied to
any other reactive algorithm that utilizes a 2D
representation of the space, i.e. only relying on 2D
laser scanners.
Next section gives a description of the Kinect
device. Section III presents the proposed solution for
the usage of Kinect as an additional input sensor for
any reactive navigation algorithm based on a 2D
obstacle representation. In section IV results from
the conducted experiments are presented and
discussed. Finally, some conclusions are outlined.
2 THE KINECT SENSOR
2.1 Description
The Kinect device (see figure 1) is equipped with an
RGB camera, a depth sensor, a matrix of
microphones, a motorized base which endows the
sensor with a tilt movement of 27º, and a 3-axis
accelerometer.
Focusing on the depth sensor, also called range
camera, it is composed of an infrared light projector
combined with a monochrome CMOS sensor. It has
a VGA resolution (640x480 pixels) with 11-bit
resolution depth, and a data refresh ratio of 30 Hz.
Its nominal field of view is 58º in the horizontal
plane and 45º in the vertical one. The operational
range reported by the manufacturer is from 1.2 m. to
3.5, though in our experiments we have tested that
the sensor is able to detect objects placed at 0.5 m.
(see section 2.3).
2.2 Working Principle of the Range
Camera
The range camera of Kinect consists of two devices:
an infrared (IR) light projector which casts a pattern
of points onto the scene and a standard CMOS
monochrome sensor (IR camera) (Freedman et al.,
2010). Both are aligned along the X axis, with
parallel optical axis separated a distance (baseline)
of 75mm (see figure 1). This configuration eases the
computation of the range image, which is performed
through the triangulation between the IR rays and
their corresponding dot projections onto the image.
The method to compute the correspondence between
rays and pixels relies on innovative technique called
Light Coding (Garcia and Zalevsky, 2008), patented
by PrimeSense (PrimeSense, www.primesense.com),
which entails a very particular factory calibration. In
this calibration, a set of images of the point pattern,
as projected on a planar surface at different known
distances, are stored in the sensor. These are the so-
called reference images.
Kinect works like a correlation-based stereo
system with an ideal configuration (i.e. identical
cameras, aligned axes separated along the X axis)
where the IR rays are “virtually” replaced by line-of-
sight of the points in the reference images. As in
stereo, depth at each image point is derived from its
disparity along the X axis, which is computed by
correlating a small window over the reference
image. Further information about this calculation
can be found in (Khoshelham, 2011).
Regarding the accuracy of this method, the
distance errors are lower than 2 cm. for close objects
(up to 2m.), linearly increasing until an average error
of 10 cm. at 4m.
2.3 Kinect Reliability for Detecting
Thin Obstacles
To assess how much reliable Kinect is to feed a
robot reactive navigation system, it is of interest to
analyze its capacity to detect surrounding obstacles,
particularly those that are hardly detectable by other
sensors, either because of their small size or because
of their position in the scene, e.g. the salient board of
a table, the legs of a chairs, coat stands, etc.
We have performed a number of experiments
where sticks of different thickness (1, 2, and 4 cm.)
were horizontally placed in front of the sensor, at a
ICINCO2013-10thInternationalConferenceonInformaticsinControl,AutomationandRobotics
394