Multi-view ToF Fusion for Object Detection in Industrial Applications
Inge Coudron and Toon Goedem´e
EAVISE Research Group, KU Leuven, Jan Pieter De Nayerlaan 5, Sint-Katelijne-Waver, Belgium
{inge.coudron, toon.goedeme}@kuleuven.be
Keywords:
Extrinsic Calibration, Multi-sensor, Object Detection.
Abstract:
The use of time-of-flight (ToF) cameras in industrial applications has become increasingly popular due to the
camera’s reduced cost and its ability to provide real-time depth information. Still, one of the main drawbacks
of these cameras has been their limited field of view. We therefore propose a technique to fuse the views
of multiple ToF cameras. By mounting t wo cameras side by side and pointing them away from each other,
the horizontal field of view can be artificiall y extended. The combined views can then be used for object
detection. The main advantages of our technique is that the calibration is fully automatic and only one shot of
the calibration target is needed. Furthermore, no overlap between the views is required.
1 INTRODUCTION
Object detection remains an important challenge in
industry. In ma ny of these applications, a la rge scene
area needs to be covered. Hence a camera with a wide
field of view is usually required. Since the field o f
view of a single ToF camera is limited, multiple ca-
meras mu st be com bined. This requires to first cali-
brate the rela tive poses (i.e. extrinsic paramete rs) of
the cameras.
Once the data from the different cameras is trans-
formed into a common reference frame, it can be fed
to the object detection framework. A popular appro-
ach to the 3D object detection problem is to exploit
range images (Bielicki and Sitnik, 2013). These ima-
ges make data processing significantly faster, as they
convert the mo st time-consuming tasks (e.g ., nearest
neighbor search) from a 3D space into a 2D space. We
will there fore rende r the registered point clouds with
a virtua l camera to simulate a depth sensor.
In this paper, we present a convenient extern al ca-
libration method for a multi-ToF system. That is to
say, the human interaction and export knowledge re-
quired for the calibration is kept to a minim um. The
views from the different ToF camera s c an be merged
into an extended range image usable for 3D object
detection. The remainde r of this paper is organized as
follows. Firstly, the Related Work section provides an
overview of existing calibration techniques. Section 3
introdu ces our app roach for multi-view TOF fusion.
Experiments in section 4 show the accuracy in cali-
bration. Finally, a short conclusion is given.
2 RELATED WORK
The calibr ation of mu ltiple cameras is a well-studied
problem in computer vision. The most co mmon met-
hod for calibrating conventional intensity cameras is
to use a checkerboard which is obser ved at different
positions and orie ntations within the cameras shared
field of view (Zhang , 2000). Given the image coor-
dinates of the reference po ints (i.e., the checkerboard
corners) and the g eometry of the checkerboard (i.e. ,
the numb e r of squares and the squa re dime nsion), the
camera parameters can be estimated using a closed
form so lution w.r.t. the pinhole camera model. An ite-
rative bundle adjustment algorithm can then be used
to refine the parameters. The same stand ard technique
could be used for ToF cameras as well, as they pro-
vide an amplitude image associated with each range
image. However, the low resolution of the amplitude
images makes it difficult to detect the checkerboard
corners reliably resulting in ina ccurate c alibration.
To overcome this limitation, other methods have
been proposed that work directly on 3D shapes. Au-
vinet et al. (Auvinet et al., 2012), for example, use
the intersection points of triplets of plane s as ref e-
rence points. The equation of each plane can be cal-
culated by using a singular value decomposition of
points lying on the plane. Given the sets of co rre-
sponding reference points, the rigid body transfo rma-
tion between the pair of cameras is estimated in a le-
ast square sense. Another method presented by Ruan
et al. (Ruan and Huber, 2014), u ses the centers of a
spherical calibration target as reference poin ts. The