Automatic Obstacle Classification using Laser and Camera Fusion

Aurelio Ponz

, C. H. Rodríguez-Garavito

, Fernando García

, Philip Lenz

Christoph Stiller

and J. M. Armingol

Intelligent Systems Lab, Universidad Carlos III De Madrid, Butarque, 15, Leganés, Madrid, Spain

Automation Engineering Department, Universidad de La Salle, Bogotá, Colombia

Institut für Mess- und Regelungstechnik, Karlsruher Institut für Technologie, Karlsruhe, Germany

Keywords: Computer Vision, Lidar, Advanced Driving Assistance System, Sensor Fusion.

Abstract: State of the art Driving Assistance Systems and Autonomous Driving applications are employing sensor

fusion in order to achieve trustable obstacle detection and classification under any meteorological and

illumination condition. Fusion between laser and camera is widely used in ADAS applications in order to

overcome the difficulties and limitations inherent to each of the sensors. In the system presented, some

novel techniques for automatic and unattended data alignment are used and laser point clouds are exploited

using Artificial Intelligence techniques to improve the reliability of the obstacle classification. New

approaches to the problem of clustering sparse point clouds have been adopted, maximizing the information

obtained from low resolution lasers. After improving cluster detection, AI techniques have been used to

classify the obstacle not only with vision, but also with laser information. The fusion of the information

acquired from both sensors, adding the classification capabilities of the laser, improves the reliability of the

system.

1 INTRODUCTION

About 1.2 million people die every year in the world

as a consequence of traffic accidents (WHO, 2009).

ADAS use widely lasers and cameras to detect and

classify obstacles on the road. These sensors are

complementary, as the laser’s ability to detect

obstacles regardless of the light quality and to select

Regions of Interest (ROI) for camera classification,

improves remarkably the speed and accuracy of the

CV classification in the images from the camera.

The present work has been developed using the

Intelligent Vehicle based on Visual Information 2.0

(IVVI 2.0) (Figure 1), the Intelligent Systems Lab’s

research platform (Martín et al., 2014).

The article is divided in the following sections:

Section 2 provides scientific context of the state of

the art in the related domain. Section 3 is a general

description of the system. Section 4 describes the

method for laser point cloud (PC) clustering, which

is the initial part of obstacle detection. Section 5

outlines the data alignment process, essential for a

correct data association between the camera and the

laser system. Section 6 depicts the strategy for

obstacle classification with a Support Vector

Machine (SVM). Finally, conclusions for the present

work are presented.

2 RELATED WORK

The work described in the present paper covers

several fields with interesting state of the art.

Regarding the automatic and unattended data

alignment phase in our system, (Li et al., 2011)

proposed a method for calibration using a

chessboard pattern, (Rodríguez-Garavito et al.,

2014) proposed a method for automatic camera and

laser calibration, based on PC reconstruction of the

road surface. Other approaches such as (Li et al.,

2011) and (Kwak et al., 2011) projects the features

into a 2D plane to minimize the distance among the

features in the different sensors. (Lisca et al., 2010)

presents a CAD model based calibration system for

inter-sensor matching. Similar approach based on

triangular model is presented in (Debattisti et al.,

2013) and based on circular models in (Fremont and

Bonnifait, 2008).

Data fusion detection approaches can be divided in

centralized and decentralized schemes. Some

examples of decentralized schemes can be found in

(Premebida et al., 2009) and (Premebida et al., 2010)

with different algorithms to combine the features

from computer vision and laser , such as Naïve

Bayes, GMMC, NN, FLDA. Decentralized schemes

Ponz A., H. Rodríguez-Garavito C., García F., Lenz P., Stiller C. and M. Armingol J..

Automatic Obstacle Classiﬁcation using Laser and Camera Fusion.

DOI: 10.5220/0005459600190024

In Proceedings of the 1st International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS-2015), pages 19-24

ISBN: 978-989-758-109-0

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: IVVI 2.0 research platform.

implements detection and classification on each

sensor independently and a further fusion stage

combines the detection according to the certainty

provided by each sensor. (Spinello and Siegwart,

2008) provides high level fusion based on

multidimensional features for laser and Histogram of

Oriented Gradients (HOG) for computer vision.

(García et al., 2012) provides pedestrian detection

based on pedestrian's leg model for laser and HOG

features for computer vision for distributed

pedestrian detection and danger evaluation. (García

et al., 2014) takes advantage of advance fusion

techniques (i.e. Joint Probabilistic Data Association

Filter) to enhance decentralised pedestrian detection.

3 GENERAL DESCRIPTION

This work is included in the IVVI 2.0 project

(Figure 1). IVVI 2.0 is the second platform for

development and research of ADAS technologies of

the Intelligent Systems Lab, at Universidad Carlos

III de Madrid.

In the presented application, a Sick LDMRS 4-

layer Laser and a stereo camera are used. Laser for

primary obstacle detection and later for

classification and stereo capability from the camera

is used for PC ground representation and data

alignment parameters estimation; later one of the

cameras is used for image capturing.

The laser generates a PC from which the system

extracts the obstacles as clusters of points. These

clusters are used both for ROI generation in the

images and as information for obstacle

classification. The extracted ROIs in the image are

processed for obstacle classification using AI

methods applied to CV. The last step in the process

performs further information fusion between laser

and camera for a final obstacle classification based

on machine learning.

4 POINT CLOUD CLUSTERING

FOR LASER DETECION

The first step in our system is the obstacle detection

using laser generated PCs. The laser obtains a PC

representing some of the reality in front of the

vehicle. Obstacles are part of this reality and can be

located as local concentrations of points in the PC

that can be mathematically categorized as clusters.

Several clustering techniques have been studied in

order to obtain the highest and most reliable amount

of information from the PC. It is important to note

that obstacles to be detected will be represented by

very few points in the PC, typically from four points

to not much more than fifty depending on the

distance to the vehicle, due to laser limitations. Most

of the clustering strategies already available are

designed for highly populated PC, obtained from

high definition multilayer laser scanners or stereo

cameras, and do not adapt well to our outdoor,

sparse PCs offering limited information.

Figure 2: Angular resolutions by sector in the laser.

Maximum resolution front, minimum in the sides.

4.1 Adapted Euclidean Distance and

Geometrically Constrained

Clusters

In this approach, a classical Euclidean distance

clustering strategy has been adopted, modulated by

several parameters in order to modify the clustering

behaviour, such as distance from the sensor to the

obstacle, geometrical constraints, allowed number of

points in every cluster, etc.

In this approach, clusters are defined as the set of

points separated a distance variable as a function of

several parameters, plus some points meeting some

geometric constraints, such as belonging to the same

line in the space than some of the points in the

cluster.

The strategy is defined as an iterative addition of

points to the cluster with the following steps:

VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems

Figure 3: Axis representation: X=laser-obstacle distance,

Z=detection height, Y=horizontal deviation.

First point in the PC is taken as the first point in

the cluster.

All the other points in the PC are checked to

have a distance smaller than the cluster threshold

ClusterTh

ℎ=ℎ+

(



)



(



)



(



)



(

(



)



if





2

360

then



=2π

0.125

360



if2π

360







2

360

then



=2π

0.25

360



if2π

360







2

360

then



=2π

0.5

360

(1)

where x,y,z are point’s coordinates. Due to laser

restrictions,





is always 0.8º, BaseTh is a parameter

experimentally determined as the base threshold.

DistCorr(x) is a function of the x coordinate which

ensures that no distance smaller than the minimum

physically possible distance will be required, as seen

in Equation (1), and depending on the different

angular resolutions seen in Figure 2. DistCorr(x) is

computed as the minimum distance possible

between two consecutive points in z and y

coordinates.







Represents the angle between two

consecutive laser reads in horizontal (y axis) and







is the angle between two consecutive laser reads

in vertical (z axis).

All the points in the PC are checked for cluster

inclusion. The same iteration is performed for every

point added to the cluster until all cross checks are

performed. Then, points close to the obstacle but not

belonging to the cluster are included into a

temporary new PC together with the obtained

cluster, and then lines are searched in the new

cluster using RANSAC. If lines are found containing

a determined minimum of points belonging to the

original cluster and points not belonging to it, then

these points are added to the cluster. This strategy

has proven to be effective for oblique obstacles.

Figure 4 shows the result of the algorithm. Red

dots are the cluster created by Euclidean Adapted

distance. Blue dots are the points close to the cluster

but not belonging to it. Yellow lines are 3D lines

found by RANSAC, including points from the

original cluster and points from the extended cluster.

Figure 4: Extended cluster. Blue points are added to the

cluster as they share a line with points in the cluster.

Upon completion of cluster extraction, it is

checked against the parameters ClusterTolerance for

maximum width of cluster in meters, and

minClusterSize and maxClusterSize for minimum

and maximum number of points, respectively. These

parameters are also a function of the distance to the

obstacle.

The strategy is addressed to obtain the most

populated clusters possible, taking into account that

we are using a low resolution multilayer laser. The

threshold distance must be adapted to the distance x

from the laser sensor to the obstacle, as the distance

between consecutive laser points grows with x. Due

to laser construction limitations, the minimum

distance detected in y and z in consecutive points

will be greater than the initial threshold if not

adapted following Equation (1).

4.2 Ground Detection and Removal

from Point Cloud

As outlined in Section 5, our system can compute

the plane corresponding to the road surface, so it is

possible to remove ground plane points from the list

of detected clusters.

AutomaticObstacleClassificationusingLaserandCameraFusion

5 DATA ALIGNMENT

Our system is based in data fusion between several

sensors, based on different physical phenomena.

Thus each of these sensors has its own system of

reference, and extrinsic parameters between sensors

system of reference must be estimated in order to

perform the data alignment.

To achieve the necessary alignment, rotation and

translation between sensors must be estimated. Some

methods have already been proposed by other

authors, involving chessboards or specific patterns

(Li et al., 2007) detectable by all of the sensors

involved in the fusion (Kwak et al., 2011). This is

cumbersome and requires driver implication or some

help from others, needs specific and stationary

environment and to be performed manually again in

case of change of orientation or translation between

sensors.

Figure 5: Obstacle detection based on cluster computation.

Dark blue is the cluster, cyan is the ROI for vision

classification.

Our approach estimates the extrinsic parameters

of all the sensors involved, and calibration between

them, assuming flat surface in front of the vehicle,

thus sensor’s height and two rotation angles can

already be determined. For the third angle

computation, any identifiable obstacle located in the

road surface can be used.

Applying the M-estimator-Sample-Consensus

(MSAC) algorithm for plane detection in the PCs

obtained from the stereo camera and from the laser,

the most populated planes are found from the clouds

in the form



(



)

:



+



+



+=0

(2)

which can be written in the Hessian form



(



)

:



.p



=ℎ

(3)

where 



is the vector normal to the road plane, and

the relation between this vector and the camera and

laser rotation angles can be computed as in

(Rodríguez-Garavito et al., 2014).

Once all the calibration parameters, i.e. roll,

pitch, yaw and x,y,z translations between sensors

have been computed, the system is able to translate

from laser coordinates into camera coordinates in the

image for obstacle classification using Computer

Vision.

The conversion between laser and image

coordinate systems can be performed as in equation

(4):









=(













+)

=



cos

(



)

0()

0 1 0

−() 0 ()

·





1 0 1



0()−()



0 () ()

·



cos

(



)

−sin

(



)

() () 0

0 0 1

 ,

=













,

(4)

where T represents the translation vector and R the

rotation matrix between sensors.

6 OBSTACLE CLASSIFICATION

USING LASER AND IMAGE

INFORMATION FUSION

Obstacle classification in this work can be

performed with single sensor information or using

sensor fusion information.

6.1 SVM Classification

Classification is performed using the SVM

implementation from the Computer Vision OpenCV

library. SVM algorithm was developed by Vapnik &

Cortes (Cortes and Vapnik, 1995) and is widely used

in machine learning. In the present work, a database

of manually labelled images and clusters is used to

execute a supervised learning process. After the

training process, the SVM structures are stored and

used for classification of images and clusters as seen

in Figure 6 and Figure 8.

VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems

6.2 Laser Feature Vector

Clusters detected in laser generated PCs are used to

determine a ROI in the image, but can also be used

for obstacle classification without image support

(Premebida et al., 2009).

Clusters are converted into a mesh structure by

Delaunay triangulation in order to reconstruct the

shape of the obstacle, as seen in Figure 7.

Figure 6: SVM learning process for clusters: Training and

classification.

Previous works (Premebida et al., 2009) have

considered 2D PCs for classification, but the present

work is intended to extract features from a 3D PC.

The features considered are described in Table 2.

Figure 7: Mesh representation of a cluster. Triangles

represent the obstacle surface.

Table 1: Features considered for cluster classification.

Concentration: Normalized mean distance (NMD)

to the centroid 3D

Y-Z, X-Z, X-Y concentration: NMD to the

centroid excluding x,y,z

Planicity: NMD to the most populated plane

Sphericity: NMD to the most populated sphere

Cubicity: Measures how far are the planes

containing the mesh triangles from being the

same plane or from being perpendicular .

Triangularity: Measures the triangles uniformity

Average deviation from the median in x,y,z

6.3 Computer Vision Feature Vector

As pointed before, obstacles found in the Laser PC

as clusters are used to determine a ROI in the image

suitable for applying SVM for obstacle classification

in images.

These images have been manually labelled as

frontal view, back view, side view, frontal oblique

view and back oblique view. Later, Histogram of

Oriented Gradients (HOG) features are extracted

from every image, and SVM training is performed

following the process outlined in Figure 8, in order

to obtain the SVM classifier.

Figure 8: SVM learning process for images: Training and

classification.

6.4 Information Fusion

In poor illumination conditions, when the camera

offers no help, laser obstacle classification can still

be used, but the real advantage of the sensor fusion

resides in the combination of multisensor

information to obtain a result which is greater than

the mere sum of the individual contributions.

(García et al., 2014). Information fusion will be

performed with SVM and results will be compared

with individual sensor classifications.

7 PRELIMINARY RESULTS AND

DISCUSSION

The presented work is currently under development

but its results have been preliminarily tested,

showing better figures using sensor fusion for

classification than single sensor classification, as

presumed, and very promising expectations.

AutomaticObstacleClassificationusingLaserandCameraFusion

ACKNOWLEDGEMENTS

This work was supported by the Spanish

Government through the CICYT projects

(TRA2013-48314-C3-1-R) and (TRA2011-29454-

C03-02.

REFERENCES

Debattisti, S, Mazzei, L & Panciroli, M 2013. Automated

extrinsic laser and camera inter-calibration using

triangular targets. Intelligent Vehicles Symposium

(IV), 2013 IEEE, 2013, pp. 696–701.

Cortes, C & Vapnik, V 1995, Support vector network,

Machine Learning, vol. 20, pp. 1-25.

Fremont, V & Bonnifait, P 2008. Extrinsic calibration

between a multi-layer lidar and a camera. 2008 IEEE

Int. Conf. Multisens. Fusion Integr. Intell. Syst., 2008.

García, F, Jiménez, F, Naranjo, JE, Zato, JG, Aparicio, F,

Armingol, JM & de la Escalera, A. 2012. Environment

perception based on LIDAR sensors for real road

applications.

García, F, García, J, Ponz, A, de la Escalera, A &

Armingol, JM 2014. Context Aided Pedestrian

Detection for Danger Estimation Based on Laser and

Computer Vision. Expert Systems with Applications,

Vol: 41 (15), pp.6646-6661.

Kaempchen, N, Buehler, M & Dietmayer, K 2005.

Feature-level fusion for free-form object tracking

using laserscanner and video. IEEE Proceedings

Intelligent Vehicles Symposium 2005, pp. 453–458,

2005.

Kwak, K, Huber, DF, Badino, H & Kanade, T. 2011

.Extrinsic calibration of a single line scanning lidar

and a camera. IEEE/RSJ Int. Conf. Intell. Robot.

Syst., pp. 3283–3289, 2011.

Li, Y, Ruichek, Y & Cappelle, D 2011. 3D triangulation

based extrinsic calibration between a stereo vision

system and a LIDAR. 14th Int. IEEE Conf. Intell.

Transp. Syst., pp. 797–802, 2011.

Li, Y, Liu, Y, Dong, L, Cai, X 2007. An algorithm for

extrinsic parameters calibration of a camera and a

laser range finder using line features, IEEE/RSJ Int.

Conf. Intell. Robot. Syst.

Lisca, G, Jeong, PJP & Nedevschi, S 2010. Automatic one

step extrinsic calibration of a multi layer laser relative

to a stereo camera. Intell. Comput. Commun. Process.

(ICCP), 2010 IEEE Int. Conf., 2010.

Martín, D, García, F, Musleh, B, Olmeda, D, Marín, P,

Ponz, A, Rodríguez, CH, Al-Kaff, A, de la Escalera, A

& Armingol, JM 2014. IVVI 2.0: An intelligent

vehicle based on computational perception. Expert

Systems with Applications 41.

Premebida, C, Ludwig, O & Nunes, U 2009. LIDAR and

Vision-Based Pedestrian Detection System. Journal of

Field Robotics, vol. 26, no. Iv, pp. 696–711, 2009.

Premebida, C, Ludwig, O, Silva, M & Nunes, U 2010. A

Cascade Classifier applied in Pedestrian Detection

using Laser and Image-based Features.

Transportation, pp. 1153–1159, 2010.

Rodríguez-Garavito, CH, Ponz, A, García, F, Martín, D,

de la Escalera, A & Armingol, JM 2014. Automatic

Laser and Camera Extrinsic Calibration for Data

Fusion Using Road Plane.

Spinello, L & Siegwart, R, 2008. Human detection using

multimodal and multidimensional features. IEEE

International Conference on Robotics and Automation,

3264-3269. DOI: 10.1109/ROBOT.2009.4543708.

WHO, 2009. Global status report on road safety. Time for

action. WHO library cataloguing-in-publication data,

World Health Organization 2009, ISBN 978-9-

241563-84-0, Geneva, Switzerland.

VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems