Exploring Machine Learning Techniques for Identification of Cues

for Robot Navigation with a LIDAR Scanner

Aj Bieszczad

Channel Islands, California State University, One University Drive, Camarillo, CA 93012 U.S.A.

Keywords: Mobile Robots, Navigation, Cue Identification, Machine Learning, Clustering, Classification, Neural

Networks, Support Vector Machines.

Abstract: In this paper, we report on our explorations of machine learning techniques based on backpropagation neural

networks and support vector machines in building a cue identifier for mobile robot navigation using a LIDAR

scanner. We use synthetic 2D laser data to identify a technique that is most promising for actual

implementation in a robot, and then validate the model using realistic data. While we explore data

preprocessing applicable to machine learning, we do not apply any specific extraction of features from the

raw data; instead, our feature vectors are the raw data. Each LIDAR scan represents a sequence of values for

measurements taken from progressive scans (with angles vary from 0° to 180°); i.e., a curve plotting distances

as a functions of angles. Such curves are different for each cue, and so can be the basis for identification. We

apply varied grades of noise to the ideal scanner measurement to test the capability of the generated models

to accommodate for both laser inaccuracy and robot motion. Our results indicate that good models can be

built with both back-propagation neural network applying Broyden–Fletcher–Goldfarb–Shannon (BFGS)

optimization, and with Support Vector Machines (SVM) assuming that data shaping took place with a [-0.5,

0.5] normalization followed by a principal component analysis (PCA). Furthermore, we show that SVM can

create models much faster and more resilient to noise, so that is what we will be using in our further research

and can recommend for similar applications.

1 INTRODUCTION

Automated Intelligent Delivery Robot (AIDer; shown

in Figure 1) is a mobile robot platform for exploring

autonomous intramural office delivery (Hilde et al.,

2009; Rodriguez et al., 2007). The research reported

in this paper was part of the overall effort to explore

ways to deliver such functionality. The robot was to

navigate in a known environment (a map of the

facility is one of the elements of AIDer’s

configuration) and carry out tasks that were requested

by the users through a Web-based application. Each

request included the location of a load that was to be

moved to another place that was also specified in the

request. The pairs of start and target locations were

entered into a queue that was managed by a path

planning module. When the next job from the queue

was selected, the robot was directed first to the start

location where it was to get loaded after announcing

itself, and then to the destination where it was to get

unloaded after announcing its arrival. That routine

was to be repeated indefinitely — if there were other

requests waiting in the queue and as long as there was

power.

Figure 1: Robot with a laser scanner (between the front

wheels).

645

Bieszczad A..

Exploring Machine Learning Techniques for Identiﬁcation of Cues for Robot Navigation with a LIDAR Scanner.

DOI: 10.5220/0005569006450652

In Proceedings of the 12th International Conference on Informatics in Control, Automation and Robotics (ANNIIP-2015), pages 645-652

ISBN: 978-989-758-122-9

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 2: Robot with a rotating laser scanner (between the

front wheels) generates a sequence of distances for each

progressive angle.

One of the major objectives was to provide the

functionality at low cost. Therefore, AIDer has a very

limited set of sensors for navigation: right side

detectors of the distance from the wall, and a frontal

2D (one-plane) LIDAR laser scanner for detecting

cues such as turns and intersections. The side sensors

are used to provide a real-time feedback to a

controller that corrects the position of the robot so it

stays at a constant distance from the right wall (Hilde

et al., 2007).

Higher-level navigation in AIDer is based on

following paths that consist of a series of intervals

between landmarks (Rodriguez et al., 2007). A map

of the facility is provided as an element of the

configuration (using a custom notation), so the robot

is not tasked with mapping the environment. The map

configuration file includes locations of landmarks

along with exact distances between the landmarks.

Upon receiving the next task to carry out, the robot

determines the path to travel in terms of the

landmarks. The path is divided into a sequence of

landmarks, and the robot is successively directed to

move to the next landmark. After the current target

landmark is identified, the robot receives the next

target landmark to go to. To accommodate for error

in mobility (like slippage of the wheels) that may

skew the robot orientation based purely on traveling

exact distances, the robot relies on identification of

cues to verify reaching landmarks.

In an environment lacking GPS, identification of

environmental cues is a critical low-level task

necessary for recognizing landmarks (Thrun, 1998),

since landmarks are defined in terms of cues. The

frontal laser-scanner in AIDer serves that purpose.

Each scan produces a sequence of measurements that

differ depending on the shape of the surrounding

walls. For example, Figure 2 shows a scan of a left

turn. The scan results - a sequence of numbers

representing the measured distance (e.g., in inches) -

are graphed using angles on the x axis and the

distances on the y axis. Due to the range limitations

of the laser scanner, certain measurement may be read

as zeros; that is visible as a sudden drop in the curve

shown in the figure.

In (Hilde et al., 2007), an approach similar to

(Hinkel et al, 1988) was taken with a selective subset

of measurements used to define cues analytically with

a limited success.

In this paper, the complete raw set is used for this

purpose as will be shortly explained. Our earlier

attempts to use raw data in such a way were not

completely successful (Henderson, 2012), and the

research reported here remedies that.

2 RELATED WORK

Mapping and localization services are the foundation

of autonomous navigation (Thrun, 1998). As we

already stated, mapping is not a functional objective

of the AIDer. Vast majority of the current localization

work is based on utilization of very sophisticated

equipment as seen in cars participating in R&D

efforts in academia, auto industry, and government-

sponsored contests (e.g., Peters et al., 2008). Utilizing

simple sensors with very limited capabilities started

the field (Borenstein, 1997), but currently it’s rare to

depend on just such limited functionality. Yet, the use

of inexpensive devices is important in environments

lacking access to powerful computers or abundant

power supplies (e.g., Roman et al., 2007), and when

cost is a concern (e.g., Tan et al., 2010).

LIDAR-based identification was successfully

solved by analytical methods in (Hinkel et al, 1988)

in which histograms of laser measurements were used

as the input data. There have been numerous attempts

to use similar data using a variety of analytical

approaches (e.g., Zhang et al., 2000; Shu et al., 2013;

Kubota et al., 2007; Nunez et al., 2006).

(Vilasis-Cardona et al., 2002) used cellular neural

networks to classify cues, but the localization was

based on processing 2D images of vertical and

horizontal lines placed on the floor rather then 1D

LIDAR measurements. Just like in (Henderson,

2012), histogram data were used as inputs to

backpropagation neural network in research reported

in (Harb et al., 2010), but the authors did not specify

the details of the backpropagation algorithm that they

used. In here, we follow that sub-symbolic approach

studying the capabilities of back propagation models

and contrast them with training based on support

vector machines (SVM).

ICINCO2015-12thInternationalConferenceonInformaticsinControl,AutomationandRobotics

646

Figure 3: AIDer Laser Laser Scan Data for 9 cues.

Progressive scan angles in radians are shown on x axis,

while the y axis shows distance in inches.

3 DATA SETS

The laser mounted on AIDer is capable of scanning

180° with a granularity yielding 512 measurements

per scan. To explore machine learning techniques for

AIDer cue identification, we set aside the large data

set that would be inconvenient for explorations, and

instead we engineered a smaller data set for a

miniaturized virtual model that otherwise preserved

the geometry of the office environment. We are

returning to the larger data set at the end of the paper

when we validate our best performing technique on

that realistic set.

For the experiments, we created by hand data for

9 cues: lt (left turn), rt (right turn), ts (t-section/front),

xs (x-section), tl (left t-section), tr (right t-section), dr

(door on right), dl (door on left), and d2 (door on both

sides). In this miniaturized model every scan is a

sequence of distance measurements made with a laser

angle progressing in 17 (rather than the original 512)

steps in the interval [0, ]. The curves for all cues are

shown in Figure 3.

A visual inspection of the graphs gives hope that

the curves indeed can be correctly classified within

some noise limits. These limits can be established to

some degree by introducing elements of possible

noise that can be modeled by modifying each data

point using the normal distribution with a certain

standard deviation. That noise accounts for the

inaccuracy of the laser measurements. For example,

the type of the material from which walls are built

impacts the reading.

Figure 4: Left turn curve adjusted for distance from the cue,

and then with a noise added. Progressive scan angles in

radians are shown on x axis, while the y axis shows distance

in inches.

For illustration of the impact of distortion on the ideal

curves, the left graph in Figure 4 show the curve for

the lt (left turn) cue with overlaid sample noisy curve:

one distorted curve on the left, and with added noise.

For actual training, we generated a large number of

noisy curves. To illustrate the complexity that the

training algorithm must overcome, the right side of

Figure 4 shows a bundle of 100 curves generated for

left turn with a standard deviation of ߪ = 0.2. We used

numerous levels of noise in the experiments and

larger sample sets.

4 FEATURE VECTORS

Each of the generated noisy curves is used as a

training sample for building a clustering model. To

create a corresponding feature vector, each curve was

preprocessed. First, we normalized the data to the [-

.5, 0.5] interval, and then we applied principal

component analysis (PCA) with a hope to eliminate

redundant data dimensions, but also to visualize data

clustering (in 3 rather than 17 dimensions, so not all

nuances in the data are captured in the plot). We also

tried linear discriminant analysis (LDA), but we got

better clustering with PCA. That was important in our

case, since a visual inspection shows that certain parts

of the original ideal curves are repeated in every

pattern. We could just cut off these dimensions from

the data altogether, but instead we left it to the PCA

that formalizes such observations while also catching

similarities that are not easily visible with a naked

eye. Additionally, while the ideal data may be aligned

in some dimensions, noisy data coming from the

scanner may not be so inclined, so it’s better to let the

PCA make such decisions.

Figure 5 shows how the data is clustered using just

the first three principal components of the

preprocessed data. The 3D scatter graphs for curves

with low distortion clearly indicate that the data are

ExploringMachineLearningTechniquesforIdentificationofCuesforRobotNavigationwithaLIDARScanner

647

Figure 5: Visible nine clusters of feature vectors

corresponding to the nine target cues. Standard deviation of

ߪ = 0.05, ߪ = 0.1, ߪ = 0.2, and ߪ = 0.5 were used to generate

the curves (starting with the left upper corner). Please note

that the visuals are rotated to show the best perspective of

the clusters.

clustered in nine locations corresponding to the nine

target cues. The figure also illustrates the challenge in

data separation when the standard deviation is

increased. After numerous experiments, we actually

found that for best results (i.e., for the lowest error

rate) we needed to keep almost all principal

components banning one or two least significant.

Since dropping so few had minimal impact on the

efficiency of training, we ended up with using PCA

for improving odds for clustering rather than for

dimension reduction.

5 EXPLORING NEURAL

NETWORK-BASED CUE

IDENTIFICATION

With so-generated one thousands of feature vectors at

hand, we used the backpropagation neural network to

build a classifier. We also attempted to process larger

numbers (namely ten thousand), but that was taking

too much time (in excess of 12 hours on a fast iMac

running Python 3.4 and Neurolab 3.5). We tried a

number of training strategies available as options in

Neurolab, but we were consistently successful only

with the one based on the Broyden-Fletcher-

Goldfarb-Shannon (BFGS) optimization. Other

optimization methods (such gradient decent) took

much longer time, often failed to converge, and

lacked consistency in repeated attempts (i.e., they

were very dependable on the starting conditions). A

backpropagation network with BFGS optimization-

based training was converging successfully under a

Figure 6: Convergence rate of a back propagation network

with BFGS training in training with random curves

distorted with varied standard deviation.

variety of conditions and had a high rate of

identification accuracy (unless the data set was very

large as will be explained later).

Just for completeness and clarity of the setup for

the experiments, let us clearly state that we used a 17-

unit input layer (since we have 17-dimension feature

vectors), and a 9-unit binary output layer (as we have

9 cues - classification targets). We also tried a

network with one single multivariate output unit, but

that architecture did not yield a successful model.

After trying a number of network architectures,

we found that a 17-50-9 network (a single-hidden-

layer network with fifty hidden units) was performing

similarly to a 17-20-20-20-9 (three-hidden-unit

network with twenty units in each hidden layer); as

shown in Figure 6. With higher level of distortion

(standard deviation ߪ >= 0.4), the 17-20-20-20-9

network failed to train in a reasonable time. The

convergence speed was similar for both networks as

shown in Figure 7 for standard deviation ߪ = 0.1. The

figure shows the convergence rate of the networks for

one thousand randomly distorted curves generated

with two different standard deviations (with SSE used

as a measure of errors). Increasing the standard

deviation often led to increased training time (but not

always owing to the dependence on the starting

condition that are chosen automatically by the

Neurolab’s training algorithm), to a higher error rate

on the test set, and increasingly frequent failure to

converge below the target error rate. Neurolab’s

training functions detect when there is no progress

(i.e., no change in the error rate) and terminate the

training session even before hitting the limit on the

number of epochs.

The training the 17-50-9 network took

ICINCO2015-12thInternationalConferenceonInformaticsinControl,AutomationandRobotics

648

Figure 7: Comparison of convergence rates of backpropagation networks with BFGS 17-20-20-20-9 (left) and 17-50-9 (right)

with random curves distorted with standard deviation of ߪ = 0.1.

increasingly longer time for the larger ߪ, but

completed successfully. As shown in Figure 8, the

network was relatively effective in tests with the the

training set, but it became increasingly less reliable

with higher level of noise.

For testing the models, we generated another

thousand randomly distorted new curves. In tests, we

used the preprocessing transform functions

(normalization and PCA) constructed with the

training set, since the application requires that the

model is able to deal well with novelty. As it could be

expected, if the standard deviation of the test set was

the same as the one used to generate the training set,

the accuracy of the model was better than with a test

set generated with a higher standard deviation than

the one used in training.

It’s worth noting here that we did not need to use

Figure 8: Performance of a 17-50-9 network expressed

through a misclassification error rate as a function of the

standard deviation used for generating curves.

Figure 9: Misclassification rate as a function of the

regularization coefficient with standard ߪ = 0.3.

any of the crossvalidation techniques as we could

generate test data at will.

6 APPLYING REGULARIZATION

To address the higher misclassification rate for higher

noise in input data, we tested several values of the

regularization coefficient to relax the clustering and

avoid overfitting. As shown in Figure 9 it was an

effective tool to improve the accuracy of

classification for noisy data.

It is interesting to note that although networks

with a non-zero regularization factor may yield higher

error rates (SSE), and even fail the training in the

traditional sense of not getting under a certain error

threshold, they can still classify correctly the data,

and therefore show lower classification error. To

ExploringMachineLearningTechniquesforIdentificationofCuesforRobotNavigationwithaLIDARScanner

649

illustrate the point, compare the rate of

misclassifications in Figure 9 with the number of

erroneous outputs made by the same network shown

in Figure 10.

It is an important distinction between applications

for classification in which the winner-takes-all

strategy is applied, and for regression; in the latter,

increased error rate would certainly be more

troublesome.

7 EXPLORING SUPPORT

VECTOR MACHINES (SVM)

FOR CUE IDENTIFICATION

We attempted to improve the performance of the

neural network models by increasing the cardinality

of the training set to ten thousands samples.

Unfortunately, as we stated earlier, the BNFS training

algorithms in the Neurolab could not deal with that

number of data points, and as a consequence failed to

converge in a reasonable time. As also previously

explained, using a reduced dimension proved to make

things worse in the experiments with a smaller

training set, so we decided to move on and try another

technique said to be very successful in classification

applications, Support Vector Machines (SVM).

We started immediately with a very large set of

training samples (ten thousands), since we were

interested in the performance of the training method

in the Scikit Learn toolkit that we used for our

explorations. Scikit Learn uses extremely efficient

scientific libraries collected under one common

umbrella of SciPy; some implemented even in Fortran

for maximal efficiency. The implementation of the

SVM in Scikt Learn has a very convenient to use API

for multi-class classification.

We preprocessed the data in the way identical to

the earlier experiments using neural networks:

MixMaxScaler and PCA. We tried to use data with

reduced dimension, but as earlier, we got better

results when keeping all dimensions.

For tests, we generated a random set of also ten

thousand data points and using the same level of

distortion (i.e., the standard deviation ߪ) as in the

training.

One immediate observation was that the SVM

training on a ten times larger data set was

dramatically faster then for the neural network using

many fewer samples. Figure 11 shows the results

from a number of experiments with a variety of

distortion levels. Comparing these results to the ones

shown in Figure 8 and 9, it is evident that in presence

Figure 10: A histogram of errors made by a model with a

regularization coefficient of 0.3 for a data set generated

with ߪ = 0.3.

Figure 11: Performance of SVM models on sets with

increasing standard deviation.

Figure 12: Ten overlaid actual 512-dimensional curves for

cue identification in actual AIDer environment. Distance is

measured in inches, and the x axis shows index of each

measurement (from 0 to 511).

ICINCO2015-12thInternationalConferenceonInformaticsinControl,AutomationandRobotics

650

of noise, the performance of the model built with

SVM exceeds by three-fold or so the capability of the

backpropagation neural network with the best

performing (with our specific data sets) BFGS

training. We used the default values of SVM

parameters from Scikit Learn.

8 CONCLUSIONS: VALIDATING

THE MODEL IN A HIGHER

DIMENSION

As we stated at the beginning of this document, the

actual scanner data has a much higher dimension: 512

rather than 17 used for our explorations of the

machine learning techniques. Figure 12 shows the

ideal cues taken from the actual physical facility (the

measurements were done by hand rather than from a

scanner; hence the adjective “ideal”). There are also

ten rather than nine cues in this data set. With the very

optimistic results from the experiments with the

SVM, we used exactly same strategy to process the

realistic data. Quite often, it’s a computational

challenge to expand the dimension of a data set thirty-

fold; it was evident in the increased processing times.

Still, the increase in the demand for time was more of

linear rather than exponential nature in spite of using

also ten thousand samples for both training and for

testing.

We need to emphasize that it was critical to

normalize the data, since some algorithms in the

Python toolkit could fail on NaNs otherwise.

Fortunately, the MinMaxScale() function from Scikit

Learn worked well, preparing data for successful

PCA as shown in Figure 13.

Subsequently, the SVM algorithms converged

nicely and performed similarly to the experiments

with the smaller data set. Figure 14 shows the results

for several levels of noise.

9 FUTURE DIRECTIONS

One of the aspects of curve shape distortion for cues

based on object boundaries is the point of view from

which the snapshot is taken. If the identification is

made quickly, then it does not matter, as the model

may be picking the level of recognition of a cue in the

ideal spot from which the training samples were

generated. Introducing such an element of distortion

with random methods is difficult, since the shape of

the cue may change more dramatically than with

application of a standard deviation, so another

approach can be to use a number of points of view

(e.g., three) and to generate snapshots of a cue taken

from these points. In this way, the training data would

include a number of views of each cue. We report on

this approach in another paper (Bieszczad, 2015).

Yet another problem omitted in this paper is the

fact that cues often are present at the same time, so

they make it to the same snapshot. We are planning

to use data sets that mix cues to some degree to test

the identification capabilities of the models trained

under such circumstances. One idea to deal with this

problem if it arises is to separate cues from curves.

Such attempts have been made by numerous

researchers, and in more complex approaches to the

localization problem (e.g., through image processing

and scene analysis).

Much harder problem to overcome is the issue of

accuracy of laser scans when dealing with various

materials from which obstacles are made and light

Figure 13: Normalized cue curves and their location in the feature space with only three most significant principal

components. The distance uses normalized values, and measurement index is shown on the x axis. The 3D plot is rotated for

best illustration of centers of the clusters.

ExploringMachineLearningTechniquesforIdentificationofCuesforRobotNavigationwithaLIDARScanner

651

conditions. These issues are of paramount importance

in outdoor navigation in an unknown terrain as

described in (Roman et al., 2007) and elsewhere.

While we plan to continue experimenting with a

robot, using a physical machine for numerous tests is

inconvenient and inefficient, so we are planning to

build a simulator with which it will be easier to test

our models.

REFERENCES

Hilde, L., 2009. Control Software for the Autonomous

Interoffice Delivery Robot. Master Thesis, Channel

Islands, California State University, Camarillo, CA.

Rodrigues, D., 2009. Autonomous Interoffice Delivery

Robot (AIDeR) Software Development of the Control

Task. Master Thesis, Channel Island, California State

University, Camarillo, CA.

Thrun, S., 1998. Learning metric-topological maps for

indoor mobile robot navigation. In Artificial

Intelligence, Vol. 99, No. 1, pp. 21-71.

Hinkel, R, and Knieriemen, T., 1988. Environment

Perception with a Laser Radar in a Fast Moving Robot.

In Robot Control 1988 (SYROCO'88): Selected Papers

from the 2nd IFAC Symposium, Karlsruhe, FRG.

Henderson, A. M., 1012. Autonomous Interoffice Delivery

Robot (AIDeR) Environmental Cue Detection. Master

Thesis, Channel Island, California State University,

Camarillo, CA.

Leonard, J., et al., 2008. A Perception-Driven Autonomous

Urban Vehicle. In Journal of Field Robotics, 1–48.

Tan, F., Yang, J., Huang, J., Jia, T., Chen, W. and Wang, J.,

2010. A Navigation System for Family Indoor Monitor

Mobile Robot. In The 2010 IEEE/RSJ International

Conference on Intelligent Robots and Systems, October

18-22, 2010, Taipei, Taiwan.

Roman, M., Miller, D., and White, Z., 2007. Roving Faster

Farther Cheaper. In 6th International Conference on

Field and Service Robotics - FSR 2007, Jul 2007,

Chamonix, France. Springer, 42, Springer Tracts in

Advanced Robotics; Field and Service Robotics.

Vilasis-Cardona, X., Luengo, S., Solsona, J., Maraschini,

A., Apicella, G. and Balsi, M., 2002. Guiding a mobile

robot with cellular neural networks. In International

Journal of Circuit Theory and Applications; 30:611–

624.

Shu, L., Xu, H., and Huang, M., 2013. High-speed and

accurate laser scan matching using classified features.

In IEEE International Symposium on Robotic and

Sensors Environments (ROSE), Page(s): 61 - 66.

Nunez, P. Vazquez-Martin, R. del Toro, J. C. Bandera, A.

and Sandoval, F., 2006. Feature extraction from laser

scan data based on curvature estimation for mobile

robotics. In IEEE International Conference Robotics

and Automation (ICRA),, pp. 1167–1172.

Zhang, L. and Ghosh, B. K., 2000. Line segment based map

building and localization using 2d laser rangefinder. In

IEEE International Conference on Robotics and

Automation (ICRA), pp. 2538–2543.

Harbl M., Abielmona, R., Naji, K.,and Petriu, E., 2010.

Neural Networks for Environmental Recognition and

Navigation of a Mobile Robot. In Proceedings of IEEE

International Instrumentation and Measurement

Technology Conference, Victoria, Vancouver Island,

Canada.

Kubota, S., Ando, Y., and Mizukawa, M., 2007. Navigation

of the Autonomous Mobile Robot Using Laser Range

Finder Based on the Non Quantity Map. In

International Conference on Control, Automation and

Systems, COEX, Seoul, Korea.

Bieszczad, A. and Pagurek, B. (1998), Neurosolver:

Neuromorphic General Problem Solver. In Information

Sciences: An International Journal 105 (1998), pp.

239-277, Elsevier North-Holland, New York, NY.

Bieszczad, A., 2015. Identifying Landmark Cues with

LIDAR Laser Scanner Data Taken from Multiple

Viewpoints. In Proceedings of International

Conference on Informatics in Control, Automation and

Robotics (ICINCO), Scitepress Digital Library.

ICINCO2015-12thInternationalConferenceonInformaticsinControl,AutomationandRobotics

652