OBSTACLE DETECTION AND AVOIDANCE ON SIDEWALKS

D. Castells

Universidad Polit´ecnica de Madrid, Spain

J. M. F. Rodrigues and J. M. H. du Buf

Vision Laboratory, Institute for Systems and Robotics (ISR), University of the Algarve (ISE and FCT)

8005-139 Faro, Portugal

Keywords:

Sidewalk border detection, Obstacle avoidance, Path tracking, Visually impaired.

Abstract:

We present part of a vision system for blind and visually impaired people. It detects obstacles on sidewalks and

provides guidance to avoid them. Obstacles are trees, light poles, trash cans, holes, branches, stones and other

objects at a distance of 3 to 5 meters from the camera position. The system ﬁrst detects the sidewalk borders,

using edge information in combination with a tracking mask, to obtain straight lines with their slopes and

the vanishing point. Once the borders are found, a rectangular window is deﬁned within which two obstacle

detection methods are applied. The ﬁrst determines the variation of the maxima and minima of the gray levels

of the pixels. The second uses the binary edge image and searches in the vertical and horizontal histograms for

discrepancies of the number of edge points. Together, these methods allow to detect possible obstacles with

their position and size, such that the user can be alerted and informed about the best way to avoid them. The

system works in realtime and complements normal navigation with the cane.

1 INTRODUCTION

Every car and bicycle can be equipped with a

GPS/GIS-based navigation system that may cost a

few hundreds of euros. By contrast, blind and visu-

ally impaired persons need to navigate using the stick

or, at best, an ultrasonic obstacle detector. This asym-

metry needs to be solved, because there are an esti-

mated 180 million persons with severe impairments

of which 40-50 million are completely blind, and ev-

ery year 2 million more become blind. The Por-

tuguese project “SmartVision: active vision for the

blind” aims at developing a portable GIS-based nav-

igation aid for the blind, for both outdoor and indoor

navigation, with obstacle avoidance and object recog-

nition based on active vision modules.

There are a few recent systems for visually im-

paired users which may assist them in navigation,

with and without obstacle detection and avoidance,

e.g., (Lee and Kang, 2008) who developed a system

which integrates outdoor navigation and obstacle de-

tection. (Kim et al., 2009) presented an electronic

travel aid called iSONIC. It complements the conven-

tional cane by detecting obstacles at head-height.

The work presented here concerns one of the mod-

ules of the SmartVision project. This module serves

to detect sidewalk borders and assists the user in cen-

tering on the sidewalk, thereby avoiding any obsta-

cles. Typical obstacles are light poles, trash cans and

tree branches, also imperfections as holes and loose

stones, at a distance of 3 to 5 meters from the user.

The system automatically adapts to different types of

sidewalks and paths, and it works in realtime on a nor-

mal portable computer.

2 SYSTEM SETUP

In the SmartVision project, a stereo camera (Bum-

blebee 2 from Point Grey Research Inc.) is ﬁxed

to the chest of the blind, at a height of about 1.5m

from the ground. Results presented here were ob-

tained by using only the right-side camera, but the

system performs equally well using a normal, inex-

pensive webcam with about the same resolution. The

resolution must be sufﬁcient to resolve textures of the

pavements related to possible obstacles like holes and

235

Castells D., M. F. Rodrigues J. and M. H. du Buf J. (2010).

OBSTACLE DETECTION AND AVOIDANCE ON SIDEWALKS.

In Proceedings of the International Conference on Computer Vision Theory and Applications, pages 235-240

DOI: 10.5220/0002816002350240

 SciTePress

loose stones with a minimum size of about 10 cen-

timeters at a distance of 3 to 5 meters from the cam-

era (the ﬁrst meters are not covered because of the

height of the camera; this area is covered by the cane

swayed by the user). Figure 1 (top) shows a typical

frame; Fig. 4 shows one of our test sequences.

The system is composed of three processing steps:

(1) Sidewalk border detection and the deﬁnition of the

obstacle detection window (ODW). (2) The detection

of obstacles in the ODW using two complementary

processes for tracking irregularities: (i) the number

of local maxima and minima of pixel values, and (ii)

histograms of binary edge information. (3) Tracking

of obstacles in subsequent frames for alerting the user

and obstacle avoidance.

2.1 Sidewalk Border Detection

There are some methods to detect the borders of side-

walks (Kayama et al., 2007). Here we detect them

by using a simple edge detector in combination with

a tracking mask to obtain straight lines, from the bot-

tom of each frame to the top, characterized by slope,

length and proximity to the left and right boundaries

of the frame. The detected borders will deﬁne the hor-

izontal position and width of the ODW.

We apply the Canny edge algorithm (Heath et al.,

1997) with three parameters: σ deﬁnes the size of the

Gaussian ﬁlter, and T

and T

are the high and low

thresholds for hysteresis edge tracking. We always

use σ = 1.5, T

= 0.95 and T

= 0.5. Figure 1 (2nd

image from top) shows edges detected in the case of

the frame shown above. Other edge detectors may

perform better, see (Rodrigues and du Buf, 2009), but

most require more CPU time which is critical in this

application.

In order to detect potential sidewalk borders, sev-

eral horizontal test lines are deﬁned in the binary edge

image, I

(x,y) with i the frame number. Figure 1 (2nd

image) shows these on the left (in red). On the ﬁrst

test line “1TL” there are many edge pixels which may

be part of potential border lines. They are starting

points which are labelled differently for the testing

process. A line-tracking mask of size 5x3 (Fig. 1,

bottom-right) is applied to track connected pixels up-

wards, vertically with an opening angle α ≈ 120deg,

from “1TL” to “2TL”, 20 pixels higher, and con-

nected pixels are attributed the same label. Then,

starting at “2TL”, the process is repeated for ﬁnding

more potential border lines, complementing the ﬁrst

search. New labels are generated for pixels on “2TL”

which are not connected to pixels on the ﬁrst test line.

This second search continues until the last testing line

“last TL”, 200 pixels above “1TL”. Figure 1 (3rd row,

Figure 1: Sidewalk border detection. Top to bottom: an

input frame; edge detection with testing lines indicated (in

red) at the left; start of tracking edge pixels at the ﬁrst test

line (1 TL) with label 43 (at left) plus the ﬁlling of a small

gap (at right); the three steps a, b and c for coping with very

close lines; and the line tracking mask with angle α.

at left) shows the mask tracking edge pixels with label

43 and (at right) tracked edge pixels with a small gap

which will be ﬁlled.

Occasionally, there are two or more labeled edge

pixels which are very close. These could correspond

VISAPP 2010 - International Conference on Computer Vision Theory and Applications

236

to a single or to several border lines. In those cases,

when the mask is applied to a new edge pixel, the

already labelled pixels below are checked by using

the vertically mirrored mask. If there are many pixels

with a label not equal to the label of the mask’s central

pixel, the label of the central pixel changes to the one

of the majority of the pixels in the mask. The 4th and

5th rows in Fig. 1 show an example of this process

with steps a, b and c. Final images with connected

and labelled edge pixels are denoted by L

(x,y), but

these still contain potential border lines. To be con-

sidered sidewalk borders, detected lines must satisfy

the following three requirements:

(1) Connected edge pixels must have a minimum

length (MinEL) covering at least 80 vertical positions

(MinEL 1TL or MinEL 2TL, depending on the line

starting on 1TL or on 2TL, see Fig. 1, 2nd image from

top). Shorter series are removed.

(2) Connected edge pixels must be almost lin-

ear, i.e., with correlaton r > 0.9, with a slope |b|

between 0.5 and 10. The slope also provides infor-

mation about the sidewalk’s width: the higher the

slope, the narrower the sidewalk. In order to speed

up the correlation/slope process, only eight equidis-

tant points of each potential line are processed. The

correlation r of edge pixels with the same label is

given by r = σ

· σ

, where σ

is the covari-

ance σ

∑

xy/n− (

∑

x·

∑

y)/n

and σ

are

the variances of x and y

∑

−



∑



;σ

∑

−



∑



. (1)

The regression line is given by L = a+ bx, with

b =

∑

x·

∑

y− n

∑

(

∑

− n

∑

;a =

∑

x·

∑

xy−

∑

y·

∑

(

∑

− n

∑

. (2)

(3) Occasionally, more than two lines remain as

potential sidewalk borders, or none at all. Depend-

ing on the number of lines n

, the following is done:

(i) If no lines are found, n

= 0, the last two borders

found in a previous frame will be used. (ii) If a sin-

gle line is found, n

= 1, the second line will be au-

tomatically generated, symmetrically with respect to

the vertical line that passes through the intersection

of the line found and the horizontal line through the

vanishing point. The latter is updated dynamically for

each new frame, as explained below. (iii) If two lines

are found, n

= 2, they are accepted as sidewalk bor-

ders, but with the following exception: If the signs of

the slopes of the two lines are not different, this means

that we do not have the right lines. The outermost one

is ignored, the innermost one is used, and a new line

is generated as in case n

= 1. Here, innermost means

closest to the center of the frame and outermost clos-

est to the left or right frame borders. (iv) In the case

of more lines, n

> 2, the most symmetrical and inner

pair of lines is selected.

Above, the vanishing point is used to generate

symmetrical line pairs. At the start of a sequence of

frames, or when no acceptable sidewalk borders can

be detected, the height of the vanishing point will be

initialized at 3/4 of the frame height. Then, when

two correct sidewalk borders are found, the vanish-

ing point is determined by the intersection of the two

borders, and the point is dynamically updated by av-

eraging the points of the previous frame and the new

frame.

After obtaining two valid sidewalk borders, the

obstacle detection window (ODW) is deﬁned for de-

tecting and locating possible obstacles. This window

has predeﬁned upper and a lower limits with a height

of N

v,ODW

= 100 pixels (v is vertical); see Section 2.

The left and right limits are deﬁned by taking 80% of

the distance between the two borders found at the up-

per limit, which gives N

h,ODW

pixels (h is horizontal).

3 OBSTACLE DETECTION AND

AVOIDANCE

For obstacle detection, two different methods are ap-

plied to the ODW. The ﬁrst one counts variations of

gray values, i.e., local maxima and minima, on each

horizontal line inside the ODW. Then, outliers are re-

duced by averaging counted variations over groups of

lines, and variations over these groups are combined

into a single value which indicates whether the frame

has a possible obstacle or not. Final conﬁrmation is

obtained by combining the results of a few subsequent

frames. The second method is based on irregulari-

ties in vertical and horizontal histograms of the binary

edge image I

. An obstacle can lead to two differ-

ent signatures: if the pavement is smooth, an obstacle

may appear as a local excess of edge points, but if

it has a strong texture there will be a huge amount

of edge points and an obstacle may appear as a local

deﬁciency (lack or gap) of edge points. The second

method is used to conﬁrm the result of the ﬁrst one,

but it also serves to detect the size and position of an

obstacle in order to guide the user away from it.

3.1 Local Maxima and Minima

(a) A small lowpass ﬁlter (averaging block ﬁlter of

size 3x3; LP(x, y)) is applied twice to the graylevel

ODW of frame i (F

ODW,i

), so high frequencies are

OBSTACLE DETECTION AND AVOIDANCE ON SIDEWALKS

237

suppressed and a less noisy window can be pro-

cessed. If ∗ denotes convolution, then

ODW,i

(x,y) =

ODW,i

(x,y) ∗ LP(x,y) ∗ LP(x,y).

(b) Then, the variations of gray values on each hori-

zontal line of the window are computed by applying

the ﬁrst derivative F

′

ODW,i

(x) = ∂(

ODW,i

(x))/∂x.

= ±2 is applied to the derivative. This suppresses

small transitions and maxima and minima can now

easily be found: where the derivative changes its sign

(zero crossing or ZC), +/− for a maximum and −/+

for a minimum.

(d) The next step consists of counting on each hor-

izontal line y the number of maxima and minima

MM(y) over the ODW window (100 horizontal lines),

(y) =

∑

ZC[T

′

ODW,i

(x)]].

(e) The result is stabilized by removing outliers.

This is done by taking the average of MM over

triplets of lines, i.e., over three consecutive horizontal

lines, which results in only 33 values for each ODW.

With k = 0, 1,2 and line counting starting at y = 0,

(y/3) =

∑

(y+ k)/3.

(f) For calculating variations over the ODW’s

lines, the ﬁrst derivative is applied to MM

(y/3),

′

(y/3) = ∂(MM

(y/3)/∂y.

(g) The last processing step of the ODW of frame i

consists of determining the maximum value (max) of

the absolute value ([·]

) of the derivative from step

(f), max

= max[MM

′

(y/3)]

. This value indicates a

possible obstacle in the ODW.

(h) In order to detect and conﬁrm obstacles, a dy-

namic threshold is used to alert the user. The dy-

namic threshold is initialized by computing the av-

erage of max

over the ﬁrst ﬁve frames, max =

∑

i=1

max

/5, and the average of the deviation, i.e.,

the difference between max and each max

, dev =



∑

i=1

(max

− max)/5



. The ﬁrst threshold, T

max + dev, is going to be tested against max

ob-

tained from frame 6, after this frame has been pro-

cessed from step (a) to (g). The same processing is

done with max

, dev

and T

. Two conditions can oc-

cur for i > 5:

(h.1) If max

does not exceed threshold T

, a new

threshold for the next frame will be calculated by

i+1

= 4/5· T

+ 1/5 · (max

+ dev

(h.2) Otherwise, a warning-level counter is activated.

If max

of the next two frames continue exceeding

the threshold, an obstacle warning will be issued.

The same happens when, after the warning level has

been activated, there is only one frame which does

not exceed the threshold. If more than two consecu-

tive frames do not exceed the threshold, the warning

counter will be reset and the threshold will continue

to be adapted dynamically.

The processing described above detects big vari-

ations of the number of local maxima and minima,

ﬁrst in the horizontal lines and then over the lines in

the ODW. This allows to detect the appearance and

the disappearance of an obstacle in the window, be-

cause these coincide with the ﬁrst and last max

which

exceed the dynamic threshold. For detecting the po-

sition and size of an obstacle, we apply an analysis of

edge histograms.

3.2 Edge Histograms

This method exploits the already available edge maps

(x,y), see Section 2.1, but only inside the ODW

ODW,i

(x,y). Depending on the smoothness (texture)

of a sidewalk’s pavement, different characteristics are

expected. If r

is the fraction of the number of

“white” (edge) pixels N

w,ODW

in I

ODW,i

(x,y), with

h,ODW

and N

v,ODW

the window’s dimensions, r

w,ODW

/(N

h,ODW

× N

vODW

Extensive tests with real pavements, but without

obstacles, revealed that r

> 0.1 indicates rough

surfaces, for example with cobblestones like in

Portuguese-style “calc¸ada,” whereas smaller values

indicate smooth surfaces. In both cases, vertical and

horizontal edge histograms are computed, i.e., for

each line and for each column in the ODW the num-

ber of “white” pixels are summed. Then, the two his-

tograms are smoothed by applying a simple 1D aver-

aging ﬁlter.

Two thresholds, T

and T

, are computed for the

histograms of the lines and columns. For the column

histogram, T

is the ratio between the number of white

pixels in the ODW and the number of columns of the

window, T

= (N

w,ODW

h,ODW

) × K, and, similarly,

is the ratio between number of white pixels and the

number of lines, T

= (N

w,ODW

v,ODW

) × K, where

K is a normalization factor.

In the case of a smooth pavement (r

< 0.1), an

obstacle will appear as an excess of white points, see

Fig. 2, and we apply K = 0.8. An obstacle will be

detected if at least two neighboring values in both the

line and column histograms exceed the thresholds, T

and T

, respectively.

In the case of a textured pavement (r

≥ 0.1), an

obstacle will appear as a lack of white points, see

Fig. 3, and we apply K = 0.4. Now at least two neigh-

boring values in both histograms must be lower than

the corresponding thresholds.

3.3 Obstacle Avoidance

In order to detect an obstacle, both methods described

above must detect something, but in order to save

VISAPP 2010 - International Conference on Computer Vision Theory and Applications

238

Figure 2: Smooth pavement with a possible obstacle and

corresponding edge histograms.

CPU time the histogram method is only applied to

frames in which the ﬁrst method has detected some-

thing. Once the histogram method has conﬁrmed the

detection, a sound is generated in order to alert the

user. This sound is modulated such that it indicates

the approximate position and the best way to avoid

the obstacle.

The ﬁrst method indicates an obstacle somewhere

in the entire ODW. Often, but not always because

it depends on the textures of the pavement and the

object and the latter’s size, the histogram method

can narrow the approximate position to the object’s

bounding box. Blind people prefer to walk near walls

or the fac¸ades of buildings along streets, swaying the

cane in front and keeping contact with the wall or

fac¸ade. Since a wall- and fac¸ade-detection algorithm

has not yet been implemented, we illustrate walking

close to the centerline of paths and sidewalks in this

paper. This scenario is also quite realistic, and avail-

able information about the path’s borders and the van-

ishing point can be used to inform the user about the

centerline. In addition, having the position and di-

mensions (in pixels) of an obstacle’s bounding box,

and the dimensions (in pixels and in meters) of the

ODW, see Section 2, it is easy to convert the bound-

ing box to meters and provide information about the

approximate position and size of the obstacle. If the

obstacle is not centered on the path, the user can be

informed about the left or right side which has the

largest distance between the bounding box and the

path’s left and right borders. In any case, the user

will check the obstacle using the cane.

It should be stressed that the camera is tightly

ﬁxed to the person’s breast such that it points straight

Figure 3: Textured pavement with a possible obstacle and

corresponding edge histograms.

forward, also that the blind have been trained not to

sway much with the body while swaying the cane.

The obstacle-avoidance module requires initial cali-

bration in order to obtain correct distances, for these

depend on each user’s length and posture. In the

future, a disparity module which uses both cameras

of the stereo camera will be integrated. This mod-

ule will complement the obstacle-detection methods

and it will also provide calibrated distance informa-

tion, such that the calibration mentioned above will

no longer be required.

4 CONCLUSIONS AND RESULTS

Various test sequences have been captured on several

paths and sidewalks of Gambelas Campus at the Uni-

versity of the Algarve, with different pavements and

obstacles. Figure 4 shows one sequence with, apart

from the frame number given in the upper-left cor-

ner, the following frame annotation: Type 0 concern

the ﬁrst 5 initial calibration frames with no obstacles.

Type 1 frames are those in which the variation of max-

ima and minima in the ODW does not exceed the dy-

namic threshold. Type 2 frames do exceed the thresh-

old and the ﬁrst one activates an alert counter which

counts, from 1 to N, the number of subsequent Type

2 frames. The third alert (N = 3) activates “obstacle

warning” which remains until the ﬁrst new frame of

Type 1 is encountered.

The ﬁrst results are very encouraging. The num-

bers of false positives and negatives of the sequences

OBSTACLE DETECTION AND AVOIDANCE ON SIDEWALKS

239

tested were quite small, with more false positives than

false negatives. False positives were mainly caused

by tree leaves and litter, whereas false negatives were

mainly cobblestones pushed up a few centimeters by

tree roots, such that the irregularity of the calc¸ada’s

texture is not detectable.

On a portable computer with an Intel Pentium

clocked at 1.6 GHz, elapsed time to process individ-

ual frames is about 0.5 second. This is already fast

enough for realtime application where the user walks

at normal speed. By using a new portable with a

multi-core processor, more than two frames per sec-

ond can be processed, but the disparity module will

also consume CPU time (although disparity process-

ing might be limited to the ODW), as will other mod-

ules for using GPS in combination with a dedicated

GIS for autonomous navigation.

In general, good results were obtained in the de-

tection of the borders of paths and sidewalks, and

many more paths and sidewalks are now being tested.

Difﬁculties mainly arise when the color difference or

contrast of a path’s curbs is small, when the curbs

are partly hidden by plants and long grass, or when

a path has no curbs but is delimited by grass or low

shrubs. Similar problems arise in the case of obsta-

cle detection, when the contrast and the texture of an

obstacle and those of the pavement are too similar.

However, most obstacles, including missing cobble-

stones in Portuguese-style “calc¸ada,” the smallest but

most frequent problem, can be detected, but not yet

elevated cobblestones which are being pushed up by

long tree roots. Therefore, detection algorithms must

be improved, even at the cost of more CPU time.

A speciﬁc problem is the detection of single and

multiple steps, for which no dedicated algorithm has

been included yet. Occasionally, a wrong obstacle de-

tection window is caused by a wrong detection of the

borders. However, normally this happens in a sin-

gle frame and the problem can be solved by keep-

ing the alert counter counting such that at the next

Type 2 frame an obstacle warning will be issued.

This solution, i.e., tracking information over multiple

frames, can be applied in many more cases. For exam-

ple, positions of borders detected in previous frames

can be extrapolated to new frames in order to narrow

the search area and to conﬁrm the new borders, al-

though sudden and unpredictable movements of the

user cannot be excluded unless they are detected by

big changes of the global optical ﬂow of entire frames.

Such aspects require more research because of the

CPU times which are involved. Also being devel-

oped is a dynamic adaptation of the parameters of the

Canny edge detector as a function of the type of pave-

ment, for resolving ﬁner textures of obstacles like ele-

Figure 4: One test sequence with annotation.

vated cobblestones, but also for the detection of often

minute differences between textures of horizontal and

vertical surfaces of steps when the contrast between

them is too low.

ACKNOWLEDGEMENTS

Portuguese Foundation for Science and Technol-

ogy (FCT) through the pluri-annual funding of

the Inst. for Systems and Robotics (ISR/IST), the

POS Conhecimento Program with FEDER funds, and

FCT project SmartVision (PTDC/EIA/73633/2006).

REFERENCES

Heath, M., Sarkar, S., Sanocki, T., and Bowyer, K. (1997).

A robust visual method for assessing the relative per-

formance of edge-detection algorithms. IEEE Trans.

PAMI, 19(12):1338–1359.

Kayama, K., Yairi, I., and Igi, S. (2007). Detection of

sidewalk border using camera on low-speed buggy.

Proc. IASTED Int. Conf. on Artiﬁcial Intelligence

and Applications,13-15 February, Innsbruck, Austria,

pages 262–267.

Kim, L., Park, S., Lee, S., and Ha, S. (2009). An electronic

traveler aid for the blind using multiple range sensors.

IEICE Electronics Express, 11(6):794–799.

Lee, S. and Kang, S. (2008). A walking guidance system

for the visually impaired. Int. J. Pattern Recogn. Artif.

Intell., 22(6):1171 – 1186.

Rodrigues, J. and du Buf, J. (2009). Multi-scale lines and

edges in V1 and beyond: Brightness, object catego-

rization and recognition, and consciousness. BioSys-

tems, 95:206–226.

VISAPP 2010 - International Conference on Computer Vision Theory and Applications

240