• The examples are representative of the pedestrian
class in terms of variability, illumination condi-
tions, position and size in the image.
Example-based techniques have been previously
used in natural, cluttered environments for pedestrian
detection (Shashua et al., 2004) (Gavrila et al., 2004).
In general, these techniques are easy to use with ob-
jects composed of distinct identifiable parts arranged
in a well-defined configuration. A distributed learn-
ing approach based on components (Mohan et al.,
2001) is more efficient for object recognition in real
cluttered environments than holistic approaches (Pa-
pageorgiou and Poggio, 2000). Distributed learning
techniques can deal with partial occlusions and are
less sensitive to object rotations. However, in spite
of their ability to detect objects in real images, we
propose to reduce the pedestrians searching space in
an intelligent manner, based on the road image, so
as to increase the performance of the detection mod-
ule. Accordingly, road lane markings are detected and
used as the guidelines that drive the pedestrian search-
ing process. The area contained by the limits of the
lanes determines the zone of the real 3D scene where
pedestrians are searched for. The objects found in the
searching area are passed on to the pedestrian recog-
nition module. This helps reduce the rate of false pos-
itive detections. In case that no lane markings are de-
tected, a basic area of interest is used instead covering
the front part ahead of the ego-vehicle. The descrip-
tion of the lane marking detection system is provided
in (Sotelo et al., 2005). The rest of the paper is or-
ganised as follows: section II provides a description
of the candidate selection mechanism. Section III de-
scribes the pedestrian recognition system. The results
achieved up to date are presented in section IV. Fi-
nally, section V summarizes the conclusions and fu-
ture work.
2 CANDIDATE SELECTION
We have developed a calibrated stereo platform and
calculated the intrinsic parameters for each camera,
and the extrinsic parameters between them, in order
to obtain the fundamental matrix that defines the sys-
tem epipolar geometry. This way the perfect phys-
ically aligning between cameras that implies the as-
sumption of parallel epipolar lines, is not necessary,
because the stereo calibration process defines mathe-
matically the geometric relationships for the cameras
(Xu and Zhang, 1996).
The first task is image preprocessing which has two
steps: normalize intensity values, to correct for dif-
ferences between the two images, and eliminate ra-
dial and tangential distortion. Once here, we apply a
Canny algorithm for feature extraction on the left im-
age. The Canny image provides a good representation
of the discriminating features of pedestrians, as de-
picts Figure 1. Features such as heads, arms and legs
are visible and distinguishable and are not affected
by colours or intensity. It gives us some indications
about discriminating zones for the pedestrian recog-
nition system.
Figure 1: Some Canny images examples. Upper row:
pedestrians examples. Bottom row: non pedestrians exam-
ples
In order to extract 3D scene information some au-
thors use disparity map techniques combined with
the v-disparity segmentation (Grubb et al., 2004)
(Labayrade et al., 2003). This option was discarded
because of the disadvantages associated with disparity
computation algorithms: prior to disparity map gener-
ation the image pair has to be rectified to ensure good
correspondence matching. In addition the informa-
tion for performing generic obstacles detection is de-
fined with a vertical line into the v-disparity image.
This implies managing very little information to de-
tect obstacles, which works for big object detection
as vehicles, but could not be enough for smaller ob-
ject detection such as pedestrians. After solving the
correspondence problem, our approach creates a 3D
points map which origin is placed at the left camera.
Using the fundamental matrix for each Canny’s de-
tected point we search the corresponding point in the
other image along its epipolar line (fixing the maxi-
mum distance between corresponding points in order
to reduce the cost of matching).
The correspondence problem can be solved using
a wide spectrum of matching techniques. But most
recent successes have been in area-based algorithms.
Specifically the Zero Mean Normalized Cross Corre-
lation has performed most robustly (Boufama, 1994).
This algorithm seeks -for a point given on the left
image- the larger correlation response for a point of
the right image, taking into account the relevance of
the window size. As the window size decreases, the
discriminatory power of the area-based criterion is de-
creased and some local maximum in ZMNCC could
have been found in the search regions. Moreover, con-
tinually increasing the window size causes the perfor-
PEDESTRIAN RECOGNITION FOR INTELLIGENT TRANSPORTATION SYSTEMS
293