with the addition of new traffic signs in the training
data and using the existing trained network after
remodeling its output layer. The CNN takes the
preprocessed images as inputs, and outputs the
numeric values that represent probability of a
particular class.
2 RELATED WORK
Within the research field of computer vision, various
traffic sign recognition systems have been analyzed.
They are designed for different applications with a
suitable pipeline of algorithms. Various systems
made use of the color representation of symbols to
segment the Region of Interest.
For instance, segmentation was performed by
Shadeed et al. by applying the (positive) U
and(negative) V channels of the YUV system, for red
colors. Hue channel of the HSV system was used
with it in segmentation of red road signs.
In Greenhalgh, (2012) the authors made use of
MSERs at different thresholds in the detection stage
to detect the connected components that maintained
their shape. It used grayscale images for signs with a
white background, however, transformed RGB into a
“normalized red-blue” image for the other signs.
Since traffic signs are usually straight and upright,
HOG algorithm was used to extract the features and
passed to a cascade of SVMs for classification. The
white road signs gave an accuracy of 89.2% and
92.1% for color signs.
SVMs gives better results when the distribution
of images in the classes is uneven and avoids
overfitting Jo, K.H., (2014).
In Lai, Y. (2018), a Convolutional Neural
Network and Support Vector Machines (CNN-SVM)
method was proposed by the authors for carrying out
recognition and classification. The input to the
convolutional neural network is the YCbCr color
space for separating the color channels and
extracting distinct characteristics. Then classification
is done using the SVM. This approach gave a 98.6%
accuracy.
On the contrary, several methods are found that
use only shape information from grayscale images
and entirely ignore color information. In Lorsakul, A.
(2017) the authors converted the RGB images to
grayscale and applied Gaussian filter to smoothen the
image and applied Canny edge detection to enhance
before converting to Binary. It was found that each
image frame took satisfactory processing time to use
in the real application.
A survey of many proposed methods using CNN
was done by the authors in
Hatolkar, Y., (2018)
showing the challenges faced by CNN technique
through time complexity and accuracy. Canny edge
detection was also proposed to solve the problems
and outline the traffic symbols’ edges which can then
be classified using a CNN.
3 TRAINING SET
The training and validation of the CNN is carried out
with the “The German Traffic Sign Recognition
Benchmark” (GTSRB). Fig 1. shows a few examples
of the data set images.
The partition into the Training set, Testing set,
and Validation set has been done. The first partition
(67.12%) consists of 34799 images of size 32 x 32 in
RGB format. The total number of images is 51,839.
The dataset has 43 different classes; however, the
distribution is not even. For instance,
0
th
class which is “20 km/h speed limit sign” has
180 sample images which is the lowest. Moreover,
2
nd
class of “50 km/h speed limit sign” has the
greatest number of samples, that is 2010 sample
images. In total, the training set has an average of
809 samples per class. The second part consists of
12630 pictures which make up 24.4%. The images
are not evenly distributed among the classes here as
well. The third part consists of 4410 images (8.5%),
again not evenly distributed.
4 TRAFFIC SIGNS
RECOGNITION SYSTEM
MODEL
4.1 Traffic Images Preprocessing
This stage is essential to the accuracy and
performance of the model. Thus, prior to the training
or testing of images, these images need to be
preprocessed in a hit and trial way to find the best
order that reduces computation. Following methods
were tried:
Shuffling the training data: This was an important
step done to prevent the model from memorizing the
pattern of images and getting stuck in local minima,
instead of deciding through features.
Normalization of RGB image: This scales each
color RGB pixel from 0-255 range to the 0.1-0.9
range.