DETECTING LICENSE PLATE USING CLUSTER RUN LENGTH
SMOOTHING ALGORITHM
Siti Norul Huda Sheikh Abdullah, Marzuki Khalid and Rubiyah Yusof
Centre for Artificial Intelligence and Robotics (CAIRO)
Faculty of Electrical Engineering,Universiti Teknologi Malaysia, Jalan Semarak,54100 Kuala Lumpur
Khairuddin Omar
Jabatan Sains dan Pengurusan Sistem
Fakulti Teknologi Sains Maklumat,Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor.
Keywords:
License plate recognition, clustering, run length smoothing algorithm, thresholding.
Abstract:
Vehicle license plate recognition has been intensively studied in many countries. Due to the different types
of license plates being used, the requirement of an automatic license plate recognition system is different for
each country. In this paper, an automatic license plate recognition system is proposed for Malaysian vehicles
with standard license plates based on image processing, clustering, feature extraction and neural networks.
The image processing library is developed in-house which referred to as Vision System Development Plat-
form (VSDP).After applying image enhancement, the image is segmented using blob analysis, horizontal scan
line profiles, clustering and run length smoothing algorithm approach to identify the location of the license
plate. Thoroughly each image is transformed into blob objects and its important information such as total of
blobs, location, height and width, are being analyzed for the purpose of cluster exercising and choosing the
best cluster with winner blobs. Here, new algorithm called Cluster Run Length Smoothing Algorithm (CLSA)
approach was applied to locate the license plate at the right position. CLSA consisted of two separate new
proposed algorithm which applied new edge detector algorithm using 3x3 kernel masks and 128 grayscale
offset plus a new way (3D method) to calculate run length smoothing algorithm (RLSA), which can improve
clustering techniques in segmentation phase. Two separate experiments were performed; Cluster and Thresh-
old value 130 (CT130) and CRLSA with Threshold value 1 (CCT1). The prototyped system has an accuracy
more than 96% and suggestions to further improve te system are discussed in this paper pertaining to analysis
of the error.
1 INTRODUCTION
Automatic license plate recognition system (LPR) is
an important area of research due to its many appli-
cations. For local authorities license plate recogni-
tion is required for the purposes of enforcement, bor-
der protection, vehicle thefts, automatic toll collec-
tion, and perhaps traffic control. Among the com-
mercial license plate recognition systems available
worldwide are Car Plate Reader (CPR) by Rafael
et.al.(J.Barosso et al., 1997) and Automatic Number
Plate Recognition(ANPR) by Chang et. al.(Chang
et al., 2004). In Malaysia, vehicles license plates
are in the form of single or double line with normal
fonts which comprise of perhaps 95% of the all the
vehicles. Most pictures have been taken in various
states in Malaysia like Sabah, Wilayah Perseketuan,
Johor, Selangor, Perak, Negeri Sembilan, Pahang and
Terengganu. There are also special fonts as depicted
in Figure1.
This dedicated LPR software covers at least
ve major processes consecutively; Capturing, Pre-
(a)
(b)
Figure 1: (a)Samples of common Malaysia license plates
(b) Samples of special Malaysia license plates.
Processing, Segmentation, Feature Extraction and
Classification. However this paper will only conce-
trate on license plate detection, which covers image
enhancement and segmentation.
This section is divided into five sections. First sec-
tion discusses on Image Segmentation while section
two discusses on Clustering Technique. This cluster-
ing techniques is enhanced by applying RGB convo-
lution with a new edge detector and 128 greyscale off-
set, and Run Length Smoothing Algorithm Approach.
Both of these topics are explained in Section 3 and 4
consecutively. Discussion on three different experi-
ments are briefly concluded in Section 5.
175
Norul Huda Sheikh Abdullah S., Khalid M., Yusof R. and Omar K. (2006).
DETECTING LICENSE PLATE USING CLUSTER RUN LENGTH SMOOTHING ALGORITHM.
In Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics, pages 175-178
DOI: 10.5220/0001198501750178
Copyright
c
SciTePress
2 IMAGE SEGMENTATION
Image segmentation is a process that separates words
to single characters for easy identification(Al-Badr
and S.A.Mahmoud, 1995). In this project, segmen-
tation involves a process of separating a collection
of character that has been filtered; to a sequence of
characters that will be used in the feature extraction
stage. This step is very significant due to overlapping
characters that form the license plate. At the moment,
LPSeeker applies clustering technique to identify im-
portant blobs. After processing image using simple
image enhancement technique like Fixed Filter, Min-
imum Filter, Median Filter and Homomorphic Filter
for the LPSeeker image enhancement which are pro-
vided in VSDP library (Vision System Development
Platform). VSDP is a library that has been developed
by CAIRO, UTMKL researchers.
3 CLUSTERING TECHNIQUE
After applying above image enhancement, the im-
age is segmented using horizontal scan line profiles
and clustering technique. Thoroughly each image is
transformed into blob objects and its important infor-
mation such as location, height and width, are being
analyzed by the LPSeeker for the purpose of cluster
exercising and choosing the best cluster with winner
blobs. The blobs are clustered when difference be-
tween blob and cluster heights and difference between
maximum Y value of the cluster and blob are less than
a constant time to cluster’s height as stated in clus-
tering algorithm. Please refer to picture depicted in
Figure 2. Then, the clusters are sorted ascendingly
according to its member’s size. Starting from the
most maximum members, each of the clusters is being
checked using RGB Convolution and a new edge de-
tector with 128 grayscale offset, run length smoothing
algorithm , (or)Thresholding with value 1 and Hor-
izontal and vertical Projection. These processes will
be described in following section. If the checking suc-
cess, then the cluster is chosen as first winner clus-
ter. This first winner cluster will search if there is any
other cluster with the same height nearby. If yes, then
the second winner cluster is set. Lastly, these winner
blobs are extracted their feature individually before
permitting to recognition or classification phase.
15
14 13
12
11
16
W
C5
H
C5
(minX
C5
, minY
C5
)
(maxX
B11
, maxY
B11
)
(minX
B11
, minY
B11
)
H
B11
W
B11
(maxX
C5
, maxY
C5
)
Figure 2: Important information for clustering approach.
4 RGB CONVOLUTION AND RUN
LENGTH SMOOTHING
APRROACHES
This setion consists of two sub sections; RGB Con-
volution and Thresholding, and Run Length Smooth-
ing Approach (RLSA). The first section which is RGB
Convolution, covers the process of applying new edge
detector to an image and later the resulted imaged is
thresholded into value 1. While the second section ex-
plains RLSA which involves the concept of interprat-
ing both resulted images (RGB convolved and thresh-
olded images).
4.1 RGB Convolution and
Thresholding Approach
RGB Convolution is an approach to transform a RGB
pixel value into determining grayscale value. We in-
troduced a new edge detector with 128 grayscale off-
set and thresholding. Here, 3x3 kernel mask and its
matrix is shown in equation (1). Every pixel (in RGB
format), RGBsum in the image is summed and multi-
plied by equation (1). At the same time, Ksum, sum-
mation of 3x3 kervel mask value, kernelV(x,y) where x
and y is from 0 until 2 is calculated. Later the RGBsum
is divided into Ksum and added to a grayscale offset
value, b. Finally, the RGBsum is converted again into
grayscale value. This process is called RGB Convolu-
tion. The compositions of all new graycale values will
transform an original image in Figure 3(a) into a new
black-gray-white image display as illustrated in Fig-
ure 3(b). The new convolved image is set into a buffer
and undergone a thresholding process with value one.
The final black-white image, is shown as Figure 3(c).
128 +
1
11
"
3 0 3
5 0 5
3 0 3
#
(1)
4.2 Run Length Smoothing
Approach
Run Length Smoothing Algorithm (RLSA) has been
used widely in optical character recognition (OCR)
process especially in document analysis system
(Wong et al., 1982)(Fisher et al., 1990). The previous
method developed for the Document Analysis System
(Nagy, 1968) consists of two steps. Firstly, a segmen-
tation procedure subdivides the area of a document
into regions (blocks), each of which should contain
only one type of data (text, graphic, halftone image,
etc.) and later some basic features of these blocks are
calculated. A linear classifier which adapts itself to
varying character heights discriminates between text
and images. However, this technique usually applies
ICINCO 2006 - INTELLIGENT CONTROL SYSTEMS AND OPTIMIZATION
176
only after horizontal or vertical projection to recog-
nize the block segementation.
After manupulating the chosen cluster using RGB
convolution with a new edge detector and 128
grayscale offset value, RLSA and(or) horizontal and
vertical projection were applied to distinguish within
non-character and character clusters. From the above
resulted images Figure 3 (a), (b) and (c), character and
non-character images were analyzed and a three di-
mensional horizontal projection rules are constructed
to differentiate between non-character and character
cluster.
The run length Figure 3 (b) of is stored if only
black-gray-white pixels exist consecutively. For ex-
ample, the image greyvalue pixel is transform into a
series of pixel where b = black, g = gray, w = white
as show below in Figure 4(a). Then, the runlengths
for each black, gray and white (b-g-w) pixels is cal-
culated Figure 4(b). Lastly, only total runlengths of
b-g-w pixel consecutively are stored for each single
line. Lastly, the average runlengths per line, aveRL-
whole, number of runlengths per line, aveCountRL-
whole, and ratio average runlength per line and num-
ber of runlength per line, ratioRL is computed (Figure
4(c)). These useful information were taken to con-
struct a set of decision threshold rules at the very be-
ginning purposely for recognizing character and non-
character clusters .
(a)
(b)
(c)
Figure 3: (a)An image of detected license plate (b) Image
analysis using RLSA with a new edge detector and 128 of-
fet value (b) Image after applying RLSA and Thresholding
Value 1.
wggwwwbbbggwwwbbbggwwwggggggggggggbbbbgwwwwggggbbbbbbggggwwwwwwggggggg
ggggggggggggggbbbgggggggggggggwwwwgggggbbbbbbgggwwwwwwgggggggbbbbbbbgggww
wwwwggggggggggggggbbbbbggwwb
1w2g3w3b2g3w3b2g3w12g4b1g4w4g6b4g6w21g3b13g4w5g6b3g6w7g7b3g6w14g5b2g2w1b
R:8X:13Y:0 R:8X:21Y:0 R:9X:42Y:0 R:16X:62Y:0 R:20X:103Y:0 R:15X:123Y:0 R:16X:146Y:0
 AveRLperline:13.14 NoOfRLperline:7 Ratio:1.88
(a)
(b)
(c)
Figure 4: Example of (a) a string of image (b) runlengths
for each black, gray and white pixels and (c) runlength only
for white pixels consecutively.
Figure 5: Example of correct license plate detection.
(a)
(b)
(c)
(d)
(e)
ACU3992
ADM9585
CBD2232
JCW4898
BBF1707
Figure 6: Types of detection errors; (a)Extra (b)Miss1
(c)Miss2 (d)Miss2(e) Fail.
Table 1: Type of errors.
Type Description
Pass Found license plate
Miss1 Miss 1 character
Miss2 Miss 2 characters
Miss2 Miss more than 2 characters
Extra More than actual characters
Fail Cannot locate license plate
Correct detection Locate license plate correctly or
summation number of Pass, Miss1,
Miss2,Miss2 and Extra errors
Incorrect detection Locate license plate incorrectly or
number of fail errors
5 DISCUSSION AND
CONCLUSION
Two different experiments were run. There were
Clustering with Threshold 130 value (CT130), Clus-
tering with RGB convolution and Threshold 1 value
(CCT1). Error analysis were calculated based on sev-
eral types; they are Pass, Miss1, Miss2, Miss2, Ex-
tra or Fail. Explainations on each type of analysis
were outlined in Table 1. Correct detection rate or to-
tal correct is calculated based on summation of num-
ber of Pass, Miss1, Miss2, Miss2 and Extra divided
by total image. Examples of error images are illus-
trated in Figure 6 (a), (b) ,(c) and d. On the other
hand, Incorrect detection rate is formulated as num-
ber of fail divided by total image.
From Table 2, CCT1 value has obtained the highest
pass rate (77.25%) comparing to others (76.68% for
CT130 and 53.85% for CCT1). Numbers of Miss1,
Miss2, Miss2 and Extra Errors in CT130 were dis-
tributed evenly compared to CCT1. CCT1 could dra-
matically reduce number of fail detection down to 44
or 3.56% detection rate whereas CT130 could only af-
ford to achieve 72 number of fails or 5.83%. CT130
which applies threshold 130 value before undergoing
clustering approach has decreased number of blobs
to be clustered in the whole image. Pertaining to
this matter, the CT130 may discard important blobs
in between letters and cause high number of missing
errors such as 32, 47, 55 and 82 for Miss1, Miss2
and Miss2 errors correspondingly. Another point to
highlight for CT130 is missing blobs are normally oc-
cur in between characters when sometime 2 or more
blobs are connected. Unlike CCT1, missing blobs
only occurs at the begining or ending of the the whole
license plates.
DETECTING LICENSE PLATE USING CLUSTER RUN LENGTH SMOOTHING ALGORITHM
177
We can also highlight that total of pass, extra and
Miss1 errors for CCT1 (954 + 193 + 16 = 1163)
was higher than CT130 (1026). This has proved that
RGB convolutions and Threshold method had its own
significant in choosing the appropriate winner clus-
ter. Additionally, CCT1 has successfully obtained
the highest license plate detection rate 96.44% while
CT130 achieved the second place with 94.17%.It can
be concluded that combination of RGB Convolution
with threshold one value techniques, can boost up the
license plate detection up to 96.44%.
From the above results, a few advantages and
disadvantages were notified with the proposed ap-
proaches, RGB convolutions and a new edge detector
with 128 grayscale offset that were applied. The ad-
vantages are;
i. Eventhough the original image of the back or
frontal car is having fusion problems, CCT1 can still
successfully detected the locations of the license plate
as shown in Figure 7.
ii.CCT1 can increase number of passing rate.
iii. CCT1 can increase number of extra blobs er-
rors.
iv.CCT1 can reduce number of missing Miss1,
Miss2, Miss2 blobs consisted in the winner clus-
ter(s). Besides that, those missing blobs are normally
at the beginning or ending.
The disadvantages are:
i.The passing rate for CCT1 were only slightly in-
creased compared to CT130. This is because CCT1
will consider all blobs in the image but not for CT130.
ii. RGB convolution with a new edge detector is
very time consuming because the calculations of get-
ting new grayscale output requires every pixel of the
original image to be analyzed.
iii. Quite often memory becomes leaking when us-
ing RGB convolution because it requires high storage.
As conclusion,CCT1 can boost up the detection rate
of license plates by suggestions below,
i. Instead of using fixed thresholding in CT130,
adaptive zoning thresholding can also help CCT1 to
improve its detection rate.
ii. RGB convolutions can apply other edge detec-
tors with 128 grayscale offset.
iii. Before applying RGB convoultions and Hori-
zontal and Vertical Projections, perhaps checking the
cluster’s original image using binary projections can
increase the whole performance.
iv. Incorparating uncertainty value while calculat-
ing maximum number of blobs and runlength of each
clusters may increase the LPSeeker’s performance.
This paper has generally discussed on concept of li-
cense plate recognition and segmentation techniques
which covers clustering, RGB convolutions and Run
Length Smoothing Algorithm. In conclusion, we
can conclude combination of RGB convolutions, a
new edge detector with 128 grayscale offset and Run
Figure 7: Fusion Images that has been successfully detected
by LPSeeker.
Length Smoothing Algorithm has significantly raised
detection rate of license plate’s location in the seg-
mentation phase.
Table 2: Detection rate for three experiments: Cluster with
Threshold Value 130, Cluster with RGB Convolution and
Cluster with RGB Convolution and Threshold Value 1.
Error Cluster and Cluster and
Threshold 130 RGB Convolve and
(CT130) Threshold 1(CCT1)
Type Total Percentage Total Percentage
Pass 947 76.68% 954 77.25%
Miss1 32 2.59% 193 15.63%
Miss2 47 3.81% 16 1.30%
Miss2 55 4.45% 16 1.30%
Extra 82 6.64% 12 0.97%
Fail 72 5.83% 44 3.56%
total 1235 100% 1237 100%
Total correct 1163 94.17% 1191 96.44%
Total incorrect 72 5.83% 44 3.56%
total 1235 100% 1235 100%
REFERENCES
Al-Badr, B. and S.A.Mahmoud (1995). Survey and bib-
liography of arabic optical test recognition. Signal
Processing., 41:49–77.
Chang, S.-L., shien Chen, L., Chung, Y.-C., and Chen, S.-
W. (2004). Automatic license plate recognition. IEEE
Transaction Intelligent transportation system, 5:42–
53.
Fisher, J. L., Hinds, S. C., and DAmato, D. P. (1990). A
rule-based system for document image segmentation.
In Proceedings of 10th International Conference on
Pattern Recognition, volume 1, pages 567–572. ir-
relevent.
J.Barosso, Dagless, E., A.Rafel, and Bulas-Cruz, J. (1997).
Number plate reading using computer vision. In Pro-
ceedings of IEEE International symposium on Indus-
trial Electronics., volume 3, pages 761–766.
Nagy, G. (1968). Preliminary investigation of techniques
for automated reading of unformatted text. Communi-
cation ACM, 11:480–487.
Wong, K., Casey, R., and Wahl, F. (1982). Document analy-
sis system. IBM Journal of Research and Develop-
ment, 26(6):647–657. rule based for text, horizontal
solid nlack lines, graphic and halftone images, verti-
cal solid black lines.
ICINCO 2006 - INTELLIGENT CONTROL SYSTEMS AND OPTIMIZATION
178