DETECTING LICENSE PLATE USING CLUSTER RUN LENGTH

SMOOTHING ALGORITHM

Siti Norul Huda Sheikh Abdullah, Marzuki Khalid and Rubiyah Yusof

Centre for Artiﬁcial Intelligence and Robotics (CAIRO)

Faculty of Electrical Engineering,Universiti Teknologi Malaysia, Jalan Semarak,54100 Kuala Lumpur

Khairuddin Omar

Jabatan Sains dan Pengurusan Sistem

Fakulti Teknologi Sains Maklumat,Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor.

Keywords:

License plate recognition, clustering, run length smoothing algorithm, thresholding.

Abstract:

Vehicle license plate recognition has been intensively studied in many countries. Due to the different types

of license plates being used, the requirement of an automatic license plate recognition system is different for

each country. In this paper, an automatic license plate recognition system is proposed for Malaysian vehicles

with standard license plates based on image processing, clustering, feature extraction and neural networks.

The image processing library is developed in-house which referred to as Vision System Development Plat-

form (VSDP).After applying image enhancement, the image is segmented using blob analysis, horizontal scan

line proﬁles, clustering and run length smoothing algorithm approach to identify the location of the license

plate. Thoroughly each image is transformed into blob objects and its important information such as total of

blobs, location, height and width, are being analyzed for the purpose of cluster exercising and choosing the

best cluster with winner blobs. Here, new algorithm called Cluster Run Length Smoothing Algorithm (CLSA)

approach was applied to locate the license plate at the right position. CLSA consisted of two separate new

proposed algorithm which applied new edge detector algorithm using 3x3 kernel masks and 128 grayscale

offset plus a new way (3D method) to calculate run length smoothing algorithm (RLSA), which can improve

clustering techniques in segmentation phase. Two separate experiments were performed; Cluster and Thresh-

old value 130 (CT130) and CRLSA with Threshold value 1 (CCT1). The prototyped system has an accuracy

more than 96% and suggestions to further improve te system are discussed in this paper pertaining to analysis

of the error.

1 INTRODUCTION

Automatic license plate recognition system (LPR) is

an important area of research due to its many appli-

cations. For local authorities license plate recogni-

tion is required for the purposes of enforcement, bor-

der protection, vehicle thefts, automatic toll collec-

tion, and perhaps trafﬁc control. Among the com-

mercial license plate recognition systems available

worldwide are Car Plate Reader (CPR) by Rafael

et.al.(J.Barosso et al., 1997) and Automatic Number

Plate Recognition(ANPR) by Chang et. al.(Chang

et al., 2004). In Malaysia, vehicles license plates

are in the form of single or double line with normal

fonts which comprise of perhaps 95% of the all the

vehicles. Most pictures have been taken in various

states in Malaysia like Sabah, Wilayah Perseketuan,

Johor, Selangor, Perak, Negeri Sembilan, Pahang and

Terengganu. There are also special fonts as depicted

in Figure1.

This dedicated LPR software covers at least

ﬁve major processes consecutively; Capturing, Pre-

(a)

(b)

Figure 1: (a)Samples of common Malaysia license plates

(b) Samples of special Malaysia license plates.

Processing, Segmentation, Feature Extraction and

Classiﬁcation. However this paper will only conce-

trate on license plate detection, which covers image

enhancement and segmentation.

This section is divided into ﬁve sections. First sec-

tion discusses on Image Segmentation while section

two discusses on Clustering Technique. This cluster-

ing techniques is enhanced by applying RGB convo-

lution with a new edge detector and 128 greyscale off-

set, and Run Length Smoothing Algorithm Approach.

Both of these topics are explained in Section 3 and 4

consecutively. Discussion on three different experi-

ments are brieﬂy concluded in Section 5.

175

Norul Huda Sheikh Abdullah S., Khalid M., Yusof R. and Omar K. (2006).

DETECTING LICENSE PLATE USING CLUSTER RUN LENGTH SMOOTHING ALGORITHM.

In Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics, pages 175-178

DOI: 10.5220/0001198501750178

 SciTePress

2 IMAGE SEGMENTATION

Image segmentation is a process that separates words

to single characters for easy identiﬁcation(Al-Badr

and S.A.Mahmoud, 1995). In this project, segmen-

tation involves a process of separating a collection

of character that has been ﬁltered; to a sequence of

characters that will be used in the feature extraction

stage. This step is very signiﬁcant due to overlapping

characters that form the license plate. At the moment,

LPSeeker applies clustering technique to identify im-

portant blobs. After processing image using simple

image enhancement technique like Fixed Filter, Min-

imum Filter, Median Filter and Homomorphic Filter

for the LPSeeker image enhancement which are pro-

vided in VSDP library (Vision System Development

Platform). VSDP is a library that has been developed

by CAIRO, UTMKL researchers.

3 CLUSTERING TECHNIQUE

After applying above image enhancement, the im-

age is segmented using horizontal scan line proﬁles

and clustering technique. Thoroughly each image is

transformed into blob objects and its important infor-

mation such as location, height and width, are being

analyzed by the LPSeeker for the purpose of cluster

exercising and choosing the best cluster with winner

blobs. The blobs are clustered when difference be-

tween blob and cluster heights and difference between

maximum Y value of the cluster and blob are less than

a constant time to cluster’s height as stated in clus-

tering algorithm. Please refer to picture depicted in

Figure 2. Then, the clusters are sorted ascendingly

according to its member’s size. Starting from the

most maximum members, each of the clusters is being

checked using RGB Convolution and a new edge de-

tector with 128 grayscale offset, run length smoothing

algorithm , (or)Thresholding with value 1 and Hor-

izontal and vertical Projection. These processes will

be described in following section. If the checking suc-

cess, then the cluster is chosen as ﬁrst winner clus-

ter. This ﬁrst winner cluster will search if there is any

other cluster with the same height nearby. If yes, then

the second winner cluster is set. Lastly, these winner

blobs are extracted their feature individually before

permitting to recognition or classiﬁcation phase.

14 13

(minX

, minY

)

(maxX

B11

, maxY

B11

)

(minX

B11

, minY

B11

)

B11

(maxX

, maxY

)

Figure 2: Important information for clustering approach.

4 RGB CONVOLUTION AND RUN

LENGTH SMOOTHING

APRROACHES

This setion consists of two sub sections; RGB Con-

volution and Thresholding, and Run Length Smooth-

ing Approach (RLSA). The ﬁrst section which is RGB

Convolution, covers the process of applying new edge

detector to an image and later the resulted imaged is

thresholded into value 1. While the second section ex-

plains RLSA which involves the concept of interprat-

ing both resulted images (RGB convolved and thresh-

olded images).

4.1 RGB Convolution and

Thresholding Approach

RGB Convolution is an approach to transform a RGB

pixel value into determining grayscale value. We in-

troduced a new edge detector with 128 grayscale off-

set and thresholding. Here, 3x3 kernel mask and its

matrix is shown in equation (1). Every pixel (in RGB

format), RGBsum in the image is summed and multi-

plied by equation (1). At the same time, Ksum, sum-

mation of 3x3 kervel mask value, kernelV(x,y) where x

and y is from 0 until 2 is calculated. Later the RGBsum

is divided into Ksum and added to a grayscale offset

value, b. Finally, the RGBsum is converted again into

grayscale value. This process is called RGB Convolu-

tion. The compositions of all new graycale values will

transform an original image in Figure 3(a) into a new

black-gray-white image display as illustrated in Fig-

ure 3(b). The new convolved image is set into a buffer

and undergone a thresholding process with value one.

The ﬁnal black-white image, is shown as Figure 3(c).

128 +

3 0 −3

5 0 −5

3 0 −3

(1)

4.2 Run Length Smoothing

Approach

Run Length Smoothing Algorithm (RLSA) has been

used widely in optical character recognition (OCR)

process especially in document analysis system

(Wong et al., 1982)(Fisher et al., 1990). The previous

method developed for the Document Analysis System

(Nagy, 1968) consists of two steps. Firstly, a segmen-

tation procedure subdivides the area of a document

into regions (blocks), each of which should contain

only one type of data (text, graphic, halftone image,

etc.) and later some basic features of these blocks are

calculated. A linear classiﬁer which adapts itself to

varying character heights discriminates between text

and images. However, this technique usually applies

ICINCO 2006 - INTELLIGENT CONTROL SYSTEMS AND OPTIMIZATION

176

only after horizontal or vertical projection to recog-

nize the block segementation.

After manupulating the chosen cluster using RGB

convolution with a new edge detector and 128

grayscale offset value, RLSA and(or) horizontal and

vertical projection were applied to distinguish within

non-character and character clusters. From the above

resulted images Figure 3 (a), (b) and (c), character and

non-character images were analyzed and a three di-

mensional horizontal projection rules are constructed

to differentiate between non-character and character

cluster.

The run length Figure 3 (b) of is stored if only

black-gray-white pixels exist consecutively. For ex-

ample, the image greyvalue pixel is transform into a

series of pixel where b = black, g = gray, w = white

as show below in Figure 4(a). Then, the runlengths

for each black, gray and white (b-g-w) pixels is cal-

culated Figure 4(b). Lastly, only total runlengths of

b-g-w pixel consecutively are stored for each single

line. Lastly, the average runlengths per line, aveRL-

whole, number of runlengths per line, aveCountRL-

whole, and ratio average runlength per line and num-

ber of runlength per line, ratioRL is computed (Figure

4(c)). These useful information were taken to con-

struct a set of decision threshold rules at the very be-

ginning purposely for recognizing character and non-

character clusters .

(a)

(b)

(c)

Figure 3: (a)An image of detected license plate (b) Image

analysis using RLSA with a new edge detector and 128 of-

fet value (b) Image after applying RLSA and Thresholding

Value 1.

wggwwwbbbggwwwbbbggwwwggggggggggggbbbbgwwwwggggbbbbbbggggwwwwwwggggggg

ggggggggggggggbbbgggggggggggggwwwwgggggbbbbbbgggwwwwwwgggggggbbbbbbbgggww

wwwwggggggggggggggbbbbbggwwb

1w2g3w3b2g3w3b2g3w12g4b1g4w4g6b4g6w21g3b13g4w5g6b3g6w7g7b3g6w14g5b2g2w1b

R:8X:13Y:0 R:8X:21Y:0 R:9X:42Y:0 R:16X:62Y:0 R:20X:103Y:0 R:15X:123Y:0 R:16X:146Y:0

 AveRLperline:13.14 NoOfRLperline:7 Ratio:1.88

(a)

(b)

(c)

Figure 4: Example of (a) a string of image (b) runlengths

for each black, gray and white pixels and (c) runlength only

for white pixels consecutively.

Figure 5: Example of correct license plate detection.

(a)

(b)

(c)

(d)

(e)

ACU3992

ADM9585

CBD2232

JCW4898

BBF1707

Figure 6: Types of detection errors; (a)Extra (b)Miss1

(c)Miss2 (d)Miss≻2(e) Fail.

Table 1: Type of errors.

Type Description

Pass Found license plate

Miss1 Miss 1 character

Miss2 Miss 2 characters

Miss≻2 Miss more than 2 characters

Extra More than actual characters

Fail Cannot locate license plate

Correct detection Locate license plate correctly or

summation number of Pass, Miss1,

Miss2,Miss≻2 and Extra errors

Incorrect detection Locate license plate incorrectly or

number of fail errors

5 DISCUSSION AND

CONCLUSION

Two different experiments were run. There were

Clustering with Threshold 130 value (CT130), Clus-

tering with RGB convolution and Threshold 1 value

(CCT1). Error analysis were calculated based on sev-

eral types; they are Pass, Miss1, Miss2, Miss≻2, Ex-

tra or Fail. Explainations on each type of analysis

were outlined in Table 1. Correct detection rate or to-

tal correct is calculated based on summation of num-

ber of Pass, Miss1, Miss2, Miss≻2 and Extra divided

by total image. Examples of error images are illus-

trated in Figure 6 (a), (b) ,(c) and d. On the other

hand, Incorrect detection rate is formulated as num-

ber of fail divided by total image.

From Table 2, CCT1 value has obtained the highest

pass rate (77.25%) comparing to others (76.68% for

CT130 and 53.85% for CCT1). Numbers of Miss1,

Miss2, Miss≻2 and Extra Errors in CT130 were dis-

tributed evenly compared to CCT1. CCT1 could dra-

matically reduce number of fail detection down to 44

or 3.56% detection rate whereas CT130 could only af-

ford to achieve 72 number of fails or 5.83%. CT130

which applies threshold 130 value before undergoing

clustering approach has decreased number of blobs

to be clustered in the whole image. Pertaining to

this matter, the CT130 may discard important blobs

in between letters and cause high number of missing

errors such as 32, 47, 55 and 82 for Miss1, Miss2

and Miss≻2 errors correspondingly. Another point to

highlight for CT130 is missing blobs are normally oc-

cur in between characters when sometime 2 or more

blobs are connected. Unlike CCT1, missing blobs

only occurs at the begining or ending of the the whole

license plates.

DETECTING LICENSE PLATE USING CLUSTER RUN LENGTH SMOOTHING ALGORITHM

177

We can also highlight that total of pass, extra and

Miss1 errors for CCT1 (954 + 193 + 16 = 1163)

was higher than CT130 (1026). This has proved that

RGB convolutions and Threshold method had its own

signiﬁcant in choosing the appropriate winner clus-

ter. Additionally, CCT1 has successfully obtained

the highest license plate detection rate 96.44% while

CT130 achieved the second place with 94.17%.It can

be concluded that combination of RGB Convolution

with threshold one value techniques, can boost up the

license plate detection up to 96.44%.

From the above results, a few advantages and

disadvantages were notiﬁed with the proposed ap-

proaches, RGB convolutions and a new edge detector

with 128 grayscale offset that were applied. The ad-

vantages are;

i. Eventhough the original image of the back or

frontal car is having fusion problems, CCT1 can still

successfully detected the locations of the license plate

as shown in Figure 7.

ii.CCT1 can increase number of passing rate.

iii. CCT1 can increase number of extra blobs er-

rors.

iv.CCT1 can reduce number of missing Miss1,

Miss2, Miss≻2 blobs consisted in the winner clus-

ter(s). Besides that, those missing blobs are normally

at the beginning or ending.

The disadvantages are:

i.The passing rate for CCT1 were only slightly in-

creased compared to CT130. This is because CCT1

will consider all blobs in the image but not for CT130.

ii. RGB convolution with a new edge detector is

very time consuming because the calculations of get-

ting new grayscale output requires every pixel of the

original image to be analyzed.

iii. Quite often memory becomes leaking when us-

ing RGB convolution because it requires high storage.

As conclusion,CCT1 can boost up the detection rate

of license plates by suggestions below,

i. Instead of using ﬁxed thresholding in CT130,

adaptive zoning thresholding can also help CCT1 to

improve its detection rate.

ii. RGB convolutions can apply other edge detec-

tors with 128 grayscale offset.

iii. Before applying RGB convoultions and Hori-

zontal and Vertical Projections, perhaps checking the

cluster’s original image using binary projections can

increase the whole performance.

iv. Incorparating uncertainty value while calculat-

ing maximum number of blobs and runlength of each

clusters may increase the LPSeeker’s performance.

This paper has generally discussed on concept of li-

cense plate recognition and segmentation techniques

which covers clustering, RGB convolutions and Run

Length Smoothing Algorithm. In conclusion, we

can conclude combination of RGB convolutions, a

new edge detector with 128 grayscale offset and Run

Figure 7: Fusion Images that has been successfully detected

by LPSeeker.

Length Smoothing Algorithm has signiﬁcantly raised

detection rate of license plate’s location in the seg-

mentation phase.

Table 2: Detection rate for three experiments: Cluster with

Threshold Value 130, Cluster with RGB Convolution and

Cluster with RGB Convolution and Threshold Value 1.

Error Cluster and Cluster and

Threshold 130 RGB Convolve and

(CT130) Threshold 1(CCT1)

Type Total Percentage Total Percentage

Pass 947 76.68% 954 77.25%

Miss1 32 2.59% 193 15.63%

Miss2 47 3.81% 16 1.30%

Miss≻2 55 4.45% 16 1.30%

Extra 82 6.64% 12 0.97%

Fail 72 5.83% 44 3.56%

total 1235 100% 1237 100%

Total correct 1163 94.17% 1191 96.44%

Total incorrect 72 5.83% 44 3.56%

total 1235 100% 1235 100%

REFERENCES

Al-Badr, B. and S.A.Mahmoud (1995). Survey and bib-

liography of arabic optical test recognition. Signal

Processing., 41:49–77.

Chang, S.-L., shien Chen, L., Chung, Y.-C., and Chen, S.-

W. (2004). Automatic license plate recognition. IEEE

Transaction Intelligent transportation system, 5:42–

53.

Fisher, J. L., Hinds, S. C., and DAmato, D. P. (1990). A

rule-based system for document image segmentation.

In Proceedings of 10th International Conference on

Pattern Recognition, volume 1, pages 567–572. ir-

relevent.

J.Barosso, Dagless, E., A.Rafel, and Bulas-Cruz, J. (1997).

Number plate reading using computer vision. In Pro-

ceedings of IEEE International symposium on Indus-

trial Electronics., volume 3, pages 761–766.

Nagy, G. (1968). Preliminary investigation of techniques

for automated reading of unformatted text. Communi-

cation ACM, 11:480–487.

Wong, K., Casey, R., and Wahl, F. (1982). Document analy-

sis system. IBM Journal of Research and Develop-

ment, 26(6):647–657. rule based for text, horizontal

solid nlack lines, graphic and halftone images, verti-

cal solid black lines.

ICINCO 2006 - INTELLIGENT CONTROL SYSTEMS AND OPTIMIZATION

178