INVARIANT FEATURES FOR CHARACTER RECOGNITION

Ryszard S. Chora

Institute of Telecommunications, University of Technology & Agriculture

Kaliskiego Street 7, 85-796 Bydgoszcz, Poland

Keywords:

Feature extraction, character recognition, moment invariants.

Abstract:

This paper presents feature extraction method for recognition of isolated characters. Feature extraction is most

important factor in achieving high recognition performance. We presented moments invariants as features for

pattern recognition. This article analyzes the image feature extraction task on the basis of moments invariants

for image recognition problem.

1 INTRODUCTION

Handwritten recognition have been a main research

subject in pattern recognition for over thirty years.

The application of handwritten character recognition

is broad. Typical uses include recognition of hand-

written zip codes and reading personal bank checks.

The recognition of handwritten characters, like other

problems in pattern recognition, consists of two major

problems: feature selection and pattern classiﬁcation.

Feature selection is problem-dependent and consid-

ered most signiﬁcant to the ﬁnal result of a recog-

nition system. Since handwritten characters of the

same character class can occur in great variety, it is

desirable to generate a representation that is invariant.

A feature-based recognition of objects which is inde-

pendent of their position, size, orientation and other

variations has been the goal of much recent research.

There have been several kinds of features used for

recognition:

visual features (contours, textures),

transform coefﬁcient features,

statistical features (moment invariants).

The objective of this paper is to develop an algo-

rithm for the automatic recognition of handwritten

characters. In the algorithm a binary image of the

character is obtained and its skeleton is then produced

by utilizing a standard thinning algorithm. The clas-

siﬁcation process incorporates features of the char-

acters such as number of intersections, number of

free ends, as well as criteria derived from normal-

ized moments. Moment invariants derived by (Hu,

1999),(Flusser, 1993) are invariant only under trans-

lation, rotation and scaling of the object. In this paper

we use features which are also invariant under gen-

eral afﬁne transformations. The block structure of the

recognition system is given in Figure 1. The prepro-

cessing stage includes noise ﬁltering, thinning, and

character categorization. The feature extraction and

classiﬁcation are described in the next paragraphs.

Figure 1: Block diagram of the proposed system.

2 PREPROCESSING

A character pattern is fed to a camera system,

and the camera signal is transformed into binary

image. The camera must be perpendicular to the

character plane and contrast between characters and

background must be reasonable. A binary image is

regarded as a set of image points (pixels with value 1).

102

Chora

s R. (2004).

INVARIANT FEATURES FOR CHARACTER RECOGNITION.

In Proceedings of the First International Conference on Informatics in Control, Automation and Robotics, pages 102-107

DOI: 10.5220/0001139201020107

 SciTePress

The pictorial information is represented as a

function of two variables (i, j). The image in its

digital form is usually stored as an two-dimensional

array. If M = {1, 2, . . . , m} and N = {1, 2, . . . , n}

are the spatial domains, then D = M × N is the

set of resolutions cells and the digital image P is

a function which assigns binary value to each and

every resolution cells, i.e. P : M × N −→ B.

Noise pixels add irregularities to the outer bound-

ary of the characters and may have undesired effects

on the recognition system. A smoothing algorithm

eliminates small areas and ﬁlls little holes. The algo-

rithm modiﬁes each pixel according to its initial value

and to those of its neighborhood (Figure 2) according

to the following conditions:

If p = 1 then p







0 if

i=1

≤ T

1 otherwise

else p







1 if

i=1

0 otherwise

(1)

where p is current pixel value, p

the new pixel

value and and T

are T

the threshold values.

Figure 2: Pixel notation

2.1 Thinning of the characters

A pixel is considered deletable if it satisﬁes the fol-

lowing conditions:

1. 1 < B(p) ≤ 7 where B(p) =

i=1

2. A(p) = 1 is number of 01 transitions in the eight

neighbors of pixel p.

Additional conditions may permit the removal of ele-

ment if it is on south or east edge or if is on a corner

3. (p

= 0) or (p

= 0 and p

= 0)

Table 1: Topological properties of p

THE VALUE

OF N

0 1 2 3 4

PROPERTY

OF PIXEL p

Internal

or iso-

late

End Connect Branch Cross

The next conditions may permit the removal of el-

ement if it is on north or west edge or if it is on a

corner

4. (p

= 0) or (p

= 0 and p

= 0)

Figure 3: Original a) and thinned b) characters

2.2 Character categorization

A particular difﬁculty encountered in handwritten

characters classiﬁcation is the large variations in to-

tal properties of patterns. For character categorization

we used information derived from connected number

of point p .When p = 1 , the connected number N

p is deﬁned by the next equation

k∈S

− p

k+1

k+2

) (2)

k∈S

(¯p

− ¯p

¯p

k+1

¯p

k+2

) (3)

where: S = (1, 3, 5, 7) and p means (1 − p).

Topological properties of the pixel p are shown in

Table 1, and the distribution of characters into the

categories is shown in Table 2.

INVARIANT FEATURES FOR CHARACTER RECOGNITION

103

Table 2. Distribution of characters into the categories.

3 MOMENT-BASED CHARACTER

CLASSIFIERS

3.1 Geometric Moments

Any character can be represented by the spatial mo-

ments of its intensity function

Z Z

(i, j)p(i, j)didj (4)

where p(i, j) is the intensity function representing

the image, the integration is over the entire image and

the F (i, j) is the same function of i and j for example

, or a sin(ip) and cos(jq). In the spatial case

i=1

j=1

p(i, j) (5)

The characteristic function of f (x, y) is deﬁned as

its conjugate Fourier transform and may be expanded

as a power series in u, v , as the following

F (u, v) =

Z Z

−i2π(xu+yv)

dxdy =

∞

p=0

∞

q=0

(−i2π)

p+q

p!q!

(6)

The inﬁnite set of moments

, p, q = 0, 1, 2, . . . uniquely determine f(x, y),

and vice-versa

f(x, y) =

i2π(xu+yv)

[

∞

(−i2π)

p+q

p!q!

]

(7)

The central moments are given by

i=1

j=1

(i − I)

(j − J)

p(i, j) (8)

where (I, J) are

I =

and J =

(9)

Normalized central moment µ

)

, α =

p + q

+ 1 (10)

Using nonlinear combinations of the lower order

moments, a set of moment invariants (usually called

geometric moments), which has the desirable proper-

ties of being invariant under translation, scaling and

rotation, is derived. Hu (Hu, 1999) employed seven

moment invariants, that are invariant under rotation

as well as translation and scale change, to recognize

characters independently of their position size and

orientation.

= µ

+ µ

= [µ

− µ

]

+ 4µ

= [µ

− 3µ

]

+ [3µ

− µ

]

= [µ

+ µ

]

+ [µ

+ µ

]

(11)

= [µ

− 3µ

][µ

+ µ

]×

×[(µ

+ µ

)

− 3(µ

+ µ

)

+[3µ

− µ

][µ

+ µ

]×

×[3(µ

+ µ

)

− (µ

+ µ

)

]

= [µ

− µ

][(µ

+ µ

)

− (µ

+µ

)

] + 4µ

[µ

+ µ

][µ

+ µ

]

= [3µ

− µ

][µ

+ µ

]×

×[(µ

+ µ

)

− 3(µ

+ µ

)

]

−[µ

− 3µ

][µ

+ µ

]×

×[3(µ

+ µ

)

− (µ

+ µ

)

]

Any function of moments which is invariant under

the general afﬁne transformation

= a

i + a

j + a

= a

i + a

j + a

(12)

is invariant under simple six transformations

1) j

= j i

= i + a 2) i

= i j

= j + b

3) i

= wi j

= wj

4) j

= j i

= di 5) j

= j i

= i + tj

6) i

= i j

= j + t

The central moments are invariant under the trans-

lation 1 and 2. Normalized central moments are in-

variant under the scaling 3. We are interested in ﬁnd-

ing moments invariants also under other transforma-

tions e.g. one-axis scaling 4 and skew transformation

5 and 6. Next combinations will provide compound

moments that support recognition of the characters

ICINCO 2004 - SIGNAL PROCESSING, SYSTEMS MODELING AND CONTROL

104

= µ

− µ

= (µ

− µ

)

−

4(µ

− µ

)(µ

− µ

)

= µ

(µ

− µ

) − µ

(µ

− µ

(µ

− µ

)

= µ

− 6µ

+6µ

(µ

− µ

+µ

(6µ

− 8µ

)

+9µ

− 18µ

+6µ

(2µ

− µ

) + 9µ

−6µ

+ µ

= µ

+ µ

= (µ

− µ

)

+ 4µ

= (µ

− 3µ

)

+ (µ

− 3µ

)

= (µ

+ µ

)

+ (µ

− µ

)

(13)

= (µ

− 3µ

)(µ

+ µ

)×

×[(µ

+ µ

)

− 3(µ

+ µ

)

+3(µ

− µ

)(µ

+ µ

)×

×[3(µ

+ µ

)

− (µ

+ µ

)

]

= (µ

− µ

)[(µ

+ µ

)

− (µ

+ µ

)

+4µ

(µ

+ µ

)(µ

+ µ

)

= (3µ

− µ

)(µ

+ µ

)×

×[(µ

+ µ

)

− 3(µ

+ µ

)

+(3µ

− µ

)(µ

+ µ

)×

×[3(µ

+ µ

)

− (µ

+ µ

)

]

= µ

− 4µ

+ 3µ

= µ

− 2µ

−

− µ

The algebraic moment invariants are computed

from the ﬁrst t central moments and are given as the

eigenvalues of predeﬁned matrices, M[j, k], whose

elements are scaled factors of the central moments.

In contrast to Hu’s geometric moment invariants, the

algebraic moment invariants can be constructed up

to arbitrary order and are invariant to afﬁne transfor-

mations. The algebraic moment transform of (Eqn.

5) can be extended to generalized form by replacing

the conventional transform kernel i

with a more

general kernel of P

(i)P

(j) - the Legendre polyno-

mial or Zernike polynomial respectively. Since both

Legendre and Zernike polynomials are complete sets

of orthogonal basis, Legendre and Zernike moments

are called orthogonal moments. Orthogonal moments

allow to accurately reconstruct the described shape.

They make optimal utilization of shape information.

3.2 Zernike moments

Zernike moment of order n and repetition m is de-

ﬁned as (Khotanzad, 1990), (Teh, 1988)

n + 1

≤1

∗

(ρ, θ)f(x, y)dxdy (14)

where:

- f(x, y) is the image intensity at (x, y) in Cartesian

coordinates,

- V

∗

(ρ, θ) is a complex conjugate of

(ρ, θ) = R

(ρ)e

jmθ

in polar coordinates

(ρ, θ),

- n ≥ 0, and n − |m| is even positive integer.

The polar coordinates (ρ, θ) in the image domain

is related to the Cartesian coordinates (x, y) as x =

ρcos(θ) and y = ρsin(θ).

(ρ) is a radial deﬁned as (Khotanzad, 1990), as

follows:

(ρ) =

n−|m|

s=0

(−1)

[(n − s)!]ρ

n−2s

s!(

n+|m|

− s)!(

n−|m|

− s)!

(15)

The ﬁrst six orthogonal radial polynomials are:

(ρ) = 1 R

(ρ) = ρ

(ρ) = 2ρ

− 1 R

(ρ) = ρ

(ρ) = 3ρ

− 2ρ R

(ρ) = ρ

(16)

Zernike moments, which are proven to have very

good image feature representation capabilities, are

based on the orthogonal Zernike radial polynomials.

Zernike moments are deﬁned as continuous integrals

over a domain of normalized coordinates. The

implementations of such moment functions therefore

involve the following sources of errors: (i) the dis-

crete approximation of the continuous integrals, and

(ii) the transformation of the image coordinate system

into the domain of the orthogonal polynomials.

Moments of order n with repetition m of a discrete

image function f(k, l) with spatial dimension M ×N

are given by

n + 1

M−1

k=0

N−1

l=0

f(k, l)R

(ρ

k,l

−jmθ

(17)

where the discrete polar coordinates

+ y

= arctan(

) (18)

are transformed by

INVARIANT FEATURES FOR CHARACTER RECOGNITION

105

Table 2: Hu moments invariants for characters with Figure 3.

Hu moments invariants

1 6.7E-004 3.7E-010 5.6E-015 2.1E-014 1.3E-028 2.5E-019 -1.8E-028

2 6.8E-004 4.9E-010 2.0E-015 7.8E-015 6.0E-030 1.0E-019 3.0E-029

2 6.9E-004 2.2E-010 2.3E-014 1.4E-014 2.4E-028 1.1E-019 -1.1E-028

4 6.8E-004 7.4E-011 8.9E-016 8.2E-015 4.8E-030 -5.7E-020 2.2E-029

4 7.0E-004 3.3E-010 7.8E-015 1.4E-014 -1.4E-028 -1.4E-019 -1.7E-029

8 6.9E-004 2.0E-010 2.2E-015 4.4E-015 6.7E-030 2.9E-020 -1.2E-029

= c +

l(d − c)

N − 1

= d −

k(d − c)

M − 1

(19)

for k = 0, . . . , M − 1 and l = 0, . . . , N − 1. c and d

are real numbers take values as shown in Figure 4.

To calculate the Zernike moments of an image

f(x, y) , the image is ﬁrst mapped to the unit disk

using polar coordinates, where the centre of the

image is the origin of the unit disk. Those pixels

falling outside the unit disk are not used in the

calculation.

Figure 4: Mapping of a discrete image function a)c = −1,

d = 1 and b)c =

−1

√

, d =

√

Normalize the Zernike moments

(20)

where, Z

is the Zernike moments.

Because Z

is complex, we often use the Zernike

moments modules |Z

| as the features of shape in

the recognition of pattern.

The magnitude of Zernike moments has rotational

invariant property. An image can be better described

by a small set of its Zernike moments than any other

types of moments such as geometric moments, Leg-

endre moments, rotational moments, and complex

moments in terms of mean-square error. Zernike mo-

ments do not have the properties of translation invari-

ance and scaling invariance. The way to achieve such

invariance is images translation and image normaliza-

tion before calculation of Zernike moments.

4 SIMILARITY MEASURE

Recognition is made by associating a feature vector

calculated for unknown character with a set of feature

vectors for a character obtained with a similar train-

ing set. The moment invariant approach to charac-

ter identiﬁcation attempts to represent the pattern by

a set of K moments invariant-features, thus as a point

in K-dimensional feature space. Points corresponding

to patterns of the same class are assumed to be close

together, not close to those of different classes. The

similarity distance (d

) between two feature vectors

and M

for a pair of character images X and Y

( C(ategory) code image X is identical to C(ategory)

code image Y ) is computed as the Euclidean distance

as follows

, M

) =

k=1

− m

)

(21)

The value of d

is zero or small for identical or

similar characters and high for different characters.

5 CONCLUSION

The main contribution of this paper is presentation

of character recognition using set of orthogonal

moments. In particular, we have constructed feature

vector by applying the normalized various moments.

This vector used in conjunction with a simple clas-

siﬁcation measure such as the Euclidean distance, is

capable of achieving satisfactory performance levels.

In our experiment we used our own database which

provides handwritten numerals from a hundred writ-

ers. Each numeral has 10 samples. Since the size of a

sample image varies, we ﬁrst normalized each image

into the size of pixels. If the Hu moments invariants

ICINCO 2004 - SIGNAL PROCESSING, SYSTEMS MODELING AND CONTROL

106

were used, the recognition rate of over 93.2% could

be obtained. Five samples of each characters were

used as training sets. The recognition rate of using all

combinations of moments Eq.13 was found to be sig-

niﬁcantly better - 98.7%. For real time applications,

we prefer only ﬁrst four moment invariants e.g. with

recognition rate 96.8%. When more training sets were

used, the recognition rate was found to be higher.

REFERENCES

Hu, M. (1999). Pattern recognition by moment invariants.

Proc. IRE.vol. 49, pp.1428. .

Flusser, J., Suk, T.(1993) Pattern recognition by afﬁne mo-

ment invariants. Pattern Recognition, vol. 26, pp. 167-

174.

Haralick, R., Shanmugam, K., Dinstein, I. (1973) Textural

features for image classiﬁcation. IEEE Trans. on Sys-

tems, Man, and Cybernetics. SMC-3(6), pp.610-621,

1973.

Iivarinen, I., Peura, M., Sarela, J., Visa, A. (1997) Compar-

ision of combined shape descriptors for irregular ob-

jects. Proc. 8th British Machine Vision Conf.. vol.2,

pp.430-439, 1997.

Suen, C. Y., Nadal, C., Mai, T.A., Legault, R., Lam, L.

(1992) Computer recognition of unconstrained hand-

written numerals. Proc. IEEE, vol. 80 (7), pp. 1162-

1189, 1992.

Teh, C.H., Chin. R.T., (1988) On image analysis by the

methods of moments. IEEE Trans. Pattern Anal. Ma-

chine Intell., 10 (4), 496-513, July 1988.

Khotanzad, A., Hong, Y.H. (1990) Invariant image recogni-

tion by Zernike moments. IEEE Trans. Pattern Anal.

Machine Intell., 12 (5) , 489-498, May 1990.

INVARIANT FEATURES FOR CHARACTER RECOGNITION

107