INVARIANT CODES FOR SIMILAR TRANSFORMATION AND

ITS APPLICATION TO SHAPE MATCHING

Eiji Yoshida and Seiichi Mita

To Toyota Technological Institute 2-12-1 Hisakata, Tempaku, Nagoya 468-8511, Japan

Keywords: Shape matching, curvature, shape classification, shape similarity measure.

Abstract: In this paper, we propose a new method for the measurement of shape similarity. Our proposed method

encodes the contour of an object by using the curvature of the object. If one objects are similar (under

translation, rotation, and scaling) in shape to the other, these codes themselves or their cyclic shift have the

same values. We compare our method with other methods such as CSS (curvature scale space), and shape

context. We show that the recognition rate of our method is 100 % and 90.40 % for the rotation and scaling

robustness test using MPEG7-CE-Shape1 and 81.82 % and 95.14 % for the similarity-based retrieval test

and the occlusion test using Kimia's silhouette. In particular, the value of the occlusion test is approximately

25 % higher than those of CSS, SC. Moreover, we show that the computational cost of our method is not so

large by comparison our method with above methods.

1 INTRODUCTION

A measurement of shape similarity for shape-based

retrieval in image databases should correspond with

our visual perception. This basic property leads to

the following requirements:

1. A shape similarity measure should present

recognition of perceptually similar objects

that are not mathematically identical.

2. It should not depend on scale, orientation,

and position of objects.

3. A measure must return high similarity when

we compare an object with those obtained by

varying its shape by moderate articulation

and occlusion. For instance, the similarity

measure of all hands in Figure 1 must be

large when they are compared with each

other and small when they are compared with

other objects such as heads, faces, and

aeroplanes.

4. It should be free from digitization noise and

segmentation errors.

We aim to apply a shape similarity measure to the

classification of image databases, where the object

classes are generally unknown. Therefore, a shape

similarity measure is required to be universal in the

sense that it allows us to identify and distinguish

between objects of arbitrary shapes without any

restriction on a shape assumption. In practice, the

computational complexity to measure a shape

similarity should be small. In particular, we

concentrate our effort on the above requirements 2

and 3.

Our method encodes the contour of an object by

using the curvature of the object. We use this code

as a shape similarity measure. If two objects are

perceptually similar (translation, rotation, and

scaling) in shape, these codes themselves or their

cyclic shift have the same values, and vice versa.

Our method is compared with previous ones

such as the well-known Fourier descriptor, CSS

(curvature scale space), and shape context. These

previous methods have several drawbacks. For

example, in the case of Fourier descriptors (FDs),

the mapping from the original object to the

representation features (e.g. FD magnitudes or

phase) is not one-to-one, i.e. the original object

cannot be uniquely reconstructed from the

representation features. The computational cost of

CSS is large. Our method can overcome this

drawback.

This paper is organized as follows: Section 2

presents the outline of our method. Here, we define

our code and describe the process to construct it and

how it is used to compare the similarity of two

objects. In Section 3, we report the experimental

results obtained using the shape databases

MPEG7_CE-Shape1 (Mokhtarian and Bober, 2003)

264

Yoshida E. and Mita S. (2008).

INVARIANT CODES FOR SIMILAR TRANSFORMATION AND ITS APPLICATION TO SHAPE MATCHING.

In Proceedings of the Third International Conference on Computer Vision Theory and Applications, pages 264-269

DOI: 10.5220/0001076002640269

 SciTePress

and Kimia’s silhouette (Belongie, Malik, and

Puzicha, 2002). The experiment is composed of

three parts with the following main objectives:

A: robustness to scaling and rotation by using

MPEG7_CE-Shape1,

B: performance of the similarity-based retrieval by

using Kimia’s silhouette, and

C: robustness to occlusion by using Kimia’s

silhouette.

In Section 4, we summarize our conclusions.

Original Rotated Articulated Occluded

Figure 1: Variations of a sample shape.

Figure 2: Curvature images of a leaf image and its rotated

image.

2 PROPOSED METHOD

First, we discuss the properties of curvature; then,

we define our code.

2.1 Curvature of a Planar Curve

The curvature of a planar curve is invariant under

rotation and translation. It is also inversely

proportional to the scale. Therefore, it is useful to

compare objects by using their curvatures. However,

direct use of curvature is difficult due to the

digitization errors. Figure 2 describes the curvature

images of a leaf and its rotation image. The values of

their curvature functions are slightly different. On

the other hand, the shapes of their curvature images

are similar. In particular, the positions of their

extreme points are almost equal to each other. This

property is an important aspect of our method and

extreme points of a curvature are used in our method.

2.2 Definition of our Code

Our method encodes the contour of an object as

follows:

1. Compute the curvature of the object. For this,

we use the same method as that used in CSS

(Mokhtarian and Mackworth, 1992, Costa

and Cesar, 2001).

2. Extract extreme points of the curvature

function and select points of an object corres-

ponding to them. Hereafter, we call these

points as “interest points.” Intuitively, these

points are like the corner points of an object.

Let

},...,,,{

03211

pppppp

denote the

set of interest points obtained by the above

step. Then, we construct a code of the object

defined as follows:

⎟

⎠

⎞

⎜

⎝

⎛

4342414

3332313

2322212

1312111

....

cccc

(1)

where

1111

+−+−

−

iiiiii

pppppp

(2)

1111

+−+−

iiiiii

pppppp

(3)

(

1+ii

is the length of a segment

1+ii

with

ni ,...,3,2,1

(4)

(

is the area of a triangle

11 +− iii

ppp

∑

and

⎩

⎨

⎧

)0(0

)0(1

(5)

(

k is the value of the curvature of an object

corresponding to

Figure 3 represents an outline of the construction of

our code.

INVARIANT CODES FOR SIMILAR TRANSFORMATION AND ITS APPLICATION TO SHAPE MATCHING

265

Figure 3: Outline of the construction of our code.

Each column of our code in (1) represents the infor-

mation of triangles

11 +− iii

ppp

with

ni ,...,3,2,1=

The meaning of the components

c ,

c , and

c of the

-th column of the above code is the follo-

wing:

c ,

c : information of a side of

11 +− iii

ppp

c : information of the area of

11 +− iii

ppp

c : information of convexity.

By using this code, we compare the similarity

between two objects. It is invariant under translation

and scaling. Rotation causes only a cyclic shift of a

column in it. Moreover, the mapping from an object

to this code is injective.

2.3 Application to Shape Matching

The process of comparing two objects by using our

code is as follows.

Let

},...,,{},,...,,{

2121 nm

qqqppp

denote the sets of

interest points of two objects

and

respectively.

1. Compute the value of the curvature

corresponding to

and

with

mi ,...,3,2,1=

and

nj ,...,3,2,1

( nm ≥ ).

2. If

nm >

, remove the interest points of

such

that the values of the curvature are the top

−

ranked points in the ascending order of the absolute

value.

3. Construct both codes.

4. The codes of

and

are denoted by (6) and (7),

respectively

⎟

⎠

⎞

⎜

⎝

⎛

4342414

3332313

2322212

1312111

....

aaaa

(6)

⎟

⎠

⎞

⎜

⎝

⎛

4342414

3332313

2322212

1312111

....

bbbb

(7)

(The meaning of each component is the same as that

in Section 2.2.). Then, we determine that the

-th

column of

and the

-th column of

are

common if the following conditions are held:

,||

111

tba

<−

,||

122

tba

−

(

1.0

≈

)

⎪

⎩

⎪

⎨

⎧

=−

≠<<

)(0

)0(

3233

otherwiseba

btba

jji

(

5.2

≈

)

−

We construct the

nn × -matrix

)(

comCOM

such that if the

-th column

and the

-th column of

are

common,

com

; otherwise,

com

5. Compute the following number:

4321

sssss +

(8)

where

Extract the interest points

Approximate an object

by using triangles

Set triangles in turn

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

266

}1..,...,2,1|{#

==∃−=

comtsnji

(9)

}),...,,(|max{

212

∑

∈−=

niu

Uuuucom

(10)

(

)},...3,2,1{(},...,2|)1,..,,..,2,1,{( nniiniiiU ∪=−++=

)

∑

)(

(11)

with

⎪

⎩

⎪

⎨

⎧

−

≠=−

∑

)(||

)0(}1||min{|

3'3

)(

elseba

comifcomba

ijijji

})),...,,(|max{(

∑∑

∈=

niu

Uuuucomcom

−

(12)

We use above

to measure the similarity

between two objects. We call

the similarity

number.

6. For all

i with

−

≤

≤ ti

(

ntn 8.05.0

≤

remove the interest points of

and

such that

their absolute values of curvature sorted in the

ascending order are less than or equal to that of

)( inm +−

-th and less than or equal to that of i -

th, respectively. Thereafter, repeat steps 3 to 5 by

replacing

with

in −

We denote the similarity number of the

-th

trial by

(

ss =

). Compute the mean value of

each similarity number, i.e.

∑

−

(13)

is the definition of the similarity between

and

Here, we provide an additional explanation

about

and the parameters

4321

,,, ssss

. If

small, the similarity between

and

is high, and

vice versa. The role of

is to measure how many

common parts of a shape exist between

and

. It

is used to measure the rough similarity.

is used to

calculate how many common connected parts of the

shape exist between

and

. It is used to measure

the close similarity. In fact,

is not small unless

the shapes of

and

are considerably close (e.g.

and

are similar in shape).

is mainly used to

compute the local difference between

and

is large if

−

is large. This parameter plays a role

to distinguish dissimilar shapes.

Due to step 6,

remains small if the curvatures

and

are similar; otherwise

becomes large.

Therefore,

is small when the shapes of

and

are similar or almost similar (e.g. moderate

articulation and occlusion).

3 EXPERIMENTS

The first experiment evaluates the robustness to

scaling and rotation by using MPEG7_CE-Shape1.

The second and third experiments evaluate the

performance of the similarity-based retrieval and

robustness to occlusion by using Kimia’s silhouette.

3.1 Robustness to Rotation and Scaling

3.1.1 Robustness to Rotation

The database MPEG7_CE-Shape1 includes 420

shapes: 70 basic shapes and 5 derived shapes from

each basic shape by rotation through angles: 9

°, 36°,

°, 90° and 150°. Each of these 420 images was

used as a query image. The number of correct

matches was computed in the top 6 retrieved images.

Thus, the best result is 2520 matches. Figure 4

shows some shape instances in MPEG7_CE-Shape1.

Figure 4: Shape instances in MPEG7_CE-Shape1.

3.1.2 Robustness to Scaling

The database includes 420 shapes: 70 basic shapes

are the same as in 3.1.1 and 5 shapes are derived

from each basic shape by scaling digital images with

factors 2, 0.3, 0.25, 0.2, and 0.1. Each of these 420

images was used as a query image. The number of

correct matches was computed in the top 6 retrieved

images. Thus, the best possible result is 2520

matches. In Table 1, the results of rotation and

scaling tests are presented. The presented results

except for our method are based on (Mokhtarian and

Bober, 2003, Latecki, Lakamper and Eckhardt,

2000).

INVARIANT CODES FOR SIMILAR TRANSFORMATION AND ITS APPLICATION TO SHAPE MATCHING

267

Table 1: Results of rotation and scaling tests.

Fourier CSS Proposed method

rotation 100% 100% 100%

scaling 86.35% 89.76% 90.40%

The parameters

and

in these experiments

are:

,1.0

=t 4.2

nt 7.0

. The scaling robustness

test is difficult because several objects are severely

distorted under reduced factors of 0.2 and 0.1.

Figure 5 shows a severely distorted sample reduced

by a factor of 0.1.

Original 0.1

Figure 5: Shape of a running person and its scaled-down

and re-sampled version.

3.2 Performance of the

Similarity-based Retrieval

The database Kimia’s silhouette includes 99 shapes

and is divided into 9 classes of various shapes. Each

image was used as a query, and the number of

images belonging to the same class was counted in

the top 11 matches. Since the maximum number of

correct matches for a single query image is 11, the

total number of correct matches is 1089. Some of its

samples are shown in Figure 6, where the shapes

positioned in the same row belong to the same class.

Figure 6: Example shapes in Kimia’s silhouette.

Most images of the database consist of several basic

shapes and their occluded and articulated shapes. A

few images include similar but different animals as

shown in third row of Figure 6. In Table 2, the

results of this experiment are presented. The

parameters

and

in this experiments are:

,1.0

=t 9.2

nt 8.0

. Proposed method + SC in

Table 2 means a combined method of proposed

method and shape context. The recognition rate of

proposed method + SC is better than that of

proposed method. This is because the interior

information of SC is added to the exterior

information of our method.

Table 2: Results of the similarity-based retrieval.

3.3 Robustness to Occlusion

In this experiment, we took three image classes (fish,

aeroplane and art object images) which consist of

eight images respectively. We changed their images,

and each image was impaired by 10-25% from four

different directions (front, rear, right, and left). For

each class, 96 images were tested, and the number of

the correct matches was counted. Some of the

original images and samples of occluded images are

shown in Figure 7. In Table 3, the results are

presented.

Figure 7: Example shapes of original and occluded images.

Table 3: Results of the occlusion tests.

10% 20% 25% total

CSS 91.67% 72.92% 47.92% 70.88%

SC 90.76% 66.67% 54.17% 70.49%

Proposed method 100% 96.88% 88.55% 95.14%

The parameters

and

in this experiments are:

,1.0

9.2

nt 7.0

. Since our method uses the

shape of the contour to measure the similarity, it is

more suitable for recognition of partially occluded

objects than CSS and SC. There exists a previous

result (

Krolupper and J. Flusser, 2007) deals with the

recognition of the partial occlusion of objects. It also

takes into account the invariance to the affine

transformation. However, it could not be used to

convex objects such as triangle, rectangular, since it

Recognition rate

CSS 73.19%

Shape context(SC) 76.86%

Proposed method 81.82

Proposed method + SC 87.51

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

268

is employed the zero-crossing points of curvature.

Our method can be applied for those objects.

3.4 Remark on Computational Cost

CSS involves large computational costs due to the

iterations of a Gaussian filter. The cost is at least 100

times larger than that of our method, where the

Gaussian filter is used only once. For example, the

calculation time by Matlab programming with

Pentium(R) D 3.2GHz processor to construct the

CSS image of a leaf in Figure 3 is about 150 s, while

the calculation time to construct our code is about

1.4 s. The computational cost to compare the

similarity of two objects by using our code (except

for the complexity of computing the curvature) is

low. It requires about 0.25 s. Figure 8 shows the

relationship between the length of the contour of a

leaf image given in Figure 2 and the calculation time

of three methods (CSS, shape context, our method).

It follows that the calculation time of our method is

about one-hundredth lower than that of CSS, but

about five times greater than that of shape context by

Figure 8. The vertical line of Figure 8 is the

logarithm of the calculation time and the horizontal

axis is the length of contour.

0.1

100

1000

150 250 300 400 450 500 550 600

CSS

sh ape

context

pr op osed

method

Figure 8: Graphs between the length of the contour of a

leaf image and the calculation times of CSS, shape context

and proposed method.

4 CONCLUSIONS

We have proposed a new method of shape matching.

It is shown that the computational cost of our

method is lower than that of CSS. In our method, the

recognition rates of the rotation and scaling

experiments are 100% and 90.40%, respectively.

These results are slightly better than CSS’s results.

In the similarity-based retrieval and occlusion

experiments, the recognition rates of our method are

81.82% and 95.14%, respectively. These results are

greater than that of the CSS and SC. Fourier

descriptors and shape context have smaller

computational complexities than our method due to

a Gaussian filter. The recognition performance of

our method is better than those of Fourier descriptor

and shape context in the above experiments.

ACKNOWLEDGEMENTS

This work was supported by research center for

integration of advanced intelligent systems and

devices of Toyota Technological Institute.

REFERENCES

F. Mokhtarian and A. K. Mackworth, 1992. A theory of

multi-scale, curvature based shape representation for

planar curves, IEEE Trans. Pattern Analysis and

Machine Intelligence, vol. 14, no. 8, pp. 789-805.

F. Mokhtarian and M. Bober, 2003. Curvature scale space

Representation: Theory, Applycations, and MPEG-7

Standardization, Computa-tional Imaging and Vision,

Vol. 25, Kluwer Academic Publishers.

L. J. Latecki and R. Lakamper, 2000. Shape similarity

measure based on correspondence of visual parts,

IEEE Trans. Pattern Analysis and Machine

Intelligence, vol. 22, no. 10, pp. 1185-1190.

L.J.Latecki, R.Lakamper and U.Eckhardt, 2000. Shape

Descriptors for Non-rigid Shapes with a Single Closed

Contour, CVPR.

L.J.Latecki, R.Lakamper and D. Wolter, 2005. Optimal

partial shape similarity, Image and Vision Computing

23, pp. 227-236.

S. Belongie, J. Malik, and J. Puzicha, 2002. Shape

matching and object recognition using shape context,

IEEE Trans. Pattern Analysis and Machine Intelli

gence, vol. 24, pp. 509-522.

A. Khotanzad and Y. H. Hong, 1990. Invariant image

recognition by Zernike moments, IEEE Trans. Pattern

Analysis and Machine Inte-lligence, vol. 12, no. 5, pp.

489-497.

C. T. Zahn and R. Z. Roskies, 1972. Fourier descriptors

for plane closed curves, IEEE Trans. on Computers,

C-21, pp. 269-281.

L. da F. Costa and R.M. Cesar, 2001. Shape Analysis and

Classification, Theory and Practice, CRC press. F.

Krolupper and J. Flusser, 2007. Polygonal shape

description for recognition of partially occluded

objects, Pattern Recognition Letters, 28, pp.1002-1011.

INVARIANT CODES FOR SIMILAR TRANSFORMATION AND ITS APPLICATION TO SHAPE MATCHING

269