ACTIVE 3D RECOGNITION SYSTEM BASED ON FOURIER

DESCRIPTORS

E. González, V. Feliú, A. Adán* and Luis Sánchez**

H.T.S. of Industrial Engineering, University of Castilla La Mancha, Ave. Camilo José Cela s/n

Ciudad Real, 13071, Spain

*Higher School of Informatics,Universit

y of Castilla La Mancha, Ronda de Calatrava 5

Ciudad Real, 13071, Spain

**U.E of Technical Industria

l Engineering,University of Castilla La Mancha, Ave Carlos III, s/ n

Toledo, 45071, Spain

Keywords: Active recognition system, nex

t best view, silhouette shape, Fourier descriptors.

Abstract: This paper presents a new 3D object recognition/pose strateg

y based on reduced sets of Fourier descriptors

on silhouettes. The method consists of two parts. First, an off-line process calculates and stores a clustered

Fourier descriptors database corresponding to the silhouettes of the synthetic model of the object viewed

from multiple viewpoints. Next, an on-line process solves the recognition/pose problem for an object that is

sensed by a real camera placed at the end of a robotic arm. The method avoids ambiguity problems (object

symmetries or similar projections belonging to different objects) and erroneous results by taking additional

views which are selected through an original next best view (NBV) algorithm. The method provides, in very

reduced computation time, the object identification and pose of the object. A validation test of this method

has been carried out in our lab yielding excellent results.

1 INTRODUCTION

Most computer vision systems used in robotics

environments perform 3D object recognition tasks

using a single view of the scene (Bustos et al.,

2005). Commonly, a set of features is extracted and

matched with features belonging to an object

database. This is why so many researchers focus

their recognition strategy on finding features which

are capable of discriminating objects efficiently

(Helmer and Lowe, 2004). However, these

approaches may fail in many circumstances due to

the fact that a single 2D image may be insufficient.

For instance, this happens when there are objects

that are very similar from certain viewpoints in the

database (ambiguous objects); a difficulty that is

compounded when we have large object databases

(Deinzer et al., 2003).

A well known strategy that solves the ambiguity

pr

oblem is based on using multiple views of the

object. Active recognition systems provide the

framework to efficiently collect views until the

sufficient level of information for developing the

identification and posing estimation tasks is obtained

(Niku, 2001).

Previous works on active recognition differ in

the way they

represent objects, the way they

combine information and the way of they plan the

next observation (Roy et al., 2004). These systems

use 3D representation schemes based on the object

geometric model or on the object appearance.

Although X recognition based on geometric models

might be potentially more effective and allow for the

identification of objects in any position, they raise

important problems of practical aplicability.

Moreover, the methods based on X appearance are

currently the most successful approaches for dealing

with 3D recognition of arbitrary objects.

Many strategies for solving the 3D object

recogn

ition problem using multiple views have been

proposed: an aspect graph is used in Hutchinson and

Kak (1992) to represent the objects. This criterion

handles a set of current hypotheses about the object

identity and position. It characterizes the recognition

ambiguity by an entropy measure (Dempster-Shafer

theory) and evaluates the next best sensing operation

by minimizing this ambiguity. Borotsching et al.

318

González E., Feliú V., Adán A. and Sánchez L. (2007).

ACTIVE 3D RECOGNITION SYSTEM BASED ON FOURIER DESCRIPTORS.

In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics, pages 318-325

DOI: 10.5220/0001627003180325

Copyright

c

SciTePress

(1999) represent the objects by some appearance-

based information, namely the parametric

eigenspace. This representation is augmented by

adding some probability distributions. These

probability distributions are then used to provide a

gauge for performing the view planning. Sipe and

Casasent (2002) use a probabilistic extension of the

feature space trajectory (FST) in a global eigenspace

to represent 3D views of an object. View planning is

accomplished by determining - for each pair of

objects – the most discriminating view point in an

off-line training stage. Their approach assumes that

the cost of making a mistake is higher than the cost

of moving the sensor.

In general, most of these approaches solve the

3D object recognition problem using stochastic or

probabilistic models and, consequently, they require

a large dataset for training (Deinzer et al., 2006).

Here we present a different way to focus on the

problem.

The key to our active recognition system consists

of using a reduced set of Fourier descriptors to

connect and develop the recognition phases: object

representation, classification, identification, pose

estimation and next best view planning.

We focus the object representation on silhouettes

because: they can be robustly extracted from

images, they are insensitive to surface feature

variations - such as color and texture - and, finally,

they easily encode the shape information (Pope et

al. 2005). The most popular methods for 2D object

recognition from silhouettes are based on invariant

moments or Fourier descriptors. Invariant moments

exhibit the drawback that two completely different

silhouettes may have the same low order invariant

moments, which may lead to ambiguities in the

recognition process. Fourier descriptors yield much

more information about the silhouette, and only

similar silhouettes exhibit similar Fourier

descriptors. Since we consider the objects to be non-

occluded and the background to be uncluttered, we

use a representation scheme in which the silhouettes

from different viewpoints are represented by their

Fourier descriptors.

This paper is organized as follows. Section 2

presents an overview of the method. Section 3

describes our object identification/pose estimation

approach. Section 4 details the next best view

method. Section 5 shows the performance of our

method by carrying out experiments on a real

platform, and some conclusions are stated in Section

6.

2 OVERVIEW OF THE METHOD

In this method the scene silhouette (silhouette of the

3D object to be recognized) is recognized among a

set of silhouettes (in our case, 80 or 320 per object)

of a group of objects through an algorithm based on

Fourier descriptors. Therefore, X recognition of the

silhouette of the scene involves both object

identification and pose. The method consists of off-

line and on-line parts.

The off-line process consists of building a

structured database of silhouettes belonging to a

generic set of objects. Firstly, a high precision three-

dimensional model of each object is obtained by

means of a laser scanner sensor. Next, this model is

viewed from a set of homogeneous viewpoints

obtaining the corresponding set of 2D silhouettes.

The viewpoints correspond to the vertexes of a

tessellated sphere with origin in the centre of mass

of the object. Figure 1 shows an object model inside

the tessellated sphere, the projected image of the

model and its silhouette from a specific viewpoint.

a)

b)

c)

d)

Figure 1: a) Object model put into the tessellated sphere b)

View of the object model from a specific viewpoint, c)

Depth image d) 2D silhouette.

The database is structured in clusters using only

three Fourier descriptors. To build the clustering we

used a k-means algorithm (Netanyahu et al., 2002).

This strategy allows us to split the silhouette search

space in zones where the silhouettes are roughly

similar. Consequently, the cost of the recognition

process is dramatically reduced. Figure 2 a) shows

the most important Fourier descriptors modules for a

couple of silhouettes. In our case, we have taken the

second, third and next to the last values. Figure 2 b)

presents the reconstructed silhouette with the three

Fourier descriptors superimposed on the original

one. Note that by selecting the most meaningful

Fourier components it is possible to work with

approximate shapes. Figure 3 shows a spatial

ACTIVE 3D RECOGNITION SYSTEM BASED ON FOURIER DESCRIPTORS

319

representation of the clusters that have been

extracted in our database.

The on-line process is designed to solve the

recognition/pose problem of an object that is viewed

by a camera in a real environment. The essential

steps are: Fourier descriptor calculation,

classification (discrimination) process,

identification/pose calculation and next view

algorithm. Next, a brief explanation of these steps is

provided.

a)

b)

Figure 2: a) Fourier descriptors modules b) silhouette (red)

and silhouette recovered with three Fourier descriptors

(blue).

To calculate the Fourier descriptors a suitable

image preprocessing is carried out on the original

image. Specifically this process consists of filtering,

thresholding and contour extraction. Next the points

of the contour are taken as a sequence of complex

numbers and the Fourier descriptors are finally

computed.

The discrimination phase classifies the silhouette

of the scene into a single or a set of clusters. The

selected clusters constitute the work sub-space in the

pose phase. Formally,

Let a database of N objects, C

...O ,O,O

N21

k

the

kth cluster, k ∈ [1..K], the n-th silhouette of

the object m, the k-th cluster prototype, D the

Euclidean distance, the k-th cluster radius where

and z the silhouette of the

scene to be matched. The subspace will be

formed by the clusters, which verify one or both of

the following conditions:

S

nm

k

k

p

k

R

) S,D(pmax R

nm

kkk

=

sub

S

Criterion 1: If

subkkk

SCthenRzpD ∈<),(

Criterion 2:

If

[1..K] i ,S C then |minD - D·|

subiki

∈∈

<

ε

The criterion 1 is satisfied for cases where z is

inside a cluster whereas criterion 2 corresponds to

cases where the silhouette z is inside an area with

very high cluster density or where the scene

silhouette falls outside the clusters. Thus, the

discrimination process sets a work subspace

with a reduced database of silhouettes.

sub

S

The identification phase, which is carried out

in , yields, in general, a reduced set of candidate

silhouettes. The reason for taking only a few

candidates is as follows. Matching and alignment

techniques based on contour representations are

usually effective in 2D environments. Nevertheless,

in 3D environments these techniques have serious

limitations. The main problems with the contour

based techniques occur due to the fact that the

information on the silhouettes may be insufficient

and ambiguous. Thus, similar silhouettes might

correspond to different objects from different

viewpoints. Consequently, a representation based on

the object contour may be ambiguous, especially

when occlusion circumstances occur.

sub

S

In essence, the identification phase compares the

silhouette from the scene with the silhouettes in the

subspace by means a quadratic error

minimization applied on the modulus of the Fourier

descriptors. If the identification/pose process

proposes more than one candidate silhouette, then

the solution is ambiguous and it is necessary to

apply the Next Best View planning method (NBV).

Figure 4 shows a scheme with the main process.

sub

S

Figure 3: Spatial representation of the silhouette clusters.

In most cases, the recognition and pose estimation

phase is finished after several views of the object are

taken and only one candidate is found. In this

process, the position of the next view is calculated

through an algorithm based on the set of candidate

silhouettes obtained in the previous view. This will

ICINCO 2007 - International Conference on Informatics in Control, Automation and Robotics

320

be explained in Section IV. Figure 4 shows a scheme

of the main process.

Figure 4: Diagram of the active recognition system.

3 OBJECT RECOGNITION AND

POSE ESTIMATION

PROCEDURE

Fourier descriptors can be used to represent closed

lines (silhouettes). They can be made invariant to

translations and rotations, and allow easy filtering of

image noise (

Deinzer et al., 2003). Assume a contour

l(n) composed of N points on the XY plane:

[]

1..0,)(),()( −== Nnnynxnl

(1)

where the origin of index n is an arbitrary point of

the curve, and n and n+1 are consecutive points

according to a given direction (for example

clockwise direction) over the silhouette. Assume

also that points over the curve have been regularized

in the sense that two consecutive points are always

at the same Euclidean distance. Let us define the

complex sequence and its discrete Fourier

transforms Z(k)=F(z(n))as

:

)()()( njynxnz +=

∑

−

=

−=

1

0

)/2exp()()(

N

n

NknjnzkZ

π

10 −≤≤ Nk

(2)

(3)

Assume also a data base of R silhouettes

with

),(ns

r

10 −≤≤

r

Nn R

r

≤≤1 , whose respective

discrete Fourier transforms are S

r

(k).

A critical aspect of our method is its computation

time. Then the FFT algorithm is used to obtain the

Fourier descriptors. Then N must be power of 2 and

both the scene silhouette and the silhouettes of the

data base must be regularized to a number of N = 2

speed because we want to recognize objects in real

asic problem to be solved in our method is

to

N’

points.

The b

match the scene silhouette z(n) to some silhouette

s

r*

(n) of the data base, under the assumptions that

z(n) may be scaled (

λ

), translated (

yx

jcc + ) and

rotated (

ϕ

) with respect to the ching

silhouette of the data base, and that the reference

point on z(n) (n = 0) may be different from the

reference point of that data base silhouette (we

denote that displacement

best mat

δ

). The next section deals

with selecting the silhoue es and obtaining X c,

δ

,

ϕ

,

λ

.

tt

3.1 Close Silhouettes Selection

Suppose that z is the silhouette of an object captured

r

+

by the camera and that it corresponds to the

silhouette s

r*

(n) that belongs to the silhouette

database. In general, z is matched to s

r*

(n) after

displacement, rotation, scaling and centre translation

parameters are found. In general for

),(ns

r

in the

space domain:

Dnz cjns

=

)exp()),(()(

ϕ

λ

δ

(4)

where displaces

))(( nsD

r

δ

units the in

the e u

orig of

sequenc . Taking Fo rier transform:

cNkSNkjkZ

r

+

s

)(ns

r

j

=

)()/2xp(exp)( e)(

δ

π

ϕ

λ

(5)

• Translation: Since all silhouettes have

the

coordinate origin at their centre of mass

0)0(

=

r

S and from expression (5) ,

N/Zc )0(

=

.

• Close silhouettes identification.: Defining

cNkZkZ −= )()(

, the modulus of expression

(5) holds:

)()( kSkZ

r

λ

=

(6)

The matching procedure minimizes the mean

sq

uared error between the Fourier descriptors of the

scene silhouette and the silhouettes of the database.

Given a pair of silhouettes

))(),(( nsnz

r

, the

similarity index

J is defined as

r

:

)()()(

t

ZSJ

λλ

−−=

rrr

SZ

λ

(7)

wher

e

t

NZZZ |)1(||,)0((| −=

∧∧

…

,

t

rr

r

S = NS |)1(||)0((| −…

,| . |=absolute value.

Minimizing

)(

S

λ

r

J with re

fa

spect to the scaling

ctor

λ

, we obtain:

ACTIVE 3D RECOGNITION SYSTEM BASED ON FOURIER DESCRIPTORS

321

r

t

r

r

t

S

r

SS

Z

=

λ

,

−= ZZJ

t

r

2

)(

r

t

r

r

t

SS

SZ

(8)

After calculating for all silhouettes of the data

ba ho

other viewpoin

3.2 Pose Calculation

the set of indexes

e silhouettes. In

r

J

se we select the sil uettes which verify UJ

r

≤

,

U being a specific threshold. In this case, we

of many ambiguous silhouettes

{

cfc2c1

S . . . ,S,S } and it is necessary to select

an t to solve the ambiguity problem

(NBV section).

have a

set

Let us denote ),,,(

21 m

rrrL …=

of the candidat order to select the

best candidate among

L

candidates, a more accurate

procedure is carried o t. This procedure uses the

complete complex Fourier descriptors (not only the

modules as in the previous process). As a result of

this process, a new similarity index

f is obtained

and the pose parameters

u

δ

ϕ

λ

,,

are ca lated.

The cost function to be ized is (see (4)):

lcu

minim

(9)

where

t

NZ −…

))(

ˆ

())(

ˆ

(),,(

δδδθλ

PqZPqZf

t

−−=

∗

,))1(

ˆ

,),0(

ˆ

(

ˆ

ZZ =

t

rrrr

NNj

NsNjssP

))/)1(2exp(

)1(,,),/2exp()1(),0(()(

−

−=

πδ

πδ

δ

…

,

∗ denotes conjugate,

∗

t

denotes transpose

co

njugate, and

r

is restricted now to set

L

Let us denote

qj =)exp(

ϕ

λ

, opt iim zing (9)

res

pect to the complex number

q :

SS

PZ

ZZf

SS

PZ

q

t

t

t

t

t

∗

∗

2

)(

δ

(10)

∗

∗

∗

−==

)(

)(,)(

δ

δδ

notice that ).

to account that

≤≤ N

(

r

t

rr

t

r

SSPP ⋅=

∗∗

)()(

δδ

Taking in

1−0 is i

δ

nteger,

th

e right expression of (10) is calculated for all

possible values of

δ

and )(

min

δ

δ

rr

ff = is

determined.

Then

r

f is the similarity index of the silhouette

r

in t fine matching process,

r

δ

is the

corresponding displacement and is obtained

from ¿the? left equation of (10) particularized to .

Rotation and scaling are estimated from :

he

r

q

r

δ

r

q

rrrr

qq ∠==

ϕλ

;

(11)

4 NEXT BEST VIEW PLANNING

The goal of this phase is to provide a solution to the

ambiguity problem by taking a set of optimal

viewpoints. When an ambiguous case occurs, we

move the camera to another viewpoint from which

the silhouettes of the candidate objects are

theoretically very dissimilar.

As said before, in our scheme representation we

associate each silhouette stored in the database with

a viewpoint of the tessellated sphere. Then, the first

step in the NBV consists of aligning the candidate

spheres (corresponding to the viewpoints in our

models) with the scene sphere.

Let T

R

(S) be the tessellated sphere, N'

Rx

the

camera position and N

R1

the viewpoint that

corresponds to the candidate silhouette . To align

the two spheres a rotation must be applied to make

N'

ci

S

Rx

and N

R1

coincident (Adán et al., 2000).

Formally:

Let

Rx

R

Rx

R

zyx

ONON

ONON

uuu

'

'

),,(

1

1

→

×

→

→

×

→

=u

be the normal

vector to the plane defined by and , O

being the center of

T

1R

ON

→

Rx

ON'

→

I

. Let

θ

be the angle between the

last two vectors. Then, a rotation

θ

around the u axis

can first be applied to

T

R

(S). This spatial

transformation is defined by the following rotation

matrix R

u

(

θ

):

⎟

⎟

⎟

⎟

⎠

⎞

⎜

⎜

⎜

⎜

⎝

⎛

+−+−−−

−−+−+−

+−−−+−

=

θθθθθθ

θθθθθθ

θθθθθθ

θ

ccuusucuusucuu

sucuuccuusucuu

sucuusucuuccuu

zzxzyyzx

xzyyyzyx

yzxzyxxx

)1(.)1(.)1(

.)1()1(.)1(

.)1(.)1()1(

)(

u

R

(12)

where

c

θ

=cos

θ

and s

θ

=sin

θ

.

A second rotation

ϕ

around the axis

Rx

Rx

zyx

ON

ON

vvv

'

'

),,(

→

→

=v

is required to achieve

ICINCO 2007 - International Conference on Informatics in Control, Automation and Robotics

322

the best fitting of

T

R

(S) to T

R

(S') (see Figure 5). The

swing angle

ϕ

is determined by (14). This last set of

points can be obtained by applying a rotation matrix

Rv(

ϕ) that depends on a single parameter ϕ and can

be formally expressed as

:

⎟

⎟

⎟

⎟

⎠

⎞

⎜

⎜

⎜

⎜

⎝

⎛

+−+−−−

−−+−+−

+−−−+−

=

ϕϕϕϕϕϕ

ϕϕϕϕϕϕ

ϕϕϕϕϕϕ

ϕ

ccvvsvcvvsvcvv

svcvvccvvsvcvv

svcvvsvcvvccvv

zzxzyyzx

xzyyyzyx

yzxzyxxx

)1(.)1(.)1(

.)1()1(.)1(

.)1(.)1()1(

)(

v

R

(13)

R

1

(

ϕ

,

θ

)=R

v

(

ϕ

)·R

u

(

θ

)

(14)

Finally the alignment of the spheres is:

)(T),(R)(T

R1

'

R

SS ⋅=

θϕ

(15)

In the next step the Fourier energy is calculated

for each viewpoint.

Defining , where ,

icp

OS ∈

jcq

OS ∈

cp

S

cq

S

∈

]S...S,S[

cf2C1c

, where is the number of

candidate silhouettes, and

vp is a viewpoint from

T

f

R

(S). The energy is computed for all couples of

silhouettes as follows:

fjfiji

kZkZ

N

E

vp

Oj

N

k

vp

Oi

vp

OjOi

≤≤≠∀

−=

∑

−

=

,,

,|)()(|

1

2

1

0

,

(16)

)min(

,

vp

OjOi

vp

EE =

(17)

The NBV ν is defined as the viewpoint that

verifies:

pEE

vp

ν

ν

∀= )max(

(18)

Figure 5: Alignment process between candidate spheres

and the scene sphere.

Finally, the camera is moved to the best viewpoint

and a new image of the scene is captured and

matched with the model silhouette correspondents to

the best viewpoint using (9) and (11) equations.

Figure 6: Superimposed models after the alignment

process and energy values plotted on the nodes.

5 EXPERIMENTATION

A validation test of this method has been carried out

in our lab. The experimental setup is composed of a

Stäubli RX 90 Robot with a micro camera Jai-

CVM1000 at its end. This system controls the

position and vision direction on the camera, the

object always being centered in the scene. Figure 7

shows the experimental setup.

In the off-line process, the synthesized models

(with 80000 polygons/object) are built through a VI-

910 Konica Minolta 3D Laser scanner. At the same

time the silhouette database with their respective

Fourier descriptors are obtained and stored.

Currently we have used databases of 80 and 320

silhouettes/model.

In order to reduce redundant information and to

optimize the recognition/pose time, a Fourier

descriptors reduction has been carried out on the

silhouette models. Figure 2.a shows Fourier

descriptors modulus for an example. As X can be

seen, the first and the last descriptors are the most

meaningful. The reduction procedure consists of

considering the intervals [1,X(k)], [X(N-k),X(N)],

k=1,…N, until the error/pixel between the original

and reduced silhouettes is less than a threshold

χ

.

a) b)

Figure 7: a) Experimental Setup. b) Examples of synthetic

models.

ACTIVE 3D RECOGNITION SYSTEM BASED ON FOURIER DESCRIPTORS

323

The experimentation has been carried out with

19 objects that have been previously modelled with

80 and 320 views, considering descriptor reductions

of

χ

=0.05 and

χ

=0.5.

In the clustering process we have used 50

clusters. During this phase we use images with

resolution 640x480. Each silhouette in the database

was stored with a resolution of 512 points and the

database size was 12 MB (80 views) and 42 MB

(320 views).

Table 1.

# sil.

χ

t

A

ρ

A

t

B

ρ

B

0.05 2.652 1.640 2.475 2.382

80

0.5

2.047 1.653 1.863 2.473

0.05 4.901 1.089 4.739 2.107

320

0.5 3.336 1.339 3.096 2.261

The active 3D recognition system worked in all

tests achieving 100% X effectiveness. Table 1

shows the results obtained during the recognition

process without (A) and with (B) discrimination

phase. The results are compared taking into account:

the number of silhouettes of the model, threshold for

Fourier descriptors reduction (

χ

), mean square

error between the silhouette of the scene and the

estimated silhouette (

ρ). Variable t is the

computation time (seconds) on a Pentium III 800

Hhz processor.

Table II shows in detail the main process rates

using a database with 80 silhouettes/model and a

reduction factor

χ

=0.5.

From Tables 1 and 2 the following comments can

be made.

• In the whole process, most of the time is

devoted to extracting the object’s silhouette

(88,6% and 94,7%). Note that, this stage

includes several image preprocessing tasks

like filtering, thresholdin, etc. In part, such

a high percentage is also due to the fact that

we have used a reduced object database in

our experimentation. For large databases

(>100-500 objects) this percentage will

decrease at the same time that the

percentage corresponding to the candidates

selection stage will increase.

• Using 320 silhouettes per model increases

in a double the execution times with respect

the use of 80 silhouettes per model but the

ρ

decreases by 0,3 percent.

Table 2.

Algorithm

time (%)

Silhouette extraction

88.6

Identification

7.9

Pose estimation

1.4

Without

clustering

NBV

2.1

Silhouette extraction

94.7

Discrimination

0.8

Identification

2.2

Pose estimation

0.6

With

clustering

NBV

1.7

Figure 8: Comparison of discrimination between a random

method and our proposed method.

Two experiments were carried out: one running

our active recognition system which uses a random

selection of the next view, and another computing

the next best view from our D-Sphere structure. In

Figure 8 we can see the number of candidates in

each sensor position for a real case. The test average

reported that our method considerably reduced the

number of sensor movements: about 62%. The time

needed to calculate the next observation position is

very short: approximately 1.7% of all the time

needed to carry out a complete step of the

recognition process. Calculations were performed on

a Pentium III 800 Hhz processor. The active 3D

recognition system worked in all tests achieving

100% X effectiveness. Figure 9 illustrates a case of

ambiguity between two objects and how the system

solves the problem. Our NBV method shows much

higher discriminative capability than the random

method. Thus, the proposed strategy significantly

improves the recognition efficiency.

ICINCO 2007 - International Conference on Informatics in Control, Automation and Robotics

324

Figure 9: Solving an ambiguous case.

Figure 10 presents some matching and pose

estimation results using the proposed algorithm.

Figure 10: Some examples of 3D recognition without

ambiguity.

6 CONCLUSION

This paper has presented a new active recognition

system. The system turns a 3D object recognition

problem into a multiple silhouette recognition

problem where images of the same object from

multiple viewpoints are considered. Fourier

descriptors properties have been used to carry out

the clustering, matching and pose processes.

Our method implies the use of databases with a

very large number of stored silhouettes, but an

efficient version of the matching process with

Fourier descriptors make it possible to solve the

object recognition and pose estimation problems in a

greatly reduced computation time.

On the other hand, the next best view (NBV)

method efficiently solves the frequent ambiguity

problem in recognition systems. This method is very

robust and fast, and is able to discriminate among

very close silhouettes.

REFERENCES

Adán, A., Cerrada, C., Feliu, V., 2000. Modeling Wave

Set: Definition and Application of a New Topological

Organization For 3D Object Modeling. Computer

Vision and Image Understanding 79, pp 281-307.

Bustos, B., Kein, D.A., Saupe, D., Schreck, T., Vranic, D.,

2005. Feature-based Similarity Search in 3D Object

Databases. ACM Computing Surveys (CSUR)

37(4):345-387, Association For Computing

Machinery.

Borotschnig, H. Paletta, L., Pranti, M. and Pinz, A. H.

1999. A comparison of probabilistic, possibilistic and

evidence theoretic fusion schemes for active object

recognition. Computing, 62:293–319.

Deinzer, F., Denzler, J. and Niemann, H., 2003. Viewpoint

Selection. Planning Optimal Sequences of Views for

Object Recognition. In Computer Analysis of Images

and Patterns, pages 65-73, Groningen, Netherlands,

Springer.

Deinzer, F., Denzler, J., Derichs, C., Niemann, H., 2006:

Integrated Viewpoint Fusion and Viewpoint Selection

for Optimal Object Recognition. In: Chanteler, M.J. ;

Trucco, E. ; Fisher, R.B. (Eds.) : British Machine

Vision Conference

Helmer S. and Lowe D. G, 2004. Object Class

Recognition with Many Local Features. In Workshop

on Generative Model Based Vision 2004 (GMBV),

July.

Hutchinson S.A. and Kak. A.C., 1992. Multisensor

Strategies Using Dempster-Shafer Belief

Accumulation. In M.A. Abidi and R.C. Gonzalez,

editors, Data Fusion in Robotics and Machine

Intelligence, chapter 4, pages 165–209. Academic

Press,.

Netanyahu, N. S., Piatko, C., Silverman, R., Kanungo, T.,

Mount, D. M. and Wu, Y., 2002. An efficient k-means

clustering algorithm: Analysis and implementation.

vol 24, pages 881–892, july.

Niku S. B., 2001. Introduction to Robotics, Analysis,

Systems, Applications. Prentice Hall.

Poppe, R.W. and Poel, M., 2005. Example-based pose

estimation in monocular images using compact fourier

descriptors. CTIT Technical Report series TR-CTIT-

05-49 Centre for Telematics and Information

Technology, University of Twente, Enschede. ISSN

1381-3625

Roy, S.D., Chaudhury, S. and Banerjee. S., 2004. Active

recognition through next view planning: a survey.

Pattern Recognition 37(3):429–446, March.

Sipe M. and Casasent, D., 2002. Feature space trajectory

methods for active computer vision. IEEE Trans

PAMI, Vol. 24, pp. 1634-1643, December.

ACTIVE 3D RECOGNITION SYSTEM BASED ON FOURIER DESCRIPTORS

325