IMPROVING 3D SHAPE RETRIEVAL WITH SVM

Elisabetta Delponte, Curzio Basso, Francesca Odone and Enrico Puppo

Department of Computer and Information Sciences - University of Genova, Via Dodecaneso 35, 16146 Genova, Italy

Keywords:

Shape retrieval, Statistical learning, Support Vector Machines.

Abstract:

In this paper we propose a technique that combines a classiﬁcation method from the statistical learning lit-

erature with a conventional approach to shape retrieval. The idea that we pursue is to improve both results

and performance by ﬁltering the database of shapes before retrieval with a shape classiﬁer, which allows us to

keep only the shapes belonging to the classes most similar to the query shape. The experimental analysis that

we report shows that our approach improves the computational cost in the average case, and leads to better

results.

1 INTRODUCTION

Recent advances in 3D digitization techniques lead to

an increase of available 3D shapes datasets — see,

for instance, the Princeton dataset, the National Tai-

wan University database, or the data collected under

the EU funded project AIM@SHAPE

. In order to

be able to access such 3D shapes repositories, it may

be convenient to set up retrieval systems that support

the user in extracting from a potentially big repository

only the shapes that match a given speciﬁcation.

In this paper we focus on the query by example

(QBE) approach, whereby retrieval is based on ap-

plying appropriate similarity measures to the shape

descriptors of the query example and all or some

shapes from the repository (also known as gallery

or dataset). Therefore, most research in this direc-

tion has focused on ﬁnding robust and discriminative

shape descriptors, on top of which applying rather

conventional similarity measures. Although efﬁcient

indexing techniques may be adopted, it is easy to un-

derstand that this approach can suffer from the in-

crease of the number of shapes available in the repos-

itory and retrieval can easily fail.

We propose to adopt a combined strategy that cou-

ples conventional shape retrieval with classiﬁcation

methods from the statistical learning literature, in or-

der to increase the retrieval efﬁciency and effective-

ness as the size of the shape repository grows. For

what concerns the classiﬁcation algorithms to adopt,

http://shapes.aim-at-shape.net

we focus here on the well known Support Vector Ma-

chines (SVMs) (Vapnik, 1998). Different regularized

approaches, e.g. RLS (Rifkin et al., 2003) or iterative

methods (Lo Gerfo et al., 2008), could be applied as

well, at the price of some loss of performance.

Throughout the paper, we will point out how the

method that we propose is independent of the speciﬁc

choice of shape distribution and distance measure. In

this initial work, we will start off from a simple and

well known approach to shape retrieval (Osada et al.,

2002) and show that ﬁltering the repository based on

classiﬁcation can improve performance and results.

Since our method is completely independent of the

used descriptor, as long as it is a global vector with

ﬁxed lenght, it will be easy in the future to employ

other shape descriptors, with better retrieval perfor-

mance.

More in details, let us assume that we have a

repository of labeled shapes divided in N classes, each

shape described by means of the D2 descriptor (Osada

et al., 2002). The key idea is to exploit a classiﬁcation

tool to select a reduced number of classes more simi-

lar to a given query shape. After ﬁltering out the less

relevant classes we perform a standard retrieval, based

on a similarity measure (the L

norm) computed just

for the shapes belonging to retained classes. We show

that this initial classiﬁcation step improves both per-

formance and effectiveness of retrieval.

Delponte E., Basso C., Odone F. and Puppo E. (2009).

IMPROVING 3D SHAPE RETRIEVAL WITH SVM.

In Proceedings of the Fourth International Conference on Computer Graphics Theory and Applications, pages 46-51

DOI: 10.5220/0001755400460051

 SciTePress

2 BACKGROUND ON

STATISTICAL LEARNING

In this section we recall some basics of statisti-

cal learning that will be used throughout the pa-

per. Speciﬁcally we focus on penalized empirical risk

minimization approaches to supervised learning and,

implicitly, we refer to classiﬁcation problems.

We ﬁrst assume we are given two random vari-

ables x ∈ X ⊆ R

(input) and y ∈ Y ⊆ R (out-

put). In the binary classiﬁcation case, the elements

x ∈ X are feature vectors (e.g., shape descriptors),

while y ∈ {−1, 1}. We then consider a set of data

S = {(x

)|1 ≤ i ≤ n} that we call a training set ob-

tained by randomly sampling the set X ×Y .

Supervised learning approaches use the training

set to learn a function f : X → Y that can be applied

to previously unseen data. Indeed, we say that an al-

gorithm is predictive or that it has good generaliza-

tion properties if it applies successfully to data other

than training examples. A large class of algorithms

are based on the minimization of the penalized em-

pirical risk in a given space of functions H :

min

f ∈H

∑

i=1

V (y

, f (x

)) + λJ[ f ]

where V is some loss function measuring the solu-

tion’s goodness of ﬁt, J[ f ] is a functional of the func-

tion f penalizing complexity, and λ is a regulariza-

tion parameter that trades off between the two terms.

Within this framework we will refer in particular to

the so-called Tikhonov regularization, using a L

norm in H as a penalization term:

min

f ∈H

∑

V (y, f (x

)) + λ|| f ||

As for the choice of H , a very useful class of

spaces to enclose some notion of smoothness in their

norm are the Reproducing Kernel Hilbert Spaces

(RKHS). It can be shown that for every RKHS there

exists a corresponding unique positive-deﬁnite func-

tion K that we call a kernel function. Conversely, for

each positive-deﬁnite function K on X × X there is

a unique RKHS H that has K as a reproducing ker-

nel. A very important theorem in statistical learning

is the representer theorem stating that, under general

conditions on the loss function V , the minimizer of a

Tikhonov regularization problem in a RKHS associ-

ated to a kernel K is of the form

f (x) =

∑

i=1

K(x

,x) (1)

for some (c

,... ,c

) ∈ R

Roughly speaking, the kernel function K can be

expressed as a dot product in a higher dimensional

space. Choosing a kernel or, equivalently, choosing a

hypothesis space, allows us to formalize a non-linear

problem by mapping the original observations (fea-

ture vectors in the input space) in a different space

where a linear algorithm may be applied. Thanks to

the kernel, the mapping may be performed implicitly

with the so-called kernel trick:

K(x

) = φ(x

) · φ(x

)

This makes a linear classiﬁcation in the new space

equivalent to non-linear classiﬁcation in the origi-

nal space. In this work we use Gaussian kernels,

K(x

) = exp(−||x

− x

/2σ

). The parameter σ

is the width of the kernel and needs to be tuned ap-

propriately. Its choice is somehow alternative to the

choice of λ: a small σ may lead to overﬁtting, a big σ

to oversmoothing.

The choice of the loss function leads to different

learning algorithms. The L

-norm leads to Regular-

ized Least Squares (RLS) algorithms (Caponnetto and

De Vito, 2006), while the so-called Hinge loss (1 −

y f (x))

leads to Support Vector Machines (SVM)

(Vapnik, 1998). SVMs have been used with success in

a number of different application domains. They are

characterized by many nice properties, some of which

are not apparent from the regularized formulation fol-

lowed in this section. We mention here, as it will be

useful later in the paper, the fact that they produce a

sparse solution on the set of input data. This means

that the solution (c

,. .. ,c

) they produce will usually

contain few non-zero entries. The training data asso-

ciated to non-zero weights are referred to as support

vectors. For a geometric intuition of such a property

the reader is referred to (Vapnik, 1998).

3 RELATED WORK

The problem of retrieval and matching of shapes has

been extensively studied in numerous ﬁelds such as

computer vision, computer graphics and molecular

biology.

Most approaches to shape retrieval are based on

shape descriptors and exhaustive search. A shape de-

scriptor is computed for each object in a database, as

well as for the query object. Then the descriptor of

the query object is compared with all descriptors of

objects in the database through some measure for pat-

tern matching, and a ranked list of most similar ob-

jects is retrieved.

A large class of works deal with global descriptors

in the form of feature vectors, i.e., arrays of scalar

IMPROVING 3D SHAPE RETRIEVAL WITH SVM

values computed by some analysis of the 3D shape.

Some of such feature vectors are in fact histograms

of either scalar ﬁelds computed on the shape, or some

form of shape distribution. The D2 shape distribution

histogram (Osada et al., 2002) that we adopt here, is

one of the simplest descriptors to compute. More re-

cent descriptors exhibit better performances though.

As already remarked, the scope of this work is some-

how orthogonal to the descriptor adopted, as long as

this falls in the class of feature vectors, so our ap-

proach could (and will) be adopted also with other

descriptors. Other works use descriptor based on lo-

cal features and more complex matching procedures.

A few approaches to 3D object retrieval based on

statistical learning have been proposed in the litera-

ture. In their seminal work, Elad, Tal and Ar (Elad

et al., 2001) proposed a semi-interactive retrieval sys-

tem based on a feature vector descriptor and on the

use of SVM together with relevance feedback. In

(Leifman et al., 2005) the relevance feedback mech-

anism is combined with discriminant analysis tech-

niques to achieve a reduction of the dimensionality

of the feature vectors. Different methods based on

relevance feedback were compared by Novotni et al.

(Novotni et al., 2005) and those based on SVM were

found to give better performances. Relevance feed-

back on SVM was also adopted in a more recent work

by Leng, Qin and Li (Leng et al., 2007), where the

authors adopt an algorithm for SVM active learning,

previously proposed for image retrieval.

Hou, Lou and Ramani (Hou et al., 2005) use SVM

for organizing a database of shapes through clustering

and then to perform the classiﬁcation and retrieval. In

(Xu and Li, 2007) the authors, assuming a training

set structured in N classes, employ a similarity mea-

sure including a term depending on the ranked out-

put of N one-vs-all classiﬁers. Shape classiﬁcation

has also been addressed by Barutcouglu and De Coro

(Barutcuoglu and DeCoro, 2006) by using a Bayesian

approach to exploit the dependence between classes,

assuming that they are organized in a hierarchical

fashion. However, their work is focused only on the

classiﬁcation problem, and retrieval is not addressed.

A peculiarity of kernel methods is that they allow

for the design of ad hoc kernels able to capture the ex-

pressiveness of feature vectors. A number of kernels

for different application domains have been proposed

in the literature (Taylor and Cristianini, 2004). For in-

stance, a possible way to deal with histogram-like de-

scriptions is to treat them as probability distributions

and to resort to kernels deﬁned on probability mea-

sures (Jebara et al., 2004). Alternatively, kernel func-

tions derived from signal or image processing may be

adopted – see, for instance, (Odone et al., 2005).

All methods above feed the classiﬁcation algo-

rithms with shape descriptors of a unique type. In

(Akgul et al., 2008) the authors proposed a fusion al-

gorithm, based on the so-called empirical ranking risk

minimization, which combines different descriptors.

The algorithm can also be implemented via SVMs.

The learning algorithm returns the weights to asso-

ciate to the various descriptors. After this, conven-

tional retrieval may be performed, for instance by

means of relevance feedback.

4 OUR APPROACH

We propose a technique that is a combination of clas-

sical retrieval methods with a classiﬁcation approach

typical of the learning-from-examples framework.

We assume that our repository of shapes is labeled

and that, for simplicity, a unique label is assigned to

each shape. We can thus organize our repository in

classes using the labels to build a dataset per each

class. The idea that we pursue is to reduce complexity

of search by ﬁltering the available repository before

retrieval, i.e., by using only shapes belonging to the k

most relevant classes to the query object.

As a benchmark method, we consider a popular

work proposed by Osada et al., based on statistical

shape descriptors (Osada et al., 2002): the shape dis-

tribution of each 3D model is represented by its D2

descriptor, which consists of a histograms of the dis-

tances between pairs of vertices randomly selected on

the shape surface; two descriptors are compared with

the L1 distance. For a given query object, its descrip-

tor is compared with all the descriptors available on

the database, and the output returned by the retrieval

system is a ranked list of the n most similar shapes.

In this paper, we adopt the same shape descriptor

and we follow a similar pipeline. Our variation of the

original pipeline consists of performing shape classi-

ﬁcation prior to shape retrieval, in order to reduce the

size of the repository of shapes to be analyzed.

The reminder of the section describes the de-

tails of the classiﬁcation (ﬁltering) procedure and dis-

cusses the computational advantages of our choice.

Then we evaluate the appropriateness of our choice in

terms of nearest-neighbor performance and retrieval

indicators (the so-called ﬁrst tier and second tier).

4.1 Classiﬁcation and Retrieval

Let us assume we start from a shape repository con-

taining M labeled shapes, belonging to N different

classes. Each class is composed of a number of shapes

that could be arbitrarily different.

GRAPP 2009 - International Conference on Computer Graphics Theory and Applications

We aim at designing a shape classiﬁer that returns

the shape classes

S = {S

,. .. ,S

} most similar to a

query object. This will allow us to restrict retrieval to

the shapes belonging to set

The shape classiﬁcation problem that we consider

is a multi-class classiﬁcation problem. We adopt a

one-vs-all procedure that requires we train N binary

classiﬁers of the type class C

, with i = 1, .. ., N, ver-

sus all the other classes – see, e.g., (Bishop, 2006).

Thus, for each class we train a classiﬁer: each

training set is made of an equal number of positive

and negative examples of the shapes, where the nega-

tive examples are extracted from all the other classes

of the repository.

In this work we adopt Support Vector Machines

(SVM) (Vapnik, 1998) with Gaussian kernels as a

classiﬁcation algorithm. Our choice is mainly moti-

vated by performance reasons, in particular the fact

that the solution of a SVM classiﬁer is sparse on

the training data, but other binary classiﬁers could be

used within the same pipeline. In practice, we use the

SVM implementation of SV M

light

(Joachims, 1999).

The optimal values for the two free parameters,

i.e. the width σ of the Gaussian kernel and the

SVM regularization parameter, are set with a stan-

dard leave-one-out (LOO) procedure (see for instance

(Bishop, 2006)).

At run time, we apply a query object to each clas-

siﬁer available and we rank the output of the classi-

ﬁcation results, thus obtaining a list of classes sorted

from the most similar to the least similar to a given

query.

We exploit this result to ﬁlter shapes available

in the repository before we apply a standard nearest

neighbor (NN) retrieval. In other words, given a query

example, we test it against the N classiﬁers, rank the

results and then keep the shapes belonging to the ﬁrst

k classes for the following retrieval process.

Notice how the choice of an appropriate k is cru-

cial both for computational and performance reasons.

A small k will make the retrieval very fast but it may

impoverish the results. A big k (k → N) would in-

crease retrieval time yet not necessarily improve per-

formance - see the discussion in Section 5.

4.2 Computational Advantages

The computational cost of shape ﬁltering followed by

a retrieval restricted on the ﬁltered classes is equal to

the maximum cost between the two operations.

Note that if the classiﬁcation method is not sparse,

there is no computational advantage. Indeed, if the

ﬁltering procedure has to evaluate the basis K(x

,x)

in Eq. (1) for all training shapes x

, then the ﬁl-

tering phase is more costly than actual retrieval, no

matter the choice of k. For instance, if we consider

a Regularized Least Squares (RLS) (Caponnetto and

De Vito, 2006) approach, the query shape x is com-

pared with each training shape x

via a kernel function

K, then for each class C we have a summation over all

the training data in it (see Eq. (1)).

In the case of SVM, instead, the sparsity on the

training data in the obtained solution means that ﬁlter-

ing requires fewer comparisons - one per each support

vector - since in this case the summation runs on the

support vectors only. There is no way to evaluate a

priori the number of obtained support vectors, as they

depend on the training set (both their cardinality and

the data representation chosen) and on the choice of

the kernel function. In the average case, and assum-

ing an appropriate choice of the representation and

the kernel function, we observed a saving in terms of

number of comparisons between the query object and

training data. This effect becomes more relevant as

the training set size grows (see Section 5 and Figure

1).

5 EXPERIMENTAL RESULTS

In this section we show the results of retrieval made

on a subset of the Princeton Shape Benchmark (PSB)

repository (Shilane et al., 2004) and we compare our

results with the ones reported in (Osada et al., 2002).

The PSB is a publicly-available database of 3D

models, widely adopted by the shape retrieval com-

munity. The repository contains 1814 polygonal mod-

els classiﬁed by humans with respect to function and

form in 27 classes. For our experiments we chose

a subset of 18 PSB classes, that fulﬁll requirements

of intra-class shape homogeneity and a threshold for

cardinality (we keep only classes with at least 10 ele-

ments on the training and on the test set).

To evaluate the performances of the retrieval we

use the following evaluation methods:

Nearest Neighbor (NN) : the percentage of closest

matches that belong to the same class as the query.

First Tier (I-T) : the percentage of models in the

query’s class that appear within the top k matches

where k is the size of the query’s class.

Second Tier (II-T) : the percentage of models in

the query’s class that appear within the top 2k

matches where k is the size of the query’s class.

The means shown in Figure 2 and Table 1 are

weighted with respect to the size of the test set.

We trained 18 SVM classiﬁers according to the

previously described procedure. After a preliminary

IMPROVING 3D SHAPE RETRIEVAL WITH SVM

0 5 10 15 20 25 30

100

150

200

250

classes

# support vectors

# training data

Figure 1: Comparison between the original size of the train-

ing sets and the number support vectors for the 18 classes

considered; the sparsity of the solution is more noticeable

for repositories with more than 20 entries.

analysis on the performance of different standard ker-

nels, we adopted a Gaussian kernel. Because of the

presence of very small classes, we adopted a LOO

procedure to select the regularization parameter and

the parameter σ of the Gaussian kernel. Figure 1

shows how the number of support vectors becomes

signiﬁcantly smaller than the training set size when

the latter grows. This is an advantage, suggesting that

if we adopt SVMs the sparsity of the solution with

respect to the training data may reﬂect on an overall

computational saving. However, the dataset consid-

ered in our experiments is too small to allow for an

exhaustive analysis. At run time, we test all the query

shapes against the 18 classiﬁers.

As we pointed out previously in the paper, the

choice of the number of classes to keep for shape

retrieval is crucial and there is no obvious common

sense rule to apply. Obviously the presence of the

right class in the reduced set of shapes does not guar-

antee a successful retrieval, but its absence means

that the retrieved elements will all be wrong. At the

same time we notice how, as the size of the reposi-

tory grows, the average performance of the retrieval

indicators degrade. Fig. 2 shows how they vary as

the number of retained classes grow. From this analy-

sis we conclude that a small number of classes ([2-5])

should be kept both for efﬁciency and performance

reasons. The remaining experiments are performed

with k = 4.

By analysing the performance for the different

classes it is possible to notice that the performances

of direct retrieval are comparable or above our ﬁlter-

ing method for those classes which have very small

training sets (less than 10 elements), while SVM ﬁl-

Figure 2: Dependence of retrieval performance on the num-

ber k of retained classes grows. The three lines show the

metrics we used: 1st tier, 2nd tier and nearest neighbor.

tering is a clear advantage when the training set has

more than 40 elements.

Table 1 reports the average retrieval results over

all the classes, in the case of direct retrieval and SVM

ﬁltering with k = 4. Notice that direct retrieval repre-

sents the results obtained with the original work by

Osada et al. (Osada et al., 2002) on our datasets.

The advantage of our approach is evident. The results

presented in the original work (Osada et al., 2002)

are relative to a different (and smaller) repository and

the performances are described with different indica-

tors, therefore comparison is more complex. We con-

clude reporting that in (Gal et al., 2007) the follow-

ing results, obtained with Osada approach on a dif-

ferent subset of the PSB, are reported: I − T = 33%,

II −T = 47%, NN = 59%. The NN result seems to be

superior to the one we obtain, the reasons may be due

to the different characteristics of the selected classes

and to the fact that apparently only stable subsets of

shapes per each class are kept.

6 CONCLUSIONS

We have proposed a method based on SVM for ﬁlter-

ing the relevant classes in a 3D object database prior

to shape retrieval. SVM classiﬁers are built for all

classes of object in a database, and just the k most rel-

Table 1: Results with our approach (retaining 4 classes)

and our implementation of (Osada et al., 2002).

I-T II-T NN

SVM ﬁltering 33% 55% 47%

(Osada et al., 2002) 24% 39% 44%

GRAPP 2009 - International Conference on Computer Graphics Theory and Applications

evant classes or a query object are searched to answer

a query by similarity. We have shown that not only

our method can improve performance by pruning the

repository to be searched, but results are better with

respect to those obtained with exhaustive search, us-

ing the same shape descriptor.

The shape descriptor used in this initial work is

outperformed by others at the state-of-the-art. There-

fore, we plan to test our approach also with other, bet-

ter performing, descriptors. As already mentioned,

our ﬁltering is somehow orthogonal with respect to

the descriptor used. However, the quality of a descrip-

tor may also inﬂuence the performance of classiﬁca-

tion through SVM. If some other descriptor could give

us a better performance in classiﬁcation, we could re-

strict search to an even smaller number k of classes,

thus improving performance further.

ACKNOWLEDGEMENTS

The work is supported by the PRIN project 3SHIRT

(3-dimensional Shape Indexing and Retrieval Tech-

niques) funded by the Italian Ministry of Research

and Education.

REFERENCES

Akgul, C. B., Sankur, B., Yemez, Y., and Schmitt, F. (2008).

Similarity score fusion by ranking risk minimization

for 3d object retrieval. In Eurographics workshop on

3D object retrieval.

Barutcuoglu, Z. and DeCoro, C. (2006). Hierarchical shape

classiﬁcation using bayesian aggregation. In SMI ’06:

Proceedings of the IEEE International Conference

on Shape Modeling and Applications 2006, page 44,

Washington, DC, USA. IEEE Computer Society.

Bishop, C. (2006). Pattern Recognition and Machine

Learning. Springer.

Caponnetto, A. and De Vito, E. (2006). Optimal rates for

regularized least-squares algorithm. Found. Comput.

Math. In Press, DOI 10.1007/s10208-006-0196-8,

Online August 2006.

Elad, M., Tal, A., and Ar, S. (2001). Content based retrieval

of vrml objects: an iterative and interactive approach.

In Proceedings of the sixth Eurographics workshop

on Multimedia 2001, pages 107–118, New York, NY,

USA. Springer-Verlag New York, Inc.

Gal, R., Shamir, A., and Cohen-Or, D. (2007). Pose-

oblivious shape signature. IEEE Transactions on Vi-

sualization and Computer Graphics, 13(2):261–271.

Hou, S., Lou, K., and Ramani, K. (2005). SVM-

based semantic clustering and retrieval of a 3d model

database. Computer Aided Design and Applications,

2(1-4):155–164.

Jebara, T., Kondor, R. I., and Howard, A. (2004). Proba-

bility product kernels. Journal of Machine Learning

Research, 5:819–844.

Joachims, T. (1999). Making large-scale svm learning prac-

tical.

Leifman, G., Meir, R., and Tal, A. (2005). Semantic-

oriented 3d shape retrieval using relevance feedback.

The Visual Computer.

Leng, B., Qin, Z., and Li, L. (2007). Support vector ma-

chine active learning for 3d model retrieval. Journal of

Zhejiang University SCIENCE A, 8(12):1953–1961.

Lo Gerfo, L., Rosasco, L., Odone, F., De Vito, E., and Verri,

A. (2008). Spectral algorithms for supervised learn-

ing. Neural Computation, 20:1873–1897.

Novotni, M., Park, G., Wessel, R., and Klein, R. (2005).

Evaluation of kernel based methods for relevance

feedback in 3d shape retrieval. In Proceedings 4th

Int. Workshop on Content-based Multimedia Indexing,

Riga, Latvia.

Odone, F., Barla, A., and Verri, A. (2005). Building ker-

nels from binary strings for image matching. IEEE

Transactions on Image Processing, 14(2):169–180.

Osada, R., Funkhouser, T., Chazelle, B., and Dobkin, D.

(2002). Shape distributions. ACM Trans. Graph.,

21(4):807–832.

Rifkin, R., Yeo, G., and Poggio, T. (2003). Regularized

least-squares classiﬁcation.

Shilane, P., Min, P., Kazhdan, M., and Funkhouser, T.

(2004). The princeton shape benchmark. In SMI

’04: Proceedings of the Shape Modeling International

2004, pages 167–178, Washington, DC, USA. IEEE

Computer Society.

Taylor, J. S. and Cristianini, N. (2004). Kernel Methods for

Pattern Analysis. Cambridge University Press.

Vapnik, V. (1998). Statistical learning theory. Adaptive and

learning system for signal processing, communication

and control. John Wiley and Sons Inc.

Xu, D. and Li, H. (2007). 3D shape retrieval integrated

with classiﬁcation information. In Proceedings of

the Fourth International Conference on Image and

Graphics, pages 774–779, Washington, DC, USA.

IEEE Computer Society.

IMPROVING 3D SHAPE RETRIEVAL WITH SVM