MULTIPLE KERNEL LEARNING FOR ONTOLOGY INSTANCE

MATCHING

Diego Ardila, Jos

e Abasolo and Fernando Lozano

Universidad de los Andes, Bogot

a, Colombia

Keywords:

Ontology instance matching, Similarity measure combination, Multiple kernel learning, Indeﬁnite kernels.

Abstract:

This paper proposes to apply Multiple Kernel Learning and Indeﬁnite Kernels (IK) to combine and tune

Similarity Measures within the context of Ontology Instance Matching. We explain why MKL can be used in

parameter selection and similarity measure combination; argue that IK theory is required in order to use MKL

within this context; propose a conﬁguration that makes use of both concepts; and present, using the IIMB

bechmark, results of a prototype to show the feasibility of this idea in comparison with other matching tools.

1 INTRODUCTION

Ontology matching is the problem of determining cor-

respondences between concepts, properties, and in-

dividuals of two or more different formal ontologies

(Euzenat and Shvaiko, 2007). The aforementioned

plays a key role in many different applications such

as data integration, data warehousing, data transfor-

mation, open government, peer-to-peer data manage-

ment, semantic web, and semantic query processing.

Currently, one of its main challenges is the selec-

tion and combination of similarity measures during

the matching process (Shvaiko and Euzenat, 2008).

Although it is broadly accepted that multiple similar-

ity measures can help in ﬁnding better alignments and

the general opinion supports the idea that there is not a

similarity measure that is able to deal with all existing

matching problems, we still require to ﬁnd ways to or-

chestrate the available similarity measures in order to

ﬁnd the appropriate set of these for the matching task

at hand.

Furthermore, even if it is possible to choose the

measures that are likely to work within a speciﬁc con-

text, the question on how to set the parameters of such

functions remains open. Empirical results and litera-

ture tell us that similarity measures work but they re-

quire prior tuning steps.

To overcome these issues, most of the current

proposals use probabilistic or machine learning tech-

niques in order to ﬁnd the correct combination of

measures. This is a natural approach considering that

the rules that deﬁne how the measures should be com-

posed depend on the real application. Furthermore,

even for a domain expert, such rules are not necessar-

ily clear because similarity for humans is a relative -

and sometimes - contradictory concept (Laub et al.,

2007). Within this context, a learning algorithm is a

suitable option to ﬁnd such rules.

In the present paper, our aim is to give new in-

sights to this problem. We propose a matching solu-

tion based on the recent research in Multiple Kernel

Learning (MKL) and Indeﬁnite Kernels (IK). To our

knowledge, there are not any current solutions that

propose the use of the algorithms and techniques that

are employed in this article. Our main concern is to

explore other ways to ﬁnd the weights that typically

need to be determined when a process of aggregation

of similarity measures is carried on; therefore, we as-

sume the existence of an available library of similarity

measures and aggregation functions from which both

of these can be selected.

In a proof of concept prototype, the semi-

supervised learning paradigm is also integrated, as

we believe it to be suitable for this problem. First,

because of the volume of instances it is not feasible

to compare all the possible instances to ﬁnd the cor-

rect correspondences; for this reason, it is necessary

to ﬁnd rules that can be learned from a small subset of

instances. Second, as previously stated, the rules that

make two instances equivalent can be difﬁcult to cap-

ture, thus, the use of a learning algorithm to ﬁnd them

is better. Third, a huge amount of unlabeled data can

be easily obtained in many applications of this prob-

lem so it would be great if we could take advantage of

such information.

This article is organized as follows: In the next

311

Ardila D., Abasolo J. and Lozano F..

MULTIPLE KERNEL LEARNING FOR ONTOLOGY INSTANCE MATCHING.

DOI: 10.5220/0003117403110318

In Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD-2010), pages 311-318

ISBN: 978-989-8425-29-4

 2010 SCITEPRESS (Science and Technology Publications, Lda.)

section some of the related work is described. Sec-

tion 3 discusses the suggested approach. Section

4 presents experimental results that validate our ap-

proach. Finally, Section 5 discusses the conclusions

and future work.

2 BACKGROUND AND RELATED

WORK

There is a lot of work related to ontology match-

ing. Some of the available reviews are (Kalfoglou

and Schorlemmer, 2005) and (Shvaiko and Shvaiko,

2005). However, these reviews focus on the schema

level. This is probably because there are relatively

few works that prioritize the instance level. In fact, to

our knowledge, there is not a comprehensive review

involving ontology instance matching systems.

Speciﬁcally, concerning the challenge of tuning

and selecting similarity measures, the proposals typ-

ically attempt to ﬁnd a linear combination

∑

where each S

is a similarity measure which some-

times is called a matcher, an agent, an expert or a clas-

siﬁcation hyphotesis. What tends to change within the

different works is the manner in which the coefﬁcients

are found.

One of the ﬁrst solutions proposed was using val-

ues obtained through empirical evaluation. For exam-

ple, this approach was used by (Castano et al., 2003)

where they set the weights using the data of several

real integration cases. Of course, some of the prob-

lems of this approach are that it can only be used in

very static context and that the process of tuning the

parameters can be very expensive or require the sam-

pling of many scenarios in order to have a reliable

estimation.

There are works that use different similarity mea-

sures as features of a sample so that they can em-

ploy toolbox Machine Learning Algorithms. For ex-

ample (Wang et al., 2006) uses Support Vector Ma-

chine (SVM) as the classiﬁcation model. The train-

ing is achieved by creating a set of matched instance

pairs with positive labels and a set of non-matched in-

stance pairs with negative labels. A binary classiﬁer

is trained by using different similarity measurements

as features from the two pair sets. The classiﬁer then

acts as a pairing function taking a pair of instances

(a, b) as input and generating decision values as out-

put. Since from a Kernel Theory point of view this

is equivalent to modifying the spectrum of the Gram

matrix by replacing each of the eigenvalues with its

square, our approach captures this kind of proposal.

(Ehrig et al., 2005) uses different machine learn-

ing techniques for classiﬁcation (e.g. decision tree

learner, neural networks, support vector machines)

to assign an optimal internal weighting and thresh-

old scheme for each of the different feature/similarity

combinations of a given pair of ontologies. The

machine learning methods like C4.5 capture rel-

evance values for feature/similarity combinations.

To obtain the training data, they employ an exist-

ing parametrization as input to the Parameterizable

Alignment Method to create the initial alignments for

the two ontologies. The user then validates the initial

alignments and thus generates correct training data.

Some systems deﬁne a hierarchy of similarity

measures that are combined through a preestablished

process. This approach allows the systems to deﬁne

different types of mapping in which the kind of fea-

tures that are analyzed changes. A system of this kind

is HMatch (Castano et al., 2005) that deﬁnes four

matching models. The idea is that each model re-

ﬂects different levels of complexity within the match-

ing process. To combine the different similarities,

it deﬁnes weights according to the characteristics of

each feature. For example, each semantic relation has

associated a weight W

which shows the strength of

the connection expressed by the relation on the in-

volved concepts.

(Marie and Gal, 2008) proposes creating an en-

semble matcher by treating each similarity matrix

M(S, S

) as a weak classiﬁer and ﬁnding a strong clas-

siﬁer using a modiﬁed version of Adaboost. They use

a compound measure formed by Precision and Recall

as the error function for each iteration. The principles

behind kernel theory and boosting are different mak-

ing it possible to complement this proposal with our

ideas.

(Duchateau et al., 2008) introduces the notion of

planning to the problem of similarity measure aggre-

gation. Although this is a very interesting idea and

we believe it can be used to extent most of the current

approaches, the solution currently requires the user

to manually create or modify a decision tree. This

heavily depends on the user, who does not necessarily

know exactly how the similarity measures should be

parametrized and aggregated.

Some works propose different operators to com-

bine different similarity measures. For example,

(M. Nagy, 2010), based on Dempster Schafer The-

ory of Evidence (Diaconis, 1978), proposes using

the Demspter Combination Rule m

i j

(A) = m

⊕ m

∑

) ∗ m

), where m

, m

are similarity mea-

sures and E

is the similarity value for a candidate

correspondence. Another similar approach is found

in (Ji et al., 2008) where they deﬁne what is called the

Ordered Weighted Average (OWA) operator and use

the linguistic quantiﬁers developed by Yager (Yager,

KEOD 2010 - International Conference on Knowledge Engineering and Ontology Development

312

1988).

Finally, it is worth mentioning works that attempt

to formalize the combination task. For example,

(Stahl, 2005) investigates aspects of these approaches

in order to support a more goal-directed selection as

well as initiating the development of new techniques.

The investigation is based on a formal generalization

of the classic CBR cycle, which allows a more suit-

able analysis of the requirements, goals, assumptions,

and restrictions relevant in learning similarity mea-

sures. To simplify the selection of accurate techniques

within a particular application as well as for creating

foundations for future investigations, the work pro-

poses different categories for each of the following

three dimensions of the task of combining similarity

measures:

• Semantic of Similarity Measures: Determining

the Most Useful Case, Ranking the Most Use-

ful Cases, Approximating the Utility of the Most

Useful Cases and Probabilistic Similarity Mea-

sures.

• Training Data: Relative Case Utility Feedback,

Absolute Utility Feedback, Absolute Case Utility

Feedback and Utility Feedback.

• Learning Techniques: Probabilistic Similarity

Models, Local Similarity Measures and Feature

Weights.

3 MKL FOR ONTOLOGY

MATCHING

We want to ﬁnd an appropriate combination of simi-

larity measures for an instance matching task. Specif-

ically, our interests lie in learning a linear combina-

tion of N similarity measures with nonnegative coef-

ﬁcients β

that minimizes some error criteria e within

a given dataset Ψ. The elements of such dataset

are equivalent and non equivalent correspondences

C ∈ (I

, I

), where I

and I

are two homogeneous in-

stance sets:

mine(

∑

(c), Ψ) (1)

β ≥ 0

∈ ( f

, p

, m

), i ∈ 1..N

We see each similarity measure as a 3-tuple

( f

, p

, m

): f

is the actual similarity function, p

speciﬁc value set of parameters for the function and

a possible mapping between the properties of the

instances. We note that according to this description

the same similarity function can be part of two differ-

ent similarity measures.

The advantage of incorporating the mapping be-

tween properties as an additional component of the

similarity measure is that this allows us to conduct

the instance matching process even though there is

neither an unique property mapping nor certainty con-

cerning the correct mapping at the end of the schema

matching problem. In this case, it is sufﬁcient to view

the mappings as an additional variable during the se-

lection of the similarity measures and allow the algo-

rithm to select the mappings that provide better infor-

mation to accomplish the task.

Our interest extends only to ﬁnding equivalence

correspondences among homogeneous instance sets

in which their elements belong to equivalence classes.

For this reason, to ﬁnd all the correspondences be-

tween all the instances of two ontologies, it would be

necessary to carry on a schema matching process and

then to transverse the instance tree of the two ontolo-

gies in post-order.

This condition leads us to suppose that the under-

lying similarity rules are globally shared by the in-

dividuals of the set. To see that, consider a set in

which all its instances have a natural key but they can

be members of the concept PERSON or the concept

CAR. In this case, the natural key for each class will

be obviously different as is the correct combination of

similarity measures. While a person should be iden-

tiﬁed by its social security number, a car should be

identiﬁed by its license plate number.

We argue that the problem in equation (1) can be

solved using the MKL problem. In the following sec-

tion we present the advantages of such algorithm and

describe how it can be used in the instance match-

ing context assuming all similarity measures are also

a kernel. Then, we explain how it is possible to learn a

kernel from a similarity measure so that the algorithm

can be correctly employed.

3.1 MKL as Similarity Measure

Aggregator

If we limit ourselves to kernel functions (Scholkopf

and Smola, 2001) as similarity measures, deﬁne the

set of candidate correspondences as the input space,

label 1 for equivalent correspondences and −1 for

non equivalent correspondences, the problem stated

in equation (1) is equivalent to the MKL problem

(Bach et al., 2004). Under this setting, (Bach et al.,

2004) showed that this problem can be solved by the

QCP problem of equation (2) whose basic idea is to

train a classiﬁer that minimizes the error in the dataset

while also learns the optimal coefﬁcients as part of the

optimization problem.

MULTIPLE KERNEL LEARNING FOR ONTOLOGY INSTANCE MATCHING

313

min

ξ,α

ξ − 21

α (2)

sub ject0 ≤ α ≤ C, α

y = 0

D(y)S

D(y)α ≤ tr

ξ ∈ ℜ, α ∈ ℜ

where D(y) is the diagonal matrix with diagonal

y - the labels -, 1 ∈ R

, the unit vector, and C

a positive constant. The coefﬁcients β j are re-

covered as Lagrange multipliers for the constraints

D(y)S

D(y)α ≤ tr

ξ.

There are several advantages of using MKL within

the context of ontology matching:

First, MKL allows us to ﬁnd a sparse and non-

sparse combination of similarity measures by us-

ing various combinations of 1-norms and 2-norms.

Primarily 1-norms algorithms form a sparse linear

combination that can be useful in parameter selec-

tion where few kernels - the ones with the correct

parametrization - encode most of the relevant infor-

mation. On the other hand, 2-norm algorithms ﬁnd a

non-sparse combination that can be useful when fea-

tures encode orthogonal characterizations of a prob-

lem (Marius Kloft and Sonnenburg, 2008); in other

words, this may be used to combine complementary

similarity measures such as Knowledge, String, and

Structural based measures.

Second, there are very efﬁcient methods to solve

large scale MKL problems with a large number of

kernels ((Rakotomamonjy et al., 2008), (Sonnenburg

et al., 2006)). In fact, experimental results show that

the available methods work for hundreds of thousands

of examples or hundreds of kernels to be combined

and that have been applied in demanding applications

such as medical data fusion (Yu et al., ). This is ex-

tremely useful within the present context in which

large and complex ontologies have started to be a con-

cern.

Third, MKL directly addresses the problem of

combining similarity measures by using such com-

bination during the learning process. This is con-

trary to what happens when a neural network or any

other classic machine learning algorithm uses similar-

ity measures as features. Feature comparison and not

instance comparison is being carried out under this

condition, thus, the problem of instance matching is

not being directly addressed.

Fourth, a sparse and linear combination of simi-

larities such as the one producing MKL is simple and

easy to interpret. If a given situation is observed, all

a human has to do is analyze the larger and non zero

terms to understand which similarities are important

to classify a pair of instances, as different or equiva-

lent.

Finally, it is worthwhile mentioning that MKL is

an extension of the SVM algorithm that is capable of

learning from small training sets of high-dimensional

data with satisfactory precision (Wang et al., 2006).

3.2 Indeﬁnite Kernels

Even though the conﬁguration needed to use MKL

as a solver for (1) is simple and set forth a typical

scenario of binary classiﬁcation, until now kernels

have been considered as similarity measures. How-

ever, most of the current similarity measures for on-

tology matching are not explicitly presented as a Ker-

nel. Furthermore, for most of the similarity functions

the question on whether or not these are kernels has

not even been raised.

On the one hand, kernels are very convenient func-

tions from an optimization point of view. The PSD

condition on the Gram matrix makes most of re-

lated optimization problems convex, and as a result,

low cost computation algorithms for solving them -

such as interior-point methods (Boyd and Vanden-

berghe, 2004) - become available. The related prob-

lems would be nonlinear without this condition, and

under the current technology, intractable.

Moreover, kernels have generalization advantages

over regular similarity functions. The latter imply

a signiﬁcant deterioration in the learning guarantee.

(Srebro, 2008) found that if an input distribution can

be separated, in the sense of a kernel, with a margin γ

and an error rate ε

, then for any ε

> 0, this may also

be separated by the kernel mapping viewed as a sim-

ilarity measure, with similarity-based margin ε

and

error rate ε

+ ε

. Because ε

and ε

do not take neg-

ative values, a kernel-based margin is never smaller

than a similarity-based margin.

On the other hand, kernels also come with com-

promise and trade-offs. Their mathematical expres-

sions do not necessarily correspond to the intuition of

a good kernel as a good similarity measure and the

underlying margin in the implicit space is not usually

apparent in natural representations of the data (Balcan

and Blum, 2006). Therefore, it may be difﬁcult for a

domain expert to use the theory to design an appro-

priate kernel for the learning task at hand. Further-

more, the requirement of positive semi-deﬁniteness

may rule out most of the natural pairwise similarity

functions for the given problem domain.

To use MKL in the context of ontology match-

ing without losing the designability and interpretabil-

ity of similarity functions, we suggest following the

approaches that focus on ﬁnding a surrogate kernel

KEOD 2010 - International Conference on Knowledge Engineering and Ontology Development

314

matrix K derived from the original similarity matrix S

(Wu et al., 2005).

To this regard, one of the ﬁrst approaches was to

consider all the negative eigenvalues as noise and ap-

ply the linear transformation (3) to the similarity ma-

trix to replace all the eigenvalues with zero.

clip

= U

clip

U (3)

where a

clip

= diag(I

, ., ., ., ., I

Another common approach consists in changing

the signs of all negative eigenvalues - instead of mak-

ing them zero - by using the following linear transfor-

mation:

f lip

= U

f lip

U (4)

where a

f lip

= diag(sign

, ., ., ., ., sign

What we propose is to use the optimization prob-

lem stated in (Chen et al., 2009) to alter the original

similarity measure. This approach guarantees a con-

sistent treatment of all the samples because the same

linear transformation that is applied to the original

similarity matrix i.e. the one that creates the original

measure, can be applied to the new samples. Besides,

by controlling a parameter γ the user can control how

far to extend the search for the surrogate matrix.

The problem is presented in equation (5). Given

a similarity matrix S

calculated from a similar-

ity measure S and whose eigendescomposition is

UΛU

, this problem ﬁnds a linear transformation

A = Udiag(a)U

that modiﬁes the original similar-

ity matrix by solving a small QCP problem that can

be handled by standard optimization packages:

min

c,b,ξ,a

ξ + ηc

c + γh(a) (5)

sub ject to diag(y)(K

c + b1) ≥ 1 − ξ

ξ ≥ 0, Λa ≥ 0

where h(a) is a convex function that regularizes the

search of the modiﬁed similarity matrix toward S

;

for example, one can use h(a) = ka − a

clip

k to fo-

cus the search at the vicinity of the A

clip

transforma-

tion. Since there may be different regularizers, we

suggest employing several of these to ﬁnd surrogate

matrices and allow MKL to select the proper one. In

other words, the regularization function and the pa-

rameter coeﬁcient may be seen as other components

of the similarity measure.

3.3 Putting the Ideas Together...

The following algorithm shows the suggested se-

quence to compose MKL and IK. There are three

steps in the process. The ﬁrst one calculates differ-

ent transformations of similarity measures that use the

same similarity function. The second one uses MKL

with 1-norm to ﬁnd a sparse combination of kernels

for each similarity function. The last one calls 2-norm

MKL to ﬁnd a linear combination of similarity mea-

sures that analyzes different types of features.

Input: similarity measures, learning parameters

Begin:

//Step 1: Learn IK

for each similarityMeasure

for each regularizers and learningParameter

learnedKernel = learn2IndefiniteK (S, R, LP);

add (S, learnedKernel, ikList);

end;

//Step 2: Call MKL with 1-Norm

for each similarityMeasure

ikList = getLearnedKernel(similarityMeasure);

sparseKernel = mklCombination(ikList, N1);

add (sparseKernel, sparseKList);

end;

//Step 3: Call MKL with 2-Norm

combination = mklCombination (sparseKList, N2);

End

Output: combination of orthogonal kernels.

3.4 Labeling and Unbalanced Classes

Because MKL is a supervised learning algorithm, an

oracle that labels a small set of candidate correspon-

dences as positive is required. The negative labels

can be constructed by crossing an instance of a posi-

tive correspondence with a random instance that is not

within the set given by the oracle. Depending on the

real application, the oracle can be a human or some

other system that does not require labels to accom-

plish the alignment. In this case, what could be of

value of our approach is the generalization capability

of a supervised learning paradigm.

On the other hand, the unbalanced nature of our

input space needs to be considered. An element of

our space is a candidate correspondence (I

, I

), where

, I

are instances of the sets to be matched. Conse-

quently, the cardinality of the input space is N × M,

where N and M correspond to the size of each of the

sets to be mapped. Within this setting, there will be

at most max(N, M) positive correspondences making

the rest negative.

This is a typical scenario of unbalanced classes

that can be treated with Cost-Sensitive or Sampling

Techniques. For example, it is possible to choose

the undersampling method which changes the train-

ing sets by sampling a smaller majority training set

(Drummond and Holte, 2003). As the performance of

every unbalance technique is highly dependent on the

data set (McCarthy et al., 2005), we suggest selecting

the technique by using cross validation.

MULTIPLE KERNEL LEARNING FOR ONTOLOGY INSTANCE MATCHING

315

4 EXPERIMENTS

A prototype in Java as a proof of concept was imple-

mented. Its architecture is depicted by ﬁgure 1.

We used the MKL implementation of Shogun

(Sonnenburg and Raetsch, 2010), and Mosek (Mos,

2010) to solve the QCP problem of equation (5).

The employed libraries of similarity measures were

(Chapman, 2009) for String measures and the Java

WordNet Similarity library for Knowledge based

measures. Jena OWL was the API to read the OWL

ontologies and the alignment ﬁles. We also incorpo-

rated a TSVM classiﬁer as a ﬁnal matcher ( module

8 in ﬁgure 1) whose training algorithm was imple-

mented by (Joachims, 2002).

Figure 1: Architecture of the Prototype.

4.1 Kernels and Similarity Measures

A composite kernel to compare two correspondences

was used:

K(C

) = K

internal

). (6)

where K

internal

refers to a kernel that measures the

similarities between the instances I

, I

of each can-

didate correspondence. Considering that the product

of two numbers is greater when they are close to each

other, this kernel takes greater values when the two

correspondences share a similar estimation of simi-

larity.

Two type of measures were employed as internal

kernels K

internal

The ﬁrst one was the function K

internal

, I

) =

i, j

, I

) where m

i, j

is a speciﬁc alignment between

two properties lists of each ontology given by the

mapping m

and P is a local similarity function that

compares how similar the values of the two properties

of the two instances are. The idea behind this kernel

is to follow a natural key approach where the identity

of the instances is captured within the value of a few

properties.

The second internal kernel was an adapted ver-

sion of the tree-like function described in (Xue et al.,

2009) that is not stated as a Kernel. This measure

aims to ﬁnd structural and semantic similarity. Its ba-

sic principle is to ﬁnd how far apart two instance trees

are by computing the operations needed to transform

one tree into another. Since both functions require

to measure similarity between the values of the prop-

erties, the following list of local similarity functions

was used:

• String based: BlockDistance, ChapmanLength-

Deviation, CosineSimilarity, DiceSimilarity, Eu-

clideanDistance, JaccardSimilarity, JaroWinkler,

Levenshtein, MatchingCoefﬁcient, MongeElkan,

SmithWatermanGotoh.

• Knowledge based: Lin, Resnik, Path, WuAnd-

Palmer.

Clearly, the fact that these similarity measures

were employed at the local level makes most of our

kernels indeﬁnite.

4.2 Test Set and Results

We used the IIMB bechmark (Ferrara et al., 2008) as

basis for our preliminary experiments. The IIMB is

an evaluation dataset for the OAEI conference track

which consists of several transformations to a ref-

erence ontology. This ontology contains 5 named

classes, 4 object properties, 13 datatype properties

and 302 individuals. We make clear that we have not

participated in the ofﬁcial campaign.

There are a total of 37 matching tasks in the

benchmark. Each one introduces a class of modiﬁ-

cations over the original value/s of a speciﬁc property

within the source ontology. For example, there are ty-

pographical error simulations, changes in the aggre-

gation level, and instantiation on different subclasses

of the same individual. Only the ﬁrst 19 match-

ing tasks were tested because similarity measures de-

signed to capture logic heterogeneity were not em-

ployed.

The standard set of parameters for Ontology

Matching was used as evaluation measures:

• Precision: the number of correct retrieved map-

pings / the number of retrieved mappings.

• Recall. the number of correct retrieved mappings

/ the number of expected mappings.

KEOD 2010 - International Conference on Knowledge Engineering and Ontology Development

316

Figure 2: Protoype Behavior vs Other Systems.

• F-measure. 2 x (precision x recall) / (precision +

recall).

The following ﬁgures reveal the results of the pro-

totype (MKL-IK) and the systems that participated in

the OAEI 2009 (Euzenat et al., 2009).

Our prototype was comparable to the other tools in

the selected matching tasks. Besides, although most

of the time it was overcome by another system, it

showed a consistent behavior across every data set.

Both facts leads us to believe that the suggested ap-

proach should be further explored.

5 CONCLUSIONS AND FUTURE

WORK

In the present paper, we proposed to use Multiple Ker-

nel Learning (MKL) to combine similarity measures

within the context of Ontology Instance Matching.

We described the advantages of MKL and explained

how it can be used to address the problem. The need

to ﬁnd surrogate similarity matrices to be able to use

such algorithm within this context, has been explained

and a possible approach to accomplish the task sub-

mitted. This approach consists in computing a linear

transformation that searches for the surrogate matrix

within the vicinity of the original similarity matrix.

We also suggested an algorithm that makes use of

both concepts and pointed how the unbalanced class

issues of the suggested conﬁguration can be faced. In

addition, we implemented a proof of concept proto-

type and partially tested it using the IIMB benchmark.

The results suggest that our approach is feasible and

should be explored to extent current matching solu-

tions or to create new ones.

Our current research follows different directions.

We are particularly studying the internal behavior of

the algorithm and conducting a profound assessment

of the prototype through the use of other bechmarks

and test dataset. Furthermore, we are analyzing possi-

ble performance issues that may appear with this ap-

proach.

REFERENCES

(2010). The mosek optimization software.

Bach, F. R., Lanckriet, G. R. G., and Jordan, M. I. (2004).

Multiple kernel learning, conic duality, and the smo

algorithm. In ICML ’04: Proceedings of the twenty-

ﬁrst international conference on Machine learning,

page 6, New York, NY, USA. ACM.

Balcan, M.-F. and Blum, A. (2006). On a theory of learn-

ing with similarity functions. In ICML ’06: Proceed-

ings of the 23rd international conference on Machine

learning, pages 73–80, New York, NY, USA. ACM.

Boyd, S. and Vandenberghe, L. (2004). Convex Optimiza-

tion. Cambridge University Press, New York, NY,

USA.

Castano, S., Ferrara, A., and Montanelli, S. (2003). H-

match: an algorithm for dynamically matching on-

tologies in peer-based systems. In Proc. of the 1st

VLDB Int. Workshop on Semantic Web and Databases

(SWDB 2003), Berlin, Germany.

Castano, S., Ferrara, A., and Montanelli, S. (2005). Match-

ing ontologies in open networked systems: Tech-

niques and applications. Journal on Data Semantics,

Chapman, S. (2009). Simmetrics.

Chen, Y., Gupta, M. R., and Recht, B. (2009). Learning

kernels from indeﬁnite similarities. In ICML ’09: Pro-

ceedings of the 26th Annual International Conference

on Machine Learning, pages 145–152, New York, NY,

USA. ACM.

Diaconis, P. (1978). [a mathematical theory of evidence.

(glenn shafer)]. Journal of the American Statistical

Association, 73(363):677–678.

MULTIPLE KERNEL LEARNING FOR ONTOLOGY INSTANCE MATCHING

317

Drummond, C. and Holte, R. C. (2003). C4.5, class imbal-

ance, and cost sensitivity: Why under-sampling beats

over-sampling. pages 1–8.

Duchateau, F., Bellahsene, Z., and Coletta, R. (2008).

A ﬂexible approach for planning schema match-

ing algorithms. In OTM ’08: Proceedings of the

OTM 2008 Confederated International Conferences,

CoopIS, DOA, GADA, IS, and ODBASE 2008., pages

249–264, Berlin, Heidelberg. Springer-Verlag.

Ehrig, M., Staab, S., and Sure, Y. (2005). Bootstrapping on-

tology alignment methods with apfel. In WWW ’05:

Special interest tracks and posters of the 14th inter-

national conference on World Wide Web, pages 1148–

1149, New York, NY, USA. ACM.

Euzenat, J., Ferrara, A., Hollink, L., Isaac, A., Joslyn,

C., Malais

e, V., Meilicke, C., Nikolov, A., Pane, J.,

Sabou, M., Scharffe, F., Shvaiko, P., Spiliopoulos, V.,

Stuckenschmidt, H., Sv

ab-Zamazal, O., Sv

atek, V.,

dos Santos, C. T., Vouros, G. A., and Wang, S. (2009).

Results of the ontology alignment evaluation initiative

2009. In OM.

Euzenat, J. and Shvaiko, P. (2007). Ontology matching.

Springer-Verlag, Heidelberg (DE).

Ferrara, A., Lorusso, D., Montanelli, S., and Varese, G.

(2008). Towards a benchmark for instance match-

ing. In Shvaiko, P., Euzenat, J., Giunchiglia, F., and

Stuckenschmidt, H., editors, Ontology Matching (OM

2008), volume 431 of CEUR Workshop Proceedings.

CEUR-WS.org.

Ji, Q., Haase, P., and Qi, G. (2008). G.: Combination of

similarity measures in ontology matching using the

owa operator. In In: Proceedings of the 12th Inter-

national Conference on Information Processing and

Management of Uncertainty in Knowledge-Base Sys-

tems.

Joachims, T. (2002). SVM light.

Kalfoglou, Y. and Schorlemmer, M. (2005). Ontology map-

ping: The state of the art. In Semantic Interoperability

and Integration, Dagstuhl Seminar Proceedings. Inter-

nationales Begegnungs- und Forschungszentrum f

Informatik (IBFI).

Laub, J., Macke, J., Muller, K.-R., and Wichmann, F. A.

(2007). Inducing metric violations in human similar-

ity judgements. In Advances in Neural Information

Processing Systems 19, pages 777–784. MIT Press,

Cambridge, MA.

M. Nagy, M. V.-V. (2010). [towards an automatic semantic

data integration: Multi-agent framework approach].

Marie, A. and Gal, A. (2008). Boosting schema match-

ers. In OTM ’08: Proceedings of the OTM 2008 Con-

federated International Conferences, CoopIS, DOA,

GADA, IS, and ODBASE 2008., pages 283–300,

Berlin, Heidelberg. Springer-Verlag.

Marius Kloft, Ulf Brefeld, P. L. and Sonnenburg, S. (2008).

Non-sparse multiple kernel learning.

McCarthy, K., Zabar, B., and Weiss, G. (2005). Does cost-

sensitive learning beat sampling for classifying rare

classes? In UBDM ’05: Proceedings of the 1st in-

ternational workshop on Utility-based data mining,

pages 69–77, New York, NY, USA. ACM.

Rakotomamonjy, A., Bach, F., Canu, S., and Grandvalet, Y.

(2008). SimpleMKL. Journal of Machine Learning

Research, 9.

Scholkopf, B. and Smola, A. J. (2001). Learning with Ker-

nels: Support Vector Machines, Regularization, Opti-

mization, and Beyond. MIT Press, Cambridge, MA,

USA.

Shvaiko, P. and Euzenat, J. (2008). Ten challenges for ontol-

ogy matching. In On the Move to Meaningful Internet

Systems: OTM 2008, volume 5332 of Lecture Notes

in Computer Science, chapter 18, pages 1164–1182.

Berlin, Heidelberg.

Shvaiko, P. and Shvaiko, P. (2005). A survey of schema-

based matching approaches. Journal on Data Seman-

tics, 4:146–171.

Sonnenburg, S. and Raetsch, G. (2010). Shogun.

Sonnenburg, S., R

atsch, G., Sch

afer, C., and Sch

olkopf, B.

(2006). Large scale multiple kernel learning. J. Mach.

Learn. Res., 7:1531–1565.

Srebro, N. (2008). How good is a kernel when used as a

similarity measure?

Stahl, A. (2005). Learning similarity measures: A for-

mal view based on a generalized cbr model. In Op-

tional Comment/Qualiﬁcation: Validation of Inter-

Enterprise Management Framework (Trial 2), pages

507–521. Springer.

Wang, C., Lu, J., and Zhang, G. (2006). Integration of on-

tology data through learning instance matching. In WI

’06: Proceedings of the 2006 IEEE/WIC/ACM Inter-

national Conference on Web Intelligence, pages 536–

539, Washington, DC, USA. IEEE Computer Society.

Wu, G., Chang, E. Y., and Zhang, Z. (2005). An analysis

of transformation on non-positive semideﬁnite simi-

larity matrix for kernel machines. In Proceedings of

the 22nd International Conference on Machine Learn-

ing.

Xue, Y., Wang, C., Ghenniwa, H., and Shen, W. (2009). A

tree similarity measuring method and its application

to ontology comparison. j-jucs, 15(9):1766–1781.

Yager, R. R. (1988). On ordered weighted averaging ag-

gregation operators in multicriteria decisionmaking.

IEEE Trans. Syst. Man Cybern., 18(1):183–190.

Yu, S., Falck, T., Daemen, A., Tranchevent, L.-C., Suykens,

J. A. K., De Moor, B., and Moreau, Y.

KEOD 2010 - International Conference on Knowledge Engineering and Ontology Development

318