Description and Evaluation of Algorithms for Ontology Matching

Mario Blanco-Alonso, Francisco J. Rodríguez-Martínez and Lorena Otero-Cerdeira

LIA2 Group - Computer Science Department, University of Vigo, Ourense, Spain

Keywords:

Didactic Framework, MaasMatch, CODI, LogMap, Ontology, Matching.

Abstract:

In this paper, we present a state of the art about Ontology Matching Algorithms and we propose a general

classiﬁcation of them. A selection of three algorithms to work with a concrete platform is presented: CODI,

LogMap and MaasMatch.

In addition we propose a testbed divided in three groups of tests to evaluate the algorithms .

These algorithms were tested and evaluated to verify which was the most suitable for this problem.

1 INTRODUCTION

The existence of computer systems that handle in-

formation and entities related to the real world has

caused a growing interest in techniques and processes

that can build connections among them. This shows

the need to identify and link the heterogeneous infor-

mation present in multiple resources. Since this re-

quirement was detected, several proposals appeared

to solve and facilitate, especially, the treatment of the

ambiguity on these relationships. This paper focuses

on semantic techniques since, in Ontology Matching

they are the most prominent among the available ones.

An ontology deﬁnes a vocabulary that describes a

domain of interest and a speciﬁcation of the meaning

of terms used in the vocabulary (David et al., 2010).

Depending on both, the accuracy of the speciﬁcation

and, its purpose and scope, the notion of ontology

ranges from, groups of terms and classiﬁcations, to

database schemas (Euzenat and Shvaiko, 2007).

There are different environments where Ontolo-

gies and Ontology Matching(alignment of terms) can

be used. Based on these ontologies, various systems

arose to align in such way that it was possible to gen-

erate relations in comparable domains.

We propose a speciﬁc case to apply Ontology

Matching, both in a theoretical and practical way, re-

lated to a didactic platform. This platform has two

main types of users, "Teacher" and "Student", who

create and use contents like: templates, tables, mod-

els, exercises, etc.

This platform is used in several schools that gen-

erate sets of terms that could refer to the same mean-

ing but with different names. Using the right tools

its possible to create labels to tag and identify con-

tents, users and units. Thus, to assume that system

is mapped and to make it easier to create groups un-

der some rules and conditions, we need to manually

tag the speciﬁc domain. In such way the system can

model the knowledge with a set of primitives called

Ontologies.

In the context of this platform, the goal is to map

the semantic information to use it as a base to create

relations between the Ontologies that deﬁne the con-

cepts. Therefore its possible to get automatic inter-

connections that are discernible similarity relations.

For example, a teacher may create new content and,

to identify it, tag it using the label "subject" while

in another school, another teacher could deﬁne new

enhanced content about the same topic, but adding a

label "course" rather than "subject". Those two con-

cepts are related and, therefore, they should belong

to the same group. The alignment "subject-course"

does exactly that, it classiﬁes a concept that should be

grouped using a relationship in an alignment.

This is why it seems mandatory that once the plat-

form is created with its own ontological representa-

tion, it will be possible to interconnect the knowledge,

even when its allocated in different geographic loca-

tions and managed by different people. As a restric-

tion to the algorithm that would align the ontologies,

it must also align the knowledge consistently keeping

a great degree of coherence.

With regard to Ontology Matching, it has achieved

a satisfactory level of success but still has challenges

at the operational and computational level which lead

492

Blanco-Alonso M., Rodríguez-Martínez F. and Otero-Cerdeira L..

Description and Evaluation of Algorithms for Ontology Matching.

DOI: 10.5220/0004253304920495

In Proceedings of the 5th International Conference on Agents and Artiﬁcial Intelligence (ICAART-2013), pages 492-495

ISBN: 978-989-8565-39-6

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

to a very open research line.

In recent decades diverse solutions have been

proposed for Ontology Matching (Batini and

M.Lenzerini, 1986; Spaccapietra et al., 1992), several

recent surveys (Bellahsene et al., 2011; Shvaiko

and Euzenat, 2008; Sakarkar and Upadhye, 2010)

and some books have also been written (Euzenat

and Shvaiko, 2007; Bellahsene et al., 2011) on the

subject.

The rest of the paper is organized as follows.In

Section 2 the basis of the platform are speciﬁed, in

section 3 a brief review of the best existing algorithms

based on Ontology Matching. Section 4 shows the

benchmarks and tests. And ﬁnally In section 5 there

is a conclusion about the general results and the be-

havior of each algorithm.

2 PLATFORM BASIS

The main focus of this didactic platform IntellTec5.1

is to create a virtual environment for interactive e-

learning, promoting the development of a creative, au-

tonomous and independent pedagogy.

We are not limited to upload traditional materials to

the cloud, in fact we provide:

• A domain to elaborate learning materials

• Support tools

• Data and knowledge sources

• Self-assessment activities

• Collaborative learning

To introduce and explain the platform mecha-

nisms, we have to distinguish between two potential

users: Teachers and Students. Teachers can manage

activities and the progress of every student. Therefore

with this specialized training and teaching it is easy

to empower the students’ motivation to become

specialized professionals taking advantage of the new

technologies and improving their own skills to draw

upon the e-learning capabilities and internet potential.

About the built-in management system that each

user has in IntellTec5.1:

• Teacher. This kind of user has a set of tools to

develop its course topics, units, agenda and the

on-line activities to improve its new acquired

skills. They can evaluate students on-line, check

their speciﬁc and over-all progress, take care of

individuals, etc.

Each teacher can create and upload its own con-

tent, activities, exercises using the integrated tool

to create the content using templates, completely

editable and conﬁgurable. For those teachers

who don’t want to create new content, we have a

database with a recompilation of every template

and activity skeleton created by any teacher in

the world. The focus of this approach is to share

and grow the knowledge and good ideas, so any

student in any school is able to take advantage of

the best activities and applications.

Then a teacher can manage students, create

digital contents and new interactive, traditional

and auto-evaluable tasks.

• Student. The whole system has the student as the

main target to improve the e-learning experience

and focusing on the motivation and inspiration to

create and use the digital advantages and the in-

ternet potential.

We encourage the exploration of the data kept in

our knowledge base with the interactive activi-

ties approach making a great enhancement to self-

taught learning and provoking a proﬁle of student

who is eager to explore by himself.

The platform lets the student enjoy a collabora-

tive domain, the use of an easy but complete set

of tools to develop the digital skills to create, use

and explore digital content and internet by him-

self.

The focus of this paper is to study a possible so-

lution to some side problems like knowledge repre-

sentation, knowledge interaction and grouping, sug-

gestion functionalities, intelligent sharing and an easy

way to show and use a vast source of specialized in-

formation.

3 ALGORITHMS THAT

GENERATE ALIGNMENTS

Regarding the measure effectiveness of an Ontology

Matching algorithm, there are several ways to select

which one is better, depending on the type of tests,

data and complexity used to compare them. It will

change which algorithm is the better in every case.

There are evaluators and generic tests to evaluate and

compare algorithms and determine which one is the

best in each ﬁeld or goal.

CODI(Huber et al., 2011), LogMap(Jiménez-Ruiz

et al., 2011) and MaasMatch(Schadd and Roos, 2011)

are algorithms that were designed to solve problems

similar to the didactic platform proposed in this re-

search paper. They are specialized to solve align-

ments prioritizing the logic and coherence in the re-

sulting ontologies.

DescriptionandEvaluationofAlgorithmsforOntologyMatching

493

3.1 CODI

Combinatorial Optimization for Data Integration

(CODI) leverages terminological structure for Ontol-

ogy matching. The current implementation produces

mappings between concepts, properties, and individu-

als. The system combines lexical similarity measures

with schema information to completely avoid inco-

herence and inconsistency during the alignment pro-

cess(Huber et al., 2011).

CODI is based on instance classiﬁcation, so it is

grouped as an instance-based technique with the ad-

dition of the lexical similarity measures.

3.2 LogMap

LogMap was developed to be capable of computing

scalable and logic-based Ontologies. It’s able to take

advantage of diagnosis techniques, make alignments

and output mappings that don’t lead to logical incon-

sistencies when integrated with the input ontologies.

Therefore, the resulting alignments don’t have inco-

herences. The second focus of this algorithm is to

achieve a low runtime.

3.3 MaasMatch

MaasMatch is an Ontology Matching tool that focuses

on resolving terminological heterogeneities, such that

entities with the same meaning but differing names

and entities with the same name but different mean-

ings are identiﬁed as such and matched accordingly.

4 BENCHMARK TEST

4.1 Benchmark Tracks

The goal of the benchmark data set is to provide a

stable and detailed picture of each algorithm. For that

purpose, algorithms are run on systematically gener-

ated test cases.

4.1.1 Bonus Tests

• Anatomy. This test confronts the existing match-

ing technology with a speciﬁc type of ontologies

from the biomedical domain. In this domain,

many ontologies have been build covering dif-

ferent aspects of medical research. We focus on

fragmentsof two biomedical ontologies which de-

scribe the human anatomy and the anatomy of the

mouse(Shvaiko et al., 2011).

Table 1: Anatomy results.

Anatomy CODI LogMap MaasMatch

Precision 0.965 0.948 0.995

Recall 0.825 0.846 0.287

score 0.879 0.894 0.445

• Conference. The main features are that the on-

tologies were developed independently and based

on different resources using different terminolo-

gies and points of view, and most ontologies

were equipped with OWL DL axioms of various

kinds opening a way to use semantic matchers.

In addition, we tested the runtime of the algo-

rithms(Shvaiko et al., 2011).

5 CONCLUSIONS AND FUTURE

WORK

In this paper, an evaluation, classiﬁcation and selec-

tion of Ontology Matching algorithms is proposed.

The main pre-condition to select the algorithms for

this study is that they must be able to generate coher-

ent alignments (without disjointness), which is only

achieved by three of the studied algorithms(over 40).

This work was made to distinguishing relevant algo-

rithms applicable to a real project with some restric-

tions and real life issues. The proposed algorithms

to face the problem were CODI, LogMap and Maas-

Match.

The experimental results of this paper show that

there are algorithms that yield great success with

the proposed restrictions and seem to be promising.

Based on the results, CODI seems like it is the best

one of them with a very good F-measure, coherent

alignments and capable of aligning large ontologies.

LogMap could be selected if it didn’t yield inco-

herent alignments since its runtime is the best. Test-

ing and using these algorithms gave a good expertise

of what is done and what should be enhanced.

MaasMatch is an average algorithm: it has bet-

ter runtime than CODI, but worse than LogMap. It

has a good F-measure, but worse than CODI and

LogMap. It can handle large ontologies as the other

two. MaasMatch has more incoherences than CODI

and LogMap, therefore this algorithm is excluded

since it was outperformed in every test.

Based on the research made and results evaluated,

we have concluded that we have to use LogMap ap-

proach and CODI approach to implement three ways

to handle the issues detected in the didactic platform.

First, we will use LogMap as a real-time suggestion

tool for the teacher users, then they can type down or

ICAART2013-InternationalConferenceonAgentsandArtificialIntelligence

494

Table 2: Conference results.

Conference Prec Rec F

0.5

-Meas F

-Meas

CODI 0.74 0.57 0.7 0.64 0.6

LogMap 0.85 0.5 0.75 0.63 0.54

MaasMatch 0.83 0.42 0.69 0.56 0.47

Table 3: Average size of alignments, number of incoherent alignments, and average degree of incoherence.

Coherence Size Inc. Alignments Degree of Inc. Reasoning problems

CODI 9.5 0/91 0% 0

LogMap 8 8/91 2% 0

MaasMatch 7.5 21/91 4% 0

select a set of proposed tags to use in the new or mod-

iﬁed content. Second, we will use CODI as a lazy

alignment tool server side, this way we make align-

ments and Ontology enrichment with only our data

base information, therefore we reach a coherent set of

knowledge. Third, with the database knowledgeprop-

erly tagged, grouped, and with its own representation

(as ontologies) we will assign or suggest tags, con-

tent and activities by LogMap in real time when it’s

necessary, therefore we ﬁnally set up a mixed strat-

egy with CODI and LogMap advantages. After this

test is done, the next step will be to mix our schema,

tags, ontologies and generated taxonomies with a set

of taxonomies made of every verb and name of inter-

est in our case, it’s like a WordNet but specialized and

limited within our target ﬁeld of interest.

About the algorithms tested, there is still a lot of work

to do. Firstly, in the sense of runtime-coherence we

need to look for a good method to reduce the runtime

like LogMap does, but preventing the coherence loss

like it’s done by CODI. This is a great challenge and

a key step to really using Ontology Matching in real

systems.

Secondly, almost every algorithm depends on mapped

knowledge, only a few does auto-mapping, and those

who do it they have a very low conﬁdence on the gen-

erated data. With machine learning technologies this

can be improved by creating more useful and reliable

ontologies. Thirdly, to improve precision-recall ratios

and to improve capabilities to face new problems and

scalability.

These three challenges are the main and permanent

lines of improvement in every algorithm that has been

made, so they can’t be removed from the scope.

ACKNOWLEDGEMENTS

This work is part of the project "TSI-090500-2011-36

- Ministerio de Industria, Turismo y Comercio", and

was also supported by Sandra Castro, Noelia Gil and

J.M. Castro from Intellectia Bank S.A.

REFERENCES

Batini, C. and M.Lenzerini (1986). A comparative analy-

sis of methodologies for database schema integration.

ACM Computing Surveys, 18(4).

Bellahsene, Z., Bonifati, A., Duchateau, F., and Velegrakis,

Y. (2011). On Evaluating Schema Matching and Map-

ping.

David, J., Euzenat, J., Scharffe, F., and dos Santos, C. T.

(2010). The Alignment API 4.0. IOS Press, 1:1 – 8.

Euzenat, J. and Shvaiko, P. (2007). Ontology Matching.

Springer.

Huber, J., Sztyler, T., Nößner, J., and Meilicke, C. (2011).

Codi: Combinatorial optimization for data integra-

tion: results for oaei 2011. In (Shvaiko et al., 2011).

Jiménez-Ruiz, E., Morant, A., and Cuenca Grau, B. (2011).

LogMap results for OAEI 2011. In Proc. of the 6th

International Workshop on Ontology Matching (OM),

volume 814. CEUR Workshop Proceedings (CEUR-

WS.org). http://ceur-ws.org/Vol-814/.

Sakarkar, G. and Upadhye, S. (2010). A survey of software

agent and ontology. International Journal of Com-

puter Applications, 1(7).

Schadd, F. C. and Roos, N. (2011). Maasmatch results for

oaei 2011. In (Shvaiko et al., 2011).

Shvaiko, P. and Euzenat, J. (2008). Ten challenges for on-

tology matching.

Shvaiko, P., Euzenat, J., Heath, T., Quix, C., Mao, M., and

Cruz, I. F., editors (2011). Proceedings of the 6th In-

ternational Workshop on Ontology Matching, Bonn,

Germany, October 24, 2011, volume 814 of CEUR

Workshop Proceedings. CEUR-WS.org.

Spaccapietra, S., Parent, C., and Dupont, Y. (1992). Model

Independent Assertions for Integration of Heteroge-

neous Schemas. VLDB Journal, 1:81 – 126.

DescriptionandEvaluationofAlgorithmsforOntologyMatching

495