A LRAAM-based Partial Order Function for Ontology Matching in the

Context of Service Discovery

Hendrik Ludolph

, Peter Kropf

and Gilbert Babin

Institute of Computer Science, University of Neuchâtel, 2015 Neuchâtel, Switzerland

Information Technologies, HEC Montréal, 3000, ch. Côte-Ste-Catherine, Montréal (QC) H3T 2A7, Canada

Keywords:

Ontology Matching, Neural Network, Service, Integration.

Abstract:

The demand for Software as a Service is heavily increasing in the era of Cloud. With this demand comes a

proliferation of third-party service offerings to fulﬁll it. It thus becomes crucial for organizations to ﬁnd and

select the right services to be integrated into their existing tool landscapes. Ideally, this is done automatically

and continuously. The objective is to always provide the best possible support to changing business needs.

In this paper, we explore an artiﬁcial neural network implementation, an LRAAM, as the speciﬁc oracle to

control the selection process. We implemented a proof of concept and conducted experiments to explore the

validity of the approach. We show that our implementation of the LRAAM performs correctly under speciﬁc

parameters. We also identify limitations in using LRAAM in this context.

1 INTRODUCTION

Today, more than ever, Information Technology (IT),

and more speciﬁcally Information Systems (IS) are

necessary for an organization to succeed. An IS can

be deﬁned as a speciﬁc assembly of applications to

support distinctive enterprise needs (Izza, 2009). As

an organization evolves, so does the IS supporting it.

Many factors put pressure on the organization, which

in reaction will evolve. The sources of this pressure

include, but are not limited to, competition, internal

and external politics, organizational evolution, tech-

nical progress, and cost containment. The changes

induced by these pressure sources lead to a reconﬁg-

uration of the organization, which in turns leads to a

reconﬁguration of the IS supporting it. We anticipate

that these changes will tend to occur more frequently

and more rapidly. The high frequency of technolog-

ical innovations brought to the market illustrates this

tendency. This is further enhanced by the presence of

cloud-based solutions, i.e. SaaS. As a consequence of

this trend, we must ﬁnd ways to increase our ability

to evolve the organization’s IS as often and as quickly

as it is required to maintain the stability of the organi-

zation. In a perfect world, the IS itself would “know”

when change is required and would “adapt” itself to

better ﬁt the needs of the organization. This might

only occur if the IS has some understanding of what

the organization’s requirements were and how they

evolved. To adapt adequately, the IS must be able to

identify what are the alternatives from which to se-

lect the most appropriate response to the change. The

research presented in this paper is set in that vision

of autonomic IS adaptation in the context of service

oriented architecture (SOA).

The basic idea of SOA is to modularize and

wrap applications behind formally described access

points or interfaces, e.g., Application Programming

Interfaces (APIs) which follow more or less rigor-

ous protocols (e.g., SOAP, REST, JSON (Erl et al.,

2014)) and are accessible over the network. Through

the APIs, applications’ functionalities can be auto-

matically discovered (e.g., using UDDI registries;

see (Kale, 2014)) and consumed as a service. Ser-

vices can represent anything from simple service re-

quests to complicated business processes (Lehaney

et al., 2011). They can participate in many differ-

ent IS (Kale, 2014). With SOA, eventually, the ob-

jective is to lower integration hurdles and increase

reusability of applications. It empowers organizations

to assemble complex IS with unprecedented ﬂexibil-

ity and sophistication as business requirements shift

over time (Erl, 2004).

The SOA principle is applicable beyond organi-

zational limits. Specialized service providers, such

as SalesForce, ServiceNow, Akamai, etc. emerged to

extend on-premise SOA to off-premise cloud-based

services. They commoditize and commercialize ser-

Ludolph, H., Kropf, P. and Babin, G.

A LRAAM-based Partial Order Function for Ontology Matching in the Context of Service Discovery.

DOI: 10.5220/0006294904210431

In Proceedings of the 7th International Conference on Cloud Computing and Services Science (CLOSER 2017), pages 393-403

ISBN: 978-989-758-243-1

393

vices, such as Customer Relationship Management,

IT Service Management, Performance & Availability.

Other organizations request, contract, and integrate

these services into their internal IS instead of setting

up the functionality by themselves. This way, no ex-

pensive technical know-how for the service is needed.

If later the service is no longer useful, the contract is

cancelled. The service, technically and commercially,

disappears from the organizational scope. This ﬂex-

ibility appeals to more and more organizations these

days to improve their IS (Cisco, 2014). Some authors

even claim that it becomes mandatory for keeping

a competitive advantage (Fensel and Bussler, 2002;

Hussain and Mastan, 2014). The commercial success

of some service providers, such as SalesForce, acts

as an incentive for new ﬁrms to enter the SaaS market

(see (Frank and Cartwright, 2013, Chapter 11) – long-

term zero-proﬁt equilibrium) and offer similar (some-

times identical) services. This leads to a proliferat-

ing number of similar cloud-based services to choose

from (Bughin and Chui, 2010).

In this context, the organization’s IS is an assem-

bly of services (in-house or cloud-based). It provides

a more ﬂexible and easier to adapt solution to support

all requirements of the organization. Hence, when-

ever a change in the organization’s requirements oc-

curs, the set of all services available is searched to

select those which may best support the changed re-

quirements, and to remove the obsolete/inappropriate

services from the IS scope, and ﬁnally integrate the

newly selected services.

Current industrial integration techniques, such

as traditional middleware (e.g., remote procedure

call mechanisms, data-oriented, component-oriented,

message-oriented, application servers), EAI tools

(e.g., MS Biztalk, Tibco), BPM (e.g., BPEL, BPMN),

or SOA (Oracle Service Bus) do support service in-

tegration. However, these approaches do not lend

themselves to autonomic IS adaptation. For many

authors (Izza, 2009; Fensel et al., 2011; Hoang

et al., 2014; Hoang and Le, 2009), these integra-

tion techniques are agnostic to the most crucial aspect

of autonomic IS adaptation, that is, understanding

the semantics of services. Indeed, these techniques

merely regulate information and focus on meta-data

exchange. They are syntactic in nature. The same au-

thors suggest using Semantic Web-based approaches,

which enable machines to “understand” the meaning,

i.e., the semantics of services. In our view, semantic

understanding is required both to determine changes

in requirements, comparing the old and the new re-

quirements, and how to resolve the differences, iden-

tifying and selecting appropriate services.

The work presented in this paper focuses on one

of the tasks that must be performed by the auto-

nomic IS system: service selection and composition

(SSC). In our view, SSC should (1) automatically se-

lect the “best” service available; (2) automatically in-

tegrate the selected service to the organization’s IS;

and (3) do this continuously as part of the autonomic

IS environment. All this is based on the premise that

we can automatically determine how well a service

supports requirements, and by extension, that we can

rank services by their level of support. It is clear that

the selection process goes beyond simple discovery

as it represents a degree of intelligence, namely iden-

tiﬁcation and analysis towards synthesis of possible

actions (Zdravkovi

c et al., 2014). We further limit the

scope in this paper to the selection process itself. In-

deed, once selection is performed, existing SOA tech-

niques can be used to facilitate/automate the actual

integration process.

Speciﬁcally, the paper presents exploratory results

on a novel service selection approach which uses on-

tological description of services. That is, we assume

that both organization’s requirements and service of-

ferings are described and represented using an onto-

logical notation, such as OWL (Web Ontology Lan-

guage), in addition to the usual descriptions used for

SOA (e.g., UDDI, SOAP, etc.), which are by nature

syntactic, as they describe the APIs and the data struc-

ture, but do not provide any information about the

purpose of the service, at least not in a form that

can be processed by a computer. In (Ludolph et al.,

2011), the authors present a global service selection

algorithm. The algorithm assumes the existence of a

service matching function. In this paper, we focus on

the deﬁnition of the service matching function, which

lacked in (Ludolph et al., 2011), using an ontology

matching approach. The main issue addressed in the

paper, therefore, pertains to the speciﬁcation and anal-

ysis of a partial order function used to rank services

from most appropriate to least appropriate based on

the similarity of their ontological description.

We ﬁrst start with a brief analysis of the use of

ontology matching approaches in the context of ser-

vice composition (Sec. 2). This sets the context in

which the partial order function is required. We then

explore different alternative approaches to deﬁne such

a partial order function. Established symbol-driven

(Sec. 2.2) as opposed to neural network matching

techniques (Sec. 2.3) are discussed. From the latter

category, in Section 3, the LRAAM, a speciﬁc type

of artiﬁcial neural network, is further investigated. In

Sections 4 and 5, we describe experiments using an

LRAAM-based partial order function to perform on-

tology matching. The paper is concluded in Section 6

with a critical discussion of the results, the approach’s

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

394

limitations and a outlook on future research.

2 THE MATCHING PROBLEM

Following (Fensel et al., 2011) and (Born et al., 2007),

the semantic descriptions of services is necessary in

order to establish and warrant interoperability that

does not require a human to manually effect certain

integrations that will rapidly become obsolete, or non-

reusable in a dynamically evolving environment. The

approach described in this paper thus focuses on a

semantics-based approach towards more intelligent

SSC. We assume that business activities (e.g. a task

in BPMN terminology), and how they are related, and

that independent, competing services are described

using semantic descriptions. In our context, we fo-

cus on what a service provides as opposed to on how

to access it technically (e.g., its UDDI description).

A common approach to supply semantic decrip-

tions is the use of ontologies. An ontology is a formal

representation of some knowledge domain. It con-

tains relevant entities and their relations. It is based

on formal semantics, i.e., logic, allowing for machine

reasoning (see (Antoniou and van Harmelen, 2008;

Antoniou and van Harmelen, 2009) for a detailed in-

troduction). Figure 1 illustrates such an ontological

description, where we can identify ﬁve distinct in-

stances of Activity. A similar ontological description

is assumed for services.

Process

Ac vity

CreditApproval

MeetWithCustomer

RequestCreditAmount

CapturePersonID

CreditChecking

Figure 1: Simple ontology for related business activities.

Provided that the organizational requirements de-

scriptions are available in a standardized form (e.g.,

OWL-based ontologies), we assume that the descrip-

tion of a business activity will be more or less sim-

ilar to a corresponding service. Thefore, “similar-

ity” could be used as a selection criterion. It follows

that the most similar service for a certain business

activity will be the one selected. Eventually, once

all services required are selected, service composition

can be accomplished by adequate attribute mapping.

These mappings could be automatically constructed

or adapted through syntactic or semantic matching

techniques (Euzenat and Shvaiko, 2013).

2.1 Finding a Good Oracle for Ontology

Matching

Automatically establishing service–requirement map-

pings on a large scale remains a challenge (Diallo,

2014; Otero-Cerdeira et al., 2015). To this we add

other challenges: (1) the number of distinct ontolo-

gies to evaluate in a continuous manner, considering

the evolving organizational context described above,

(2) the efﬁciency in terms of search space and time

consumption, (3) the effectiveness in terms of correct

and complete identiﬁcation of semantic correspon-

dences (Rahm, 2011), (4) the potential use of possibly

fragmented background knowledge, and (5) the user

involvement (Shvaiko and Euzenat, 2013).

Matching is the fundamental operation to identify

similarities (Rahm and Bernstein, 2001) between two

ontologies. It takes two ontologies as input and pro-

duces a mapping between semantically similar enti-

ties. Following the approach described in (Ludolph

et al., 2011), we start from an ontological description

of a business process (such as in Fig. 1) to identify

all activities (using a “is-a” relationship). Then, we

look at all valid pairs of consecutive activities {aa

}

in the business process (using a “preceeded by” rela-

tionship). In this context, activity a is a predecessor

of a

in the process. We also consider their contextual

use within the business process. We seek to identify

a pair of services {ss

} that supports the most ade-

quately the sequence of activities {aa

}. Services are

also identiﬁed using a “is-a” relationship in the ser-

vice ontologies. A generic ontology fragment, called

a sequence ontology, is used to represent the prece-

dence relationship (Ludolph et al., 2011). The se-

quence ontology has two placeholders, one for the

predecessor activity (service), and one for the suc-

cessor activity (service). Using the sequence ontol-

ogy, we construct the set R = {r

} of reference

ontologies by replacing both placeholders by a and

, respectively. In the same way, we construct the set

C = {c

} of compound ontologies representing the

composition of services s and s

, such that s 6= s

Reference and compound ontologies are com-

pared against one another to evaluate a matrix D =

], where d

is the ontological distance between

reference ontology r and compound ontology c. In

this context, a distance d

= 0 would yield identi-

cal ontologies r and c, while the value of d

would

increase as the dissimilarity between r and c in-

creases. Using the ontological distance, the general

matching algorithm (Ludolph et al., 2011) then tries

A LRAAM-based Partial Order Function for Ontology Matching in the Context of Service Discovery

395

to ﬁnd an optimal solution which identiﬁes the best

matches among R and C and which minimizes (1) in-

tegration costs, (2) costs of on-premise applications

packages/add-on’s providing services bundles, and

(3) costs of off-premise, cloud-based services. The

optimal solution must fulﬁll the following constraints:

(1) each r must be matched with exactly one c, (2) at

most one c is matched with an r, and (3) all service

sequences must be coherent with the business activity

sequences.

Under these conditions, a perfect set of services,

that is, one for which d

= 0, ∀r ∈ R, ∀c ∈ C,

would return an optimal, integrated sequence of ser-

vices to support a predeﬁned business process. Inte-

gration costs would be negligible.

2.2 Symbol-based Methods to Match

Ontologies

The real challenge not addressed in (Ludolph et al.,

2011) is in deﬁning explicitly a distance function d

Indeed, the authors hypothesize that such a function

exists. In general, distance can be deﬁned and mea-

sured in different ways. An example is the Hamming

distance. To obtain it, one counts the minimum num-

ber of letter substitutions required to transform one

string into another string – the fewer the substitutions,

the more similar the strings, the smaller the distance.

For example, for two strings a = ’ibm’ and b = ’hal’,

a,b

Ham

= 3. Using a symbol-based approach, deter-

mining an ontological distance is somewhat equiva-

lent to calculating the Hamming distance.

Ontology matching designates the process of ﬁnd-

ing semantic similarities. Following (Kotis et al.,

2006), the matching of two ontologies o and o

can be

deﬁned as a morphism from o to o

. One approach to

determine a distance would therefore be to determine

how many steps are optimally required to morph from

ontology o to ontology o

The associated matching task is to ﬁnd an align-

ment A between o and o

(Euzenat and Shvaiko,

2007). An alignment is a set of correspondences,

which in turn is a 4-tuple hid, e

, e

, ri, with id as

correspondence identiﬁer, e

and e

as entities (e.g.,

classes, properties) of the compared ontologies, and

r ∈ {6, =, 1, ⊥}

the identiﬁed relation (Atencia

et al., 2011). Various techniques are used to ﬁnd cor-

respondences. Table 1 presents a classiﬁcation into

syntactic and (formal) semantic matching techniques.

These techniques focus on individual elements or

whole structures. They may analyze frequency dis-

reads: is less general than, equal, is more general than,

disjoint from.

tribution or speciﬁc languages’ morphologies. They

introduce external data repositories. Eventually, they

all work on discrete arbitrary objects, that is, symbols.

They rely on the exactness of the analyzed represen-

tations to draw appropriate conclusions.

This, however, is at odds with the fact that the rep-

resentation of requirements is by nature imprecise.

These approaches are therefore insufﬁcient to prop-

erly determine the distance.

2.3 LRAAM as Matching Function

To overcome the shortcomings of symbol-based

matching techniques in the context of impre-

cise requirements, we combine them with a non-

deterministic approach, namely an artiﬁcial neural

network (ANN (Hinton et al., 1986)) implemented

as LRAAM (Labelled Recursive Auto-Associative

Memory (Sperduti, 1993)). Typical symbols in a sym-

bolic model are letter strings. They may be placed

into structured relationships with other symbols, e.g.,

subsumption. However, they do not possess inter-

nal structure on their own (Blank et al., 1992). In

contrast, ANNs incorporate distributed representa-

tions (Chan, 2003), also called subsymbolic repre-

sentations. These representations might evolve into

different patterns, but nevertheless, still behave in

a way related to the original pattern. Structure is

thus inherent to the representation, also called micro-

semantic (see Tab. 2). Our view is that in a highly

dynamic environment it is not about ﬁnding an exact,

but rather a most similar service to support an activity.

To support this view, a ﬁne-grained (continuous) dis-

tributed pattern as opposed to a coarse-grained (dis-

crete) symbolic pattern is evaluated. ANN’s proper-

ties do not arise from the nodes’ individual function-

ality but from the collective effects resulting from the

nodes’ interconnections. It amounts to the develop-

ment of distributed internal representations of the in-

put information (Blank et al., 1992).

An ANN has the ability to derive patterns from

complex or changing input data. In this context, it

is used to evaluate similarity of ontologies, such as

r and c, which may be too difﬁcult to be noticed

by either humans or symbol-based computing tech-

niques (Li et al., 2012). Speciﬁcally, the ANN is able

to learn all of r’s, respectively c’s, ontological entities

and relations at the same time as distributed represen-

tations. It allows for changing the matching approach

from a coarse-grained d

∈ {0, · · · , u} with u ∈ N

to a ﬁne-grained d

∈ [0, u] with u ∈ R (see also

Sect. 3).

The LRAAM is a particular implementation of an

ANN. Its most interesting feature is the potential to

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

396

Table 1: Classiﬁcation of matching techniques (adapted from (Euzenat and Shvaiko, 2013)).

Syntactic

Semantic

Element-

level

Informal resource-based directories, annotated resources

String-based name similarity, description similarity, global

namespace

Language-based tokenisation, lemmatisation, morphology,

elimination, lexicons, thesauri

Constraint-based type similarity, key properties

Formal resource-based upper-level ontologies, domain-speciﬁc

ontologies, linked data

Structure-

level

Taxonomy-based taxonomy structure

Graph-based graph homomorphism, path, children, leaves

Instance-based data analysis, statistics

Model-based SAT solvers, DL reasoners

Table 2: Symbolic vs. subsymbolic paradigm (Blank et al.,

1992).

Subsymbolic Symbolic

Representation distributed atomic

continuous discrete

emergent static

use affects form arbitrary

Composition superimposed concatenated

context-sensitive systematic

Functionality micro-semantic macro-semantic

holistic atomic

encode (and decode for that matter) labeled directed

graphs of arbitrary size (de Gerlachey et al., 1994;

Sperduti, 1993). The resulting patterns are sensitive

to the graph they represent. Following (Ellingsen,

1997), these patterns can be exploited for similar-

ity analysis. They are thus used to calculate the

distance matrix D = [d

]. The general architec-

ture of an LRAAM is shown in Figure 2. It is a

supervised 3-layer feedforward network trained by

backpropagation. The dashed arrows indicate that

this auto-associative architecture

must be used re-

cursively (Pollack, 1990). Certain node values from

the hidden and output layer are fed back to the input

data until the network has reached a steady state, that

is, until the activation thresholds remain stable even

when feeding new inputs to the network.

The training of the LRAAM is achieved through

backpropagation so it learns an identity function F :

x → x

, where x, x

∈ R

. A node vector is com-

pressed by using the function F

: x → z. Then, the

compressed representation is reconstructed using the

function F

: z → x. The node vector x

is thus an ap-

proximated output equal to x. The network is trained

by presenting the input vectors repeatedly, one vector

at the time.

The “auto-association” is a consequence of the equality

of input and output layers.

output vector x‘ (output layer)

input vector x (input layer)

hidden vector z (hidden layer)

input data

Figure 2: LRAAMs network architecture.

3 INTEGRATING AN LRAAM

MATCHING FUNCTION IN THE

ONTOLOGY MATCHING

ALGORITHM

For each r ∈ R and c ∈ C, we construct a directed

acyclic graph G = (V, E), such that vertices (V )

correspond to concepts from the ontology and edges

(E) correspond to relationships between these con-

cepts (Eder and Wiggisser, 2007). In other words, G

is a conceptual graph. The vector z ∈ R

serves as

comparison pattern and Z = [z

, z

, . . . , z

ˆg

, . . . , z

]

as a collection of comparison patterns, with g the

number of vertices in G. Vertices 1 through ˆg cor-

respond to the nodes from the generic sequence on-

tology from which both r and c were constructed.

We will note Z

ˆg

the vector composed of values

, z

, . . . , z

ˆg

]

By construction of the LRAAM, we know that

each z in Z

ˆg

and Z

ˆg

contains (micro-semantic) in-

formation about the complete collections Z

and Z

notwithstanding the number of vertices in r or c.

Therefore, d

can be calculated as the Euclidian dis-

tance between Z

ˆg

and Z

ˆg

. The ontological distance

A LRAAM-based Partial Order Function for Ontology Matching in the Context of Service Discovery

397

between r and c is thus deﬁned as:

ˆg

i=1

− z

)

To construct the LRAAM, we proceed as follows

(Fig. 3). Each vertex within G serves as a single in-

put vector x = (x

, x

, . . . , x

) ∈ R

. By exten-

sion, X = [x

, x

, . . . , x

]

, the collection of vertex

vectors for a speciﬁc r, respectively c, comprises the

complete input data to the network.

label

pointer

condition

, )

real value in range [-1,1]

Figure 3: Experimental LRAAM implementation.

To obtain z of a vertex, part of the input (out-

put) vector is allocated to represent the vertex la-

bel and the existence of pointers p (edges

) to con-

nected vertices. There are q pointer slots reserved,

where q = max{degree(v )}, with v ∈ V . The

input vector is composed of t

+ q · m → R

ele-

ments, where t

is the number of elements used to

represent vertex information (label plus pointer ex-

istence), and m is the number of elements used to

represent pointer values. If a vertex has less than

q pointers, a nil pointer is used, to which a random

value is initially assigned (Ellingsen, 1997). The hid-

den representation z of a speciﬁc vertex is understood

as the pointer for that vertex. As part of other in-

put vectors, it will thus be used as pointer and it-

eratively fed to the network. Eventually, we obtain

the collection Z of ﬁxed-sized vectors which rep-

resents all concepts/relationships from the ontology.

For each reference ontology r, we can thus determine

the most similar compound ontology c ∈ C, such that

d(Z

ˆg

, Z

ˆg

), ∀r ∈ R, ∀c ∈ C : Z

ˆg

× Z

ˆg

→ R

minimized.

In a ﬁnal step, c is selected for effective integra-

tion. Again, the assumption is that the selected ser-

vice sequence ss

is semantically closer to the se-

The pointer p represents an edge of the graph G.

quence of activities aa

and thus better suited to sup-

port it.

4 EXPERIMENTAL

METHODOLOGY

We developed a prototype ontology matching tool in

order to assess the quality of the ontology distance

function. The next section describes the prototype.

Special emphasis is put on the list of parameters to the

LRAAM, as these will be investigated in the experi-

mental protocol. The experimental protocol is pre-

sented next. As this study is exploratory, the exam-

ples used are in their simplest form in order to control

all input parameters to better assess the quality of the

distance function, given known input ontologies.

4.1 Prototype Ontology Matching Tool

In order to test the LRAAM-based distance function,

a tool was developed to receive, transform, and pro-

cess the input, and determine the similarity score,

that is d

. The tool extends Neuroph

to realize the

LRAAM topology. The tool receives the following

input data:

• A process ontology (such as Fig. 1) and two or

more service ontologies in order for the tool to

construct r ∈ R and c ∈ C.

• LRAAM-speciﬁc input parameters (see

also (Ellingsen, 1997)):

– : total error to reach before training stops.

– η: backpropagation learning rate.

– σ

: slope of hidden layer activation function

ϕ(υ) = 1/(1 + e

), with υ the sum of in-

put to the unit.

– σ

: slope of the activation function ϕ(υ) =

1/(1 + e

) of the label part of input/output

vector x, being x

– σ

: slope of the activation function ϕ(υ) =

1/(1 + e

) of the pointer part of input/output

vector x, being x

– m: freely deﬁned number of hidden layer

nodes, with m < |x|. Note: It determines the

size of the input vector x (see again Fig. 3).

Based on this input data, the tool initializes the fol-

lowing variables, which are described and discussed

in the subsequent sections:

http://neuroph.sourceforge.net/

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

398

• V

= {v

}, the set of vertices of a reference on-

tology, where v

is a vertex in the graph represen-

tation of reference ontology r ∈ R.

• V

= {v

}, the set of vertices of a compound on-

tology, where v

is a vertex of the graph represen-

tation of compound ontology c ∈ C.

• E

= {v

}, the set of edges in a reference ontol-

ogy, such that an edge starts from v

and ends at

• E

= {v

}, the set of edges in a compound on-

tology, such that an edge starts from v

and ends

at v

• q ← maxOutDegree(V

∪ V

), the maximum

number of connection slots to include in the input

and output vectors of the LRAAM.

• l

max

← maxLabelLength(V

∪ V

), the maxi-

mum number of {−1, 0, 1} values required to en-

code the labels of the vertices.

• t

= q + l

max

, the number of {−1, 0, 1} values

required in the input and output vectors to encode

the different vertices.

• t

← m · q, the number of real values required in

the input and output vectors to encode the connec-

tions between vertices (pointers).

• z

is the reduced representation of vertex v

∈

at the beginning of a training iteration. It is a

vector of m values.

• z

is the reduced representation of vertex v

∈

at the end of a training iteration. It is a vector

of m values.

• x

= (x

h,v

, x

p,v

) is the input vector of vertex

∈ V

at the beginning of a training iteration,

where x

h,v

is a {−1, 0, 1} value vector encoding

the label and pointer conditions of vertex v

and

p,v

is a real vector encoding the pointers outgo-

ing from vertex v

• x

= (x

h,v

, x

p,v

) is the output vector of vertex

∈ V

at the beginning of a training iteration,

where x

h,v

is a {−1, 0, 1} value vector encoding

the label and pointer conditions of vertex v

and

p,v

is a real vector encoding the pointers outgo-

ing from vertex v

• z

is the reduced representation of vertex v

∈

at the beginning of a training iteration. It is a

vector of m values.

• z

is the reduced representation of vertex v

∈ V

at the end of a training iteration. It is a vector of

m values.

• x

= (x

h,v

, x

p,v

) is the input vector of vertex

∈ V

at the beginning of a training iteration,

where x

h,v

is a {−1, 0, 1} value vector encoding

the label and pointer conditions of vertex v

and

p,v

is a real vector encoding the pointers outgo-

ing from vertex v

• x

= (x

h,v

, x

p,v

) is the output vector of vertex

∈ V

at the beginning of a training iteration,

where x

h,v

is a {−1, 0, 1} value vector encoding

the label and pointer conditions of vertex v

and

p,v

is a real vector encoding the pointers outgo-

ing from vertex v

4.2 Experimental Protocol

The experiment is divided into a preliminary and three

main phases. In the preliminary phase, a simple

process ontology composed of the last two activities

from Figure 1 is constructed, namely activities Cred-

itChecking and CreditApproving. In addition, two

simple service ontologies are created. Each service

is described by exactly one label. The labels are also

CreditChecking and CreditApproving. We used

Protégé

to construct the different ontologies.

The ontologies are loaded into the tool,

which then generates new composed ontolo-

gies r: CreditChecking × CreditApproving,

: CreditApproving × CreditChecking, and

: CreditChecking × CreditApproving. As we

explore the potential of LRAAM as a classiﬁcation

tool at this time, the experiment is not extended to

more than two related activities.

In a further step, the ontologies r and c

are used

to implement the LRAAM to calculate d

, with i =

{1, 2}.

For the analysis, the following variables are de-

ﬁned:

• w: the expected best (winning) service, with w ∈

{1, 2}. For the controlled experiment, we expect

w = 2, as r ≡ c

• d

: distance of service combination c

, with i =

{1, 2}, and r.

• d

: distance of expected winning service combi-

nation c

and r

• r

: ranking of service combination i out of all

combinations.

• r

: ranking of expected winning service combi-

nation out of all combinations.

• b: 1 if correctly matched, i.e., w = 2, 0 otherwise.

In Phase 1, the relationship between the LRAAM

input , η, σ

, σ

, m and output d

is analyzed.

For each parameter, interval and tick size is ﬁxed to

http://protege.stanford.edu/

A LRAAM-based Partial Order Function for Ontology Matching in the Context of Service Discovery

399

explore reasonable, yet limited, parameter combina-

tions within LRAAM’s large state space. Tick size

is the value of each increment of the parameter value

from lowest to largest value in the interval. For exam-

ple, an interval from 0 to 10, with tick size of 5 would

test values in the set {0, 5, 10}.

All parameter values are thus recombined with

each other. To control processing time, one LRAAM

instance per combination is executed. A multiple lin-

ear regression is conducted to identify the parameters

having a signiﬁcant correlation with d

. These param-

eters will be analyzed further in subsequent phases.

The other parameters are then ﬁxed heuristically at a

value which maximizes b for d

(i.e., the number of

good classiﬁcations). At this point, classiﬁcation per-

formance is not evaluated.

In Phase 2, we further explore the impact of

parameters with a signiﬁcant correlation with d

Speciﬁcally, the interval is increased and the tick size

is decreased. More of LRAAM’s state space is thus

explored, this time to ﬁnd optimal parameter values.

All resulting parameter values are recombined with

each other. Similar to Phase 1, one LRAAM instance

per combination is executed. Each variable parameter

is heuristically analysed for promising values which

maximize b for d

. Classiﬁcation performance is still

not evaluated.

In Phase 3, 2000 LRAAM runs are executed for

the promising values found in Phase 2, which are

analysed for classiﬁcation performance. Speciﬁcally,

each list is transformed by assigning a ranking r

. The smallest of two d

per run is given a ranking

= 1. The other ranking (r

= 2) is transformed

into a weighted ranking, normalized over the maxi-

mum distance spread over all runs, which gives a ﬁner

account of the classiﬁcation performance.

The general success criteria is deﬁned as, µ

, ∀i 6= w which in the experiment is µ

< µ

In other words, the mean ranking of the expected win-

ning service combination c

must be smaller than the

respective means of all other service combinations c

If a list respects the criterium, it is selected and

tested for statistical signiﬁcance. To this end, an inde-

pendent samples t-test is used, as the sample groups

i are independent of each other. Furthermore, as no

experiment is conducted on the same subjects before

and after some event, a paired t-test is not applicable.

If the null-hypothesis (α = 0.05) can be rejected, d

is correctly classiﬁed. In that case, the best combina-

tion of services (c) to support related business activi-

ties (r) could be identiﬁed.

5 EXPERIMENTAL RESULTS

In Phase 1, a list of 12,636 d

is generated, corre-

sponding to 6,318 parameter value combinations. The

model summary of the conducted multiple regression

is shown in Table 3:

Table 3: Multiple regression model summary.

Predictors R R

Adj. R

Std.Err.

η 0.857 0.735 0.735 0.58527

η & m 0.915 0.838 0.838 0.45722

η & m & σ

0.949 0.900 0.900 0.35947

η & m & σ

& σ

0.949 0.900 0.900 0.35935

The predictive capacity of η (learning rate) and m

(number of hidden layer nodes) is high as these pa-

rameters explain a large part of d’s variability (R

83.8%). Both parameters are promoted to Phase 2.

As the other parameters (IO layer binary slope σ

IO layer real slope σ

) do not have a signiﬁcant im-

pact on d, they are ﬁxed based on suggestions made

in (Ellingsen, 1997). We have  = 0.15, σ

= 6 (IO

layer binary slope), σ

= 0.5 (IO layer real slope),

and σ

= 0.5 (hidden layer slope).

For σ

, σ

, and σ

, the values selected also yield

the highest count of correctly matched c in the list (see

marked area in Table 4).

Table 4: Parameter combination with highest count of cor-

rectly mapped c (framed).

Epsilon IO layer

binary slope

IO layer

real slope

Hidden layer

slope

Correct

matches

0.15 6 0.50 0.50 29

0.20 6 1.50 0.40 26

0.20 7 1.00 0.70 26

0.15 5 1.50 0.70 25

0.25 5 1.00 0.30 25

0.25 7 0.50 0.70 25

In Phase 2, the interval of m and η is increased to

cover more of LRAAMs state space. A list of 6,060 d

is generated, corresponding to 3,030 parameter value

combinations. From this list, zones of promising

matching performance are identiﬁed (Fig. 4 and 5).

The values η = {0.09; 0.20} and m =

{22; 42; 105} are ﬁxed for further processing. Con-

sequently, six combination trials of 2,000 runs are ex-

ecuted during Phase 3. The results are shown in Ta-

ble 5. The classiﬁcation passed the success criterium,

namely, µ

< µ

, in four of the six parameter

combinations (in bold).

The signiﬁcance of the results is veriﬁed by an in-

dependent samples t-test (see Tab. 6). As we can see,

we only have a signiﬁcant classiﬁcation for the pa-

rameter combination (0.20,22), where a very low p-

value of 0.002 is obtained. It suggests that the mean

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

400

0,05

0,06

0,07

0,08

0,09

0,10

0,11

0,12

0,13

0,14

0,15

0,16

0,17

0,18

0,19

0,20

0,21

0,22

0,23

0,24

0,25

0,26

0,27

0,28

0,29

0,30

0,31

0,32

0,33

0,34

Correct matches

Learning rate

Sum of correct matches

Figure 4: Phase 2: Correct matches for η’s tested interval.

100

102

104

106

108

110

Correct matches

Number of hidden layer nodes

Correctly matched

Figure 5: Phase 2: Correct matches for m’s tested interval.

ranking of compound ontology c

= 1.89 is signiﬁ-

cantly different from the mean ranking of c

= 2.02.

With α = 0.05, in this case, the null-hypothesis is

thus rejected. In other words, with 2,000 runs, the

combination correctly classiﬁes the input. However,

SPSS only allows for 2-tailed t-tests, whereas we are

interested only in the left tail of the t-distribution.

Since the t-test supposes a symmetrical distribution,

in this case, the “signiﬁcant” tail will reﬂect signiﬁ-

cance at p/2 = 0.025 or below

. As 0.002 < 0.025,

the validity of rejection holds.

6 DISCUSSION AND

CONCLUSION

In this paper we introduced a service selection and

composition approach towards automatic, yet ﬂexible

integration of applications wrapped into services. We

discussed its necessity for organizations to cope with

an increasing number of modularized and decentral-

http://www-01.ibm.com/support/docview.wss?

uid=swg21476176 (visited May 19, 2015).

Table 5: Classiﬁcation results for ﬁxed parameter value

combinations.

Parameters Matches

η m c

)

0.09 22 989 1,011

0.20 22 949 1,051

0.09 42 996 1,004

0.20 42 1,005 995

0.09 105 974 1,026

0.20 105 1,042 958

Table 6: t-test for signiﬁcance of µ

< µ

Parameters Levene’s test t-test for equality of µ

η m F Sig. t df Sig.(2-tailed)

0.09 22 0.792 0.374 1.122 3,998 0.262

0.20 22 2.232 0.135 3.025 3,998 0.002

∗

0.09 42 0.368 0.544 0.587 3,998 0.557

0.09 105 0.000 0.991 1.148 3,998 0.251

∗

rejected.

ized services from which to choose, especially in the

context of the proliferation of cloud-based services.

As other authors (e.g., (Born et al., 2007; Fensel

et al., 2011)), we believe that ontologies lend them-

selves naturally as building blocks. The idea is that

business experts may use ontologies to describe cor-

porate business activities, on the one hand, and ser-

vice providers use ontologies to describe their service

offerings, on the other hand, so machines can com-

pare them automatically. It is a reasonable conjecture

that an ontology describing a credit checking activ-

ity uses “similar” entities and relations as the needed

credit checking service. In our example, both possibly

contain the entities Person and Customary credit reg-

ister, including a relation hasEntry between them. In

turn, an ontology describing a weather forecast ser-

vice most probably does not contain an entity Cus-

tomary credit register.

With the implemented LRAAM oracle, we con-

ducted experiments to verify similarity of services.

The results obtained are limited, as only one pair of

parameters in the space investigated yielded signif-

icant results. Consequently, no general conclusions

may be drawn from these experiments. This however,

does not imply that LRAAM should not be consid-

ered altogether to perform service selection. Indeed,

we feel that the implemented tool needs further devel-

opment and pruning to increase its performance. The

following points caught our attention:

• As part of the sigmoid activation function, a

higher impact of σ on d was anticipated. This

could however not be shown and warrants further

investigation.

• The signiﬁcant parameter combination (0.2; 22)

A LRAAM-based Partial Order Function for Ontology Matching in the Context of Service Discovery

401

leads to a ratio m/n = 22/151 = 0.15. Based

on (Ellingsen, 1997), a good classiﬁcation perfor-

mance was expected at a higher ratio of ≈ 0.25.

• Ontologies c and r contain the same labels by de-

fault, namely those representing the sequence on-

tology, which we use to calculate the Euclidean

distance. In the experiment, only two labels are

added to it to construct c, respectively r. These

labels may not have sufﬁcient impact to signiﬁ-

cantly alter the LRAAM’s rather volatile steady

state. In other words, distinctiveness, which is

sought, gets lost (see next point).

• We do not think that the current LRAAM im-

plementation exhibits a sufﬁciently trustworthy

steady state. Instead of initializing the input vec-

tor x with values from the interval [−1, 1], similar

to (Al-Said and Abdallah, 2009), it may be ben-

eﬁcial to use a smaller interval, e.g., [−0.5; 0.5],

to decrease potential variability, such as described

in (Sperduti, 1993). Nevertheless, in light of the

similarity already given between the two particu-

lar input strings, the results seem promising.

• Narrowing down the LRAAM’s state space is

time-consuming. However, further intervals or

parameters need to be explored, also combined

with the above mentioned adaptations. One fur-

ther parameter may be the number of consecutive

times the LRAAM’s establishes a steady state (ex-

pressed in Z) based on the error tolerance , before

it is chosen to calculate d. It would increase con-

ﬁdence in the validity of the steady state.

The above list mainly points towards improve-

ments of the LRAAM’s existing architecture, i.e., tun-

ing the activation function, varying the ratio of in-

put and hidden layer, etc. However, a further route

to enhancing classiﬁcation performance may reside

in changing the LRAAM’s architecture altogether.

Speciﬁcally, the idea would be to couple the LRAAM

with the Deep Learning (DL) approach. The latter

represents an ANN with up to 20 hidden layers (in-

stead of only one). In total, such an ANN may con-

tain tens or hundreds of thousands of units. Follow-

ing (LeCun et al., 2015), a DL system can implement

extremely intricate functions of its inputs that are si-

multaneously sensitive to smallest variations. It ex-

ploits the property that many input signals are com-

positional hierarchies, in which higher-level features

are obtained by composing lower-level ones. The au-

thors continue to state that similar hierarchies exist in

speech and text from sounds to phones, phonemes,

syllables, words and sentences. Clearly, this may

prove beneﬁcial to our approach. Finally, for DL,

poor local minima are rarely a problem. Notwith-

standing the initial conditions, i.e., the initialized ran-

dom values, the system nearly always reaches high

quality solutions.

In Figure 6, a possible realization is sketched. In-

stead of initializing the label part of LRAAM’s in-

put vector with binary information, one could use the

Deep Learning approach to preprocess those concept

labels.

output vector x‘ (output layer)

hidden vector z (hidden layer)

input vector x (input layer)

Label feature vector (pixel intensity)

Label feature vector (curves)

Label feature vector (letters)

Label feature vector (syllables)

Label feature vector (label)

Figure 6: Deep Learning (DL) and LRAAM. Concept la-

bels are preprocessed by DL before participating in the in-

put vector x.

Summarizing, in this paper, we explored ontol-

ogy matching as a means towards dynamic service

integration. Thereby, we limited our focus on on-

tological descriptions about what a service actually

does. We followed the premise that the richer a ser-

vice is described, the better it can be evaluated and

selected, provided a holistic (micro-semantic) match-

ing method.

REFERENCES

Al-Said, G. and Abdallah, M. (2009). An arabic text-

to-speech system based on artiﬁcial neural networks.

Journal of Computer Science, 5(3):207–213.

Antoniou, G. and van Harmelen, F. (2008). A Semantic Web

Primer. The MIT Press, Cambridge Massachusetts, 2

edition.

Antoniou, G. and van Harmelen, F. (2009). Ontology web

language: Owl. In Staab, S. and Studer, R., editors,

Handbook on Ontologies, pages 91–110. Springer.

Atencia, M., Euzenat, J., Pirrò, G., and Rousset, M.-C.

(2011). Alignment-based trust for resource ﬁnding in

semantic p2p networks. In The Semantic Web–ISWC

2011, pages 51–66. Springer.

Blank, D., Meeden, L. A., and Marshall, J. B. (1992). Ex-

ploring the symbolic/subsymbolic continuum: A case

study of RAAM. In The Symbolic and Connection-

ist Paradigms: Closing the Gap, pages 113–148. Erl-

baum.

Born, M., Drumm, C., Markovic, I., and Weber, I. (2007).

SUPER - raising business process management back

to the business level. ERCIM News, 2007(70).

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

402

Bughin, J. and Chui, M. (2010). The rise of the networked

enterprise: Web 2.0 ﬁnds its payday. McKinsey quar-

terly, 4:3–8.

Chan, S. W. K. (2003). Dynamic context generation

for natural language understanding: A multifaceted

knowledge approach. IEEE Transactions on systems,

man, and Cybernetics - Part A: Systems and Humans,

33(1):23–41.

Cisco (2014). Cisco global cloud index: Forecast and

methodology: 2013 - 2018. White paper, Cisco Sys-

tems Inc.

de Gerlachey, M., Sperdutiz, A., and Staritaz, A. (1994).

Using labeling raam to encode medical conceptual

graphs. NNESMED’94 proceedings.

Diallo, G. (2014). An effective method of large scale on-

tology matching. Journal of Biomedical Semantics,

5:44.

Eder, J. and Wiggisser, K. (2007). Detecting changes in

ontologies via DAG comparison. In Lecture Notes in

Computer Science 4495, pages 21–35.

Ellingsen, B. K. (1997). Distributed representations of

object-oriented speciﬁcations for analogical mapping.

Technical report, Citeseer.

Erl, T. (2004). Service-oriented architecture: a ﬁeld guide

to integrating XML and web services. Prentice Hall

PTR.

Erl, T., Chelliah, P., Gee, C., Kress, J., Maier, B., Normann,

H., Shuster, L., Trops, B., Utschig, C., Wik, P., and

Winterberg, T. (2014). Next Generation SOA: A Con-

cise Introduction to Service Technology & Service-

Orientation. The Prentice Hall Service Technology

Series from Thomas Erl. Pearson Education.

Euzenat, J. and Shvaiko, P. (2007). Ontology Matching.

Springer.

Euzenat, J. and Shvaiko, P. (2013). Ontology Matching.

Springer, 2 edition.

Fensel, D. and Bussler, C. (2002). The web service model-

ing framework wsmf. Electronic Commerce Research

and Applications, 1(2):113–137.

Fensel, D., Facca, F. M., Simperl, E., and Toma, I. (2011).

Semantic web services. Springer Science & Business

Media.

Frank, R. and Cartwright, E. (2013). Microeconomics and

Behaviour. McGraw Hill.

Hinton, G., McClelland, J., and Rumelhart, D. (1986).

Distributed representations, volume 1, pages 77–109.

MIT Press.

Hoang, H. H., Jung, J. J., and Tran, C. P. (2014). Ontology-

based approaches for cross-enterprise collaboration:

A literature review on semantic business process man-

agement. Enterprise Information Systems, 8(6):648–

664.

Hoang, H. H. and Le, M. T. (2009). Bizkb: A concep-

tual framework for dynamic cross-enterprise collabo-

ration. In Nguyen, N. T., Kowalczyk, R., and Chen,

S.-M., editors, ICCCI, volume 5796 of Lecture Notes

in Computer Science, pages 401–412. Springer.

Hussain, M. A. and Mastan, M. (2014). A study on se-

mantic web services and its signiﬁcant trends. IJCER,

3(5):234–237.

Izza, S. (2009). Integration of industrial information

systems: from syntactic to semantic integration ap-

proaches. Enterprise Information Systems, 3(1):1–57.

Kale, V. (2014). Guide to Cloud Computing for Business

and Technology Managers: From Distributed Com-

puting to Cloudware Applications. Taylor & Francis.

Kotis, K., Vouros, G., and Stergiou, K. (2006). Towards

automatic merging of domain ontologies: The hcone-

merge approach. Journal of Web Semantics (JWS),

4:60–79.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learn-

ing. Nature, 521(7553):436–444.

Lehaney, B., Lovett, P., and Shah, M. (2011). Business in-

formation systems and technology: a primer. Rout-

ledge.

Li, W., Raskin, R., and Goodchild, M. F. (2012). Se-

mantic similarity measurement based on knowledge

mining: An artiﬁcial neural net approach. Interna-

tional Journal of Geographical Information Science,

26(8):1415–1435.

Ludolph, H., Kropf, P., and Babin, G. (2011). SoftwIre

integration - an onto-neural perspective. In Babin,

G., Stanoevska-Slabeva, K., and Kropf, P., editors, E-

Technologies: Transformation in a Connected World -

5th International Conference (MCETECH 2011). Les

Diablerets, Switzerland, January 23-26, 2011, Re-

vised Selected Papers, number 78 in Lecture Notes

in Business Information Processing, pages 116–130.

Springer.

Otero-Cerdeira, L., Rodríguez-Martínez, F. J., and Gómez-

Rodríguez, A. (2015). Ontology matching: A lit-

erature review. Expert Systems with Applications,

42(2):949–971.

Pollack, J. B. (1990). Recursive distributed representations.

Artiﬁcial Intelligence, 46:77–105.

Rahm, E. (2011). Towards large-scale schema and ontology

matching. In Schema matching and mapping, pages

3–27. Springer.

Rahm, E. and Bernstein, P. A. (2001). A survey of ap-

proaches to automatic schema matching. The VLDB

Journal, 10(4).

Shvaiko, P. and Euzenat, J. (2013). Ontology match-

ing: state of the art and future challenges. IEEE

Transactions on Knowledge and Data Engineering,

25(1):158–176.

Sperduti, A. (1993). On some stability properties of the

lraam model. Technical report, International Com-

puter Science Institute.

Zdravkovi

c, M., Trajanovi

c, M., and Panetto, H. (2014).

Enabling interoperability as a property of ubiquitous

systems: towards the theory of interoperability-of-

everything. In 4th International Conference on In-

formation Society and Technology, ICIST 2014, vol-

ume 1, pages 240–247, Kopaonik, Serbia.

A LRAAM-based Partial Order Function for Ontology Matching in the Context of Service Discovery

403