S-Discovery: A Behavioral and Quality-based Service Discovery on the

Cloud

Ahmed Gater

, Fernando Lemos

, Daniela Grigori

and Mokrane Bouzeghoub

Universit´e Paris-Dauphine, Pl. Mal de Lattre de Tassigny, 75775 Paris, France

Universit´e de Versailles Saint-Quentin en Yvelines, 45 Avenue des Etats-Unis, 78035 Versailles Cedex, France

Keywords:

Business Process as a Service, Service Oriented Architecture, Service Discovery, Business Process Matching.

Abstract:

Cloud computing recently emerged as a paradigm providing computer power and storage as a utility that is

consumed on demand (following the footsteps of other utilities, like electricity). Recently, a new service de-

livery mode emerged: Business Process as a Service (BPaaS). As a consequence, process models repositories

will be developed allowing this new type of services to be published by process providers and discovered by

enterprises wanting to outsource some of, or parts of, their processes. In this paper we present the S-Discovery

framework allowing to ﬁnd in such repositories processes that could satisfy user functional and no-functional

requirements.

1 INTRODUCTION

Cloud computing emerged recently as a paradigm

providing computer power and storage as a utility

that is consumed on demand (following the footsteps

of other utilities, like electricity). It allows typi-

cally three delivery modes: Infrastructure as a Service

(IaaS), Platform as a Service (PaaS), and Software as

a Service (SaaS).

The development of cloud computing offers new

opportunities for enterprises to outsource their pro-

cesses and thus a new service delivery mode emerged:

Business Process as a Service (BPaaS) (Anstett et al.,

2009; Pathirage et al., 2011). As a consequence, pro-

cess models repositories will be developed allowing

this new type of services to be published by pro-

cess providers and discovered by enterprises want-

ing to outsource some of, or parts of, their processes.

Similar to service registries, process repositories will

contain process model descriptions, but also business

speciﬁcations (non-functional descriptions like cost,

quality of service, etc). We argue that process model

discovery techniques taking into account functional

and non-functional requirements will be required.

Besides business processes outsourcing, other ap-

plication scenario for BPaaS can be found in the area

of scientiﬁc workﬂows (Pathirage et al., 2011). Many

large scale collaborative science projects use work-

ﬂows to automate computation steps. However, deﬁn-

ing and running such workﬂow systems are often a

challenge. Workﬂow engines in the cloud would fa-

cilitate scientist’s work and reduce overhead on their

projects. In the context of web-based scientiﬁc work-

ﬂow repositories, scientists expressed the need to

have workﬂow similarity search capabilities (Goderis

et al., 2006). Moreover, in the new context of BPaaS

repositories, they may be interested in ﬁnding, among

the retrieved workﬂows the one satisfying some busi-

ness criteria (e.g., cost) and some global or local qual-

ity requirements, e.g., the workﬂow that takes the

shorter time or which guarantees a given correctness

for the results of a speciﬁc activity.

These scenarios show the need for a discovery ap-

proach taking into account both process model and

non-functional requirements. The contribution of this

paper is a framework able to efﬁciently query a repos-

itory to ﬁnd processes that could best fulﬁll user struc-

tural and non-functional requirements. We suppose

that users express their query as a process model ac-

companied by non-functional requirements. Thus, the

technical challenges in the discovery approach are

at two levels. At the description level, (i) provide

a formal model that allows one to specify, at differ-

ent granularity levels, non-functional attributes as an-

notations of the functional speciﬁcation; and (ii) al-

low the user to enrich his query with non-functional

requirements. At the discovery level, (i) ﬁlter the

repository to efﬁciently ﬁnd matching candidates (ii)

104

Gater A., Lemos F., Grigori D. and Bouzeghoub M..

S-Discovery: A Behavioral and Quality-based Service Discovery on the Cloud.

DOI: 10.5220/0004378201040109

In Proceedings of the 3rd International Conference on Cloud Computing and Services Science (CLOSER-2013), pages 104-109

ISBN: 978-989-8565-52-5

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

combine the structure-based matching algorithms and

non-functional factors matching and (iii) deﬁne a sim-

ilarity measure that aggregates both functional and

non-functional similarities.

The remainder of the paper is structured as fol-

lows. Section 2 presents our model to annotate ser-

vice process models. Section 3 describes an overview

of our approach, which is composed of (i) a ﬁltering

task, described in Section 4; (ii) a structural similarity

evaluation task (Section 5); (iii) a preference satisfac-

tion evaluation and ranking tasks (Section 6). Related

work are discussed in Section 7. Finally, the conclu-

sions are presented in Section 8.

2 BACKGROUND AND

NOTATIONS

Our S-Discovery framework is based on matching

techniques that operate on process models. The pro-

cess model consists of a set of related activities that

are organized using control ﬂow structures to con-

struct complex behavior models. To abstract as much

as possible from any existing notation formalism, we

represent a process model as a directed graph p =

(A,C, E,Q), called process graph (p-graph for short),

where A is a set of activity nodes, C is a set of con-

nector nodes, E is a set of directed edges and Q is

a set of quality annotations. An activity node repre-

sents an atomic task which is described by its name

(N), a set of inputs (In), a set of outputs (Out) and a

set of quality annotations (Q). Notice that activity in-

puts and outputs are annotated unsing a domain ontol-

ogy. Connector nodes describe control ﬂows between

activities, and represent Split and Join operators of

types XOR or AND. Split connectors have multiple

outgoing edges, while Join connectors have multiple

incoming edges. Quality annotations are of the form

(m,r), where r is a value for a QoS attribute m. They

can characterize the process as a whole or speciﬁc ac-

tivities.

A user query is a p-graph q = (A,C,E,P), where

A, C, E are as deﬁned before and P is a set of QoS

preferences, which are speciﬁed as expressions us-

ing the following constructors: around, between, max,

min, like and dislike.

A preferred order between preferences can be de-

ﬁned using the complex constructors pareto (⊗) and

prioritized (&). The semantics of the terms of this

vocabulary were taken from the PreferenceSQL ap-

proach (Kießling, ), howeverthe user may personalize

these semantics by means of a membership function

µ. User can label a preference as hard or soft, the

difference being that a target p-graph must satisfy all

hard preferences, while the satisfaction of soft prefer-

ences is optional.

Figures 1(a) and 1(b) show, respectively, a process

model and a sample user query annotated with hard

and soft preferences.

3 OVERVIEW OF THE

S-DISCOVERY FRAMEWORK

Given an extensive repository of published p-graphs,

the goal of our framework is to retrieve a ranked list of

p-graphs that best fulﬁll a p-graph query. Our frame-

work is a multi-stepped approach, as illustrated by

Figure 2.

Given a query p-graph q, the Filter module selects

the p-graphs that can most likely answer the query

(p-graphs T

,... ,T

); it avoids scanning the whole

repository to compare each target p-graph against the

query. This module retrieves all the p-graphs sharing

at least one activity with the query p-graph, and this

is done by relying on an index built on an ofﬂine step.

This set is subsequently passed to the Structural

Similarity Evaluator that measures the structural sim-

ilarity λ

struc−i

between each selected p-graph (target)

and the query. Furthermore, a set of activity map-

pings between query and target p-graphs is estab-

lished (mappings M

, 1 ≤ i ≤ n).

Next, the Preference Satisfaction Evaluator com-

putes the degree of satisfaction λ

pref−i

of the QoS

preferences at the basis of the mappings computed in

the previous stage. At the end, the retrieved p-graphs

are ranked according to their structural similarity and

the preference satisfaction degrees using a set of ag-

gregation metrics. The following sections present our

contributions to each stage of the S-Discovery frame-

work.

4 REPOSITORY FILTERING

As mentioned previously, a target p-graph is a poten-

tial match of a query p-graph when they have simi-

lar activities. To avoid comparing each query activity

against all the activities stored in the repository, we

deﬁned an index structure over the activities of the

repository p-graphs.

This index is built by assuming that activities hav-

ing similar inputs/outputs are similar (Gater et al.,

2011a). The structure we deﬁned indexes the activ-

ities stored in the repository by attaching to each con-

cept C of the ontology two sets C

and C

Out

record-

ing the identiﬁers of the activities where it appears,

S-Discovery:ABehavioralandQuality-basedServiceDiscoveryontheCloud

105

start

end

XOR

join

XOR

split

returnError

out: errorMsg

createPDF

in: file, fileExt

out: pdfFile

preflight

in: file, fileExt

out: status

[status=ok]

[status=ko]

createLink

out: webLink

start

end

AND

join

AND

split

linkGeneration

out: link

PDFConvertion

in: file, fileExt

out: pdfFile

convertionCheck

in: file, fileExt

out: status

HARD PREFERENCES





:  , 



SOFT PREFERENCES





:  , 







:  





: & 



, 







:  , 





: & 



, 



SOFT PREFERENCE





: 

, 

HARD PREFERENCE 



:  , 







: , $15 ,





, 90





: (, 10)





, 60%





: (, )





: (, 75)





: (, 25)





: , 15





: (, )

0  ()







10 60

0  ()







5 20

0  (%)







75 100

(a) (b)

Figure 1: Mapping between a (a) sample target p-graph T

and a (b) sample query p-graph Q

Filter

Structural

Similarity

Evaluator

Preference

Satisfaction

Evaluator

Ranker

query’s matches

index

p-graph

repository

query p-graph

…

(

)

⋮

(

)

(

, λ

struc

-1

)

⋮

(

, λ

struc

-n

)

(

, λ

struc

-1

, λ

pref

-1

)

⋮

(

, λ

struc

-n

, λ

pref

-n

)

…

Figure 2: S-Discovery framework architecture.

respectively, as an input or an output. Furthermore,

since we are not only interested in retrieving inexact

matches, the technique should be able to retrieve the

activities that don’t match exactly the query activity

but also those having similar inputs/outputs, i.e. the

activities having inputs/outputs that are semantically

related to those of the query activity. To avoid mak-

ing this computation in real time, the set C

(resp.

Out

) of a concept C record also the activities having

as input (resp. output) a concept which is semanti-

cally related (its descendants, ascendant, ...) to C.

Thus given a query activity A

, the set of its poten-

CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience

106

tial activity matches is composed of all the activities

belonging to the set of annotations attached to the in-

puts and outputs of A

. Straightforwardly, the set of

process-match candidates of a query is the set of tar-

get p-graphs with whom it shares at least one activity.

5 STRUCTURAL SIMILARITY

EVALUATION

The problem of process matching is reduced to a

graph matching problem and an error-correcting sub-

graph isomorphism (ECSI for short) detection algo-

rithm (Messmer, 1995) was adapted to this purpose.

The principle behind ECSI algorithm is to apply edit

operations (node/edge insertions, deletions and sub-

stitutions) over the target graph until there exists a

subgraph isomorphism to the query graph. Each edit

operation is assigned a cost function, on the basis of

which the quality of an ECSI is estimated. The goal

is then to ﬁnd the sequence of edit operations leading

to the ECSI between the query and target graphs that

has the minimal cost. To ﬁnd this sequence, the ECSI

detection algorithm relies on an exhaustive A* search

space algorithm.

The adaptation of this algorithm in order to handle

the p-graphs concerns the deﬁnition of: (i) measures

for evaluating the similarity of two activities that inte-

grate the similarity of their names, inputs and outputs;

(ii) measures for evaluating the behavioral/structural

similarities; (iii) heuristics for detecting the granular-

ity level differences.

This algorithm allows evaluating the similarity of

two p-graphs as well as ﬁnding a set of correspon-

dences between their activities. Experimental results

showed the effectiveness of this approach in terms of

precision/recall of the found matches. However, the

time complexity induced by the combinatorial search

space limits their application in practice to p-graphs

of relatively small sizes (55 activities). To make this

algorithm tractable, we deﬁned two heuristics that

aim to ﬁnd matches having acceptable qualities in rea-

sonable execution times. More details are given in

(Gater et al., 2010; Gater et al., 2011b).

6 PREFERENCE SATISFACTION

EVALUATION

After the calculation of the structural similarity be-

tween query and each candidate p-graph, the most

similar ones are subjected to the preference satisfac-

tion evaluation, which calculates the degree to which

the QoS annotations of the p-graphs satisfy the QoS

preferences of the query. The procedure ﬁrst calcu-

lates the satisfaction degrees of atomic preferences

and, then, it recalculates these degrees based on the

complex preferences. At the end, a global satisfac-

tion degree is obtained from the aggregation of the

preference degrees. The mapping between query and

target p-graphs is used in the evaluation procedure to

recalculate the QoS attributes of target p-graphs and

to evaluate the satisfaction degree of atomic prefer-

ences.

6.1 Atomic and Complex Preference

Evaluation

For each activity correspondence (w,v), the degree

to which each atomic preference of v is satisﬁed by

its corresponding annotation in w is calculated. The

same is similarly done for the proﬁle preferences.

For a preference p of the type around, between,

min or max, given its corresponding annotation a, the

satisfaction degree δ(p,a) between them is given by

the normalized satisfaction distance d (p,a), which

measures how far is the value r in annotation a from

those favored by preference p. For a preference of the

type likes or dislikes, the satisfaction degree is based

on the semantic similarity between concepts given by

the classic edge counting technique proposed in (Wu

and Palmer, 1994). When a membership function µ

deﬁnes the semantics of the preference, the satisfac-

tion degree is a simple application of the function over

the corresponding quality attribute.

When a hard preference is not satisﬁed, the target

p-graph is eliminated from the discovery result. As a

consequence, the veriﬁcation of the hard preferences

at activity level may be interwoven with the structural

matching when the latter checks weather two activi-

ties match. This may eliminate the mappings express-

ing structural match but not expressing preference sat-

isfaction and, thus, pruning the search space.

The satisfaction degrees of the atomic preferences

are reevaluated according to the order of importance

deﬁned by the complex preferences (paretoand prior-

itized ). The goal is to assign weights to the satisfac-

tion degrees of atomic preferences to capture the order

of importance deﬁned by complex preferences. This

is done with the help of a preference graph, which is

a rooted directed graph whose nodes represent atomic

preferences, edges represent a prioritized preference

from source to destiny, and each node of the graph

has weight ω

/i, where i is the edge distance from

the node to the graph root. Then, the reevaluation of

a preference p is done by δ

′

(p,a) = δ(p, a) × ω

S-Discovery:ABehavioralandQuality-basedServiceDiscoveryontheCloud

107

6.2 Preference Satisfaction Degree

Calculation

The global preference satisfaction degree λ

pref

indi-

cates the degree to which the QoS annotations of a

target satisfy the QoS preferences of a query. This

degree is calculated with the help of a preference sat-

isfaction metric, which receives as input the prefer-

ence satisfaction degrees {δ

,..., δ

} of the target and

aggregates them to provide the global preference sat-

isfaction degree λ

pref

. Our framework provides three

different metrics.

The average-based metric calculates the prefer-

ence satisfaction degree λ

pref

as the average of the

preference satisfaction degrees {δ

,..., δ

The linguistic quantiﬁer-based metric calculates

the degree λ

pref

by measuring the truth degree of

the sentence γ : “almost all preferences are satisﬁed”.

This sentence is a fuzzy quantiﬁed proposition de-

ﬁned using a relative quantiﬁer (e.g., almost all, at

least, around half, etc.) (Gl¨ockner, 2004).

The bipolar-based metric calculates the degree

pref

by evaluating the bipolar condition (Dubois and

Prade, 2008) “all hard preferences are satisﬁed and

if possible at least one soft preference is satisﬁed”.

This method returns a bipolar degree of the form

pref



,δ



meaning that “all hard preferences

are satisﬁed to at least a degree of δ

and at least one

soft preference is satisﬁed to at least a degree of δ

”.

More details are given in (Lemos et al., 2012).

Once the structural similarity and quality satisfac-

tion degrees are computed, the retrieved p-graphs are

subsequently ranked according to the structural and

quality satisfaction degrees using aggregation tech-

niques. Our framework proposes a set of aggregation

techniques (lexicographic order, weighted average,

fuzzy-based techniques) that are detailed in (Lemos

et al., 2012).

7 RELATED WORK

Our work addresses an important topic in the area of

service oriented architecture, which is the discovery

of services based on their process models. Several

work have been proposed similarity measures (Wom-

bacher et al., ; de Medeiros et al., 2008; Dijkman

et al., ) for the evaluation of the similarity of two

service process models. These approaches proposed

similarity measures that consider either the struc-

tural or behavioral perspectives of the process mod-

els. While structure-based approaches consider the

process topologies, behavior-based approaches con-

sider the execution semantics of the process models.

In this case, the service process discovery is done

by comparing the query against each target service,

and subsequently ranking target processes according

to their closeness to the query. To avoid browsing

the whole process repository, some approaches rely

on indexing structures (Gater et al., 2011a; Awad and

Sakr, 2010; Yan et al., 2010) to quickly retrieve the

processes that are the most likely similar to a (part of)

process query.

With regard to the quality-basedservice discovery,

current approaches (Mokhtar et al., ; Agarwal et al.,

2009) consider services as black boxes, so quality re-

quirements are deﬁned over the service proﬁle. Gen-

erally, they specify quality preferences as relational

expressions, fuzzy sets, linguistic variables, or utility

functions. These approaches do not propose prefer-

ence constructors to help user better deﬁne and com-

pose his preferences and they are not abstract enough

to be adapted to different non-functional contexts.

While process similarity search is an active ﬁeld in

the domain of business process management research

area, little attention was given until now to the dis-

covery of the services hosted in the cloud (Goscin-

ski and Brock, 2010); the existing techniques are lim-

ited in what information can be used when publish-

ing and discovering services (Microsoft Azure (Mi-

crosoft, Inc., )). To the best of our knowledge there

are no process discovery framework allowing to com-

bine functional and non-functional requirements.

8 CONCLUSIONS

In this paper we presented a framework for process

discovery taking into account both functional and

non-functional criteria. User query is expressed as a

process model adorned with quality annotations ex-

pressing user preferences and requirements. Our ap-

proach will allow searching process repositories of-

fered by BPaaS providers.

In our past work, we implemented basic match-

ing operators and evaluated them in terms of efﬁ-

ciency and effectiveness. Our future work consists

in building a prototype implementing the framework

presented in this paper by reusing and adapting our

matching operators.

REFERENCES

Agarwal, S., Lamparter, S., and Studer, R. (2009). Mak-

ing Web services tradable: A policy-based approach

for specifying preferences on Web service properties.

Journal of Web Semantics, 7(1):11–20.

CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience

108

Anstett, T., Leymann, F., Mietzner, R., and Strauch, S.

(2009). Towards bpel in the cloud: Exploiting dif-

ferent delivery models for the execution of business

processes. In SERVICES’2009), pages 670–677.

Awad, A. and Sakr, S. (2010). Querying graph-based repos-

itories of business process models. In DASFAA Work-

shops’10.

de Medeiros, A. K. A., van der Aalst, W. M. P., and Weijters,

A. (2008). Quantifying process equivalence based on

observed behavior. In Journal DKE, pages 55–74.

Dijkman, R., Dumas, M., and Garc´ıa-Ba˜nuelos, L. Graph

matching algorithms for business process model sim-

ilarity search. In BPM’09).

Dubois, D. and Prade, H. (2008). Handling bipolar

queries in fuzzy information processing. In Hand-

book of Research on Fuzzy Information Processing in

Databases, pages 97–114. IGI Global.

Gater, A., Grigori, D., and Bouzeghoub, M. (2010). Com-

plex mapping discovery for semantic process model

alignment. In IIWAS’10, pages 317–324.

Gater, A., Grigori, D., and Bouzeghoub, M. (2011a). In-

dexing process model ﬂow dependencies for similar-

ity search. In COOPIS’2012.

Gater, A., Grigori, D., Haddad, M., Bouzeghoub, M.,

and Kheddouci, H. (2011b). A summary-based ap-

proach for enhancing process model matchmaking. In

SOCA’11, pages 1–8.

Gl¨ockner, I. (2004). Fuzzy Quantiﬁers in Natural Lan-

guage: Semantics and Computational Models. Der

Andere Verlag.

Goderis, A., Li, P., and Goble, C. A. (2006). Workﬂow

discovery: the problem, a case study from e-Science

and a graph-based solution. In ICWS’06, pages 312–

319.

Goscinski, A. and Brock, M. (2010). Toward dynamic and

attribute based publication, discovery and selection

for cloud computing. FGCS’10.

Kießling, W. Foundations of preferences in database sys-

tems. In VLDB’02.

Lemos, F., Abbaci, K., Grigori, D., Hadjali, A.,

Bouzeghoub, M., Li´etard, L., and Rocacher, D.

(2012). Int´egration de pr´ef´erences dans la d´ecouverte

et la s´election des services web: Une approche fond´ee

sur les ensembles ﬂous. Ing´enierie des Syst`emes

d’Information.

Messmer, B. T. (1995). Graph Matching Algorithms and

Applications. PhD thesis, University of Bern, Switzer-

land.

Microsoft, Inc. Microsoft Azure.

http://www.windowsazure.com.

Mokhtar, S. B., Preuveneers, D., Georgantas, N., Issarny,

V., and Berbers, Y. EASY: Efﬁcient semAntic Service

discoverY in pervasive computing environments with

QoS and context support. JSS’08.

Pathirage, M., Perera, S., Kumara, I., and Weerawarana, S.

(2011). A multi-tenant architecture for business pro-

cess executions. In ICWS’11, pages 121–128.

Wombacher, A., Fankhauser, P., Mahleko, B., and Neuhold,

E. Matchmaking for business processes based on

choreographies. In EEE’04.

Wu, Z. and Palmer, M. S. (1994). Verb semantics and lexi-

cal selection. In 32nd Annual Meeting of the Associa-

tion for Computational Linguistics (ACL 1994), pages

133–138.

Yan, Z., Dijkman, R. M., and Grefen, P. (2010). Fast busi-

ness process similarity search with feature-based sim-

ilarity estimation. In CoopIS’10.

S-Discovery:ABehavioralandQuality-basedServiceDiscoveryontheCloud

109