information retrieval techniques used by them were
cosine, extended Jaccard, information loss, and
Jensen-Shannon information divergence. Neverthe-
less, it does not consider preconditions and effects.
This algorithm uses information present in the input
and output categories only. The work presented in this
paper considers preconditions and effects, and does
not take information retrieval techniques into consid-
eration.
The approach used by Wei et al. (Wei et al., 2008)
is similar to Klusch et al. (Klusch et al., 2009) since it
uses only I/O information, and combines it with syn-
tactic information. In this case, it uses informationex-
traction techniques for generating a constraint graph
and then matchmaking the similarity. However, this
extraction is made on textual description, and since
the user is not obliged to provide textual description,
this can be a serious issue for the efficiency of this ap-
proach.
Khdour and Fasli (Khdour and Fasli, 2010) pro-
poses a method for filtering the relevant semantic web
services for a query, for diminishing the amount of
time necessary for calculating the similarity among
the relevant semantic web services and the query.
However, their work only determine if, a priori, a se-
mantic web service is relevant or not for a query in a
binary way, for then using some similarity algorithm
for providing a rank among the relevant semantic web
services.
Kritikos and Plexousakis (Kritikos and Plex-
ousakis, 2006) points out that syntactic based dis-
covery techniques presents results with low precision
and high recall ratios. A richer language is necessary
for tuning the web service discovery process (Petrie,
2009), and that is the objective of Semantic Web Ser-
vices (McIlraith et al., 2001).
This richer language must be both human and
computer readable, having good expressivity, wherein
such expressivity does not imply in losing decidabil-
ity, that is, every reasoning made in this language will
be finished in a feasible time. The semantic Web
idea (Shadbolt et al., 2006) is that software agents
can automate most of the tasks done by human agents.
Thus, the utilization of these semantic description lan-
guages would ease the process of web service discov-
ery for these software agents.
Liu et al.(Liu et al., 2009) present an ontology-
based algorithm for measuring the similarity among
semantic web services. It is based on Li et al.’s (Li
et al., 2003) work, which uses information present
in a hierarchical semantic knowledge base of words
for calculating the similarity among different words.
Liu and partners apply it to calculate similarity among
semantic web services by using a domain ontology
taxonomy. It uses information present in a web ser-
vice profile description, which, in fact, contains infor-
mation about web service’s inputs, outputs, precon-
ditions and effects, wherein all these categories are
considered as sets of concepts.
Li et al (Li et al., 2011) presents a different kind of
similarity measurement, the behavioral web services
similarity. It states that there are three kinds of sim-
ilarity: syntactic, semantic and behavioral. And his
work focuses on the latter one, which consists in ana-
lyzing how the exchange of messages occurs, forming
a colored petri net for each web services and measur-
ing the behavior similarity based on these coloured
petri nets.
It is extremely important that these similarity al-
gorithms present a high precision ratio, due to its in-
creasing adoption, e.g. Maamar et al. (Maamar et al.,
2011) uses Liu’s work for building an initial social
network for each web service present in a given reg-
istry. Unfortunately, experiments performed show
that Liu’s algorithm presents low precision and high
recall ratios, bringing too many false positives, resem-
bling syntactic based discovery techniques (Kritikos
and Plexousakis, 2006). Furthermore, Liu’s approach
contains different parameters, but they did not make
any experiment testing which ones would provide a
better precision, recall and f-measure. The current
work improves Liu’s algorithm by changing the way
similarities are calculated, resulting in a much better
precision and recall.
3 SIMILARITY ALGORITHM
The algorithm proposed by Liu et al.’s (Liu et al.,
2009) is about calculating similarity among seman-
tic web services by analyzing the relationship among
concepts given by an ontology taxonomy. This algo-
rithm is based on Li et al.’s (Li et al., 2003) work for
calculating similarity among words by using a hierar-
chical semantic knowledge base of words, which also
takes into account the structure (location, hierarchy)
of these words in the taxonomy. An example of a
hierarchical knowledge base of words is depicted at
Figure 1.
An intuitive way of calculating the similarity be-
tween two words consists on evaluating the length
of the path that is needed to reach one word from
another. For instance, considering Figure 1, the
word boy is more similar to the word girl than
to the word teacher, since the path from boy to
girl is boy-male-person-female-girl, and from boy
to teacher is boy-male-person-adult-professional-
educator-teacher. Nevertheless, this way is not the
WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies
84