works. Section 4 describes our experimental settings
and analyses the results. Finally, conclusions and fu-
ture work are presented in Section 5.
2 PRELIMINARIES
We define information diffusion in terms of infor-
mation flow from output parameter(s) of a Web ser-
vice operation to input parameter(s) of other Web ser-
vice operations in a Web services network. To this
end, we first categorize and semantically annotate the
Web services under examination. Web service match-
making is the next step which leads to construction
of a Web service networks. Finally, we apply our in-
formation diffusion discovery model to estimate the
information flow in the network.
2.1 Web Service Categorization
In Web service categorization step, we assign each in-
dividual Web service to its corresponding categories.
A category describes a general kind of a service
that is provided, for example “banking service” and
“weather service” (Heß and Kushmerick, 2003). In
the context of this paper we are only interested in cat-
egorizing Web services at higher category levels (e.g.
“E-Commerce”, “Weather”, etc.) rather than at lower
levels (e.g. “search for a flight”, “get temperature”).
For instance, Logistics category in our categorization
scheme includes any Web service whose operations
are related in some way to transportation or postal
services such as DHL Service and Fedex Notification
Service. In this regard, our categorization scheme is
similar to the approach exploited by Heß and Kush-
merick (Heß and Kushmerick, 2003) and Crasso et
al. (Crasso et al., 2008). We assume that there exists
a set D = {d
1
,d
2
,...,d
n
} of Web service categories
where no structural relationship (e.g. taxonomic) is
assumed among members of D. It should be noted that
a Web service can be associated with multiple cate-
gories.
2.2 Semantic Annotation and Web
Service Matching
In this work we only require annotation of basic el-
ements of Web service operation input and output
parameters. These element names are either WSDL
message part names or XML schema leaf element
names. The reason is that the actual pieces of in-
formation, exchanged between services, are encoded
with these basic elements. The extracted terms are in-
gredients of our previously developed ontology learn-
ing component (Mokarizadeh et al., 2010) to generate
a reference domain ontology. The reference ontol-
ogy is formally presented as C = {c
1
,c
2
...}, where c
i
represents an element in a reference ontology. In our
reference ontology, concepts are inter-related through
additional ontological relations (Mokarizadeh et al.,
2010).
Semantic annotations of Web services are ex-
ploited in order to find semantic matching between
inputs and outputs of services. As the annotated el-
ements (i.e. terms) are in fact instances in the gen-
erated reference ontology, the instance matching pro-
cess is used to find ontological relationships between
those instances. We employ a rule-based instance
matching method that has been already described and
evaluated in our previous work (Mokarizadeh et al.,
2011). The matching component takes as input a
pair of instances and produces a correspondence ele-
ment. Each correspondence element implies whether
a semantic relation holds between the two given in-
stances, according to a particular matching rule. The
presence of such semantic relation means that the un-
derlying output and input elements of Web service
operation parameters can be matched. The implicit
assumption here is that matching process is only per-
formed between pair of elements where one of them
represents an output element of a Web service oper-
ation and the second one depicts an input element of
another Web service operation. The results of match-
ing process is exploited in Web service network for-
mation which will be discussed in next sections.
2.3 Web Services Network Models
We distinguish Annotated, Semantic and Category
representations of Web service networks derived from
semantically annotated Web services.
Annotated Web Service Model. This network cap-
tures main elements of WSDL descriptions as nodes
and edges of a directed graph. The graph is further en-
riched with references to ontology elements and cat-
egory labels. A node P
i
in this model refers to input
and output parameters (i.e. the WSDL message part
names and XSD schema leaf element names) of Web
service operations. Every node is annotated with: 1)
a semantic label C
i
that points to an ontology element
in reference ontology C, and 2) category label D
i
that
refers to the affiliated category in category list D. Fi-
nally, nodes are connected by respective Web service
operations represented as directed edges from nodes
representing input elements towards the nodes depict-
ing the output elements. In fact, an instance of this
network model is nothing more than a collection of
discreet graphs constructed to facilitate understand-
USINGSEMANTICANNOTATIONSOFWEBSERVICESFORANALYZINGINFORMATIONDIFFUSIONINTHE
DEEPWEB
111