Figure 3: Diagnosis Process.
• QoS4: (T4 - T1) - QoS1: Communication time.
The value of the parameters are stored in the PaRe
(Parameter Repository) database for further analysis.
An interaction scenario is showed in Figure 2.
If the provider does not receive a response mes-
sage after sending a request, the Monitoring Mod-
ule determines an unsuccessful transaction and indi-
cates that its provider is not available, therefore a re-
placement for the provider is required and the former
provider is set offline as recovery strategy.
After a certain number of transactions, the Diag-
nosis Module is executed.
2.4 Diagnosis Process
The diagnosis is based on the analysis of QoS param-
eters using a statistical (Moo-Mena et al., 2009) and
ontological model. The statistical model is based on
the box-plot method, which is a technique used to an-
alyze a data set (Mendenhall et al., 1998). The results
of the statistical module are the inputs for the onto-
logical method. The ontological model establishes a
knowledge base that represents the current QoS level
of communication with a WS provider.
For the QoS parameters analysis, the most recent
set of values for each QoS parameter is taken from
PaRe. The statistical model calculates how many
recorded values are outliers, according to the Right
Outer Fence (ROF) measures established by each WS
provider in the Service Level Parameter (SLP) table.
When the diagnosis process is executing for the
first time, the SLP table contains reference measures
(ROF) for all providers.
The results of the statistical model are passed to
the ontological model at runtime to determine if any
recovery action is required (See Figure 3).
The ontological model defines an ontology where
relations are established between the main QoS prop-
erties (W3C, 2009) and the health of a component.
In this work, Integrity and Availability characteristics
are defined and related to the State of performance
entity to determine the Degradation level.
The reference measures were defined, applying
the box-plot method, with a sample set of parame-
ters obtained during the execution of the architecture
at an early stage. After testing, Normal Performance
(OK) was less than 5% of outliers, with a Moderated
Degradation the results are between 5% and 10%, and
a higher percentage indicates a Severe Degradation.
The application of rules over the knowledge base
allows to infer the performance level of a provider,
comparing the current QoS measures against refer-
ence measures.
For example, the rule that diagnoses a Severe
Degradation checks that the percentage of outliers
values for QoS parameters are greater than the limit
values defined to activate a substitution action. In
this case, the ontological model determines a Severe
Degradation.
If the percent of outlier values is between 3% and
5%, means that the time intervals of interaction are
increasing. Then, the ROF will be redefined using a
recent data set for a each QoS parameter. The new
values are stored in the SLP table for further use. Af-
ter the adaptation, the diagnosis process is executed
again to improve the diagosis result.
If the result is Moderate degradation, the Diagno-
sis Module requests to the Recovery Module a dupli-
cation of its WS provider to improve the performance
level.
If the result is Severe degradation, the Diagnosis
Process requests to the Recovery Module a substitu-
tion of its WS provider to interact with another WS
provider.
2.5 Recovery Strategies
The diagnosis techniques applied in the Self-healing
architecture are based on redundancy. Each WS
provider should have a duplicate which offers the
same functions. The WS are stateless.
Recovery actions are applied at the diagnosis
module’s request and include duplicating or replacing
the current WS provider.
Duplicating a WS provider has the consequence
that the consumer requests are divided between more
than one provider, randomly selecting a provider for
each transaction.
The replacement, however, is oriented to the re-
quests that are sent to a new provider, which has not
been diagnosed as corrupted.
The changes applied by the recovery module are
A DISTRIBUTED SELF-HEALING ARCHITECTURE SUPPORTING WS-BASED APPLICATIONS
159