Components and connectors in the simulation are
implemented in Java and have a thread of control. In
addition, Java RMI is used as the middleware for
message delivery. The simulation runs on a single
machine. Thus, components are concurrent but
distribution is simulated.
The adaptation and recovery scenarios consist of
simulating adaptation and service failure,
respectively, while three transactions are being
processed. During simulation, every application
message contains in its header (1) a transaction
identifier that uniquely identifies the transaction of
this message, (2) the identifier of the message
producer component, (3) the identifier of the message
consumer component, (4) the timestamp at which the
message producer sent the message, (5) a message
type identifying whether the message initiates a
transaction, completes a transaction, or is an
intermediate message of a transaction, and (6) a
sequence number for detecting duplicate messages.
In the remaining of this section, we use the
notation msg(tid, s, r, ts, p) to represent messages,
where msg can be either request or response, tid is the
transaction identifier of the message, s is the identifier
of message sender, r is the identifier of message
receiver, ts is the timestamp of the message, and p
identifies the message type.
5.1 Recovery Scenario
In the failure scenario, the connector analyzes the
failure and determines which transactions need to be
recovered and sends them to the new service, after the
service has been instantiated on a different node. At
the time of service failure, the execution trace (fig. 5)
revealed that the service was engaged in three
transactions with three clients: two transactions
involving dialogs (transactions c1_1 and c2_1) and
one transaction involving a single request/response
messages (transaction c3_1). At the time of failure,
the execution trace shows that the messages queued
at the connector are as follows:
• SPQ contains no requests that have been received
by the connector but not forwarded to the service.
• SAQ contains three requests (received by the
connector and forwarded to the service):
o request(c2_1, client2, service, 1, begin)
o request(c3_1, client3, service, 1, none)
o request(c1_1, client1, service, 6, end)
• SRQ contains one request (for which a service
response is received at the connector):
o request(c1_1, client1, service, 1, begin)
• RFQ contains one response (received by the
connector but not forwarded yet to the client):
o response(c2_1, service, client2, 6,
intermediate)
• RRQ contains one response (received by the
connector and forwarded to the client)
o response(c1_1, service, client1, 3,
intermediate)
During failure analysis, the execution trace indicates
that the recovery connector determined transactions
c1_1, c2_1, and c3_1 as having failed because none
of them have a response that completes the
transaction in either RFQ or RRQ.
The recovery plan created while the connector is
in the Planning for Recovery state consists of a list
that identifies the messages that must be restored
from the SRQ and the SAQ to recover the failed
transactions. The list obtained from the execution
trace indicates that the first request to be recovered is
request(c1_1, client1, service, 1, begin), which is
queued in the SRQ, since this request was the first
request processed by the service before it failed. The
second request in the list was request(c2_1, client2,
service, 1, begin) queued in the SAQ since this
request was also processed by the service and its
response is queued in the RFQ. The list also contains
actions to recover request(c3_1, client3, service, 1,
none) and request(c1_1, client1, service, 6, end), in
that order, which are queued in the SAQ.
While in the Executing Recovery Plan state, the
connector executed the recovery plan by restoring
messages from the SRQ and the SAQ to the SPQ.
After all messages are recovered, the execution trace
shows that the messages queued in the SPQ (starting
from the head of the SPQ) are as follows:
• request(c1_1,client1,service,1, begin)
• request(c2_1,client2,service,1, begin)
• request(c3_1,client3,service,1, none)
• request(c1_1,client1,service,6, end)
The execution trace also indicates that while the
connector is in the Component Recovering state, it
received a new
request(c4_1, client4, service,1,
none). This request is queued at the tail of the SPQ,
so that it is sent last when the service is recovered.
After the service is recovered, the connector
resumed forwarding requests to the recovered service.
As shown in fig. 5, requests recovered from the SRQ
and SAQ are first resent sequentially, in the same
order specified in the recovery plan. Note that
response(c1_1, service, client1, 3, intermediate) has
already been forwarded to the client before the
service failure, so this response is discarded because
it is a duplicate. Then, new requests queued at the tail
of the SPQ are forwarded to the recovered service.
These requests need not be forwarded sequentially.
At this point, the connector resumes forwarding
requests and responses normally.