introduction of clustered messaging brokers and the
fault-tolerant Mobile Connector, we can guarantee
the exactly-once consumption of messages by
agents. The Mobile Connector is a lightweight
platform-independent component which does not
restrict agent autonomy and mobility.
This paper is organized as follows: firstly, we
shall present related work from the area of multi-
agent system reliability. Then, we shall explain the
reliability model which has been used in our
research and describe the architecture of the External
Fault-Tolerant Layer (EFTL) with focus on the
Mobile Connector component. After that, we shall
present a few scenarios in EFTL functioning and
explain what needs to be done to develop an
application that will be supported by EFTL. The last
sections of this paper will present performance
analysis of EFTL, the conclusions and motivations
for future work.
2 RELATED WORK
A group of authors proposed checkpointing as a
good procedure which saves agent states to a
persistent storage medium at certain time intervals.
Later, if an agent fails, its state can be reconstructed
from the latest checkpoint (Dalmeijer et al, 1998).
This approach depends on the reliability of the host
because we have the so-called blocking problem
when the host fails. The agents which have been
saved at a particular host can be recovered only after
the recovery of that host (Mohindra et al, 2000). The
second approach that tries to ensure an agent’s
reliability is replication. In this approach, there are
groups of agents which exist as replicas of one
agent, and can be chosen to act as the main agent in
case of its failure. The number of agents is increased
and they have to cooperate so the complexity of the
system is also increased. In order to preserve the
same view to the environment from all the members
of the replica group, (Fedoruk, Deters, 2002) have
proposed the concept of a group proxy, which is an
agent acting as proxy through which all the
interactions between the group and the environment
have to pass. When the proxy agent approach is
broadened with the primary agent concept, in
(Taesoon et al, 2002) and (Zhigang, Binxing, 2000),
then the primary agent is the only one which does all
the computations until its failure. Then all the slaves
vote in another primary agent from their group.
Therefore, any slave agent can become a primary.
In order to watch the execution of an agent from
an external entity, (Eustace et al, 1994), (Patel, Garg,
2004) and (Lyu, Wong, 2004) have proposed the
usage of supervisor and executor agents. The
supervisor agents watch the execution of the
problem-solving agents and detect all the conditions
which can lead to, or are, the failures, and react upon
detected conditions. Hosts can also be used as the
components of a fault-tolerant system (Dake, 2002).
Basic services which are provided by the hosts can
be extended by certain services which help the
agents achieve a desirable level of reliability.
Depending on the implementation of the fault-
tolerant system, it cannot cope with all kinds of
failures. That is why some systems do not even try
to recover from certain types of failures. In order to
determine the feasibility of the recovery, (Grantner
et al, 1997) proposed the usage of fuzzy logic.
Moving on to the recovery of an agent host, if
the state of the host has not been saved to a
persistent storage medium, we can simply restart the
host. Then, if a host is very important for the
functioning of the whole agent platform, we can
replicate it (Bellifemine et al, 2003). If our agents
used the transaction-based approach which relied on
the services provided by the host and not by an
underlying application server or a database, then the
host is the one which has to undo all the
uncommitted actions after its restart (Patel, Garg,
2004).
In order to deliver a message to an agent, we
have to track the agent’s location to determine where
to forward the message. The authors have proposed
different solutions, such as the registration of the
agent locations at some central entity (Moreau,
2002) or the usage of the forwarding pointers
principle (Zhou et al, 2003). Then, when we know
the exact location of the agent, we have to deliver
the message. Two main delivery principles have
been specified in (Cao et al, 2002). In the “push”
principle, we have to interfere with an agent’s
autonomy and to constrain its mobility until we
deliver the messages to it. In the “pull” principle, the
agent is the one which decides when it wants to
receive messages, and which messages it wants to
receive. (Cao et al, 2004) have proposed the mailbox
as a separate entity that is also mobile and moves to
be at the same host as its agent or somewhere close
to that host.
The benefits of the publish/subscribe messaging
model in mobile computing have been presented in
(Padovitz et al, 2003). Their approach specifically
concentrates on context-aware messaging, where an
agent can subscribe to receive only the messages
which satisfy its subscription filter. This solution
leads us to a highly effective notification mechanism
for the mobile agents.
Another communication problem, the
inaccessibility in the case of, for example, network
fragmentation can be solved using the doubler
agents, presented in (Pechoucek et al, 2003).
ICEIS 2005 - SOFTWARE AGENTS AND INTERNET COMPUTING
112