Authors:
I. Arrieta-Salinas
;
J. R. Juárez-Rodríguez
;
J. E. Armendáriz-Iñigo
and
J. R. Gonzalez de Mendivil
Affiliation:
Universidad Pública de Navarra, Spain
Keyword(s):
Replication, Crash-Recovery Model, Group Communication Systems, Virtual Synchrony, Distributed File Server, Testing.
Related
Ontology
Subjects/Areas/Topics:
Distributed and Mobile Software Systems
;
Distributed Architectures
;
Process Coordination and Synchronization
;
Software Engineering
Abstract:
Data replication techniques are widely used for improving availability in software applications. Replicated systems have traditionally assumed the fail-stop model, which limits fault tolerance. For this reason, there is a strong motivation to adopt the crash-recovery model, in which replicas can dynamically leave and join the system. With the aim to point out some key issues that must be considered when dealing with replication and recovery, we have implemented a replicated file server that satisfies the crash-recovery model, making use of a Group Communication System. According to our experiments, the most interesting results are that the type of replication and the number of replicas must be carefully determined, specially in update intensive scenarios; and, the variable overhead imposed by the recovery protocol to the system. From the latter, it would be convenient to adjust the desired trade-off between recovery time and system throughput in terms of the service state size and th
e number of missed operations.
(More)