HOW TO DEAL WITH REPLICATION AND RECOVERY IN A DISTRIBUTED FILE SERVER

I. Arrieta-Salinas, J. R. Juárez-Rodríguez, J. E. Armendáriz-Iñigo, J. R. Gonzalez de Mendivil

Abstract

Data replication techniques are widely used for improving availability in software applications. Replicated systems have traditionally assumed the fail-stop model, which limits fault tolerance. For this reason, there is a strong motivation to adopt the crash-recovery model, in which replicas can dynamically leave and join the system. With the aim to point out some key issues that must be considered when dealing with replication and recovery, we have implemented a replicated file server that satisfies the crash-recovery model, making use of a Group Communication System. According to our experiments, the most interesting results are that the type of replication and the number of replicas must be carefully determined, specially in update intensive scenarios; and, the variable overhead imposed by the recovery protocol to the system. From the latter, it would be convenient to adjust the desired trade-off between recovery time and system throughput in terms of the service state size and the number of missed operations.

References

  1. Bartoli, A. (1999). Reliable distributed programming in asynchronous distributed systems with group communication. Technical report, Università di Trieste, Italy.
  2. Bernstein, P. A., Hadzilacos, V., and Goodman, N. (1987). Concurrency Control and Recovery in Database Systems. Addison Wesley.
  3. Budhiraja, N., Marzullo, K., Schneider, F. B., and Toueg, S. (1993). Distributed Systems, 2nd Ed. Chapter 8: The primary-backup approach. ACM/Addison-Wesley.
  4. Chockler, G., Keidar, I., and Vitenberg, R. (2001). Group communication specifications: a comprehensive study. ACM Computing Surveys, 33(4):427-469.
  5. Cristian, F. (1991). Understanding fault-tolerant distributed systems. Commun. ACM, 34(2):56-78.
  6. de Juan-Marín, R. (2008). Crash Recovery with Partial Amnesia Failure Model Issues. PhD thesis, Universidad Politécnica de Valencia, Spain.
  7. Dwork, C., Lynch, N. A., and Stockmeyer, L. J. (1988). Consensus in the presence of partial synchrony. J. ACM, 35(2):288-323.
  8. Kemme, B., Bartoli, A., and Babaog?lu, O . (2001). Online reconfiguration in replicated databases based on group communication. In DSN, pages 117-130. IEEE-CS.
  9. Schiper, A. (2006). Dynamic group communication. Dist. Comp., 18(5):359-374.
  10. Schneider, F. B. (1984). Byzantine generals in action: Implementing fail-stop processors. ACM Transactions on Computer Systems, 2(2):145-154.
  11. Schneider, F. B. (1993). Distributed Systems, 2nd Ed. Chapter 7: Replication Management Using the StateMachine Approach. ACM/Addison-Wesley.
  12. Stanton, J. R. (2009). The Spread communication toolkit. Accessible in URL: http://www.spread.org.
Download


Paper Citation


in Harvard Style

Arrieta-Salinas I., R. Juárez-Rodríguez J., E. Armendáriz-Iñigo J. and R. Gonzalez de Mendivil J. (2009). HOW TO DEAL WITH REPLICATION AND RECOVERY IN A DISTRIBUTED FILE SERVER . In Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-989-674-010-8, pages 147-152. DOI: 10.5220/0002258001470152


in Bibtex Style

@conference{icsoft09,
author={I. Arrieta-Salinas and J. R. Juárez-Rodríguez and J. E. Armendáriz-Iñigo and J. R. Gonzalez de Mendivil},
title={HOW TO DEAL WITH REPLICATION AND RECOVERY IN A DISTRIBUTED FILE SERVER},
booktitle={Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2009},
pages={147-152},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002258001470152},
isbn={978-989-674-010-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - HOW TO DEAL WITH REPLICATION AND RECOVERY IN A DISTRIBUTED FILE SERVER
SN - 978-989-674-010-8
AU - Arrieta-Salinas I.
AU - R. Juárez-Rodríguez J.
AU - E. Armendáriz-Iñigo J.
AU - R. Gonzalez de Mendivil J.
PY - 2009
SP - 147
EP - 152
DO - 10.5220/0002258001470152