provides several advantages over a slightly more
performant fault intolerant system. Not only that, but
the performance decay using the remote mechanism
is nearly negligible.
In the future, we intend to improve both the local
and remote mechanisms. Regarding the file system,
we intend to develop a highly performant algorithm,
that does not rely on copying the previous log on
each update. Regarding the remote mechanism, we
intend to adapt the CN for other requirements, in
order to improve performance. This can be done by
allowing priority nodes and removing the symmetry
factor. This way, servers can preferentially become
masters, if they have better hardware or conditions.
The CN can also be improved by changing the
underlying communication protocol, which at the
moment is assumed to be unreliable. We also intend
to develop a master look-up mechanism, like DNS
registration. At the moment, there is no such
mechanism, and clients resort to finding masters
manually.
In conclusion, we extended DFAF with a log-
based fault-tolerance model, this way guaranteeing
ACID properties on the underlying DBMS
transactions. We describe two ways of storing the
information, to leverage performance and reliability,
but support other models. We also propose a master-
slave fault tolerant network which can be used as a
remote server to keep information replicated and
consistent. Both the logging model and the CN can
be used for other applications as well; we have for
example adapted the CN to act as a concurrency
handler in another module of DFAF.
ACKNOWLEDGEMENTS
This work is funded by National Funds through FCT
- Fundação para a Ciência e a Tecnologia under the
project UID/EEA/50008/2013.
REFERENCES
Borthakur, D., 2007. The hadoop distributed file system:
Architecture and design. Hadoop Project Website,
11(2007), p.21.
Castro, M. and Liskov, B., 1999. Practical Byzantine fault
tolerance. OSDI.
Castro, M. and Liskov, B., 2002. Practical Byzantine fault
tolerance and proactive recovery. ACM Transactions
on Computer Systems (TOCS).
Chun, B., Maniatis, P. and Shenker, S., 2008. Diverse
Replication for Single-Machine Byzantine-Fault
Tolerance. USENIX Annual Technical Conference.
Cowling, J., Myers, D. and Liskov, B., 2006. HQ
replication: A hybrid quorum protocol for Byzantine
fault tolerance. Proceedings of the 7th ….
Garcia-Molina, H. and Salem, K., 1987. Sagas, ACM.
Gray, J. and others, 1981. The transaction concept: Virtues
and limitations. In VLDB. pp. 144–154.
Gray, J. and Reuter, A., 1992. Transaction Processing:
Concepts and Techniques 1st ed., San Francisco, CA,
USA: Morgan Kaufmann Publishers Inc.
Gusella, R. and Zatti, S., 1985. An election algorithm for a
distributed clock synchronization program,
Huang, K.-H., Abraham, J. and others, 1984. Algorithm-
based fault tolerance for matrix operations.
Computers, IEEE Transactions on, 100(6), pp.518–
528.
Johnson, D.B., 1989. Distributed System Fault Tolerance
Using Message Logging and Checkpointing by.
Sciences-New York, 1892(December).
Kotla, R. and Dahlin, M., 2004. High throughput
Byzantine fault tolerance. Dependable Systems and
Networks, 2004 ….
Merideth, M. and Iyengar, A., 2005. Thema: Byzantine-
fault-tolerant middleware for web-service applications.
… , 2005. SRDS 2005. ….
Mohan, C. et al., 1992. ARIES: a transaction recovery
method supporting fine-granularity locking and partial
rollbacks using write-ahead logging. ACM
Transactions on Database Systems (TODS), 17(1),
pp.94–162.
Nakamoto, S., 2008. Bitcoin: A peer-to-peer electronic
cash system. Available at: http://www.cryptovest.co.u
k/resources/Bitcoin paper Original.pdf [Accessed
February 15, 2016].
Oki, B.M. and Liskov, B.H., 1988. Viewstamped
replication: A new primary copy method to support
highly-available distributed systems. In Proceedings
of the seventh annual ACM Symposium on Principles
of distributed computing. pp. 8–17.
Pereira, Ó.M., Simões, D.A. and Aguiar, R.L., 2015.
Endowing NoSQL DBMS with SQL Features
Through Standard Call Level Interfaces. In SEKE
2015 - Intl. Conf. on Software Engineering and
Knowledge Engineering. pp. 201–207.
Rabin, M.O., 1989. Efficient dispersal of information for
security, load balancing, and fault tolerance. Journal
of the ACM (JACM), 36(2), pp.335–348.
Randell, B., Lee, P. and Treleaven, P.C., 1978. Reliability
Issues in Computing System Design. ACM Computing
Surveys, 10(2), pp.123–165.
Shih, K.-Y. and Srinivasan, U., 2003. Method and system
for data replication.
Sumathi, S. and Esakkirajan, S., 2007. Fundamentals of
relational database management systems, Springer.
Wolfson, O., Jajodia, S. and Huang, Y., 1997. An adaptive
data replication algorithm. ACM Transactions on
Database Systems (TODS), 22(2), pp.255–314.
Ylönen, T., 1992. Concurrent Shadow Paging: A New
Direction for Database Research.