monly referred to as primary-backup, only the pri-
mary replica executes the transaction, propagating the
transaction’s write set to other replicas. The primary’s
native database engine concurrency control decides
which transactions to commit or abort, and in which
order. To ensure that replicas remain consistent, these
must know or decide on the same serialization or-
der as the primary. In a multi-primary setting, i.e.,
where, for example, different replicas may have the
role of primary for different parts of the data, each
transaction still executes in a single primary, but hav-
ing several primaries means that these must agree on
a total order for transaction execution, as a transac-
tion might update data owned by multiple primaries.
If replicas apply updates according to that total or-
der, strong consistency is guaranteed. Group com-
munication protocols that guarantee message deliv-
ery with appropriate semantics, which are instances
of the abstract consensus problem (Guerraoui and
Schiper, 2001), can be used for that purpose. In
(Wiesmann et al., 2000), the authors compare differ-
ent approaches for replication as well as the primitives
needed in each case and a survey of atomic broadcast
algorithms can be found in (D
´
efago et al., 2004).
Because the total order property guarantees that
all replicas receive the same set of messages and that
messages are delivered in the same order to all repli-
cas, the transaction order can be established simply
by sending the transaction identifier (along with other
relevant metadata) to the group; if transactions are
queued in the same order in which the respective mes-
sages are delivered, the queues at each replica will
be identical and can be considered as instances of a
replicated queue. Because active replication requires
every replica to execute every transaction, if non-
determinism in transactions is allowed and strongly
consistent replication is a requirement, performance
is limited by the slowest replica in the group. While
passive replication protocols do not suffer from this
limitation, transferring large write sets across the net-
work to several replicas can be costly. Protocols that
combine active and passive replication have been pro-
posed (Correia Jr et al., 2007). There have also
been proposals for mitigating the limitations of state-
machine replication, namely by implementing spec-
ulative execution and state-partitioning (akin to par-
tial replication) (Marandi et al., 2011) and eschew-
ing non-determinism by restricting valid execution to
a single predetermined serial execution (Thomson
and Abadi, 2010). Using primary-backup (and multi-
primary), ownership of data partitions must be guar-
anteed to be exclusive. This means that when the pri-
mary fails, the database must block until a new pri-
mary is found, usually through a leader election pro-
tocol. This is costly, particularly in churn-prone envi-
ronments.
Figure 1: Strategies for database replication.
Having stand-by failover replicas might avoid
most runs of the leader election protocol, but at the
cost of increasing the number of replicas that need
to be updated in each transaction, thereby increas-
ing network utilization and generally increasing the
number of nodes in the system without a correspond-
ing improvement in system throughput. Update-
everywhere protocols avoid this issue because all
replicas are equivalent. Again, replicas must apply
updates according to the defined total order to guar-
antee correctness. Database replication protocols can
also be classified in terms of when the client is noti-
fied that the transaction has been committed: in ea-
ger (synchronous) replication protocols, the client is
only replied to after all replicas have committed the
transaction (using, e.g., two-phase commit (2PC)),
which can be costlier in terms of latency but provides
stronger consistency; lazy (asynchronous) replication
protocols reply to the client as soon as the transac-
tion has committed in some replica, later propagating
updates to other replicas, providing weaker consis-
tency because of potential temporary divergence be-
tween replicas. An alternative definition is to consider
whether updates are propagated to other replicas be-
fore the transaction is committed at the primary using
a primitive that guarantees delivery and the appropri-
ate message order properties needed by the protocol.
Figure 1 depicts distinct strategies for a replicated
database system and how replication can be imple-
mented at that level:
• SQL-based, at a middleware layer, above the
database engine layer;
• Log shipping, at the database engine layer; and
• Block Device, at the storage layer, below the
database engine layer.
Active replication can be implemented above the
database engine layer by reliably forwarding SQL
statements from clients to all replicas, handling syn-
chronization/recovery at this level, when needed.
ADITCA 2019 - Special Session on Appliances for Data-Intensive and Time Critical Applications
636