epidemic protocols, this system is able to guarantee
data persistence even in the presence of high levels of
failures.
In DataFlasks, nodes are organized into groups.
Each group is responsible for a subset of the data
and groups do not overlap. A client application can
write key-value objects to DataFlasks by issuing a
put operation and later retrieve them via a get op-
eration. Objects are carry a version and the triple
(key,version,value) is considered unique by the stor-
age system. However, DataFlasks does not enforce
any kind of data consistency. As a consequence a
client application is responsible for explicitly manage
data versioning in order to provide consistency.
We leverage the work on DataFlasks in order to
take advantage of its resilience properties. Our pro-
posal is to use DataFlasks as a persistence layer.
3 EpTO- STRONG CONSISTENCY
WITH HIGH PROBABILITY
EpTO(Matos et al., 2015) is a scalable and robust total
order protocol. While validity, integrity and total or-
der properties are deterministic, the agreement prop-
erty of classic total order is relaxed to be probabilistic
and implemented at the expense of epidemic dissem-
ination protocols, know precisely for their scalability
and robustness. This allows EpTO to scale to thou-
sands of nodes, at least an order of magnitude larger
than previous proposals, which enables building very
large systems with strong (consistency) semantics.
Combining DataFlasks with EPTO, allows us to
offer total order on data writes to the store and, as a
consequence, a strong consistency model. DataFlasks
group construction mechanism and the fact that each
group dataset is disjoint (Guerraoui and Schiper,
1997) allows us to use the EPTO protocol only on
a restricted subset of the system nodes allowing the
system to scale.
4 RELAXED CONSISTENCY
With DataFlasks and the EPTO protocol we are able
to provide a storage system with strong consistency
with high probabiility. Moreover, we are able to
achieve this even for a deployment of several thou-
sand of nodes. Naturally, in order to achieve such
level of consistency a latency cost must be paid.
In DataFlasks, every node can receive requests.
When a write request is received, in order to guar-
antee strong consistency with high probability, nodes
must follow the EPTO protocol to ensure they assign
the correct version to that write operation. This may
result in increased request latency.
Our proposal is offering a weaker consistency
model where there is a small probability of temporar-
ily considering an incorrect version for write opera-
tions. It works as follows. Let us consider a system
component that gives nodes an estimate of the time it
takes a message to reach all nodes in their DataFlasks
group. Recall that each group is responsible for a cer-
tain subset of the data. This time estimate is associ-
ated with a probability of being correct. When a node
receives a write request automatically becomes the
coordinator for that write. It looks at its current state
and assigns the write a version it thinks is the correct
one based only on local knowledge. It disseminates
to all the other nodes in the system the write opera-
tion and the version. Next, it waits for an amount of
time equal to that given by the estimation. If no write
is received for that object in such time, it stores the
object with the assigned version. All the other nodes,
when receiving such object and version go through
the same procedure. Each time a node receives a con-
flicting request the one that was proposed by the node
with smaller identification wins.
This simple model allows the user to explicitly
tune the desired level of consistency by configuring
the time estimation component. When the time es-
timation component is configured with a probability
of 1 of being correct, the system automatically dis-
cards this algorithm and uses the EPTO protocol. For
every value smaller than 1, the system will relax con-
sistency guarantees and become faster. This way, the
same system architecture is able to provide a stronger
or a weaker consistency model according to the prior-
ity given to consistency and performance.
5 CHALLENGES
The weaker consistent model we propose shares simi-
larities with the unconscious model presented in (Bal-
doni et al., 2006). In it, processes are not aware -
i.e. are unconscious - of when consistency has been
reached. Our proposal allows for consciousness in the
sense that processes may know with probability 1 that
a consistency state has been reached while also allow-
ing for unconscious operation. We believe exposing
and quantifying these notions to the application is an
interesting research path, and in particularly its inter-
play with the reliability guarantees of the gossip mu-
tation and the freshness of the membership provided
by DataFlasks group construction protocols. Besides,
the consistency constraints imposed by operations af-
Towards Quantifiable Eventual Consistency
369