Consistency and Availability of Data in Replicated NoSQL Databases

Tadeusz Pankowski

Institute of Control and Information Engineering, Pozna

n University of Technology, Pozna

n, Poland

Keywords:

Replicated Data Repositories, Strong Consistency, Eventual Consistency, Consensus Quorum Algorithms.

Abstract:

Highly distributed NoSQL key-value databases have been designed to meet the needs of various applications

in the ﬁeld of social networks, Web search and e-commerce. High availability, fault tolerance and scalability

(i.e. Quality of Service, QoS) of such systems is in conﬂict with the strong consistency (Quality of Data,

QoD). Thus, a new functionality of services, balancing QoS and QoD, is needed. This functionality is referred

as Service Level Agreements, SLA) (Terry et al., 2013). We present a system Lorq as a proposal of a solution

to this problem. Lorq is a consensus quorum based algorithm for NoSQL data replication. In this paper, we

discuss how different consistency levels are guaranteed by Lorq.

1 INTRODUCTION

Recently, we observe a rapid development of mod-

ern storage systems based on Internet services and

cloud computing technologies (Terry et al., 2013).

Examples of this kind of applications range from

social networks (Lakshman and Malik, 2010), Web

search (Corbett et al., 2013; Cooper et al., 2008) to

e-commerce systems (DeCandia et al., 2007). Data

management in such systems is expected to support

needs of a broad class of applications concerning dif-

ferent performance, consistency, fault tolerance, and

scalability (Abadi, 2012). Development of this new

class of systems, referred to as NoSQL systems, dif-

fers from conventional relational SQL systems in the

following aspects: (a) the data model is usually based

on a NoSQL key-value model; (b) data management

is focus on intensive simple write/read operations in-

stead of ACID transactions processing; (c) NoSQL

system architecture is a multi-server data replication

architecture, which is necessary to meet needs con-

cerning the performance, scalability and partition tol-

erance; (d) a various kinds of consistencies is of-

fered, from the strong consistency to weak consisten-

cies (eventual consistency).

In replicated systems, the following three features

inﬂuence the design, deployment and usage of the

system: consistency (C), availability (A), and parti-

tion tolerance (P). Partition happens when in result

of a crash, a part of the network is separated from

the rest. According to the CAP theorem (Gilbert and

Lynch, 2012), all of them cannot be achieved. Be-

cause the partition tolerance is the necessary con-

dition expected by the users, the crucial issue is

then the trade-off between consistency and availabil-

ity/latency.

Applications in such systems are often interested

in possibility to declare their consistency and latency

priorities (Terry et al., 2013). In general, except from

strong consistency, a user may expect a weaker kind

of consistencies, which are referred to as eventual

consistencies. This is similar to declaring isolation

levels in conventional SQL databases. In some com-

panies, the price that clients pay for accessing data

repositories depends both on the amount of data and

on its freshness (consistency). For example, Amazon

charges twice as much for strongly consistent reads as

for eventually consistent reads in DynamoDB (Ama-

zon DynamoDB Pricing, 2014). According to (Terry

et al., 2013), applications ”should request consistency

and latency that provides suitable service to their cus-

tomers and for which they will pay”. For example,

Pileus system (Terry et al., 2013) allows applications

to declare their consistency and latency priorities.

Novelties of this paper are as follows.

1. We propose and discuss a method and an algo-

rithm for replicating NoSQL data. The algo-

rithm is called Lorq (LOg Replication based on

consensus Quorum). The main features of Lorq

are the following: (a) data replication is real-

ized by means of replicating logs storing update

operations (treated as a single-operation transac-

tions); (b) the processing and replication strate-

gies guarantees that eventually all operations in

102

Pankowski T..

Consistency and Availability of Data in Replicated NoSQL Databases.

DOI: 10.5220/0005368101020109

In Proceedings of the 10th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE-2015), pages 102-109

ISBN: 978-989-758-100-7

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

each replica are executed in the same order and

no operation is lost.

2. A special attention is paid to different kinds of

consistency, which can be guaranteed by the sys-

tem.

3. In comparison to Raft (Ongaro and Ousterhout,

2014), we propose a different way for choosing

a leader in Lorq. In particular, the procedure is

based on timestamps and servers priorities. It sim-

pliﬁes the election procedure signiﬁcantly.

In design and implementation of Lorq system, we

have used modern software engineering methods and

tools oriented to asynchronous and parallel program-

ing (Asynchronous Programming with Async and

Await, 2014).

The outline of this paper is as follows. The next

sections reviews problems of NoSQL data replica-

tion. The Lorq algorithm is presented in Section 3.

Some methods for achieving strong and eventual con-

sistency in Lorq, are discussed in Section 4. Finally,

Section 5 concludes the paper.

2 REPLICATION OF NoSQL DATA

In modern world-wide data stores, a number of goals

should be met:

• Scalability. The system must be able to take ad-

vantage of newly added servers (nodes) for repli-

cation and redistribution of data. The aim of

adding new hardware is to support large databases

with very high request rates and very low latency.

• Availability. It is guaranteed that each request

eventually receives a response. The case when a

response is too late, is often treated as the lack of

response at all, and the system can be understood

to be unavailable.

• Partitioning Tolerance. Due to communication

failures, the servers in the system can be parti-

tioned into multiple groups that cannot commu-

nicate with one another.

• Consistency. In conventional databases, a consis-

tent database state is a a state satisfying all con-

sistency constraints. In replicated databases, con-

sistency means the equality between answers to

queries issued against different servers in the sys-

tem. In the case of strong consistency all answers

are identical and up-to-date (such as ACID trans-

actions (Bernstein et al., 1987)). In the case of

a weak consistency, some answers can be stale,

however, the staleness of answers should be un-

der control, and in the lack of updates, all answers

converge to be identical. Then we say about even-

tual consistency (Vogels, 2009).

There is a fundamental trade-off between consistency

(Quality of Data), and latency/availability and parti-

tion tolerance (Quality of Service).

2.1 NoSQL Data Model

Modern NoSQL data stores manage various vari-

ants of key-value data models (Cattell, 2010) (e.g.,

PNUTS (Cooper et al., 2008), Dynamo (DeCandia

et al., 2007), Cassandra (Lakshman and Malik, 2010),

and BigTable (Chang et al., 2008).

The main NoSQL data structure is a key-value pair

(x,v), where x is a unique identiﬁer, and v is a value

(Cattell, 2010). The value component could be: struc-

tureless – then the value is a sequence of bytes and

an application is responsible for its interpretation, or

structured – the value is a set of pairs of the form A : v,

where A is an attribute name, and v is a value, in gen-

eral, values can be nested structures. Further on in

this paper, we will assume that values in data objects

are structured.

2.2 Strategies of Data Replication

Data replication can be realized using:

1. State Propagation. The leader (a master server)

propagates its latest state to workers (slaves, fol-

lowers), which replace their current states with the

propagated data.

2. Operation Propagation. The leader propagates

sequences of operations. It must be guaranteed

that these operations will be applied by all work-

ers to their states in the same order.

In both strategies, states of all database replicas will

be eventually consistent. However, the freshness

of these states is different and depends on the time

needed for the propagation. Some serious problems

follow from possible crashes of servers or communi-

cations between the servers. For instance, it may hap-

pen that the leader crashes before propagating the data

or operations. In such cases, there must be means to

prevent losing this data. From the operational point of

view, the replication must take advantages from asyn-

chronous and parallel processing in order to guarantee

the required efﬁciency.

2.3 Consensus Quorum Algorithms

In a quorum-based data replication, it is required that

an execution of an operation (i.e., a propagation of an

update operation or a read operation) is committed if

ConsistencyandAvailabilityofDatainReplicatedNoSQLDatabases

103

and only if a sufﬁciently large number of servers ac-

knowledge a successful termination of this operation

(Gifford, 1979).

In a quorum consensus algorithm, it is assumed

that: N is a number of servers storing copies of data

(replicas), R is an integer called read quorum, mean-

ing that at least R copies were successfully read; W

is an integer called write quorum, meaning that prop-

agations to at least W servers have been successfully

terminated. The following relationships hold between

N, R and W :

W > N/2, (1)

R +W > N. (2)

To commit a read operation, a server must collect the

read quorum, and to commit a write operation must

collect the write quorum. Condition (1) guarantees

that the majority of copies is updated, and (2) that

among read copies at least one is up-to-data.

The aim of consensus algorithms is to allow a col-

lection of servers to process users’ commands (up-

dates and reads) as long as the number of active

servers is not less than max{W,R}. It means that

the system is able to survive failures of some of its

servers.

An algorithm based on quorum consensus works

properly if a majority of servers is accessible and the

most current version of data can be always provided

to a client (i.e., it guarantees the strong consistency).

Note that for N = 3, R = 2,W = 2 the system tolerates

only one failure, and to tolerate at most two failures,

we can assume N = 5,R = 3,W = 3. Let p be the

probability of a server failure, then the probability that

at least n servers will be available is equal to

P(N, p,n) =

∑

k=0

(1 − p)

N−k

in particular, P(5, 0.02,3) = 0.99992.

3 LOG REPLICATION BASED ON

CONSENSUS QUORUM

During last decade, the research on consensus algo-

rithms is dominated by Paxos algorithms (Gafni and

Lamport, 2000; Lamport, 2006). Lately, a variant of

Paxos, named Raft (Ongaro and Ousterhout, 2014),

was presented as a consensus algorithm for managing

a replicated log. Lorq is based on ideas underlying

Paxos and Raft, and includes such steps as: (1) leader

election; (2) log replication, execution and commit-

ment of update operations; (3) realization of read op-

erations on different consistency levels (Pankowski,

2015).

3.1 Architecture

The architecture of Lorq (Figure 1), like in Raft (On-

garo and Ousterhout, 2014), is organized having in

mind: operations, clients, and servers occurring in the

system managing data replication.

Figure 1: Architecture of Lorq system. There are update

and read operations in queues. Update operations are deliv-

ered (1) by clients from queues to one leader (2). A replica-

tion module delivers them to leader’s log and to logs at all

workers (3). Sequences of operations in all logs tend to be

identical and are applied in the same order to databases (4).

States of all databases also tend to be identical (eventually

consistent). A client may read (5) data from any server.

Operations. We distinguish three update operations:

set, insert, and delete, and one read operation. In or-

der to informally deﬁne syntax and semantics of op-

erations, let us assume that there is a NoSQL database

DB = {(x,{A : a},t)} storing a data object (x,{A :

a},t), where t is the timestamp of the operation that as

the last has updated the data. Operations are speciﬁed

as follows:

• set(dataId, dataVal,t) – the value of an existing

data with identiﬁer dataId is set to dataVal, e.g.,

set(x, {A : b},t

) and set(x,{B : b},t

) against DB

changes its state to {(x, {A : b},t

)} and {(x,{B :

b},t

)}, respectively.

• ins(dataId,dataVal,t) – data (dataId,dataVal)

is inserted, or the value of the existing data ob-

ject dataId and its timestamp are modiﬁed ac-

cordingly, e.g.: successive execution of ins(x,{B :

b},t

) and ins(y,{C : c},t

) against DB, changes

its state to DB = {(x, {A : a, B : b},t

),(y,{C :

c},t

)}.

• del(dataId, dataVal,t) – the existing data ob-

ject (dataId, dataVal) is deleted, or the value

of the existing data object dataId is modiﬁed,

e.g.: successive execution of del(x,{B : b},t

) and

del(y, {C : c},t

) against the DB state considered

above, gives DB = {(x,{A : a},t

)}.

• read(dataId,t) – reads the key-value data with

identiﬁer dataId.

ENASE2015-10thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

104

Queues. Operations from users are serialized and put

in queues. An operation is represented as follows:

• any update operation, opType(dataId,dataVal),

opType ∈ {set,ins,del}, is represented as a tuple

(opId,clId,stTime,opType,dataId,dataVal);

• any read operation, read(dataId) – as a tuple

(opId,clId,stTime,dataId),

operation identiﬁer (opId), client identiﬁer (clId),

and start timestamp (stTime) are determined, when

the operation is delivered by a client to the leader.

Clients. Each queue is served by one client.

Servers. A server maintains one replica of a database

along with the log related to this replica, and runs the

software implementing Lorq protocols. One server

plays the role of the leader and the rest roles of work-

ers. The state of each server is characterized by the

server’s log and database. The following variables de-

scribe the state of a server:

• lastIdx – last log index (the highest position stor-

ing an operation),

• lastOpTime – the stTime stored at lastIdx posi-

tion,

• lastCommit – the highest log index storing a com-

mitted operation,

• currentLeader – identiﬁer of the current leader, 0

– the leader is not elected,

• lastActivity – the observed latest activity time of

the leader.

3.2 States and Roles

Lorq system can be in one of the following three states

(Figure 2):

Figure 2: System states, client and server roles in Lorq. A

client can ask about leader, wait for result of election or send

queries (update or read). A server can play undeﬁned role

(unqualiﬁed server), take part in leader election (as elector),

or act as leader or worker.

1. UnknownLeader – no leader is established in the

system (when the system starts and a short time

after a leader failure).

2. LeaderElection – election of a leader is in

progress.

3. KnownLeader – a leader is known in the system.

Actors of the system, i.e., clients and servers, can play

in those states speciﬁc roles.

3.2.1 Client in Asking Role

Initially, and when the system is in UnknownLeader

state, a client asynchronously sends the request

GetLeader to all servers. In response, each available

server can return:

• 0 – if a leader can not be elected because the re-

quired quorum cannot be achieved (the system is

unavailable);

• currentLeader – identiﬁer of the leader that is ei-

ther actually known to the server or has been cho-

sen in reaction to the GetLeader request.

3.2.2 Client in Waiting Role

After issuing GetLeader request, the client plays the

Waiting role. Next, depending on the reply, changes

its role to Asking or Querying.

3.2.3 Client in Querying Role

A client reads from a queue the next operation, or

reads again a waiting one if necessary, and proceeds

as follows:

• determines values for opId, clId, and stTime;

• an update operation is sent to the leader using an

U pdate command

U pdate(opId, clId,stTime,

opType,dataId,dataVal);

(3)

and the operation is treated as a ”waiting” one;

• a read operation is handled according to the re-

quired consistency type (ConType) (see Section

4), and is sent using a ReadConType command

ReadConType(opId,clId,stTime,dataId). (4)

If the U pdate command replays committed, the corre-

sponding operation is removed out from the queue. If

the timeout for responding elapses, the client changes

its role to Asking (perhaps the leader failed and the

operation will be next reissued to a new leader).

ConsistencyandAvailabilityofDatainReplicatedNoSQLDatabases

105

3.2.4 Server in Server Role

A server plays the Server role when the system

starts, and when a worker detects that election time-

out elapses without receiving any message from the

leader (that means the leader failure). A server in

Server role starts an election issuing the command

StartElection to all servers. Next, the systems goes to

the LeaderElection state, and all servers change their

roles to Elector.

3.2.5 Server in Elector Role

The community of servers attempt to chose a leader

among them. The leader should be this server that

has the highest value of lastCommit, and by equal

lastCommit, the one with the highest identiﬁer (or sat-

isfying another priority criterion). The election pro-

ceeds as follows:

1. A server collects lastCommit values from all

servers (including itself), and creates a decreas-

ingly ordered list of pairs (lastCommit, srvId).

2. If the list contains answers from majority of

servers, then the srvId from the ﬁrst pair is cho-

sen as the leader and its value is substituted to

currentLeader.

3. The currentLeader is announced as the leader.

The procedure guarantees that all servers will

choose the same leader.

Next, the system goes to the KnownLeader state, the

server chosen as the leader changes its role to Leader,

and the remainder servers to Worker.

3.2.6 Server in Leader Role

The leader acts as follows:

1. After receiving U pdate command (3), a leader

issues asynchronously to each server (including

itself) the command appending the operation to

server’s log

Append(opId,clId, stTime,opType,dataId,

dataVal,lastIdx,lastOpTime,lastCommit)

(5)

The process can return with an exit code indicat-

ing:

• success – the entry was successfully appended

to the server’s log,

• inconsistency – the server’s log is inconsistent

with the leader’s one. Then a procedure recov-

ering consistency is carried out. After this, the

leader retries sending the Append command. It

is guaranteed, that after a ﬁnite number of rep-

etition, the command returns with success (un-

less the worker fails).

If the number of servers responding success is

at least equal to the write quorum (W), then the

leader sends asynchronously to all these servers

(also to itself) the request to execute the appended

operation and to commit it. Next, without waiting

for responses, replays committed to the client. If

a worker does not respond to Append (because

of crash, delay, or lost of network packet), the

leader retries Append indeﬁnitely (even after it

has responded to the client). Eventually, all work-

ers store the appended entry or a new election is

started.

2. If the leader activity timeout elapses (given

by ActivityTimeout that must be less than

ElectionTimeout), the leader sends

Append(lastIdx, lastOpTime, lastCommit)

to each worker, i.e., Append command with

empty data-part (so-called heartbeat), to conﬁrm

its role and prevent starting a new election. The

response to this message is the same as to reg-

ular Append. In particular, this is important for

checking log consistency, especially for restarted

workers.

3. When a new leader starts, then:

• There can be some uncommitted entries in the

top of the leader’s log, i.e., after the lastCommit

entry, (lastCommit < lastIdx). These entries

must be propagated to workers by Append

command in increasing order. If a delivered log

entry is already present in worker’s log, it is ig-

nored.

• Some ”waiting” operations in a queue, i.e., de-

noted as already sent to a leader, could not oc-

cur in the leader’s log (the reason is that they

have been sent to a previous leader and that

leader crashed before reaching the write quo-

rum). Then these operations must be again

sent by the client from the queue to the newly

elected leader.

• After the aforementioned two operations have

been done, the client starts delivering the next

update operation from the queue.

A server plays the Leader role until it fails. After re-

covery, it plays a role of a Worker.

3.2.7 Server in Worker Role

Let a worker receive an Append command (possibly

with empty data-part) of the form (5), where leader’s

parameters are denoted as: lastIdxL, lastOpTimeL,

lastCommitL. Then:

ENASE2015-10thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

106

1. If lastIdx = lastIdxL and lastOpTime =

lastOpTimeL, then: if the data-part is not empty,

then append the new entry; execute and commit

all uncommitted entries at positions less than or

equal to lastCommitL in increasing order; reply

success.

2. If lastIdx < lastIdxL, then reply inconsistency.

The worker expects that the leader will decide to

send all missing entries.

3. If lastIdx = lastIdxL and lastOpTime <

lastOpTimeL, then delete lastIdx entry and reply

inconsistency.

4. If lastIdx > lastIdxL then delete entries at posi-

tions greater than or equal to lastIdxL and reply

inconsistency.

A server plays the Worker role until its failure or fail-

ure of the leader – then it returns to the Server role.

3.3 Example

Let us assume that there are ﬁve servers in the sys-

tem (N = 5) (Figure 3), and the write quorum is three

(W = 3). By ”+” we denote committed entries, and by

”?” the waiting ones (i.e., a client waits for commit-

ting the sent operation). Let us consider three steps in

the process of replication presented in Figure 3.

Step 1. Operations a and b are already commit-

ted, bat c and d are waiting, i.e., their execution in the

system is not completed. S

is the leader that propa-

gated c and d to S

, and after this failed. We assume

that also S

failed. There are two uncommitted oper-

ations, c and d, in the top of S

, these operations are

also denoted as waiting in the queue.

Step 2. In the next election, S

has been elected

the leader. The client receives information about the

new leader and sends the waiting operations c and d

to him. Since c and d are already in the S

’s log, the

sending is ignored. S

propagates c and d to available

workers S

, and S

. The write quorum is reached, so

they are executed and committed. Next, S

receives

from a client the operation e, and fails.

Step 3. Now, all servers are active and S

is elected

the leader. The client receives information about the

new leader and sends the waiting operation e to S

Next, the leader propagates e as follows:

• Propagation to S

: e is appended at ﬁfth position.

Since leader’s lastCommit is 4, thus c and d are

executed in the S

’s database and committed.

• Propagation to S

: e is successfully appended at

ﬁfth position.

• The leader (S

) recognizes that the write quorum

is already reached. Thus: asynchronously sends

to S

, S

and to itself the command to execute and

commit e; replies committed to the client; con-

tinue propagations of e to S

and S

• Propagation to S

: e is already in S

’s log, so the

append is ignored. Along with e, the lastCommit

equal to 5 is sent to S

. Because of this, e can be

executed and committed.

• Propagation to S

: there is an inconsistency be-

tween logs in S

and S

. We are decreasing

lastIdx in S

, lastIdx := lastIdx − 1, as long as

the coherence between both logs is observed. This

happens for lastIdx = 2. Now, all entries at posi-

tions 3, 4 and 5 are propagated to S

. Moreover,

operations c,d and e are executed in S

’s database

and committed.

In the result, all logs will be eventually identical.

4 CONSISTENCY MODELS FOR

REPLICATED DATA

Some database systems provide strong consistency,

while others have chosen eventual consistency in or-

der to obtain better performance and availability. In

this section, we discuss how these two paradigms can

be achieved in a data replication system based on Lorq

model.

4.1 Strong Consistency

Strong consistency guarantees that a read operation

returns the value that was last written for a given data

object. In other words, a read observes the effects of

all previously completed updates (Terry, 2013; Terry

et al., 2013). Since Lorq system is based on a consen-

sus quorum algorithm, it is guaranteed that a majority

of servers has successfully accomplished execution of

all committed update operations in their databases. To

guarantee strong consistency, the number of servers

we should read from is at least equal to the read quo-

rum (R). Next, from the set of read data values we

chose the value with the latest timestamp.

However, reading from many servers is costly,

thus, to improve performance, w can restrict ourselves

to reading from one server only. So, if we need to

have high availability, we can chose so-called even-

tual consistency.

4.2 Eventual Consistency

Eventual consistency is a theoretical guarantee that,

”if no new updates are made to the object, eventually

ConsistencyandAvailabilityofDatainReplicatedNoSQLDatabases

107

Figure 3: Illustration to a scenario of managing data replication in Lorq.

all accesses will return the last updated value” (Vo-

gels, 2009). In this kind of consistency, reads return

the values of data objects that were previously writ-

ten, though not necessarily latest values. An even-

tual consistency guarantee can be requested by each

read operation (or by a session), thereby choosing be-

tween availability, costs and data freshness. Next, we

will discuss realization of some eventual consistency

in Lorq system. We will consider four classes of even-

tual consistency: consistent preﬁx, bounded staleness,

monotonic reads, and read-my-writes (Terry, 2013).

4.2.1 Consistent Preﬁx

A reader is guaranteed to see a state of data that ex-

isted at some time in the past, but not necessarily the

latest state. This is similar to so-called snapshot isola-

tion offered by database management systems (Terry,

2013).

In Lorq, this consistency is realized by reading

from any server. It is guaranteed that databases in

all servers store past or current states, and that these

states were up-to-date sometime in the past. In gen-

eral, the level of staleness is not known, since the read

may be done from a server which is separated from

others by communication failure.

4.2.2 Bounded Staleness

A reader is guaranteed that read results are not too

out-of-data. We distinguish version-based staleness

and time-based staleness (Golab et al., 2014).

In Lorq, the staleness of a read data x can be de-

termine as follows. We read x from an active server’s

(worker or leader) database. Let read(x) = (x,v,t

where t

is the timestamp of the last operation that

updated x. Next, we ﬁnd in the log the set of opera-

tions, which are intended to update x and have times-

tamps greater than t

. Let O be the set of such oper-

ations, and

• t

lastop

= max{t

| o ∈ O}, i.e., t

lastop

is the largest

timestamp of operations in O; if O is empty, we

assume t

lastop

= t

;

• k = count(O), i.e., k denotes cardinality of O.

Then the operation read(x) has the version-based

staleness equal to k, and time-based staleness equal

to ∆ = t

lastop

−t

In this way, we can assess whether the required

freshness is fulﬁlled. If not, we apply to the auxil-

iary database AuxDB = {(x,v,t

)} as many opera-

tions as needed from O, in increasing order, to guar-

antee the acceptable staleness. Finally, we perform

read(x) against the ﬁnal state of AuxDB.

4.2.3 Monotonic Read

A reader is guaranteed to see data states that is the

same or increasingly up-to-date over time (Terry,

2013). Assume that a client reads a data object x,

read(x) = (x,v,t). Then the client updates x with

(x,v

), where t

> t. If next the user issues another

read to this data object, then the result will be: either

(x,v,t) or (x, v

), but never a state (x,v

), where

< t.

In Lorq, the monotonic read is guaranteed by so-

called session guarantee meaning that all read oper-

ations of a client are always addressed to the same

worker. Then, the session terminates when the worker

ENASE2015-10thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

108

fails. Note, that if we would read from different work-

ers, we could be unlucky and read from a delayed or

from a server belonging to a separated partition.

4.2.4 Read-my-writes

It is guaranteed that effects of all updates performed

by the client, are visible to the client’s subsequent

reads. If a client updates data and then reads this data,

then the read will return the result of this update (or

an update that was done later) (Terry, 2013).

In Lorq, the read-my-writes consistency is guaran-

teed in the way similar to that of bounded staleness.

• let read(x) be a read operation issued by a client

with identiﬁer clId, and let read(x) = (x,v,t) in

an arbitrarily chosen server;

• let O

clId

be the set of operations stored on server’s

log, and issued by clId;

• if O

clId

is empty, then read(x) is the result;

• otherwise, apply all operations from O

clId

, in in-

creasing order, to the auxiliary database AuxDB =

{(x,v,t)}; next, perform read(x) against the ﬁnal

state of AuxDB.

5 CONCLUSIONS AND FUTURE

WORK

We proposed a new algorithm, called Lorq, for man-

aging replicated data based on the consensus quo-

rum approach. Lorq, like another consensus quorum

algorithms, is devoted for data-centric applications,

where the trade-off between consistency, availability

and partition tolerance must be taken into account.

The implementation of Lorq make advantages of the

modern software engineering methods and tools ori-

ented to asynchronous and parallel programing. In

future work, we plan to extend the Lorq algorithm

to take advantages of so-called replicated data types

(Burckhardt et al., 2014; Shapiro et al., 2011). We

plan also to prepare and conduct some real-system ex-

periments. This research has been supported by Pol-

ish Ministry of Science and Higher Education under

grant 04/45/DSPB/0136.

REFERENCES

Abadi, D. (2012). Consistency tradeoffs in modern dis-

tributed database system design. IEEE Computer,

45(2):37–42.

Amazon DynamoDB Pricing (2014). http://aws.amazon.

com/dynamodb/pricing.

Asynchronous Programming with Async and Await

(2014). http://msdn.microsoft.com/en-us/library/

hh191443.aspx.

Bernstein, P. A., Hadzilacos, V., and Goodman, N. (1987).

Concurrency Control and Recovery in Database Sys-

tems. Addison Wesley Publishing Company.

Burckhardt, S., Gotsman, A., Yang, H., and Zawirski, M.

(2014). Replicated data types: speciﬁcation, veriﬁ-

cation, optimality. In ACM SIGPLAN-SIGACT Sym-

posium on Principles of Programming Languages,

POPL’14, pages 271–284.

Cattell, R. (2010). Scalable SQL and NoSQL data stores.

SIGMOD Record, 39(4):12–27.

Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., and et

al. (2008). Bigtable: A distributed storage system for

structured data. ACM Trans. Comput. Syst., 26(2).

Cooper, B. F., Ramakrishnan, R., Srivastava, U., Silber-

stein, A., and et al. (2008). PNUTS: Yahoo!’s hosted

data serving platform. PVLDB, 1(2):1277–1288.

Corbett, J. C., Dean, J., Epstein, M., and et al. (2013). Span-

ner: Google’s globally distributed database. ACM

Trans. Comput. Syst., 31(3):8.

DeCandia, G., Hastorun, D., Jampani, M., and et al. (2007).

Dynamo: Amazon’s highly available key-value store.

SIGOPS Oper. Syst. Rev., 41(6):205–220.

Gafni, E. and Lamport, L. (2000). Disk Paxos. In Dis-

tributed Computing, DISC 2000, LNCS 1914, pages

330–344. Springer.

Gifford, D. K. (1979). Weighted Voting for Replicated Data.

In ACM SIGOPS 7th Sym. on Op. Systems Principles,

SOSP’79, pages 150–162.

Gilbert, S. and Lynch, N. A. (2012). Perspectives on the

CAP Theorem. IEEE Computer, 45(2):30–36.

Golab, W., Rahman, M. R., AuYoung, A., Keeton, K., and

Li, X. S. (2014). Eventually consistent: Not what you

were expecting? Commun. ACM, 57(3):38–44.

Lakshman, A. and Malik, P. (2010). Cassandra: A de-

centralized structured storage system. SIGOPS Oper.

Syst. Rev., 44(2):35–40.

Lamport, L. (2006). Fast Paxos. Distributed Computing,

19(2):79–103.

Ongaro, D. and Ousterhout, J. (2014). In Search of an Un-

derstandable Consensus Algorithm. In USENIX An-

nual Technical Conference, pages 305–319.

Pankowski, T. (2015). A consensus quorum algorithm for

replicated NoSQL data. In BDAS 2015, Communi-

cations in Computer and Information Science, pages

1–10. Springer.

Shapiro, M., Preguic¸a, N. M., Baquero, C., and Zawirski,

M. (2011). Conﬂict-free replicated data types. Lecture

Notes in Computer Science 6976, pages 386–400.

Terry, D. (2013). Replicated data consistency explained

through baseball. Commun. ACM, 56(12):82–89.

Terry, D. B., Prabhakaran, V., Kotla, R., Balakrishnan,

M., and et al. (2013). Consistency-based service

level agreements for cloud storage. In ACM SIGOPS,

SOSP’13, pages 309–324.

Vogels, W. (2009). Eventually consistent. Commun. ACM,

52(1):40–44.

ConsistencyandAvailabilityofDatainReplicatedNoSQLDatabases

109