2 MOTIVATION
Observing the increase of web services, we can notice
that some of the data managed by those systems could
be classified as really critical information, like money
management transactions or on-time auction transac-
tions but some of the data in the same system could
be classified as uncritical information, like static in-
formation featured by your bank or auction site. Our
approach tries to maximize the scalability and avail-
ability of the whole system by taking this novel ap-
proach.
Web applications choose to store data on
databases with Snapshot Isolation (SI) (Berenson
et al., 1995). This is due to the non-blocking nature
of each read operation executed under that isolation
level, as it reads from a snapshot of committed up-
date transactions up to that transaction beginning.If
we extend the notion of SI to a replicated environ-
ment we obtain the Generalized Snapshot Isolation
(GSI) level (Elnikety et al., 2005): the snapshot got by
a transaction could be any of the previous history of
committed transaction up to its start; thus, SI is a par-
ticular case of GSI. This will cause a benefit in repli-
cated databases, as clients access their closest replicas
and reduce their latency.
Replicated databases run a replication protocol to
manage transactions. It is well known that replica-
tion protocols perform differently depending on the
workload characteristics. For instance, a read inten-
sive partition may provide a higher throughput with
a primary-backup scheme (Wiesmann and Schiper,
2005). On the contrary, a partition whose items
are frequently updated might benefit from an up-
date everywhere replication solution based on total
order broadcast such as certification based replica-
tion (Wiesmann and Schiper, 2005). However, update
everywhere protocols suffer from a serious scalability
limitation, as the cost of propagating updates in total
order to all replicas is greatly affected by the number
of involved replicas.
We take the system presented in (Arrieta-Salinas
et al., 2012) as a basis to provide higher scalability
and availability. Hence, data is partitioned (Curino
et al., 2010) and each partition is placed in a set of
replicas, say M, where K of them run a given repli-
cation protocol (either update everywhere or primary
backup) and the rest (M − K) are placed in a repli-
cation tree whose depth and composition depends on
the application. Several, or all, of the K replicas act as
primaries for other backup replicas (those of the first
level in the tree) which would asynchronously receive
updates from their respective primaries. At the same
time, backup replicas could act as pseudo-primaries
for other replicas placed at lower layers of the hier-
archy, thus propagating changes along the tree in an
epidemic way. If we augment the replication degree
of a given partition, then we can forward transactions
to different replicas storing it, and thus, transactions
will be more likely to obtain old, though consistent,
snapshots (GSI) and alleviating the traditional prob-
lem of scalability in the core.
The novelty in our approach is to take advantage
of this replication hierarchy and place non-critical
data along the hierarchy tree and proceed with it in
a similar way as with normal data. We will run a pri-
mary copy protocol based on the principles given in
the COLUP (Ir´un-Briz et al., 2003) algorithm. The
resulting protocol increases the performance and re-
duces the abort rate of non-critical update transac-
tions, by re-partitioning (apart from its original par-
titioning based on graphs or any other approach) the
data according to its critical nature. Hence, having
a traditional partitioning schema where critical and
non-critical data are placed in the same partition, we
establish that a certain replica should handle the up-
dates of non-critical data while critical data is updated
at another replica. Under this assumption, critical
transactions can be executed faster and not interleaved
with transactions accessing uncritical data. Those un-
critical transactions may access older data. Mean-
time, other critical transactions could be scheduled,
increasing the age of the snapshots accessed by un-
critical transactions. However, every uncritical trans-
action is characterized by a threshold on the age of the
data it needs to access. As a result, accessing old data
is tolerated by these transactions in the regular case
and such situation will not necessarily lead to their
abortion.
Compared with traditional GSI, what our model
does is to try to anticipate when a transaction is go-
ing to impact (i.e., to present a write-write conflict)
with other transactions and when a non-critical trans-
action is going to impact, then we control an alter-
native mechanism. In that case, a transaction A is
aborted in the validation phase only when at least one
of its conflicting transactions is critical (or uncritical
but with an allowing threshold lower than that of A).
Uncritical transactions are characterized by an al-
lowance parameter k. The value of k indicates the
number of missed updates tolerated by the uncritical
transaction. This value is 1 or greater than 1 for un-
critical transactions. Implicitly, critical transactions
are those that access at least one critical item and have
a zero value for k. When conflicts arise between trans-
actions that access critical data (i.e., critical transac-
tions) and transactions that only access uncritical data
(i.e., uncritical transactions), no critical transaction
CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience
526