Linda tuple space model (Gelernter, 1985), which
enables participants to write data tuples into a space
and retrieve them using a query mechanism based on
template matching. Tuple spaces can be used to
synchronize independent processes via blocking
queries that return their result as soon as a matching
tuple is provided by another process. The XVSM
(eXtensible Virtual Shared Memory) middleware
model (Craß et al., 2009) adheres to this space-based
computing style via space containers that are
identified via a URI and support configurable
coordination laws for writing and selecting data
entries. Processes that access a container may write,
read, or take (i.e. read and delete) entries, which
generalize the tuple concept, using configurable
coordination mechanisms like key-based access,
FIFO queues, or template matching. Depending on
the used coordination mechanism, queries for read
and take operations include parameters like the key
of a searched entry or the count of entries that shall
be returned in FIFO order. If no matching result
exists, the query blocks until it is fulfilled or a given
timeout is reached, which enables decoupled
communication.
If many distributed processes interact, a single
space may form a performance bottleneck that
hinders scalability. Thus, replicated spaces would
enable scalable P2P-based solutions, but currently
only a few space-based middleware systems provide
built-in replication mechanisms. However, even with
those that support replication, the problem is that
they usually assume a fixed mechanism, but there is
not one optimal replication mechanism that serves
all applications equally well. The trade-off between
consistency, availability and partition tolerance must
be negotiated for each use case. A replication
mechanism should therefore offer different
replication strategies that can be configured by the
user and therefore adapted to a given scenario.
In this paper, we investigate the Java-based open
source implementation of XVSM, termed
MozartSpaces
(available at www.mozartspaces.org),
for which we will present a flexible replication
framework. We suggest an asynchronous
mechanism that offers multiple replication
approaches and can be configured and adapted for
each scenario. A flexible plugin approach means that
different replication algorithms exist, and it is easily
possible to add new ones and to exchange them.
A motivating use case can be found in the
domain of traffic management for road or rail
networks, were nodes are placed along the track to
collect data from passing vehicles and inform them
about relevant events (like congestions). As nodes
may fail, data must be replicated to prevent data
loss. For scalability reasons, a P2P-based approach
is more feasible than a centralized architecture.
The paper is structured as follows: Section 2 is
dedicated to related work. Section 3 describes the
suggested space-based replication framework. As a
proof-of-concept two plugins are provided to control
the replication of containers: i) replication via the
Distributed Hash Table (DHT) implementation
Hazelcast (Hazelcast, 2012) and ii) a native
replication mechanism that is bootstrapped using the
space-based middleware itself. Both plugins perform
asynchronous multi-master replication. Section 4
evaluates the solution and analyzes benchmark
results, while Section 5 provides a conclusion.
2 RELATED WORK
Replication for databases and data-oriented
middleware like tuple spaces may be achieved via
synchronous or asynchronous replica updates.
Synchronous replication as defined by the ROWA
(Read-One-Write-All) approach (Bernstein,
Hadzilacos and Goodman, 1987) forces any update
operation to wait until the update has been
propagated to all replicas. This scales well in a
system that performs many read operations but few
updates. In general, however, asynchronous
replication mechanisms that use lazy update
propagation increase the scalability and performance
dramatically (Jiménez-Peris, et al., 2003), but this is
achieved at the cost of reduced consistency
guarantees and more complex error handling.
Depending on the requirements of a distributed
application, strict consistency models based on
ACID (Atomicity, Consistency, Isolation,
Durability) (Haerder and Reuter, 1983) or relaxed
models like BASE (Basically Available, Soft state,
Eventually consistent) (Pritchett, 2008) are more
suitable for data replication. While ACID
transactions always guarantee consistent replica
states, BASE uses a more fault-tolerant model that
allows temporarily inconsistent states. In this paper,
we present a replication mechanism that supports
both types of consistency models.
Replication schemes define how operations are
performed on specific replicas. For a space-based
approach, the master-slave and multi-master
replication schemes are relevant. For master-slave
replication, several slave nodes are assigned to a
single master node. Read operations can be
performed on any node while updates are restricted
to the master node, which then propagates the
ICSOFT2013-8thInternationalJointConferenceonSoftwareTechnologies
600