topology is peer-to-peer with asynchronous
replication. Each node in a peer-to-peer topology
publishes and subscribes to the same data. All nodes
are equal peers with equal ownership of the data.
The procedures for asynchronous replication are:
Dump and Reload, Snapshot Replication and
Incremental Refresh, where either the changed
tuples are transmitted to the target sites or the
transactions causing the tuples to be changed, are
transferred and applied at the target sites
(transaction-based replication). Examples for
replication facilities of the state of the art in
databases systems can be found in (Garmany and
Freeman, 2003; Oracle 9i, 2002; Chigrik, 2000; IBM
DB2
Universal Database, 2002; IBM DB2 Data
Propagator, 2003).
2 CONFLICT CLASSIFICATION
The following assumptions will be taken into
consideration: asynchronous replication in relational
databases in multi-master scenario (peer-to-peer),
supporting transaction-based replication with
incremental refresh mode. When we talk about the
SQL-operations, it must be clear that only write
operations (INSERT, UPDATE, DELETE) which
change tuples will be considered, because they have
a crucial impact on a successful synchronization. We
consider each SQL-operation of a transaction of
being decomposed into a series of so-called single-
operations. A single-operation is an operation that
involves one tuple only. Two operations of
transactions that are executed on different sites are
called conflicting operations, if they cause a
replication conflict. This might the case if they
operate on the same tuple or if they transform some
tuples which were previously different into the same
tuple. There are many different types of conflicts:
Insert/Insert, Insert/Update, Insert/Delete,
Update/Update, Update/Delete, Delete/Delete. More
detailed information about conflict classification can
be seen in Kühn at al., 2007; Ruhdorfer, 2005.
Example: Let us assume that there are two sites:
A and B, and two operation: op1 and op2. Further,
we assume that op1 is Insert operation and op2 is
also Insert operation. Then op1/op2 means that op1
was executed and committed at site A and op2 was
concurrently executed and committed at site B.
Afterwards op1 is replicated to site B and op2 to site
A and executed there respectively, causing the
conflict that can be described as: tuples are entirely
the same (PRIMARY KEY and all the columns are
the same).
3 CONFLICT PREVENTION
Before we start discussion about conflict detection
and conflict resolution, it is important to know what
techniques/approaches can be used in order to avoid
replication conflicts. Of course it is much better to
avoid conflicts, if this is possible. For example,
modification of the database scheme can help where
unique numbers for each peer site are added to tables
that shall be replicated etc. However, the tradeoff is
a change of existing systems that possibly is not
acceptable. Another example would be to resign full
peer-to-peer replication and to use only one-way
read-only replication etc. Replication conflicts can
also be prevented by assigning the right to update
the data to a single site in one of the following
ownership types: static site ownership model,
dynamic site ownership model (workflow, token
passing), shared ownership model with some
strategies for avoiding specific types of conflicts
(avoiding uniqueness conflicts, avoiding delete
conflicts, avoiding ordering conflicts) (Oracle 9i,
2002; IBM DB2 Redbook, 2002).
4 CONFLICT DETECTION
The process of detecting constraint errors and the
process of detecting whether the same tuple was
modified concurrently by application programs at
more than one peer site during the same replication
cycle in a peer-to-peer replication configuration is
called conflict detection. Commercial databases
address this issue differently (Garmany and
Freeman, 2003; IBM DB2 Universal Database,
2002). We assume that in the full master replication
scenario, the peers communicate directly using a
shared coordination space (Kühn, 2001) which
provides reliable asynchronous, publish/subscribe
based flexible collaboration on shared and
distributed data structure, like the communication
with near-time event notification, the possibility of
reading the same data multiple times and according
to different coordination criteria in a flexible way.
Therefore we decided to use a space instead of
distributed hash tables, publish/subscribe systems or
message queues.Each database (DB) site is called a
peer site. With each DB a gateway process is
associated that interfaces both the DB and the space
and can be located on another site than the DB. For
each table to be replicated, triggers are installed that
track every write SQL-operation and store its single
operations together with the meta-information
ICSOFT 2007 - International Conference on Software and Data Technologies
216