2 RELATED WORK
In principle, our research is based on the DOM API
(i.e., Application Program Interface) which is
supported by most native databases to access and
manipulate XML documents. Although DOM API
provides several interfaces to manipulate XML
documents, we only focus on elements, attributes,
and texts since they are the most important theme of
data sharing. Similar to XML document trees, the
Document Object Model (i.e., DOM) represents
XML documents as trees, as shown in Figure 1.
Figure 1: DOM tree representation.
Till now, there are two classifications in the
previous researches on concurrency control
manipulation with DOM API. One is XML
Transaction Coordinator (Haustein, 2004) that
provides a taDOM tree for storing XML documents,
and proposed the taDOM protocol to ensure
serializability. The other one is Natix that proposed
Doc2PL, Node2PL, NO2PL, and OO2PL protocols
(Helmer, 2004) to ensure serializability. Since
OO2PL has been verified to have the best
performance, our research stretches OO2PL.
Although OO2PL acquires locks on the pointers of
nodes (i.e., the first child, the last child, the previous
sibling, and the next sibling), OO2PL only classifies
operations into observer and mutator ones. In order
to enhance the concurrency degree, our protocol
distinguishes operations more detailedly.
3 DLP
3.1 Operation Conflicts
The operations in our protocol consist of eight types:
R, N, IB, AP, UP, RN, RM, and RP standing for
Read, Navigate, Insert-Before, Append, Update,
Rename, Remove, Replace, respectively. R and N
are both read operations, but R is for the
manipulation of nodes and N is for the navigation of
paths.
We define a transaction T as a sequence of
DOM API operations. Operation conflicts may occur
when the operations from different transactions are
interleaved with each other, thereby producing
incorrect results. The criterion of correctness is
based on the serializability of concurrent
transactions. In order to analyze the conflicts
between operations, we classify operations into
content operations and structural operations. Content
operations consisting of R, UP, and RN denote the
ones which manipulate data values at nodes.
Structural operations consisting of N, IB, AP, RM,
and RP denote the ones which navigate or modify
the structure of a DOM tree. The structural
operations get involved in the pointers within a
node.
Different from that R reads the content of a
target node (i.e., node name or text value), N reads
the pointer of each node (i.e., the first child or the
next sibling, et al.) along the path specified by a
transaction. Next, RM (or RP) is similar to N, but it
modifies the pointer to the target node into nil (or
the pointer to the replacing node). Finally, IB (or AP)
modifies the previous sibling pointer (or the last
child pointer) of the target node into the pointer to
the new node. However, not only the relevant
pointer of the target node but also the pointers of
related nodes should be modified together.
Basically, the two kinds of operations would not
conflict with each other, since content operations
only manipulate node values, whereas structural
operations only deal with the DOM structure.
However, always a transaction executing a content
operation has to use structural operation N to reach
the target node. Thus, while these two kinds of
operations work on the same target node, the
involved structural operations would conflict with
themselves.
As mentioned above, we summarize the
operation conflicts in Table 1 and 2. Within the
matrix, symbols “○” and “×” denote the concurrent
operations are OK and in conflict, respectively.
Beside symbols “ ○ ” and “ × ” , we also use
symbol “△” to denote the concurrent operations
are in conflict in some situation.
Table 1: Conflict matrix of content operations.
R UP RN
R ○ ○ ×
UP ○ × ○
RN × ○ ×
ICEIS 2008 - International Conference on Enterprise Information Systems
106