the different facets from traditional database transac-
tion processing. Before discussing the integrity eval-
uation, we first reveal the characteristics of the cloud
regarding data integrity
2.1 Data Integrity in the Cloud
Data integrity is one of the most important issues in
traditional database transaction processing, or trans-
action processing in short. The ACID principle as-
sures the isolated access to databases for each transac-
tion, in order to prevent unintentional concurrent up-
dates on the same entry in a database, which might
break the data integrity.
This rigorous control for the data integrity often
reduces the performance and throughput of the trans-
action processing, because of the locking and serial-
ization overhead. On the other hand, cloud computing
is aiming for rather high availability and scalability,
and therefore “ACID” principle is regarded as con-
flicting with the goal of cloud computing.
In order to adapt transaction processing to cloud
computing, a more relaxed principle for data integrity
is proposed, which is often referred to as “BASE”.
Following the BASE principle, transactions in the
cloud environments show the different behavior from
traditional ones, because of its unique integrity preser-
vation mechanism.
Therefore, if we plan to run transactions in cloud
environments, we need to understand their behavioral
characteristics from the data integrity viewpoint. In
order to understand it rigorously, we should formalize
the concept of data integrity, along with its preser-
vation mechanism. The data integrity is mainly an
application oriented matter, and could be differently
defined among application domains. Therefore, there
seems no way to give the common definition of it. In-
stead, it seems more practical to define the standard-
ized notation for data integrity rules or constraints.
One of the rigorous ways to express these
constraints is to use the predicate logic formulae
(Shinkawa and Matsumoto, 2001)(Shinkawa, 2012).
Since the logic formulae for data integrity define the
constraints on database values, the domain of dis-
course is composed of
D = (
[
i
D
i
) ∪(
[
i, j
r
ij
) ∪(
[
i, j,k
a
(k)
i, j
)
where D
i
is the i-th database, r
ij
is the j−th entry or
record in the database D
i
, and a
(k)
ij
is the k-th attribute
of the r
ij
.
In addition to this domain of discourse, we have
to define the functions, predicates, variables, and con-
stants rigorously. Assuming we have defined all these
elements of the logic, any constraints on the databases
can be represented by a prenex conjunctive normal
form (PCNF) as
Q
1
···Q
n
_
j
^
i
P
ij
(t
(ij)
1
···t
(ij)
m
ij
)
where Q
i
is a variable with the quantifier “∀” or “∃”,
e.g. ∀x
i
or ∃x
i
, P
ij
is a predicate, and t
(ij)
k
is a
term composed of variables, constants, and functions
(Schoening, 2008).
On the other hand, the mechanism for preserving
data integrity in the cloud is an implementation of the
BASE principle, and is equivalent to the optimistic
locking (Kung and Robinson, 1981). This mechanism
allows arbitrary concurrent access to any entry in the
databases, and the integrity preservation is attempted
only at the commit point, examining whether the re-
ferred entries have been modified during the transac-
tion execution.
The concurrent database access makes the behav-
ior of the transactions more complicated than the
ACID ones, since all the database references and up-
dates might be interleaved between the transactions.
In order to evaluate the data integrity for such a com-
plicated transaction behavior, a simulation approach
is more suitable than logical analysis.
2.2 CPN Based Integrity Evaluation
Model
For effective and efficient simulation, we need to
build an executable model reflecting the behavior
and functionality of all the related transactions, along
with the cloud platform structure including databases.
Therefore, we have to select a modeling tool having
the capability for expressing the three orthogonal as-
pects of a system simultaneously, namely, the func-
tional, behavioral, and structural aspects. In addition,
the created models must be executable for simulation.
For these requirements, Colored Petri Net (CPN) is
one of the most suitable modeling tools, since it ex-
tends Petri Net from functional viewpoint, which can
express the behavior and the structure of systems pre-
cisely (Jensen and Kristensen, 2009) (Jensen et al.,
2007).
CPN is formally defined as a nine-tuple CPN=(P,
T, A, Σ, V, C, G, E, I) , where
P : a finite set of places.
T : a finite set of transitions.
(a transition represents an event)
A : a finite set of arcs P∩ T = P∩ A = T ∩ A =
/
0.
Σ : a finite set of non-empty color sets.
(a color represents a data type)
ICSOFT-EA2014-9thInternationalConferenceonSoftwareEngineeringandApplications
394