compliant to the read and write rules of data items updated by T
i
. If a transaction T
i
is
validated, all corresponding revisions of data items updated by it are also validated.
Also we cannot validate a particular data revision unless the previous version of that
particular data item is validated. The transaction validation procedure runs in back-
ground while the DBMS is running. It does not have to be run in real time. Although
the main purpose of it is to validate transactions submitted to DBMS, it is a user pro-
cess with low priority and should not significantly affect performance of the DBMS.
Algorithm
1. Initialize the validated transaction list L
V
=
2. Initialize the malicious transaction list L
M
=
3. For each committed transaction T
i
For each data item x updated in T
i
If T
i
is not compliant to any of these read rules
Add T
i
to malicious transaction list L
M
, L
M
=L
M
{T
i
}
Else if T
i
is not compliant to any of these write rules
Add T
i
to malicious transaction list L
M
, L
M
=L
M
{T
i
}
Mark revision x
i
,
j
of data item x to be validated
Add Transaction T
i
to L
V
, i.e., L
V
= L
V
{T
i
}
Delete previous revision of data item x from data revision log
4.3 Clean Data Identification Procedure
To reduce the denial of service impact by malicious transactions, the database damage
assessment procedure should make the clean data, i.e., unaffected data, available to
legitimate users as soon as possible. Our proposed data versioning procedure helps in
identifying the correct version of data to serve future data access requests of transac-
tions. The process for identifying the correct version of data proceeds as follows.
First, the data items that are updated by unimportant transactions are made avail-
able to users. This is because these unimportant transactions are not affected by the
previous value of the data items. For example, these unimportant transactions may
simply refresh the old value of the data item, irrespective of whether the old values
are correct or not. Second, the data items that are updated by tolerating margin of
error transactions are made available to users next. What value of this kind of data
items is used to serve the transaction’s request? The lower risk value of each of these
data items are made available to transactions. Depending on the data semantics, the
lower risk value of each data item could be either the lower bound or upper bound of
the margin of error of it. The rationale is that even in the case when some previous
transactions are either malicious or affected and have not had a chance to be validat-
ed, using the lower risk value would have little or no harmful impact to users. Rather,
using the lower risk value can help constrain the spreading of damaged data. Third,
the DBMS serves transactions the latest revision of data items that are updated by
sensitive transactions. The idea is that instead of blocking the user access to these
data items until clean data identification process completes, simply serve user the
version that is guaranteed to be correct at some past time although the latest image of
the data item might be affected. Below we present the formal algorithm for finding
106