A Symbolic Model Checker for Database Programs
Angshuman Jana, Md. Imran Alam and Raju Halder
Indian Institute of Technology Patna, India
Keywords:
Model Checking, Database Program, Boolean Program, Verification, Refinement.
Abstract:
Most of the existing model checking approaches refer mainstream languages without considering any database
statements. As the result, they are not directly applicable to database applications for verifying their correct-
ness. On the other hand, few works in the literature address the verification of database applications focusing
atomicity constraints, transaction properties, etc. In this paper, as an alternative, we propose the design of a
symbolic model checker for database programs to verify integrity properties defined over database attributes.
The proposed model checker is designed based on the following key modules: (i) Abstraction, (ii) Verification,
and (iii) Refinement.
1 INTRODUCTION
Model checking is an algorithmic method for pro-
ving that a system satisfies its specification (Clarke
and Emerson, 1981; Wang et al., 2006). As stated
in (Jhala and Majumdar, 2009), the goal of the model
checking research is to expand the scope of automated
techniques for program reasoning, both in the scale
of programs handled and in the richness of properties
that can be checked. Model checker receives appli-
cation source codes and exhaustively explores their
execution states, searching for possible violations of
properties of interest. For examples, properties may
include simple assertions, which state that a predicate
on program variables holds whenever the computa-
tion reaches a particular location, or global invariants,
that state certain predicates hold on every reachable
state (e.g. each array access is within bounds), or ter-
mination properties.
Based on the state representations, model chec-
kers are of two types: (i) enumerative (in which in-
dividual states are represented) and (ii) symbolic (in
which sets of states are represented using constraints).
State space explosion is the critical limitation of the
enumerative techniques, which led researches to ex-
plore symbolic algorithmic approaches. The symbo-
lic model checking approach manipulates the repre-
sentation of sets of states, rather than individual state,
and performs state exploration through the symbolic
transformation of these representations (Queille and
Sifakis, 1982). For example, the constraint 0 a
9 6 b 10 represents the set of all states over
a, b satisfying the constraint, which implicitly repre-
sents a list of 50 possible states. Therefore, the sym-
bolic model checking algorithms are more efficient as
compared to the enumerative approaches.
Over the past, several enumerative model chec-
kers, e.g. Verisoft (Chandra et al., 2002), SPIN (Holz-
mann, 1997), Cmc (Yang et al., 2006), etc. are develo-
ped. On the other hand, many tools like SLAM (Ball
and Rajamani, 2002), CBMC (Clarke et al., 2003),
F-Soft (Ivancic et al., 2005), JavaPathFinder (Anand
et al., 2007), etc. are developed based on symbolic
algorithms. Notably, the above tools are built for the
verification of various mainstream languages without
considering any database statements. Even though in-
tensive research is already done, researchers have not
paid much attention towards database programs em-
bedding queries and data-manipulation commands.
Few works in the literature include symbolic model
checking of stored procedure and SQL queries for au-
tomatic validation according to the specification (Di-
ana et al., 2012), explicit-state model checker DPF
for the verification of database atomicity constraints
in web applications (Gligoric and Majumdar, 2013),
etc.
In this paper, as an alternative, we extend the ex-
isting symbolic model checking algorithm for impe-
rative language to the case of database programs. In
particular, we aim at verifying the correctness of data-
base programs respecting integrity properties defined
on the underlying databases.
To summarize, our contributions in this paper are:
1. Abstraction of database programs into boolean
programs.
Jana, A., Alam, M. and Halder, R.
A Symbolic Model Checker for Database Programs.
DOI: 10.5220/0006913003470354
In Proceedings of the 13th International Conference on Software Technologies (ICSOFT 2018), pages 347-354
ISBN: 978-989-758-320-9
Copyright © 2018 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
347
2. Generation of verification conditions (VCs) form
the Control Flow Graph (CFG) of boolean data-
base programs and their verification using SMT
solver.
3. Counter Example Guided Abstract Refinement
(CEGAR) of boolean database programs.
Roadmap. In section 2, we discuss the current
state-of-the-art in the literature. In section 3, we des-
cribe a motivating example. In section 4, we recall
some preliminaries. We introduce our approach in
section 5. Section 6 provides an overall tool archi-
tecture. Finally section 7 concludes the work.
2 RELATED WORKS
Chandra et al. (Chandra et al., 2002) proposed a tool
Verisoft which pioneered the idea of execution-based
stateless model checking of software. It uses a sche-
duler which is able to exhaustively explore all possi-
ble interleaving of the processes executions. In (Holz-
mann, 1997), author proposed SPIN, a tool which
supports model-based verification of distributed soft-
ware systems. It has been used to detect design er-
rors in applications ranging from high-level descripti-
ons of distributed algorithms to detailed code for con-
trolling telephone exchanges. Another execution ba-
sed model checker Cmc (Yang et al., 2006) is pro-
posed for C programs. The Cmc tool explores diffe-
rent executions by controlling schedules at the level
of the OS scheduler. Ball and Rajamani (Ball and
Rajamani, 2002) introduced first Counter Example
Guided Abstract Refinement (CEGAR)-based sym-
bolic model checker SLAM for C programs. The
SLAM tool works in the following three steps: (i)
abstracting programs into the form of boolean pro-
grams, (ii) verifying properties using model checking
algorithm on the boolean programs, and (iii) refine-
ment of the boolean programs based on CEGAR. The
model checker Magic (Chaki et al., 2004) was pro-
posed to enable modular verification of concurrent,
message passing C programs. Ivancic et al. (Ivan-
cic et al., 2005) proposed a model checker F-Soft that
used predicate abstraction along with other abstract
domains that efficiently yield the kinds of invariants
needed to check standard runtime errors in C pro-
grams. JavaPathFinder tool (Anand et al., 2007) is
the model checker for Java programs that modifies the
Java Virtual Machine to implement systematic search
over different thread schedules. Several other model
checkers CBMC (Clarke et al., 2003), Chess (Musu-
vathi and Qadeer, 2007), etc. are also developed in
the literature.
In (Paleari et al., 2008), the authors proposed a
dynamic approach to detect race condition vulnerabi-
lities on web-based applications. The approach analy-
zes a log file of a single run and identifies dependen-
cies among SQL queries based on the set of relations
and attributes that are read/written. QED (Martin and
Lam, 2008) is a first model checker for Java web ap-
plications that systematically explore sequences of re-
quests to a web application and looks for taint-based
vulnerabilities in web applications. Artzi et al. (Artzi
et al., 2010) proposed an explicit-state model chec-
ker for PHP web applications. The model checker ge-
nerates test inputs for web applications, monitors the
applications for crashes, and validates that the output
conforms to the HTML specification. Petrov et al.
(Petrov et al., 2012) developed dynamic race detector
tool for web applications. It is mainly formulation of
a happens-before relation to capture the asynchronous
behavior of most commonly-used web platform con-
structs. In (Gligoric and Majumdar, 2013), the aut-
hors proposed an explicit-state model checker DPF
for verifying atomicity violations in web applications.
The model checker interposes between the program
and the database layer and precisely tracks the effects
of queries made to the database. Another symbolic
moder checker is proposed in (Diana et al., 2012) for
the verification of stored procedure and SQL queries
w.r.t. their specification. The specification is expres-
sed in CTL temporal logic. In (Scully and Chlipala,
2017), the authors introduced Sqlcache, the automa-
tic complier optimization for database result caching.
The main aim of Sqlcache is to analyze web applica-
tions for compatibility checking between queries and
updates, by instrumenting updates with cache invali-
dations.
3 MOTIVATING EXAMPLE
Consider the database program depicted in Code
Snippet 1. The code implements a module which per-
forms withdrawal of cash from an ATM system.
The function withdraw() updates the balance of
authorized customers after satisfying two conditions:
(i) maximum withdrawal amount (amt) request of the
customer should be less than or equal to 10,000 USD
for each transaction, and (ii) minimum balance (min-
bal) will be more than or equal to 1000 USD after
each successful transaction. Now consider the follo-
wing integrity constraint on the database attribute ba-
lance:
The minimum account balance of all custo-
mers should be 1000 USD.
ICSOFT 2018 - 13th International Conference on Software Technologies
348
Code Snippet 1: A Database Application PR.
0. int withdraw() {
1. int acc_no, amt, bal, minbal=1000;
2. acc_no = read();
3. amt = read();
4. Statement con = DriverManager.get
Connection("jdbc mysql :....... " , " scott " ,
" tiger ") . createStatement () ;
5. bal = con.executeQuery("select balance form
Account where Accno=acc_no");
6. if (amt 6 10000){ // Maximum withdraw amount
7. if (balance amt > minbal){
8. con.executeQuery("update Account
balance = balance amt where Accno =
acc_no");
// ... do some work ...
}
else
11. ERROR_msg;
}
12. bal = con.executeQuery("select balance form
Account where Accno=acc_no");
13. display (bal);
// ... do some work ...
16. return 0;
Our proposed framework will allow us to verify whet-
her the database program satisfies or violates such in-
tegrity constraints.
4 PRELIMINARIES
In this section, we recall the notions of predicate
abstraction, weakest preconditions and boolean
program.
Predicate Abstraction (Ball et al., 2001). The
predicate abstraction algorithm generates a finite
state abstraction from a large or infinite state system.
It is an automated abstraction technique in which
the abstract domain is constructed from a given set
of predicates over program variables. It is used to
reduce the state space of a program. The basic idea
in predicate abstraction is to remove some variables
from the program by just keeping information about
a set of predicates about them. Let be a region.
Consider the predicate abstraction domain which is
parameterized by a fixed finite set Π of first order
formulas. The predicate abstraction of with respect
to Π is the smallest region A(, Π) which contains
and it is representable as a boolean combination of
predicates from Π:
A(, Π) =
^
{ψ | ψ is a boolean formula
over Π ψ}
The region A(, Π) can be computed by recursively
splitting as follows:
A(, Π) =
true, if Π =
/
0 and satisfiable
f alse, if Π =
/
0 and unsatisfiable
(p A( p, Π
0
)) (¬p
A( ¬p, Π
0
)), if Π = {p} Π
0
Weakest Preconditions (Ball et al., 2001). Given a
program statement stmt and a predicate ψ. The we-
akest precondition of stmt with respect to ψ is de-
noted as WP(stmt, ψ) which determines a predicate
such that after successfully executing stmt on the ini-
tial state satisfying ψ results into the final state where
ψ is true and stmt terminates. For example, given an
assignment statement x = e,
WP(x = e,ψ) = ψ
0
where ψ
0
is obtained by replacing all occurrences of x
in it with e, i.e. ψ
0
, ψ[e/x].
Example 1. Consider the statement stmt , y = y + 1
and predicate ψ , y 6 30. The weakest precondition
of stmt with respect to ψ is computed as
WP(y = y + 1,y 6 30) = (y + 1) 6 30 = (y 6 29)
Observe that, given a statement stmt and predicate
ψ Π, it may be the case that WP(stmt, ψ) is not in Π.
For instance, consider Π = {(x < 5),(x == 2)} and
WP(x = x + 1, x < 5) = (x < 4) where the predicate
(x < 4) is not in Π. Therefore, the decision procedu-
res (i.e. a theorem prover) can be used to strengthen or
weaken the resulting precondition in order to express
over the predicates in Π (Ball et al., 2001).
Boolean Programs (Ball and Rajamani, 2000). Bool-
ean programs can be thought of as an abstract repre-
sentation of a program that explicitly captures corre-
lations between data and control, in which boolean
variables can represent arbitrary predicates over the
unbounded state of a program. The syntax is depicted
in Figure 1. Boolean variables are either local or glo-
bal with statically scoped as C program and variable
declarations need not specify a type. A variable iden-
tifier is either a C-style identifier or an arbitrary string
between the characters “{” and “}”. Two constants
‘0’ (false) and ‘1’ (true) are in the language. The ex-
pressions are built from these constants, variables and
logical connectives. A parallel assignment statement
allows the simultaneous assignment of a set of values
to a set of variables. The statements ‘if’, ‘while’ and
‘assert’ can affect the control flow of the language.
Observe that the predicate of control statements is a
A Symbolic Model Checker for Database Programs
349
Table 1: Syntax of Boolean Programs (Ball and Rajamani, 2000).
Commands:
B ::= V
P
A list of global variable delectations followed by a list of procedure definitions
V
::= id
+
; Declaration of variables
P
::= id (id
) begin V
S end Procedure definition
S ::= lstmt
+
Sequence of statements
lstmt ::= stmt | id : stmt
stmt ::= skip; | id
+
= exp
+
; | if(con) then S else S | while(con) do S
| assert(con); | id(exp
); | print(exp
+
); | return;
con ::= ? | exp Non-deterministic choice
exp ::= exp op exp | !exp | (exp) | id | const
op ::= & | | | | = | ! = | Logical connectives
id ::= [a zA Z] [a zA Z0 9]
| {string}
const ::= 0 | 1 False/True
decider which can be used to model non-determinism.
A decider evaluates to ‘0’ or ‘1’ deterministically or
‘?’ which evaluates to ‘0’ or ‘1’ non-deterministically.
5 PROPOSED APPROACH
In this section, we illustrate our proposed model chec-
ker which consists of three key modules: (i) ab-
straction of database programs into their equivalent
boolean programs, (ii) automatic VC generation from
the Control Flow Graph (CFG) of the boolean pro-
grams and its verification using SMT, and (iii) ab-
straction refinement introducing additional predica-
tes, by identifying spurious execution path in the ori-
ginal database program. Let us explain each of the
modules in the following sub-sections:
5.1 Program Abstraction
This section describes the abstraction of database pro-
grams into boolean programs using predicates.
Given a database program P and a set of predi-
cates E = {ψ
1
,ψ
2
,ψ
3
,...,ψ
n
}, let BP (P, E) gene-
rates a boolean program which consists of boolean
variables and it has similar control flow as in P. In
particular, BP (P, E) contains n boolean variables B
={b
1
,b
2
,b
3
,...,b
n
} where each boolean variable b
i
re-
presents the predicate ψ
i
(1 6 i 6 n). Note that, BP
(P, E) is guaranteed to be an abstraction of P in the
sense that the set of execution traces of BP (P, E) is a
superset of the set of execution traces of P. We now
describe an abstraction of various programming con-
structs of P with respect to E below:
Assignments (Ball et al., 2001)
Consider an assignment statement x = e at program
point l in P where e denotes an arithmetic expression.
Given a predicate ψ
i
E, a boolean variable b
i
in BP
(P, E) can have the value true after l if WP(x = e, ψ
i
)
= true before l. Similarly, b
i
can have the value false
after l if WP(x = e, ¬ψ
i
) = false before l. Observe
that, if neither of these predicates holds before l then
b
i
in BP (P, E) contains parallel assignment at l as:
b
1
...b
n
=
T
ch
(W P(x = e,ψ
1
),W P(x = e,¬ψ
1
)),
... ,
T
ch
(W P(x = e,ψ
n
),W P(x = e,¬ψ
n
))
where the function T
ch
is defined as bellow:
bool T
ch
(bool pos, bool neg) {
if (pos) {return true; }
if (neg) {return false; }
return unknown(); }
The unknown function is defined as follows:
bool unknown() {
if () {return true; }
else {return false; }
}
Observe that the unknown function uses the control
expression which non-deterministically determi-
nes the result either true or false.
Conditional (Ball et al., 2001)
Consider a conditional statement if (ψ) {stmt} else
{stmt} in P. If the predicate ψ is evaluated to true
in P then predicate ψ in the corresponding BP (P, E)
should also be evaluated to true. Similarly, ψ evalua-
ted to false in P then predicate ψ in the corresponding
BP (P, E) should also be evaluated to false. In BP (P,
E), the condition is encoded as:
if () {
assume(ψ);
...
} else {
assume(¬ψ);
...
}
Observe that, as “*” in ‘if condition non-
deterministically evaluates to either true or f alse,
ICSOFT 2018 - 13th International Conference on Software Technologies
350
both execution paths will be explored. The functions
assume(ψ) under ‘if and assume(¬ψ) under ‘else’
preserve the semantics of the conditional statement in
P.
Databases Statements
We now define the abstraction of database statements
(SELECT, UPDATE, DELETE and INSERT) into their
boolean form. To do so, let us first recall from (Hal-
der and Cortesi, 2012) the formal syntax of database
programs as below:
Q ::= hA,φi
A ::= select(v
a
, f (~e), r(
~
h(~x)), φ
0
, g(~e)) |
update(~v
d
,~e)| insert(~v
d
,~e)| delete(~v
d
)
τ ::= n | v
a
| v
d
| f
n
(τ
1
,τ
2
,...,τ
n
),
where f
n
is an n-ary function.
a
f
::= R
n
(τ
1
,τ
2
,...,τ
n
) | τ
1
= τ
2
,
where R
n
(τ
1
,τ
2
,...,τ
n
) {true, f alse}
φ ::= a
f
| ¬φ
1
| φ
1
φ
2
| φ
1
φ
2
| x
i
φ | x
i
φ
c ::= Q|v = e| if cond then c
1
else c
2
| while cond do c
Where φ denotes an well-formed first order for-
mula defined over constants (n), application varia-
bles (v
a
) and database attributes (v
d
). The SQL
clauses GROUP BY, ORDER BY, DISTINCT/ALL and the
aggregate functions (SUM, COUNT, MAX, MIN, AVG)
are represented in the form of functions g(), f (),
r(), h() respectively parameterized with either none
or one arithmetic expression e or an ordered se-
quence of arithmetic expressions ~e. The abstract
syntax of a database statement is denoted by hA,φi
where A represents Action-part and φ represents
Condition-part. The Action-part include SELECT,
UPDATE, DELETE and INSERT. For example, consi-
der the query Q=“update t set sal:=sal+100 where
age >40”. According to abstract syntax, Q is denoted
by hA, φi = hupdate(~v
d
,~e), φi, where ~v
d
= hsali and
~e = hsal+100i and φ=age >40.
Given a set of predicate E consisting of the inte-
grity constraints under consideration, the abstraction
of database statements into their equivalent boolean
form involves the following two major tasks:
(1) Computation of weakest precondition of da-
tabase statements w.r.t. ψ E: We define below
the computation of weakest precondition for various
database statements for a given post condition.
WP(hselect(v
a
, f (~e), r(
~
h(~x)), φ
0
, g(~e)), φi, ψ) = ψ
WP(hupdate(~v
d
,~e), φi, ψ) = ((ψ ¬φ) (ψ[~e/~v
d
] φ))
WP(hinsert(~v
d
, ~e), falsei, ψ) = ψ [~e/~v
d
] ψ
WP(hdelete(~v
d
), φi, ψ) = ψ
(2) Conversion of database statements into boolean
form: Given a database statement Q and a postcondi-
tion ψ, suppose WP(Q, ψ)=ψ
0
. Let us now define BP
for database statements to convert into their equiva-
lent boolean form w.r.t. ψ
0
below:
BP(hselect(v
a
, f (~e),r(
~
h(~x)),φ
0
,g(~e)),φi,ψ
0
) = skip
BP(hupdate(~v
d
,~e),φi,ψ
0
) =
(
b = φ ? : true, if ψ
0
E
skip, otherwise
BP(hinsert(~v
d
, ~e), falsei, ψ
0
) =
(
b = , if ψ
0
E
skip, otherwise
BP(hdelete(~v
d
), φi, ψ
0
) = skip
Since SELECT and DELETE operations do not violate
any integrity constraints defined over database states
at row level, the corresponding statements are repla-
ced by ‘skip’. On the other hand, in case of UPDATE
and INSERT, the following two cases may arise: (i) if
ψ
0
already exists in E and φ is evaluated to true, this
means that the update operation is taking place which
may lead the updated values possibly to violate the in-
tegrity properties. Therefore, the corresponding bool-
ean variable b in the boolean program is assigned to
‘*’. Otherwise, the evaluation of φ to f alse indica-
tes no update operation (hence, no property violation)
and therefore b is assigned to true, and (ii) if ψ
0
does
not exist in E (with or without applying strengthen or
weaken operation) then we replace the given database
statement by “skip”.
Let us illustrate this using an example as follows.
Example 2. Let E={sal 8000} and its correspon-
ding boolean variable be b. We assume that the ini-
tial database is in consistent state, and therefore, the
boolean variable b is set to true.
Consider the database table Emp which stores em-
ployees information and the following update state-
ment:
Q=update Emp set sal=sal+100 where age>30
The abstract syntax of Q is represented as
Q = hupdate(hsali,hsal+100i),age > 30i
The weakest precondition of Q w.r.t. “sal 8000
is WP(hupdate(hsali, hsal+100i),age > 30i, sal
8000) = ((sal 8000 ¬(age > 30))
W
(sal +
100 < 8000 age > 30)). Assuming age > 30 eva-
luates to true, we get sal+100 < 8000 which results
sal < 7900. Observe that the resulting precondition
sal < 7900 does not exist in E. Therefore, after ap-
plying strengthen operation, b is set to “*”.
Illustration on Motivational Example. Let us
consider the motivating example in code snippet 1.
A Symbolic Model Checker for Database Programs
351
Code Snippet 2: Boolean Program of PR.
0. int withdraw() {
bool {bal > minbal} // b := bal > minbal
1. skip ;
2. skip ;
3. skip ;
4. skip ;
5. skip ;
6. if (){
7. if (){
8. b = φ ? : true // φ = Accno=acc_no
}
else {
11. skip ;
}
}
12. skip ;
13. skip ;
16. skip ; }
Consider E = {bal > minbal} taking the integrity
constraint under consideration and its corresponding
boolean variables set B = {b}. The boolean program
of PR with respect to E is shown in the code snippet
2. Observe that the program statements 1, 2, 3, 4, 5,
11, 12, 13 and 16 in the boolean program are replaced
by ‘skip’ because they do not affect the predicate set
E. On other hand, the conditions in the statements at
program points 6 and 7 are set to “”. Note that, due
to the absence of the predicates amt 6 10000” and
balance-amt > minbal” in E, there is no assume sta-
tement included in the boolean code. Program state-
ment 8 denotes the boolean abstraction of the update
statement.
5.2 Verification
In this section, we describe the generation of verifica-
tion conditions and their satisfiability. After conver-
ting the program into a boolean form, we will verify
whether all possible executions paths respect the de-
fined properties. For this objective, we perform the
below steps:
Control Flow Graph (CFG) construction of bool-
ean programs.
VC generation from CFGs of boolean programs.
Satisfiability checking of the VCs using SMT.
The Control Flow Graph G
BL
of boolean program
BL is constructed by following the similar approach
proposed in (Ball and Rajamani, 2000). After con-
structing G
BL
, we list all possible execution paths. We
convert each path p
i
into a verification formula f
i
by
anding all boolean statements encountered along that
path. The generated f
i
is suppled to SMT solver for
the satisfiability checking. If the formula f
i
is satis-
fied, then the corresponding path p
i
satisfies the pro-
perty and the algorithm terminates. Otherwise, we
check whether the corresponding path p
i
is a feasible
path in the original program. If yes, we conclude a
violation of the property and an error trace is genera-
ted. Otherwise, a refinement of the boolean program
is performed which we describe in the next section.
Illustration on Motivational Example. Consider
the boolean program in the code snippet 2. Its CFG
is depicted in Figure 1. Consider the CFG path p ,
1 2 3 4 5 6 7 8 12 13
16 (denoted by blue color). The generated VC al-
ong this path is b = φ ? * : true. This is encoded
as f , (b φ) (¬b φ) (b ¬φ) b == true.
We pass the negation of this f (i.e. ¬f ) to SMT solver
which reports “satisfied”. This shows that there exist
some solution for which the predicate b evaluates to
false. Therefore, the model checker indicates that the
program PR may violate the integrity constraint. Ho-
wever, observe that this path is not a feasible path in
PR, because b which corresponds to bal > minbal is
false. This indicates that the abstraction is not precise
enough and there is a need of refinement.
5.3 Counter Example Guided Abstract
Refinement
The process of refining the abstraction is done using a
method called counterexample-guided abstraction re-
finement (CEGAR) (Wang et al., 2006). If the bool-
ean program contains an error path p
i
and this path is
not feasible in the original program, then the refine-
ment of the Boolean program will be initiated to eli-
minate this false error paths. More specifically, we de-
termine suitable predicates, by analyzing the infeasi-
bility of paths in the original database program which
illustrates feasible paths in the corresponding boolean
program. We then refine the abstraction by adding
these new predicates to E. Let us illustrate below the
application of CEGAR on the motivating example:
Illustration on Motivational Example. Consider
the boolean program in code snippet 2. As we alre-
ady observed in section 5.2 that this abstraction is not
precise enough, in order to refine this abstraction we
discover the predicate (amt 6 10000) form PR and
we add it to E. This results E = {bal > minbal, amt 6
10000} and B = {b, b
1
} where b and b
1
corresponds
to bal > minbal and amt 6 10000 respectively. Gi-
ven the refined boolean program w.r.t. these new E
and B, observe that after verifying, by following the
same procedure as before, we get the error path p
1
, 1
ICSOFT 2018 - 13th International Conference on Software Technologies
352
65
4
3
21
0
7
8
11 12
13 16
T T T T T T
T
F
T
F
T T
Figure 1: CFG of the Boolean Program in Code Snippet 2.
Code Snippet 3: Refined Boolean Program of PR.
0. int withdraw() {
bool {bal > minbal, amt 6 10000, balance amt >
minbal } // b := bal > minbal, b
1
:= amt 6
10000, b
2
:= balance amt > minbal
1. skip ;
2. skip ;
3. skip ;
4. skip ;
5. skip ;
6. if (){
assume(amt 6 10000) // b
1
= true
7. if (){
assume(balance amt > minbal) // b
2
= true
8. b = φ ? : true; // φ = Accno = acc_no
}
else {
assume(¬(balance amt > minbal))
11. skip ;
} }
12. skip ;
13. skip ;
16. skip ; }
2 3 4 5 6 7 8 12 13
16 which is also not a feasible path in PR because the
predicate (amt 6 10000) is always true along the path.
By initiating the refinement process once again, we
discover a new predicate (balance - amt > minbal).
Now considering the new E = {bal > minbal, amt 6
10000, balanceamt > minbal} and B = {b, b
1
, b
2
},
we get a refined boolean program depicted in code
snippet 3. We now generate the VC from the CFG
of this code snippet 3, which is encoded as f
2
,
((b
1
== true) (b
1
== true b == true) (b
2
==
true b
1
== true b == true) ((b φ) (¬b
φ) (b ¬φ) b == true)) b == true". The
SMT solver reports unsatisfied when ¬ f
2
is passed as
input. Therefore, the model checker indicates that PR
is safe w.r.t. the integrity constraint bal > minbal.
6 TOWARDS IMPLEMENTATION
In general, the proposed framework accepts as inputs
a database program and properties of interest, and this
verifies whether the input program respects the pro-
perties by generating safe/unsafe message as output.
The overall architecture of our proposed framework
is shown in Figure 2. We have identified the follo-
wing key modules which play important roles in im-
plementing the proposed framework:
Proformat
Applic
ation
is safe
Database
application
Verification
Properti
es
Abstraction
Refinement
Error
unsafe
Figure 2: Overall Architecture of Database Model Checker.
1. Proformat: This modules preprocess input data-
base programs and annotates them by adding line
numbers to all statements.
2. Abstraction: The module Abstraction” abstracts
the database program into a boolean program
using a set of predicates.
3. Verification: This module at first constructs CFG
of the boolean program. After that, it generates
VC from CFG. Finally, it tests the satisfiability of
each VC using SMT solver.
4. Refinement: The primary task of this module is
to discover additional predicates which refine the
abstraction avoiding the existence of the spurious
paths in the corresponding boolean database pro-
grams.
7 CONCLUSIONS AND FUTURE
WORKS
In this paper, we proposed a symbolic model checking
algorithm for verifying the correctness of database ap-
plications with respect to integrity properties. We de-
sign our model checker based on the following key
A Symbolic Model Checker for Database Programs
353
modules: (i) Abstraction, (ii) Verification and (iii) Re-
finement. We are currently implementing our propo-
sed model checker, as per the description provided in
the tool architecture, in a modular way to support sca-
lability.
ACKNOWLEDGEMENT
This work is partially supported by the research grant
(SB/FTP/ETA-315/2013) from the Science and En-
gineering Research Board (SERB), Department of
Science and Technology, Government of India.
REFERENCES
Anand, S., P
˘
as
˘
areanu, C. S., and Visser, W. (2007). Jpf–se:
A symbolic execution extension to java pathfinder. In
TACAS, pages 134–138. Springer.
Artzi, S., Kiezun, A., Dolby, J., Tip, F., Dig, D., Paradkar,
A., and Ernst, M. D. (2010). Finding bugs in web ap-
plications using dynamic test generation and explicit-
state model checking. IEEE TSE, 36(4):474–494.
Ball, T., Majumdar, R., Millstein, T., and Rajamani, S. K.
(2001). Automatic predicate abstraction of c pro-
grams. In ACM SIGPLAN Notices, volume 36, pages
203–213. ACM.
Ball, T. and Rajamani, S. K. (2000). Bebop: A symbolic
model checker for boolean programs. In Internatio-
nal SPIN Workshop on Model Checking of Software,
pages 113–130. Springer.
Ball, T. and Rajamani, S. K. (2002). The s lam project: de-
bugging system software via static analysis. In ACM
SIGPLAN Notices, volume 37, pages 1–3. ACM.
Chaki, S., Clarke, E., Groce, A., Ouaknine, J., Strichman,
O., and Yorav, K. (2004). Efficient verification of se-
quential and concurrent c programs. Formal Methods
in System Design, 25(2-3):129–166.
Chandra, S., Godefroid, P., and Palm, C. (2002). Software
model checking in practice: an industrial case study.
In Software Engineering, 2002. ICSE 2002. Procee-
dings of the 24rd IC on, pages 431–441. IEEE.
Clarke, E., Kroening, D., and Yorav, K. (2003). Behavioral
consistency of c and verilog programs using bounded
model checking. In Proc. of the 40th annual Design
Automation Conference, pages 368–371. ACM.
Clarke, E. M. and Emerson, E. A. (1981). Design and synt-
hesis of synchronization skeletons using branching
time temporal logic. In Workshop on Logic of Pro-
grams, pages 52–71. Springer.
Diana, R., Marques-Neto, H., Zarate, L., and Song, M.
(2012). A symbolic model checking appproach to ve-
rifying transact-sql. In Systems, Man, and Cyberne-
tics (SMC), 2012 IEEE International Conference on,
pages 1735–1741. IEEE.
Gligoric, M. and Majumdar, R. (2013). Model checking
database applications. In IC on Tools and Algorithms
for the Construction and Analysis of Systems, pages
549–564. Springer.
Halder, R. and Cortesi, A. (2012). Abstract interpretation
of database query languages. Computer Languages,
Systems & Structures, 38:123–157.
Holzmann, G. J. (1997). The model checker spin. IEEE
TSE, 23(5):279–295.
Ivancic, F., Yang, Z., Ganai, M. K., Gupta, A., Shlyakhter,
I., and Ashar, P. (2005). F-soft: Software verification
platform. In IC on Computer Aided Verification, pages
301–306. Springer.
Jhala, R. and Majumdar, R. (2009). Software mo-
del checking. ACM Computing Surveys (CSUR),
41(4):21.
Martin, M. and Lam, M. S. (2008). Automatic generation of
xss and sql injection attacks with goal-directed model
checking. In Proc. of the 17th conference on Security
symposium, pages 31–43. USENIX Association.
Musuvathi, M. and Qadeer, S. (2007). Iterative context
bounding for systematic testing of multithreaded pro-
grams. In ACM Sigplan Notices, volume 42, pages
446–455. ACM.
Paleari, R., Marrone, D., Bruschi, D., and Monga, M.
(2008). On race vulnerabilities in web applications.
In IC on Detection of Intrusions and Malware, and
Vulnerability Assessment, pages 126–142. Springer.
Petrov, B., Vechev, M., Sridharan, M., and Dolby, J. (2012).
Race detection for web applications. In ACM SIG-
PLAN Notices, volume 47, pages 251–262. ACM.
Queille, J.-P. and Sifakis, J. (1982). Specification and verifi-
cation of concurrent systems in cesar. In International
Symposium on programming, pages 337–351.
Scully, Z. and Chlipala, A. (2017). A program optimization
for automatic database result caching. ACM SIGPLAN
Notices, 52(1):271–284.
Wang, C., Hachtel, G. D., and Somenzi, F. (2006). Ab-
straction refinement for large scale model checking.
Springer Science & Business Media.
Yang, J., Twohey, P., Engler, D., and Musuvathi, M. (2006).
Using model checking to find serious file system er-
rors. ACM Trans. CS (TOCS), 24(4):393–423.
ICSOFT 2018 - 13th International Conference on Software Technologies
354