A Symbolic Model Checker for Database Programs

Angshuman Jana, Md. Imran Alam and Raju Halder

Indian Institute of Technology Patna, India

Keywords:

Model Checking, Database Program, Boolean Program, Veriﬁcation, Reﬁnement.

Abstract:

Most of the existing model checking approaches refer mainstream languages without considering any database

statements. As the result, they are not directly applicable to database applications for verifying their correct-

ness. On the other hand, few works in the literature address the veriﬁcation of database applications focusing

atomicity constraints, transaction properties, etc. In this paper, as an alternative, we propose the design of a

symbolic model checker for database programs to verify integrity properties deﬁned over database attributes.

The proposed model checker is designed based on the following key modules: (i) Abstraction, (ii) Veriﬁcation,

and (iii) Reﬁnement.

1 INTRODUCTION

Model checking is an algorithmic method for pro-

ving that a system satisﬁes its speciﬁcation (Clarke

and Emerson, 1981; Wang et al., 2006). As stated

in (Jhala and Majumdar, 2009), the goal of the model

checking research is to expand the scope of automated

techniques for program reasoning, both in the scale

of programs handled and in the richness of properties

that can be checked. Model checker receives appli-

cation source codes and exhaustively explores their

execution states, searching for possible violations of

properties of interest. For examples, properties may

include simple assertions, which state that a predicate

on program variables holds whenever the computa-

tion reaches a particular location, or global invariants,

that state certain predicates hold on every reachable

state (e.g. each array access is within bounds), or ter-

mination properties.

Based on the state representations, model chec-

kers are of two types: (i) enumerative (in which in-

dividual states are represented) and (ii) symbolic (in

which sets of states are represented using constraints).

State space explosion is the critical limitation of the

enumerative techniques, which led researches to ex-

plore symbolic algorithmic approaches. The symbo-

lic model checking approach manipulates the repre-

sentation of sets of states, rather than individual state,

and performs state exploration through the symbolic

transformation of these representations (Queille and

Sifakis, 1982). For example, the constraint 0 ≤ a ≤

9 ∧ 6 ≤ b ≤ 10 represents the set of all states over

a, b satisfying the constraint, which implicitly repre-

sents a list of 50 possible states. Therefore, the sym-

bolic model checking algorithms are more efﬁcient as

compared to the enumerative approaches.

Over the past, several enumerative model chec-

kers, e.g. Verisoft (Chandra et al., 2002), SPIN (Holz-

mann, 1997), Cmc (Yang et al., 2006), etc. are develo-

ped. On the other hand, many tools like SLAM (Ball

and Rajamani, 2002), CBMC (Clarke et al., 2003),

F-Soft (Ivancic et al., 2005), JavaPathFinder (Anand

et al., 2007), etc. are developed based on symbolic

algorithms. Notably, the above tools are built for the

veriﬁcation of various mainstream languages without

considering any database statements. Even though in-

tensive research is already done, researchers have not

paid much attention towards database programs em-

bedding queries and data-manipulation commands.

Few works in the literature include symbolic model

checking of stored procedure and SQL queries for au-

tomatic validation according to the speciﬁcation (Di-

ana et al., 2012), explicit-state model checker DPF

for the veriﬁcation of database atomicity constraints

in web applications (Gligoric and Majumdar, 2013),

etc.

In this paper, as an alternative, we extend the ex-

isting symbolic model checking algorithm for impe-

rative language to the case of database programs. In

particular, we aim at verifying the correctness of data-

base programs respecting integrity properties deﬁned

on the underlying databases.

To summarize, our contributions in this paper are:

1. Abstraction of database programs into boolean

programs.

Jana, A., Alam, M. and Halder, R.

A Symbolic Model Checker for Database Programs.

DOI: 10.5220/0006913003470354

In Proceedings of the 13th International Conference on Software Technologies (ICSOFT 2018), pages 347-354

ISBN: 978-989-758-320-9

347

2. Generation of veriﬁcation conditions (VCs) form

the Control Flow Graph (CFG) of boolean data-

base programs and their veriﬁcation using SMT

solver.

3. Counter Example Guided Abstract Reﬁnement

(CEGAR) of boolean database programs.

Roadmap. In section 2, we discuss the current

state-of-the-art in the literature. In section 3, we des-

cribe a motivating example. In section 4, we recall

some preliminaries. We introduce our approach in

section 5. Section 6 provides an overall tool archi-

tecture. Finally section 7 concludes the work.

2 RELATED WORKS

Chandra et al. (Chandra et al., 2002) proposed a tool

Verisoft which pioneered the idea of execution-based

stateless model checking of software. It uses a sche-

duler which is able to exhaustively explore all possi-

ble interleaving of the processes executions. In (Holz-

mann, 1997), author proposed SPIN, a tool which

supports model-based veriﬁcation of distributed soft-

ware systems. It has been used to detect design er-

rors in applications ranging from high-level descripti-

ons of distributed algorithms to detailed code for con-

trolling telephone exchanges. Another execution ba-

sed model checker Cmc (Yang et al., 2006) is pro-

posed for C programs. The Cmc tool explores diffe-

rent executions by controlling schedules at the level

of the OS scheduler. Ball and Rajamani (Ball and

Rajamani, 2002) introduced ﬁrst Counter Example

Guided Abstract Reﬁnement (CEGAR)-based sym-

bolic model checker SLAM for C programs. The

SLAM tool works in the following three steps: (i)

abstracting programs into the form of boolean pro-

grams, (ii) verifying properties using model checking

algorithm on the boolean programs, and (iii) reﬁne-

ment of the boolean programs based on CEGAR. The

model checker Magic (Chaki et al., 2004) was pro-

posed to enable modular veriﬁcation of concurrent,

message passing C programs. Ivancic et al. (Ivan-

cic et al., 2005) proposed a model checker F-Soft that

used predicate abstraction along with other abstract

domains that efﬁciently yield the kinds of invariants

needed to check standard runtime errors in C pro-

grams. JavaPathFinder tool (Anand et al., 2007) is

the model checker for Java programs that modiﬁes the

Java Virtual Machine to implement systematic search

over different thread schedules. Several other model

checkers CBMC (Clarke et al., 2003), Chess (Musu-

vathi and Qadeer, 2007), etc. are also developed in

the literature.

In (Paleari et al., 2008), the authors proposed a

dynamic approach to detect race condition vulnerabi-

lities on web-based applications. The approach analy-

zes a log ﬁle of a single run and identiﬁes dependen-

cies among SQL queries based on the set of relations

and attributes that are read/written. QED (Martin and

Lam, 2008) is a ﬁrst model checker for Java web ap-

plications that systematically explore sequences of re-

quests to a web application and looks for taint-based

vulnerabilities in web applications. Artzi et al. (Artzi

et al., 2010) proposed an explicit-state model chec-

ker for PHP web applications. The model checker ge-

nerates test inputs for web applications, monitors the

applications for crashes, and validates that the output

conforms to the HTML speciﬁcation. Petrov et al.

(Petrov et al., 2012) developed dynamic race detector

tool for web applications. It is mainly formulation of

a happens-before relation to capture the asynchronous

behavior of most commonly-used web platform con-

structs. In (Gligoric and Majumdar, 2013), the aut-

hors proposed an explicit-state model checker DPF

for verifying atomicity violations in web applications.

The model checker interposes between the program

and the database layer and precisely tracks the effects

of queries made to the database. Another symbolic

moder checker is proposed in (Diana et al., 2012) for

the veriﬁcation of stored procedure and SQL queries

w.r.t. their speciﬁcation. The speciﬁcation is expres-

sed in CTL temporal logic. In (Scully and Chlipala,

2017), the authors introduced Sqlcache, the automa-

tic complier optimization for database result caching.

The main aim of Sqlcache is to analyze web applica-

tions for compatibility checking between queries and

updates, by instrumenting updates with cache invali-

dations.

3 MOTIVATING EXAMPLE

Consider the database program depicted in Code

Snippet 1. The code implements a module which per-

forms withdrawal of cash from an ATM system.

The function withdraw() updates the balance of

authorized customers after satisfying two conditions:

(i) maximum withdrawal amount (amt) request of the

customer should be less than or equal to 10,000 USD

for each transaction, and (ii) minimum balance (min-

bal) will be more than or equal to 1000 USD after

each successful transaction. Now consider the follo-

wing integrity constraint on the database attribute ba-

lance:

The minimum account balance of all custo-

mers should be 1000 USD.

ICSOFT 2018 - 13th International Conference on Software Technologies

348

Code Snippet 1: A Database Application PR.

0. int withdraw() {

1. int acc_no, amt, bal, minbal=1000;

2. acc_no = read();

3. amt = read();

4. Statement con = DriverManager.get

Connection("jdbc mysql :....... " , " scott " ,

" tiger ") . createStatement () ;

5. bal = con.executeQuery("select balance form

Account where Accno=acc_no");

6. if (amt 6 10000){ // Maximum withdraw amount

7. if (balance − amt > minbal){

8. con.executeQuery("update Account

balance = balance − amt where Accno =

acc_no");

// ... do some work ...

}

else

11. ERROR_msg;

}

12. bal = con.executeQuery("select balance form

Account where Accno=acc_no");

13. display (bal);

// ... do some work ...

16. return 0;

Our proposed framework will allow us to verify whet-

her the database program satisﬁes or violates such in-

tegrity constraints.

4 PRELIMINARIES

In this section, we recall the notions of predicate

abstraction, weakest preconditions and boolean

program.

Predicate Abstraction (Ball et al., 2001). The

predicate abstraction algorithm generates a ﬁnite

state abstraction from a large or inﬁnite state system.

It is an automated abstraction technique in which

the abstract domain is constructed from a given set

of predicates over program variables. It is used to

reduce the state space of a program. The basic idea

in predicate abstraction is to remove some variables

from the program by just keeping information about

a set of predicates about them. Let ℜ be a region.

Consider the predicate abstraction domain which is

parameterized by a ﬁxed ﬁnite set Π of ﬁrst order

formulas. The predicate abstraction of ℜ with respect

to Π is the smallest region A(ℜ, Π) which contains

ℜ and it is representable as a boolean combination of

predicates from Π:

A(ℜ, Π) =

{ψ | ψ is a boolean formula

over Π ∧ ℜ ⇒ ψ}

The region A(ℜ, Π) can be computed by recursively

splitting as follows:

A(ℜ, Π) =











true, if Π =

0 and ℜ satisﬁable

f alse, if Π =

0 and ℜ unsatisﬁable

(p ∧ A(ℜ ∧ p, Π

)) ∨ (¬p∧

A(ℜ ∧ ¬p, Π

)), if Π = {p} ∪ Π

Weakest Preconditions (Ball et al., 2001). Given a

program statement stmt and a predicate ψ. The we-

akest precondition of stmt with respect to ψ is de-

noted as WP(stmt, ψ) which determines a predicate

such that after successfully executing stmt on the ini-

tial state satisfying ψ results into the ﬁnal state where

ψ is true and stmt terminates. For example, given an

assignment statement x = e,

WP(x = e,ψ) = ψ

where ψ

is obtained by replacing all occurrences of x

in it with e, i.e. ψ

, ψ[e/x].

Example 1. Consider the statement stmt , y = y + 1

and predicate ψ , y 6 30. The weakest precondition

of stmt with respect to ψ is computed as

WP(y = y + 1,y 6 30) = (y + 1) 6 30 = (y 6 29)

Observe that, given a statement stmt and predicate

ψ ∈ Π, it may be the case that WP(stmt, ψ) is not in Π.

For instance, consider Π = {(x < 5),(x == 2)} and

WP(x = x + 1, x < 5) = (x < 4) where the predicate

(x < 4) is not in Π. Therefore, the decision procedu-

res (i.e. a theorem prover) can be used to strengthen or

weaken the resulting precondition in order to express

over the predicates in Π (Ball et al., 2001).

Boolean Programs (Ball and Rajamani, 2000). Bool-

ean programs can be thought of as an abstract repre-

sentation of a program that explicitly captures corre-

lations between data and control, in which boolean

variables can represent arbitrary predicates over the

unbounded state of a program. The syntax is depicted

in Figure 1. Boolean variables are either local or glo-

bal with statically scoped as C program and variable

declarations need not specify a type. A variable iden-

tiﬁer is either a C-style identiﬁer or an arbitrary string

between the characters “{” and “}”. Two constants

‘0’ (false) and ‘1’ (true) are in the language. The ex-

pressions are built from these constants, variables and

logical connectives. A parallel assignment statement

allows the simultaneous assignment of a set of values

to a set of variables. The statements ‘if’, ‘while’ and

‘assert’ can affect the control ﬂow of the language.

Observe that the predicate of control statements is a

A Symbolic Model Checker for Database Programs

349

Table 1: Syntax of Boolean Programs (Ball and Rajamani, 2000).

Commands:

B ::= V

∗

A list of global variable delectations followed by a list of procedure deﬁnitions

∗

::= id

; Declaration of variables

∗

::= id (id

∗

) begin V

∗

S end Procedure deﬁnition

S ::= lstmt

Sequence of statements

lstmt ::= stmt | id : stmt

stmt ::= skip; | id

= exp

; | if(con) then S else S | while(con) do S

| assert(con); | id(exp

∗

); | print(exp

); | return;

con ::= ? | exp Non-deterministic choice

exp ::= exp op exp | !exp | (exp) | id | const

op ::= & | | | ∧ | = | ! = | ⇒ Logical connectives

id ::= [a − zA − Z] [a − zA − Z0 − 9]

∗

| {string}

const ::= 0 | 1 False/True

decider which can be used to model non-determinism.

A decider evaluates to ‘0’ or ‘1’ deterministically or

‘?’ which evaluates to ‘0’ or ‘1’ non-deterministically.

5 PROPOSED APPROACH

In this section, we illustrate our proposed model chec-

ker which consists of three key modules: (i) ab-

straction of database programs into their equivalent

boolean programs, (ii) automatic VC generation from

the Control Flow Graph (CFG) of the boolean pro-

grams and its veriﬁcation using SMT, and (iii) ab-

straction reﬁnement introducing additional predica-

tes, by identifying spurious execution path in the ori-

ginal database program. Let us explain each of the

modules in the following sub-sections:

5.1 Program Abstraction

This section describes the abstraction of database pro-

grams into boolean programs using predicates.

Given a database program P and a set of predi-

cates E = {ψ

,ψ

,...,ψ

}, let BP (P, E) gene-

rates a boolean program which consists of boolean

variables and it has similar control ﬂow as in P. In

particular, BP (P, E) contains n boolean variables B

={b

,...,b

} where each boolean variable b

re-

presents the predicate ψ

(1 6 i 6 n). Note that, BP

(P, E) is guaranteed to be an abstraction of P in the

sense that the set of execution traces of BP (P, E) is a

superset of the set of execution traces of P. We now

describe an abstraction of various programming con-

structs of P with respect to E below:

Assignments (Ball et al., 2001)

Consider an assignment statement x = e at program

point l in P where e denotes an arithmetic expression.

Given a predicate ψ

∈ E, a boolean variable b

in BP

(P, E) can have the value true after l if WP(x = e, ψ

)

= true before l. Similarly, b

can have the value false

after l if WP(x = e, ¬ψ

) = false before l. Observe

that, if neither of these predicates holds before l then

in BP (P, E) contains parallel assignment at l as:

...b

(W P(x = e,ψ

),W P(x = e,¬ψ

)),

... ,

(W P(x = e,ψ

),W P(x = e,¬ψ

))

where the function T

is deﬁned as bellow:

bool T

(bool pos, bool neg) {

if (pos) {return true; }

if (neg) {return false; }

return unknown(); }

The unknown function is deﬁned as follows:

bool unknown() {

if (∗) {return true; }

else {return false; }

}

Observe that the unknown function uses the control

expression “∗” which non-deterministically determi-

nes the result either true or false.

Conditional (Ball et al., 2001)

Consider a conditional statement if (ψ) {stmt} else

{stmt} in P. If the predicate ψ is evaluated to true

in P then predicate ψ in the corresponding BP (P, E)

should also be evaluated to true. Similarly, ψ evalua-

ted to false in P then predicate ψ in the corresponding

BP (P, E) should also be evaluated to false. In BP (P,

E), the condition is encoded as:

if (∗) {

assume(ψ);

...

} else {

assume(¬ψ);

...

}

Observe that, as “*” in ‘if’ condition non-

deterministically evaluates to either true or f alse,

ICSOFT 2018 - 13th International Conference on Software Technologies

350

both execution paths will be explored. The functions

assume(ψ) under ‘if’ and assume(¬ψ) under ‘else’

preserve the semantics of the conditional statement in

Databases Statements

We now deﬁne the abstraction of database statements

(SELECT, UPDATE, DELETE and INSERT) into their

boolean form. To do so, let us ﬁrst recall from (Hal-

der and Cortesi, 2012) the formal syntax of database

programs as below:

Q ::= hA,φi

A ::= select(v

, f (~e), r(

h(~x)), φ

, g(~e)) |

update(~v

,~e)| insert(~v

,~e)| delete(~v

)

τ ::= n | v

| v

| f

(τ

,τ

,...,τ

where f

is an n-ary function.

::= R

(τ

,τ

,...,τ

) | τ

= τ

where R

(τ

,τ

,...,τ

) ∈ {true, f alse}

φ ::= a

| ¬φ

| φ

∨ φ

| φ

∧ φ

| ∀x

φ | ∃x

c ::= Q|v = e| if cond then c

else c

| while cond do c

Where φ denotes an well-formed ﬁrst order for-

mula deﬁned over constants (n), application varia-

bles (v

) and database attributes (v

). The SQL

clauses GROUP BY, ORDER BY, DISTINCT/ALL and the

aggregate functions (SUM, COUNT, MAX, MIN, AVG)

are represented in the form of functions g(), f (),

r(), h() respectively parameterized with either none

or one arithmetic expression e or an ordered se-

quence of arithmetic expressions ~e. The abstract

syntax of a database statement is denoted by hA,φi

where A represents Action-part and φ represents

Condition-part. The Action-part include SELECT,

UPDATE, DELETE and INSERT. For example, consi-

der the query Q=“update t set sal:=sal+100 where

age >40”. According to abstract syntax, Q is denoted

by hA, φi = hupdate(~v

,~e), φi, where ~v

= hsali and

~e = hsal+100i and φ=age >40.

Given a set of predicate E consisting of the inte-

grity constraints under consideration, the abstraction

of database statements into their equivalent boolean

form involves the following two major tasks:

(1) Computation of weakest precondition of da-

tabase statements w.r.t. ψ ∈ E: We deﬁne below

the computation of weakest precondition for various

database statements for a given post condition.

WP(hselect(v

, f (~e), r(

h(~x)), φ

, g(~e)), φi, ψ) = ψ

WP(hupdate(~v

,~e), φi, ψ) = ((ψ ∧ ¬φ) ∨ (ψ[~e/~v

] ∧ φ))

WP(hinsert(~v

, ~e), falsei, ψ) = ψ [~e/~v

] ∨ ψ

WP(hdelete(~v

), φi, ψ) = ψ

(2) Conversion of database statements into boolean

form: Given a database statement Q and a postcondi-

tion ψ, suppose WP(Q, ψ)=ψ

. Let us now deﬁne BP

for database statements to convert into their equiva-

lent boolean form w.r.t. ψ

below:

BP(hselect(v

, f (~e),r(

h(~x)),φ

,g(~e)),φi,ψ

) = skip

BP(hupdate(~v

,~e),φi,ψ

) =

(

b = φ ? ∗ : true, if ψ

∈ E

skip, otherwise

BP(hinsert(~v

, ~e), falsei, ψ

) =

(

b = ∗, if ψ

∈ E

skip, otherwise

BP(hdelete(~v

), φi, ψ

) = skip

Since SELECT and DELETE operations do not violate

any integrity constraints deﬁned over database states

at row level, the corresponding statements are repla-

ced by ‘skip’. On the other hand, in case of UPDATE

and INSERT, the following two cases may arise: (i) if

already exists in E and φ is evaluated to true, this

means that the update operation is taking place which

may lead the updated values possibly to violate the in-

tegrity properties. Therefore, the corresponding bool-

ean variable b in the boolean program is assigned to

‘*’. Otherwise, the evaluation of φ to f alse indica-

tes no update operation (hence, no property violation)

and therefore b is assigned to true, and (ii) if ψ

does

not exist in E (with or without applying strengthen or

weaken operation) then we replace the given database

statement by “skip”.

Let us illustrate this using an example as follows.

Example 2. Let E={sal ≤ 8000} and its correspon-

ding boolean variable be b. We assume that the ini-

tial database is in consistent state, and therefore, the

boolean variable b is set to true.

Consider the database table Emp which stores em-

ployees information and the following update state-

ment:

Q=update Emp set sal=sal+100 where age>30

The abstract syntax of Q is represented as

Q = hupdate(hsali,hsal+100i),age > 30i

The weakest precondition of Q w.r.t. “sal ≤ 8000”

is WP(hupdate(hsali, hsal+100i),age > 30i, sal ≤

8000) = ((sal ≤ 8000 ∧ ¬(age > 30))

(sal +

100 < 8000∧ age > 30)). Assuming age > 30 eva-

luates to true, we get sal+100 < 8000 which results

sal < 7900. Observe that the resulting precondition

sal < 7900 does not exist in E. Therefore, after ap-

plying strengthen operation, b is set to “*”.

Illustration on Motivational Example. Let us

consider the motivating example in code snippet 1.

A Symbolic Model Checker for Database Programs

351

Code Snippet 2: Boolean Program of PR.

0. int withdraw() {

bool {bal > minbal} // b := bal > minbal

1. skip ;

2. skip ;

3. skip ;

4. skip ;

5. skip ;

6. if (∗){

7. if (∗){

8. b = φ ? ∗ : true // φ = Accno=acc_no

}

else {

11. skip ;

}

12. skip ;

13. skip ;

16. skip ; }

Consider E = {bal > minbal} taking the integrity

constraint under consideration and its corresponding

boolean variables set B = {b}. The boolean program

of PR with respect to E is shown in the code snippet

2. Observe that the program statements 1, 2, 3, 4, 5,

11, 12, 13 and 16 in the boolean program are replaced

by ‘skip’ because they do not affect the predicate set

E. On other hand, the conditions in the statements at

program points 6 and 7 are set to “∗”. Note that, due

to the absence of the predicates “amt 6 10000” and

“balance-amt > minbal” in E, there is no assume sta-

tement included in the boolean code. Program state-

ment 8 denotes the boolean abstraction of the update

statement.

5.2 Veriﬁcation

In this section, we describe the generation of veriﬁca-

tion conditions and their satisﬁability. After conver-

ting the program into a boolean form, we will verify

whether all possible executions paths respect the de-

ﬁned properties. For this objective, we perform the

below steps:

• Control Flow Graph (CFG) construction of bool-

ean programs.

• VC generation from CFGs of boolean programs.

• Satisﬁability checking of the VCs using SMT.

The Control Flow Graph G

of boolean program

BL is constructed by following the similar approach

proposed in (Ball and Rajamani, 2000). After con-

structing G

, we list all possible execution paths. We

convert each path p

into a veriﬁcation formula f

anding all boolean statements encountered along that

path. The generated f

is suppled to SMT solver for

the satisﬁability checking. If the formula f

is satis-

ﬁed, then the corresponding path p

satisﬁes the pro-

perty and the algorithm terminates. Otherwise, we

check whether the corresponding path p

is a feasible

path in the original program. If yes, we conclude a

violation of the property and an error trace is genera-

ted. Otherwise, a reﬁnement of the boolean program

is performed which we describe in the next section.

Illustration on Motivational Example. Consider

the boolean program in the code snippet 2. Its CFG

is depicted in Figure 1. Consider the CFG path p ,

1 → 2 → 3 → 4 → 5 → 6 → 7 → 8 → 12 → 13

→ 16 (denoted by blue color). The generated VC al-

ong this path is b = φ ? * : true. This is encoded

as f , (b ∧ φ) ∨ (¬b ∧ φ) ∨ (b ∧ ¬φ) ⇒ b == true.

We pass the negation of this f (i.e. ¬f ) to SMT solver

which reports “satisﬁed”. This shows that there exist

some solution for which the predicate b evaluates to

false. Therefore, the model checker indicates that the

program PR may violate the integrity constraint. Ho-

wever, observe that this path is not a feasible path in

PR, because b which corresponds to bal > minbal is

false. This indicates that the abstraction is not precise

enough and there is a need of reﬁnement.

5.3 Counter Example Guided Abstract

Reﬁnement

The process of reﬁning the abstraction is done using a

method called counterexample-guided abstraction re-

ﬁnement (CEGAR) (Wang et al., 2006). If the bool-

ean program contains an error path p

and this path is

not feasible in the original program, then the reﬁne-

ment of the Boolean program will be initiated to eli-

minate this false error paths. More speciﬁcally, we de-

termine suitable predicates, by analyzing the infeasi-

bility of paths in the original database program which

illustrates feasible paths in the corresponding boolean

program. We then reﬁne the abstraction by adding

these new predicates to E. Let us illustrate below the

application of CEGAR on the motivating example:

Illustration on Motivational Example. Consider

the boolean program in code snippet 2. As we alre-

ady observed in section 5.2 that this abstraction is not

precise enough, in order to reﬁne this abstraction we

discover the predicate (amt 6 10000) form PR and

we add it to E. This results E = {bal > minbal, amt 6

10000} and B = {b, b

} where b and b

corresponds

to bal > minbal and amt 6 10000 respectively. Gi-

ven the reﬁned boolean program w.r.t. these new E

and B, observe that after verifying, by following the

same procedure as before, we get the error path p

, 1

ICSOFT 2018 - 13th International Conference on Software Technologies

352

11 12

13 16

T T T T T T

T T

Figure 1: CFG of the Boolean Program in Code Snippet 2.

Code Snippet 3: Reﬁned Boolean Program of PR.

0. int withdraw() {

bool {bal > minbal, amt 6 10000, balance − amt >

minbal } // b := bal > minbal, b

:= amt 6

10000, b

:= balance − amt > minbal

1. skip ;

2. skip ;

3. skip ;

4. skip ;

5. skip ;

6. if (∗){

assume(amt 6 10000) // b

= true

7. if (∗){

assume(balance − amt > minbal) // b

= true

8. b = φ ? ∗ : true; // φ = Accno = acc_no

}

else {

assume(¬(balance − amt > minbal))

11. skip ;

} }

12. skip ;

13. skip ;

16. skip ; }

→ 2 → 3 → 4 → 5 → 6 → 7 → 8 → 12 → 13 →

16 which is also not a feasible path in PR because the

predicate (amt 6 10000) is always true along the path.

By initiating the reﬁnement process once again, we

discover a new predicate (balance - amt > minbal).

Now considering the new E = {bal > minbal, amt 6

10000, balance−amt > minbal} and B = {b, b

, b

we get a reﬁned boolean program depicted in code

snippet 3. We now generate the VC from the CFG

of this code snippet 3, which is encoded as “f

((b

== true) ∧ (b

== true ∧ b == true) ∧ (b

true ∧ b

== true ∧ b == true) ∧ ((b ∧ φ) ∨ (¬b ∧

φ) ∨ (b ∧ ¬φ) ⇒ b == true)) ⇒ b == true". The

SMT solver reports unsatisﬁed when ¬ f

is passed as

input. Therefore, the model checker indicates that PR

is safe w.r.t. the integrity constraint bal > minbal.

6 TOWARDS IMPLEMENTATION

In general, the proposed framework accepts as inputs

a database program and properties of interest, and this

veriﬁes whether the input program respects the pro-

perties by generating safe/unsafe message as output.

The overall architecture of our proposed framework

is shown in Figure 2. We have identiﬁed the follo-

wing key modules which play important roles in im-

plementing the proposed framework:

Proformat

Applic

ation

is safe

Database

application

Verification

Properti

Abstraction

Refinement

No error

Error

unsafe

Figure 2: Overall Architecture of Database Model Checker.

1. Proformat: This modules preprocess input data-

base programs and annotates them by adding line

numbers to all statements.

2. Abstraction: The module “Abstraction” abstracts

the database program into a boolean program

using a set of predicates.

3. Veriﬁcation: This module at ﬁrst constructs CFG

of the boolean program. After that, it generates

VC from CFG. Finally, it tests the satisﬁability of

each VC using SMT solver.

4. Reﬁnement: The primary task of this module is

to discover additional predicates which reﬁne the

abstraction avoiding the existence of the spurious

paths in the corresponding boolean database pro-

grams.

7 CONCLUSIONS AND FUTURE

WORKS

In this paper, we proposed a symbolic model checking

algorithm for verifying the correctness of database ap-

plications with respect to integrity properties. We de-

sign our model checker based on the following key

A Symbolic Model Checker for Database Programs

353

modules: (i) Abstraction, (ii) Veriﬁcation and (iii) Re-

ﬁnement. We are currently implementing our propo-

sed model checker, as per the description provided in

the tool architecture, in a modular way to support sca-

lability.

ACKNOWLEDGEMENT

This work is partially supported by the research grant

(SB/FTP/ETA-315/2013) from the Science and En-

gineering Research Board (SERB), Department of

Science and Technology, Government of India.

REFERENCES

Anand, S., P

areanu, C. S., and Visser, W. (2007). Jpf–se:

A symbolic execution extension to java pathﬁnder. In

TACAS, pages 134–138. Springer.

Artzi, S., Kiezun, A., Dolby, J., Tip, F., Dig, D., Paradkar,

A., and Ernst, M. D. (2010). Finding bugs in web ap-

plications using dynamic test generation and explicit-

state model checking. IEEE TSE, 36(4):474–494.

Ball, T., Majumdar, R., Millstein, T., and Rajamani, S. K.

(2001). Automatic predicate abstraction of c pro-

grams. In ACM SIGPLAN Notices, volume 36, pages

203–213. ACM.

Ball, T. and Rajamani, S. K. (2000). Bebop: A symbolic

model checker for boolean programs. In Internatio-

nal SPIN Workshop on Model Checking of Software,

pages 113–130. Springer.

Ball, T. and Rajamani, S. K. (2002). The s lam project: de-

bugging system software via static analysis. In ACM

SIGPLAN Notices, volume 37, pages 1–3. ACM.

Chaki, S., Clarke, E., Groce, A., Ouaknine, J., Strichman,

O., and Yorav, K. (2004). Efﬁcient veriﬁcation of se-

quential and concurrent c programs. Formal Methods

in System Design, 25(2-3):129–166.

Chandra, S., Godefroid, P., and Palm, C. (2002). Software

model checking in practice: an industrial case study.

In Software Engineering, 2002. ICSE 2002. Procee-

dings of the 24rd IC on, pages 431–441. IEEE.

Clarke, E., Kroening, D., and Yorav, K. (2003). Behavioral

consistency of c and verilog programs using bounded

model checking. In Proc. of the 40th annual Design

Automation Conference, pages 368–371. ACM.

Clarke, E. M. and Emerson, E. A. (1981). Design and synt-

hesis of synchronization skeletons using branching

time temporal logic. In Workshop on Logic of Pro-

grams, pages 52–71. Springer.

Diana, R., Marques-Neto, H., Zarate, L., and Song, M.

(2012). A symbolic model checking appproach to ve-

rifying transact-sql. In Systems, Man, and Cyberne-

tics (SMC), 2012 IEEE International Conference on,

pages 1735–1741. IEEE.

Gligoric, M. and Majumdar, R. (2013). Model checking

database applications. In IC on Tools and Algorithms

for the Construction and Analysis of Systems, pages

549–564. Springer.

Halder, R. and Cortesi, A. (2012). Abstract interpretation

of database query languages. Computer Languages,

Systems & Structures, 38:123–157.

Holzmann, G. J. (1997). The model checker spin. IEEE

TSE, 23(5):279–295.

Ivancic, F., Yang, Z., Ganai, M. K., Gupta, A., Shlyakhter,

I., and Ashar, P. (2005). F-soft: Software veriﬁcation

platform. In IC on Computer Aided Veriﬁcation, pages

301–306. Springer.

Jhala, R. and Majumdar, R. (2009). Software mo-

del checking. ACM Computing Surveys (CSUR),

41(4):21.

Martin, M. and Lam, M. S. (2008). Automatic generation of

xss and sql injection attacks with goal-directed model

checking. In Proc. of the 17th conference on Security

symposium, pages 31–43. USENIX Association.

Musuvathi, M. and Qadeer, S. (2007). Iterative context

bounding for systematic testing of multithreaded pro-

grams. In ACM Sigplan Notices, volume 42, pages

446–455. ACM.

Paleari, R., Marrone, D., Bruschi, D., and Monga, M.

(2008). On race vulnerabilities in web applications.

In IC on Detection of Intrusions and Malware, and

Vulnerability Assessment, pages 126–142. Springer.

Petrov, B., Vechev, M., Sridharan, M., and Dolby, J. (2012).

Race detection for web applications. In ACM SIG-

PLAN Notices, volume 47, pages 251–262. ACM.

Queille, J.-P. and Sifakis, J. (1982). Speciﬁcation and veriﬁ-

cation of concurrent systems in cesar. In International

Symposium on programming, pages 337–351.

Scully, Z. and Chlipala, A. (2017). A program optimization

for automatic database result caching. ACM SIGPLAN

Notices, 52(1):271–284.

Wang, C., Hachtel, G. D., and Somenzi, F. (2006). Ab-

straction reﬁnement for large scale model checking.

Springer Science & Business Media.

Yang, J., Twohey, P., Engler, D., and Musuvathi, M. (2006).

Using model checking to ﬁnd serious ﬁle system er-

rors. ACM Trans. CS (TOCS), 24(4):393–423.

ICSOFT 2018 - 13th International Conference on Software Technologies

354