LOCALIZING BUGS IN PROGRAMS

Or How to Use a Program’s Constraint Representation for Software Debugging?

∗

Franz Wotawa

Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/2, 8010 Graz, Austria

Keywords:

Fault localization, constraint-based reasoning.

Abstract:

The use of a program’s constraint representation for various purposes like testing and veriﬁcation is not new.

In this paper, we focus on the applicability of constraint representations to fault localization and discuss the

underlying ideas. Given the source code of a program and a test case, which speciﬁes the input parameters and

the expected output, we are interested in localizing the root cause of the revealed misbehavior. We ﬁrst show

how programs can be compiled into their corresponding constraint representations. Based on the constraint

representation we show how to compute root causes using constraint solver. Moreover, we discuss how the

approach can be integrated with program assertions and unit tests.

1 INTRODUCTION

Localizing faults in software is generally considered

a difﬁcult and time consuming task. This holds espe-

cially in the case of software maintenance where the

basic structure of the program and the underlying as-

sumptions are not well understood or even not known.

Although, there is a growing interest in fault localiza-

tion within the research community, the overall prob-

lem is far away from being solved. One promising

approach is Zeller’s delta debugging (Zeller, 1999)

applied to the isolation of cause-effect chains in pro-

grams (Zeller and Hildebrandt, 2002). But (Gupta

et al., 2005) pointed out that from the cause-effect

chain it is not always easy to really identify a fault

and thus its applicability seems to be limited. Other

approaches use slicing techniques like (DeMillo et al.,

1996), (Kamkar, 1998), (Zhang et al., 2005) and oth-

ers. Unfortunately, these approaches do no guarantee

to remove all unnecessary parts of a program during a

debugging session. For an overview on debugging we

refer the reader to (Ducass´e, 1993), (Shahmehri et al.,

1995), and (Stumptner and Wotawa, 1998).

In this paper, we present an approach that is based

on the syntax and semantics of a program. Thus the

approach guarantees to focus on those parts of the

∗

The work described in this paper has been supported by

the FIT-IT research project Self Properties in Autonomous

Systems (SEPIAS), which is funded by the Austrian Federal

Ministry of Transport, Innovation and Technology and the

FFG.

program relevant in a certain debugging session. The

approach requires that we have given a failure reveal-

ing test case, which speciﬁes the input and the ex-

pected output of the program, and the source code

of the program. For the sake of simplicity we as-

sume that programs are written in a Java-like sequen-

tial programming language ignoring object-oriented

features, multi-threading, and exceptions. The pre-

sented approach is based on model-based diagnosis

(Reiter, 1987) and is most closely to (Ceballos et al.,

2003; Ceballos et al., 2006).

The basic idea behind our approach is to compile

the source code into a behavior equivalent constraint

representation and to use this constraint representa-

tion directly to identify possible bug locations. An-

other underlying assumption is that the corrected pro-

gram is a close variant of the given program. Hence,

the approach is more suited for experienced program-

mers and less for programmers learning a program-

ming language where a specialized tutoring system

seems to be more suitable. Finally, we assume that

the used test case is as small as possible and, there-

fore, does not require too many loop iterations or re-

cursive procedure calls. This assumption is called the

small scope hypothesis, which is used in other appli-

cations like veriﬁcation (see (Jackson, 2006)).

So, why does a constraint representation of a pro-

gram help to identify possible bug locations? Let us

start with an analysis of the following statement i:

x = a + 10;

In i the value of variable

becomes the sum of the

Wotawa F. (2008).

LOCALIZING BUGS IN PROGRAMS - Or How to Use a Program’s Constraint Representation for Software Debugging?.

In Proceedings of the Third International Conference on Software and Data Technologies - SE/GSDCA/MUSE, pages 88-95

DOI: 10.5220/0001890900880095

 SciTePress

value of

and 10 during execution. The direction of

computation is always from the inputs to the outputs.

Hence, from a known value of

after the the execu-

tion of i we cannot derive a value for

using the pro-

gramming language’s semantics alone; although there

is a relation between those variables. The constraint

representationconsiders this relationship and straight-

forward line i is considered as relation between

and

. In our case this relation is a mathematical equa-

tion. Whenever

is known, a value for variable

can

be derived and vice versa. But how does this help to

focus on relevant parts of the program during debug-

ging? To answer this question, we have a look at the

following program:

v = !in2;

out1 = in1 && v;

out2 = in3

Now consider the following test case:

in1

= true,

in2

= false,

in3

= false for the inputs, and

ou1

true,

out2

= false for the expected output. The pro-

gram behaves incorrect with respect to the given test

case. Instead of false the value true is computed for

variable

out2

. When considering the data dependen-

cies for variable

out2

statements 1 and 3 might cause

the misbehavior. Such a result would be returned

when using slicing-based debugging approaches.

Consider now the relation-based representation of

lines 1 to 3. We ﬁrst, assume that Line 1 is incorrect.

Therefore, we are not able to compute a value for tar-

get variable

. Because of Line 2 and the test case,

we know that

out1

has to be true. This can only be

the case when both

in1

and

are true. Hence, we

are able to determine the value of

. From this value

we obtain

out2

to be true, which contradicts the test

case. The same happens when assuming Line 2 to be

faulty. The only remaining candidate is Line 3, which

improves the result. One reason for the improvement

is that relations allow for reasoning in all possible di-

rections and not only from the inputs to the outputs.

In the rest of the paper we introduce the com-

pilation of programs into their constraint representa-

tion. For this purpose we ﬁrst convert programs into

programs where all loops and recursive procedure or

method calls are unrolled. From this representation

we extract its static single assignment (SSA) form.

The SSA form can be easily mapped to a set of con-

straints. We further present an algorithm, which al-

lows for computing all statements that might cause

the misbehavior, and show how the approach can be

easily combined with program assertions and testing.

2 CONVERSION – PART 1

Within the paper, we assume that programs are writ-

ten in a Java like programming language ignoring

object-oriented features, multi-threading, and excep-

tions. Moreover, we ignore all type speciﬁc infor-

mation and assume that the use of functions, vari-

ables, and other language constructs, which require

type conformity, is done in a type-safe fashion. The

ﬁrst part of the conversion comprises two steps. In

the ﬁrst step, we convert all not necessarily recursive

procedures and while statements of the program us-

ing unrolling of sub-blocks and procedure bodies. In

the second step, we convert the resulting program into

its SSA form. In the SSA form every variable is only

used once as a target. Finally, the second part of con-

version uses the SSA form to obtain a constraint rep-

resentation.

We do not formally specify the conversion pro-

cess but explain the necessary steps and discuss im-

portant issues. The overall idea of using constraints

in software engineering is not new. However, most

of the research activities focus on veriﬁcation except

(Ceballos et al., 2003; Ceballos et al., 2006), where

the constraint representation is used for fault localiza-

tion. (Gotlieb et al., 1998) used constraints for test

case generation. More recently, (Collavizza and Rue-

her, 2006) introduced the conversion of programs into

constraints and used them for veriﬁcation purposes.

2.1 Recursion and Iteration

Under some restrictions every program comprising

recursion and/or loop statement can be converted into

a loop free but behavior equivalent form. The restric-

tion applies to the program’s input requiring that the

input is chosen in a way where the maximum number

of recursive calls or iterations is known in advance.

This is of course not possible in the general case and

seems to be somehow awkward or very restrictive.

However, when considering a program that runs in its

environment similar restrictions apply. For example,

because of memory constraints the number of recur-

sions is limited. Moreover, like in (Jackson, 2006) we

argue that the test cases used to validate a program

are usually small and require not too many loop iter-

ations or recursive functions. Since, unrolling loops

and recursive calls increases the overall program size

of the resulting program, a restriction on the maxi-

mum number of iterations or recursions is necessary.

The unrolling step is necessary for debugging in order

to make all iterations explicit, which allows for a di-

rect integration of loop and recursion invariants. The

invariants only need to be copied in every unrolled

LOCALIZING BUGS IN PROGRAMS - Or How to Use a Program’s Constraint Representation for Software Debugging?

block. Moreover,in principle the number of iterations

and recursive calls can be determined from the failure

revealing test case directly.

We use the following rules in order to compute the

recursion free and loop free variants of a given pro-

gram. We handle recursive procedure as well as loops

by unrolling the involved statements. This unrolling

of statements is done up to a pre-speciﬁed bound. To

allow for detecting that a test case reaches this bound-

ary, we use a new variable

fail

is initialized

with the value false. If the number of required itera-

tions exceed the given parameter, the variable

fail

set to true.

Recursion. Given a procedure call

y =

p(a

. . .

)

at line i of the program, and the

declaration of the procedure, which comprises the

formal parameters

. . .

, x

and the set of state-

ments S. For simplicity, we assume that the body of

comprises only one return statement at the end of

the program. We construct a new body S

′

, which is

equivalent to S but where the return statement of the

form

return

;

is replaced with

return p =

;

The conversion is done by replacing the procedure

call with the following statements in the case where

the maximum number of considered recursive calls is

not reached.

= a

; ...x

= a

;

′

y = return p;

Note that if there is no return value, the last state-

ment can be ignored and S

′

is equivalent to S. If

the pre-deﬁned maximum number of recursive call is

reached, the call is replaced by a signal assignment

fail = true;

where the variable

fail

is used only

for stating that something unexpected happened, and

shall not be used in the original program. Note that

the maximum number of recursive calls limits the re-

cursion depth. For example, if there are two calls of

the same recursiveprocedurein a block, then the max-

imum number of recursions for both is the same.

While/Loops. A while statement of the form

while

C { S } can be easily converted into a nested if-

structure. Every time the conditionC evaluates to true

the statements in S are executed and testing C is done

again, until C evaluates to false. Hence, in general

we can replace a while statement by the following in-

ﬁnite structure of nested if-statements.

C { S

...

} } }

In practice we have to set the nesting depth sim-

ilar to the maximum number of recursion when con-

verting recursive procedures. If the maximum nesting

depth is reached, we add the statement

C {

fail

= true;

With these two conversions rules we replace all

recursive and non-recursivefunctions and while state-

ments with their equivalent structures. The obtained

program comprises more statements than the original

program but does not contain loops anymore.

The unrolling of recursive functions and loops is

different from the original conversion of programs

into their equivalent SSA form, which does not re-

quire an unrolling. However, in our case the unrolling

allows for explicitly considering every single iteration

step during debugging. The SSA form, which we use

for debugging, is a SSA form of the converted pro-

gram without loops and recursions and not a SSA

form of the original program. Since, the unrolling

does not change the semantics of the program if the

variable

fail

is not set to

true

during program ex-

ectution using the failure revealing test case, the SSA

form of the converted program is also semantically

equivalent to the original program under the same

conditions. This ensures the correct computation of

diagnoses.

2.2 Static Single Assignment Form

The SSA form of a program (Cytron et al., 1991) is a

representation with the property that no two left-side

variables share the same name. Hence, every vari-

able that is deﬁned in a statement has a unique name.

The SSA form of a program is of importance in our

case because it can be directly mapped to constraints.

We discuss this issue later and brieﬂy introduce the

conversion into an SSA form. For more information

regarding the SSA form and its computation we refer

to (Cytron et al., 1991; Aycock and Horspool, 2000;

Mayer, 2003) where also the conversion of arrays and

other data structures is explained.

The conversion of programs into their SSA form

can be done by adding an index to every variable. A

variable that is used obtains the index from the last

deﬁnition of the same variable. Every time a variable

is deﬁned a new index is generated. If a program com-

prises only assignment statement, the conversion is

straightforward. In case of conditional statements or

loops the conversion becomes more complicated. In

our case, we only have to consider conditional state-

ments because the loop statements and the recursive

procedure calls are eliminated in the previous conver-

sion step.

The idea behind the conversion of conditional

statement is as follows: The value of the condition is

stored in a new unique variable. The if- and the else-

branches are converted separately. In both cases the

conversion starts using the indices of the variables al-

ready computed. Both conversions deliver back new

indices of variables. In order to get a value for a vari-

able we have to select the last deﬁnition of a variable

ICSOFT 2008 - International Conference on Software and Data Technologies

if (state == 1)

{

if (on)

{

state = 2;

4. }

else if (state == 2)

{

if (off)

{

state = 1;

8. }

9. }

10.

if (state == 1)

{

11.

v = !in2;

12.

out1 = in1 && v;

13.

out2 = in3

14. }

15.

if (state == 2)

{

16.

out1 = false;

17.

out2 = false;

18. }

Figure 1: A small example program

fsm

from the if- and else-branch depending on how the

if condition evaluates. This selection is done using a

function Φ, which is deﬁned as follows:

Φ(x, y, C) =



x if C = true

y otherwise

Hence, for every variable which is deﬁned in the

if- or the else-branch we have to introduce a selecting

assignment statement, which calls the Φ function.

Let

C {

.. x = ..

}

else

{

.. x =

} be a conditional statement at line n of the

program. The SSA form is given as follows:

var

;

... x i = ...

... x j = ...

x k =

(x i,x j,var

);

The indices i, j, and k of

are assumed to be the

indices assigned to

in order to meet the properties

of the SSA form.

We illustrate the conversion using the program

fsm

, that implements a small ﬁnite state machine.

Such programs often occur in the embedded systems

domain, which is one of the target domains of our

approach. The program comprises 1 state variable

state

, 5 input variables

on, off, in1, in2, in3

and 2 output variables

out1, out2

. Lines 1-9 imple-

ment the state transitions as a function of

on, off

and lines 10-18 the output function, which speciﬁes

values for

out1, out2

as a function of

in1, in2,

in3

and the internal state

state

The SSA form of

fsm

is depicted in Figure 2.

It comprises only assignment statements. All con-

ditional statements are replaced by assignment state-

ments where the right-hand side calls the Φ function

var 1 = (state 0 == 1);

var 2 = (on 0);

state 1 = 2;

state 2 =

(state 1,state 0,var 2);

var 5 = (state 0 == 2);

var 6 = (off 0);

state 3 = 1;

state 4 =

(state 3,state 0,var 6);

state 5 =

(state 4,state 0,var 5);

10.

state 6 =

(state 2,state 5,var 1);

11.

var 11 = (state 6 == 1);

12.

v 1 = !in2 0;

13.

out1 1 = in1 0 && v 1;

14.

out2 1 = in3 0

v 1;

15.

out1 2 =

(out1 1,out1 0,var 11);

16.

out2 2 =

(out2 1,out2 0,var 11);

17.

var 15 = (state 6 == 2);

18.

out1 3 = false;

19.

out2 3 = false;

20.

out1 4 =

(out1 3,out1 2,var 15);

21.

out2 4 =

(out2 3,out2 2,var 15);

Figure 2: The SSA form of program

fsm

for each variable used as target in either the then-

branch or else-branch.

Before discussing the conversion of programs in

SSA form to constraints, we introduce the basic con-

cepts of constraint systems including constraint solv-

ing.

3 CONSTRAINTS

In order to be self contained, we brieﬂy discuss the

basic deﬁnitions of constraint systems including the

computation of solutions. For a more in-depth pre-

sentation of constraint systems and their algorithms

we refer to (Dechter, 1992), (Mackworth, 1987),

and (Dechter, 2003). A constraint system CS is char-

acterized by a set of variables V = {V

, . . . , V

}, each

of them associated with a (not necessarily ﬁnite) do-

main D

, 1 ≤ i ≤ n, and a set of constraints C =

, . . . , C

}. Each of the constraints C

has a corre-

sponding pair (X

, R

), where X

⊆ V is a set of vari-

ables, and R

is a relation over X

. X

is called the

scope of constraint C

. For convenience we assume a

function dom : V 7→ DOM that maps a variable V = i

to its domain D

, a function scope : C 7→ 2

that maps

a constraint to its corresponding scope, and a function

rel : C 7→ RELATIONS that maps constraints to their

relations.

An assignment of values to all variables is called

an instantiation. An instantiation is said to be ful-

LOCALIZING BUGS IN PROGRAMS - Or How to Use a Program’s Constraint Representation for Software Debugging?

ﬁlled, if it does not contradict any constraint. A con-

straint is said to be in contradiction with an instan-

tiation iff the variable values are not represented in

the constraint relations. Usually someone is inter-

ested in ﬁnding non-contradictory, i.e., fulﬁlling, in-

stantiation, which we also call a solution. An effec-

tive way in practice for computing solutions is to use

backtrack search. For backtracking we assume an or-

dered variable collection VO. We start with the ﬁrst

variable and assign a provisional value. We further

assign provisional values to the successive variables

as long as the constraints are fulﬁlled. For this pur-

pose we only have to consider constraints where all

variables have an assigned value. If one constraint

is violated we backtrack to the variable that has been

assigned a value in the last step and choose another

value. If there is no value, we have again to track

back to the previous variable and so on. If there is

no further value to assign for the ﬁrst variable, there

is no solution. Otherwise, the procedure stops when

all variables have been assigned values that fulﬁll all

constraints.

The following algorithms implements back-

tracking and has to be accessed via the call

ﬁndSolution(CS,VO,

0). It returns a solution if one

exists. Otherwise, the algorithm returns the empty

set. The algorithm only guarantees to terminate on

constraint systems where all domains are ﬁnite. This

is the case in debugging given a failure revealing test

case where the test case can be used to restrict the the

domains.

ﬁndSolution(CS,VO,I)

1. If VO is empty, then return the current variable

assignment I as result.

2. Otherwise, let v be the ﬁrst element of VO.

3. For all values x ∈ dom(v) of the currently selected

variable v do:

(a) Add the assignment v = x to the set of current

assignments I.

(b) Check all constraints where variables have a

value assignment in I. If at least one constraint

is violated, remove v = x from I. Otherwise, do

the following:

i. Call ﬁndSolution(CS,VO\{x},I) recursively

and store the result in r.

ii. If r =

0, then remove v = x from I. Otherwise,

return r.

4. Return

Beside optimizations regarding the used data

structures, there are three ways for improving the run-

ning time of the backtracking algorithm, i.e., vari-

able ordering (see (Freuder, 1982; Dechter and Pearl,

1989)), restricting domains, and detecting dead ends

during search as fast as possible (see (Dechter and

Pearl, 1988)).

4 CONVERSION – PART 2

Programs in SSA form have a simple structure, com-

prise only assignment statements, and every variable

is deﬁned only once. Hence, the conversion is easy

and requires only a conversion of each statement sep-

arately. All variables are mapped to corresponding

variables of the constraint systems. The data types of

the variables are mapped to the domains of the con-

straint variables. Each statement is mapped to a con-

straint where the corresponding constraint variables

of variables used in the statement form the scope of

the constraint. The constraint’s relation is given by

the statement itself.

For example, Line 14 of the SSA form

of program

fsm

(Figure 2)

out2 1 = in3 0

v 1;

is converted in a constraint C

with

scope(C

= {out2 1, in3 0, v 1} and relation

rel(C

) = {out2 1 = in3 0∨v 1}. The relation of

has to be interpreted as a logical rule where ’=’

is the bi-implication and ’∨’ a logical or. For ﬁnite

domains the relation can also be represented in a

tabular form.

out2 1 in3 0 v 1

false false false

true true false

true false true

true true true

Accordingly to our deﬁnition of consistency of

constraints, a given instantiation, i.e., an assignment

of values to constraint variables is consistent with a

constraint, if the correspondingtuple is element of the

relation. Otherwise, the instantiation contradicts/does

not fulﬁll the constraint. For C

out2 1 = false,

in3 0 = true, v 1 = false is an inconsistent instan-

tiation. If changing the value of in3 0 to false, we

obtain a consistent instantiation.

The conversion of statements to the correspond-

ing relation is rather straightforward and has to be de-

ﬁned for the functions and predicates of the data types

used in the program. Because of space limitations we

will not consider all of these conversions. Instead we

discuss the conversion of assignments comprising the

Φ function. Assume a statement x i = Φ(x j, x k, c)

in line n of the program. We map this statement

to a corresponding constraint C

with scope(C

) =

{x i, x j, x k, c}. The relation of C

is speciﬁed using

ICSOFT 2008 - International Conference on Software and Data Technologies

3 rules: c = true → x i = x j, c = false → x i = x k,

x j = x k → x i = x j. The latter rule states that if the

value of the variable

is the same for both branches

of a conditional statement, then it can be used after

the execution of the branches.

The second part of the compilation process, i.e.,

the conversion of the SSA form into a set of con-

strains, does obviously not change the behavior. Ev-

ery result of a program run of the SSA form with

respect to the given input values is a (partial) solu-

tion of the corresponding constraint system using the

same input values and vice versa. Note that in cases

where the number of chosen iterations is not enough,

the used

fail

variables are set to false. In this case,

the SSA form and its corresponding constraint rep-

resentation do not reﬂect the behavior of the original

program anymore. A solution is to perform the com-

pilation process again with an increased number of

allowed iterations.

Until now, we are not able to use the constraint

representation together with the backtracking algo-

rithm to compute possible fault candidates, i.e., state-

ments that cause misbehavior. The reason is that we

are not able to distinguish the correct behavior of a

statement from its incorrect one. A solution is to ex-

plicitly state correct and incorrect behavior. To do so,

we extend the constraints by introducing a new status

variable S

for each constraint C

. The domain of S

is {ok, ab} where ok represents the correct and ab the

incorrect, i.e., abnormal, behavior. The relations ob-

tained from the statements are mapped to the correct

behavior. This can either been done via replacing a

rule r with S

= ok → (r) in the constraint’s relation

or by adding a new column to the tabular form and

setting the value of S

to ok for each row.

In addition, we have to specify the faulty behav-

ior. Since, the real faulty behavior is not known, we

assume the following fault model: A faulty assign-

ment statement does not allow to determine a value

for the target/deﬁned variable. All other variables

are not inﬂuenced. In order to implement this fault

model, we introduce a new value ?, which represents

’don’t know’. The ? is assumed to be element of all

domains, which belong to the domains of correspond-

ing program variables. For the rule representation of

a constraint C

with target variable x, we add the rule

= ab → x =?. In tabular form we add a new col-

umn where ? is assigned to all variables except S

which has to be equivalent to ab. Moreover, a further

improvement would be to add new rules even for the

correct behavior. For example, a logical or would be

true if only one arguments is true.

For constraint C

we state the extended behavior

in tabular form as follows:

out2 1 in3 0 v 1 S

false false false ok

true ? true ok

true true ? ok

? ? ? ab

In addition to these changes, we slightly adapt the

term consistency of constraints with respect to vari-

able instantiations. We say that a constraint is consis-

tent with a given instantiation, iff there exists a tuple

in the constraint relation that can be mapped to the in-

stantiation. A tuple can be mapped to an instantiation

iff there exists a replacement of of each ? value for

a variable with an element of the domain that makes

the tuple equivalent to the instantiation. Note that not

all ? have to be replaced with the same value. Every

variable instantiation can be mapped to the faulty be-

havior. This ensures that at least one explanation can

always be found. From here on, we always assume

that a program to be debugged is compiled into a con-

straint representation that allows for debugging.

5 FAULT LOCALIZATION

Given the constraint representation CS

= (V, D, C)

of a program Π and a test case T

revealing a faulty

behavior, the backtracking algorithm ﬁndSolution

can be used in order to determine the cause of the

misbehavior. The following steps are necessary for

this purpose:

1. Let VO be the variable ordering where the ﬁrst

elements are the status variables S

, . . . , S

|C|

the constraints C

∈ C, followed by the variables,

which are speciﬁed in the test case T

, and the

remaining variables from V.

2. For each input of the form x = v in T

add a new

constraint with scope { x 0} and relation x 0 = v

to C.

3. For each expected output in T

of the form y = v

add a new constraint with scope {y i} and rela-

tion y i = v to C where i is the greatest index of

variable y.

4. Call ﬁndSolution((V, D, C), VO,

0) and return the

result.

Note that this algorithm returns only one solution.

It can be adapted in order to ﬁnd all or a pre-deﬁned

number of solutions by adapting ﬁndSolution accord-

ingly. In a practical setting the algorithm has to be

adapted in order to ﬁrst search for single fault can-

didates and afterwards for multiple fault candidates.

But it is important to consider that the approach is not

limited to single faults.

LOCALIZING BUGS IN PROGRAMS - Or How to Use a Program’s Constraint Representation for Software Debugging?

When applying the approach to the small ex-

ample program given in the introduction of this

paper, we receive as solution that Statement 3

is faulty. The corresponding instantiation is

in1 0 = true, in2 0 = false, in3 0 = false, v 1 =

true, out1 1 = true, out 2 = true, S

= ok, S

ok, S

= ab.

Besides handling multiple faults the proposed ap-

proach allows for an easy integration of assertions and

unit tests. Assertions are basically nothing else than

conditions that are evaluated during runtime. If the

condition evaluates to false the assertion is said to fail.

Otherwise, the assertion is said to be fulﬁlled. For ex-

ample, an assertion for the

fsm

program in Figure 1

would specify the state transitions:

else if (state == 2)

{

if (off)

{

state = 1;

8. }

9. }

@ASSERT[on = true ⇒ state = 2]

@ASSERT[of f = true ⇒ state = 1]

10.

if (state == 1)

{

11.

v = !in2;

The ﬁrst assertions speciﬁes that whenever

set to true, the

state

variable should be 2. The sec-

ond assertions speciﬁes that

off

changes

state

to 1.

A implicit assumption behind this assertions is that

either

off

are true. The integration of the as-

sertions is easy. They only have to be converted into

constraints using the variable indices at the given lo-

cation. Note that assertions are not allowed to change

variables. Hence, we do not need to take care of new

variable indices. For our example program, we would

add the following constraints to the

fsm

’s constraint

representation:

on 0 state 6

true 2

of f 0 state 6

true 1

For both constraints no status variables are added,

because we assume that assertions cannot fail.

The integration of unit tests can be done in a simi-

lar way. Unit test usually havethe following structure.

At the beginning the variables are set to their initial

values. Then the program is called. Finally, asser-

tions are used to specify the expected values. In case

one assertion is contradicted, an exception is raised

and the unit test is assumed to fail. We illustrate the

integration of unit tests using the

fsm

program again.

@INIT[state = 2∧on = false∧ of f = true]

@INIT[in1 = true∧in2 = false∧in3 = false]

if (state == 1)

{

if (on)

{

17.

out2 = false;

18. }

@ASSERT[out1 = true∧ out2 = false]

For integration purposes we again convert the ini-

tialization and the assertion to constraints, which are

assumed to be correct.

state 0 on 0 of f 0

2 false true

in1 0 in2 0 in3 0

true false false

out1 4 out2 4

true false

The integration of assertions and unit tests as

shown is smooth when using a constraint represen-

tation. Moreover, the approach puts the computation

of fault candidates down to the computation of solu-

tions for the corresponding constraint representation.

For the latter there are efﬁcient algorithms available.

Hence, for smaller programs up to 500 statements lo-

cating bugs should be possible. We implemented the

compilation of programs into constraints and tested

the approach on very small programs using a self-

implemented constraint solver. The results in terms

of the number of diagnosis candidates are promising

but require an improvement on side of the constraint

solver as well as some optimizations during the con-

version. In particular, time for computing fault candi-

dates has to be improved substantially in order to be

of use in practical applications. At the moment diag-

nosis time is between a fraction of a second and about

1 minute for small programs comprising 10 to 20 lines

of code.

6 CONCLUSIONS

There are many different approaches for fault local-

ization but most of them are based on data ﬂow and

control ﬂow. In this paper, we presented an approach

that is based on the constraint representation of a pro-

gram and a failure revealing test case for comput-

ing fault candidates. The advantage of constraints is

the availability of constraint solver, which can be di-

rectly used. The research described in the paper is

most closely to the application of model-based diag-

nosis (see (Reiter, 1987)) to software debugging as

described in (K¨ob and Wotawa, 2006) and (Ceballos

et al., 2003; Ceballos et al., 2006).

ICSOFT 2008 - International Conference on Software and Data Technologies

In contrast to previous research the presented ap-

proach offers: (1) An almost standardized way of rep-

resenting programs as constraints; (2) Debugging is

put down to constraint solving where a lot of research

is devoted to constraint solving algorithms; (3) The

integration of assertions and unit tests can be easily

done in our case. There is no need for a special treat-

ment. Assertions can be added to the compiled pro-

gram on the ﬂy during a debugging session; And (4)

The debugging results depends on the syntax and the

semantics of a programming language.

Of course the complexityof debugging is still high

and improvements of both the solving algorithms and

the conversion process are necessary. The handling

of object-oriented features, multi-threaded programs,

and exception are still open issues. However, in spe-

cialized areas like the embedded systems domain, the

application of the presented approach is in reach.

REFERENCES

Aycock, J. and Horspool, N. (2000). Simple generation of

static-single assignment form. In Proceedings of the

9th International Conference on Compiler Construc-

tion (CC), pages 110–124.

Ceballos, R., Casca, R. M., Valle, C. D., and Borrego, D.

(2006). Diagnosing errors in dbc programs using con-

straint programming. In Selected Papers from the 11th

Conference of the Spanish Association for Artiﬁcial

Intelligence (CAEPIA 2005), volume 4177 of Lecture

Notes in Computer Science.

Ceballos, R., Gasca, R., Valle, C. D., and Rosa, F. D. L.

(2003). A constraint programming approach for soft-

ware diagnosis. In Ronsse, M. and Bosschere, K. D.,

editors, Proceedings of the Fifth International Work-

shop on Automated Debugging, Ghent, Belgium.

Collavizza, H. and Rueher, M. (2006). Exploration of

the capabilities of constraint programming for soft-

ware veriﬁcation. In Proceedings of Tools and Algo-

rithms for the Construction and Analysis of Systems

(TACAS), pages 182–196. Springer, Vienna, Austria.

Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., and

Zadeck, F. K. (1991). Efﬁciently computing static

single assignment form and the control dependence

graph. ACM TOPLAS, 13(4):451–490.

Dechter, R. (1992). Constraint networks. In Encyclopedia

of Artiﬁcial Intelligence, pages 276–285. Wiley and

Sons.

Dechter, R. (2003). Constraint Processing. Morgan Kauf-

mann.

Dechter, R. and Pearl, J. (1988). Network-based heuristics

for constraint-satisfaction problems. Artiﬁcial Intelli-

gence, 34:1–38.

Dechter, R. and Pearl, J. (1989). Tree clustering for con-

straint networks. Artiﬁcial Intelligence, 38:353–366.

DeMillo, R. A., Pan, H., and Spafford, E. H. (1996). Critical

slicing for software fault localization. In International

Symposium on Software Testing and Analysis (ISSTA),

pages 121–134.

Ducass´e, M. (1993). A pragmatic survey of automatic

debugging. In Proceedings of the 1st International

Workshop on Automated and Algorithmic Debugging,

AADEBUG ’93, Springer LNCS 749, pages 1–15.

Freuder, E. C. (1982). A sufﬁcient condition for backtrack-

free search. Journal of the ACM, 29(1):24–32.

Gotlieb, A., Botella, B., and Rueher, M. (1998). Au-

tomatic test data generation using constraint solving

techniques. In Proc. ACM ISSTA, pages 53–62.

Gupta, N., He, H., Zhang, X., and Gupta, R. (2005). Locat-

ing faulty code using failure-inducing chops. In Auto-

mated Software Engineering (ASE), pages 263–272.

Jackson, D. (2006). Software abstractions: logic, language,

and analysis. MIT Press.

Kamkar, M. (1998). Application of program slicing in al-

gorithmic debugging. Information and Software Tech-

nology, 40:637–645.

K¨ob, D. and Wotawa, F. (2006). Fundamentals of debug-

ging using a resolution calculus. In Baresi, L. and

Heckel, R., editors, Fundamental Approaches to Soft-

ware Engineering (FASE’06), volume 3922 of Lecture

Notes in Computer Science, pages 278–292, Vienna,

Austria. Springer.

Mackworth, A. (1987). Constraint satisfaction. In Shapiro,

S. C., editor, Encyclopedia of Artiﬁcial Intelligence,

pages 205–211. John Wiley & Sons.

Mayer, S. (2003). Static single-assignment form

and two algorithms for its generation. Semi-

nar Work, Winter Term 2002/03, University of

Konstanz, http://www.inf.uni-konstanz.de/dbis/

teaching/ws0203/pathﬁnder/download/ mayers-

ausarbeitung.pdf.

Reiter, R. (1987). A theory of diagnosis from ﬁrst princi-

ples. Artiﬁcial Intelligence, 32(1):57–95.

Shahmehri, N., Kamkar, M., and Fritzson, P. (1995). Us-

ability criteria for automated debugging systems. J.

Systems Software, 31:55–70.

Stumptner, M. and Wotawa, F. (1998). A Survey of Intelli-

gent Debugging. AI Communications, 11(1).

Zeller, A. (1999). Yesterday, my program worked. today,

it doesn’t. why? In Proceedings of the Seventh Euro-

pean Software Engineering Conference/Seventh ACM

SIGSOFT Symposium on Foundations of Software En-

gineering (ESEC/FSE), pages 253–267.

Zeller, A. and Hildebrandt, R. (2002). Simplifying and iso-

lating failure-inducing input. IEEE Transactions on

Software Engineering, 28(2).

Zhang, X., He, H., Gupta, N., and Gupta, R. (2005). Exper-

imental evaluation of using dynamic slices for fault

localization. In Sixth International Symposium on Au-

tomated & Analysis-Driven Debugging (AADEBUG),

pages 33–42.

LOCALIZING BUGS IN PROGRAMS - Or How to Use a Program’s Constraint Representation for Software Debugging?