Review of Jackson and Vaziri Finding Bugs With a Constraint Solver

Jackson and Vaziri 2000:

1      Problem

The work presented in this paper provides an approach to the problem of detecting bugs in software specifications, and makes a contribution toward the higher-level goal of determining whether a concrete program meets its specifications (captured by data abstractions). To provide a means of comparison among their contribution and similar approaches, the authors call out software engineering concepts soundness, completeness, and scope. They explain that the traditional approach of shape analysis is sound (i.e. provides no false negatives), is not complete (i.e. may produce ‘spurious’ false positives), but does so within boundless scope (i.e. for any number of heap cells or loop unrollings). The problem the authors present with this approach is that it reports errors on a lot of execution traces that may never happen, and is too conservative in that it may report that a property for an execution trace does not hold when it really does hold. Furthermore they claim that such conservative approaches are less accurate in bug reports (another way of saying that they are not complete), and break down for complex data abstractions and unbounded data structures because representation invariants are hard to find. Overall, the authors present that in order to be useful, a bug detection scheme doesn’t have to offer any unbounded-scope guarantees about properties of the model, and that it’s more important that the approach finds the most true bugs without false positives.

2      Solution

To respond to the stated needs, the authors present an approach that is both sound and complete, but within a bounded scope. First, program state is modeled relationally with a collection of set and relation variables that define the abstract representation of the underlying concrete program being modeled. Next, a formula is constructed that relates the state of the program before and after a statement of execution. This conjunction of constraints characterizes the sequences of states that may hold along an execution path, which is represented as a universal quantifier over a set of constraints composed with logical conjunction operators. The interesting part is that this statement is next negated to produce a formula that represents the set of conditions that satisfy permutations of execution paths that violate the specification. Since the specification is captured as a universal quantifier statement composed with conjunction, the violating traces are captured as an existential quantifier statement composed with disjunction operators. Finally, a SAT solver is used to find solutions to this formula produce counterexamples: execution traces that violate the specification. The use of a SAT solver to find counterexamples means that the search for bugs is exhaustive, rendering this approach both sound and complete, but only within the given scope. Admittedly, beyond the prescribed scope, the authors state that the solution “offers no guarantees” about bug detection.

As one can imagine, the number of execution paths can become quite large, but the authors improve on the naive approach of generating a variable for each state component at each node, by enabling preservation of node label assignment across non-modifying edges (within paths) and 2) by reusing labels across paths.

Using the control flow graph, they construct the formula that represents paths through the code of a procedure for a finite number of loop iterations, for example (a finite scope). Then they encode each statement in the procedure as a constraint that relates the initial states to the final states.

3      Claims

The overarching claimed contribution of this work is the “small scope hypothesis”. The researchers claim that bounded model checking can be used as an effective means of discovering bugs through an exhaustive search of a program’s state space within predefined boundaries of scope. Scope bounds define the size of the state space by limiting, for example, the size of data structures that are evaluated, or the number of loop unrollings for a procedure, that are evaluated. The authors claim that, “in practice, many bugs” can be discovered even when the bounds of the model restrict it to a small size.

4      Evidence

To back up their claim that a small scope can be sufficient for uncovering ‘many’ bugs, the researchers present two obstacles that must be overcome: 1. What happens at the edge of resource boundaries, e.g. when a program runs out of memory? The answer is to simulate the program at the edge of the resource, i.e to back off the model from the edge by a few instances, and to simulate it as it runs up against the resource bound. 2. In some cases, a larger scope might indeed become necessary to set up a complex set of circumstances that need to be evaluated. The authors suggest slicing up the problem into smaller chunks to make this possible.

It seems in some sense that the comparison to shape analysis sets up a straw man argument, because shape analysis is compared in the context of informing compiler optimization, while the proposed approach is used in the context of discovering bugs or checking properties. Ultimately, the closest competition of bounded model checking is traditional software testing, which

Ultimately, it’s hard to compare this approach to the ones they call out for comparison, because they compare them only conceptually. There is no quantitative presentation of interesting metrics (e.g. number of bugs found).

Posted in Paper Reviews

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>