next up previous contents
Next: Reference - Results: Experiments Up: SBSAT User Manual and Previous: Lemma database   Contents

Reference - Results: making BDDs from bmc

Among the experiments we have run, those inputs relating specifically to bounded model checking benchmarks have been obtained from the output of the bmc program obtainable from Carnegie Mellon University. That program inputs a model checking problem and a number of time steps and outputs a propositional logic formula representing the BMC problem in three formats: a large propositional logic formula, three-address code representing the parse tree for that formula, and a CNF translation of the formula. Program bmc internally represents all formulas recursively as

<function> = <variable>;
<function> = $\neg$<variable>;
<function> = <function> op <function>;
where op is one of $\vee$, $\wedge$, $\to$, $\equiv$. The binary tree associated with such a recursion is stored as a tree of pointers. Each node of the tree is represented as a triple of pointers: to the left descendent, the right descendent, and the parent. A pointer to the root of such a tree represents the output formula in three-address code. Further processing inside bmc converts this to a CNF expression which is also available as output. As an example, we use bmc to generate the three-address code problems for queue benchmarks (see next section) as follows:
genqueue # > queue#
bmc -k # queue# -prove
where genqueue is part of the bmc suite and # is replaced by a number representing problem complexity. The CNF versions are created by replacing the last line above with this:
bmc -k # queue# -dimacs

We use bmc to generate three-address and CNF inputs directly, instead of taking already generated CNF formulas ``off the shelf'' so we have equivalent three-address and CNF data. Thus, times we report for zChaff, Berkmin, and Siege may differ from published times.

The largest propositional logic formula output by bmc is a conjunction of smaller formulas, so the obvious course for SBSAT is to read in each of those smaller formulas as a BDD. Nevertheless, for some of the bmc outputs, those propositional logic formulas were much too large even to store as BDDs. Of course, we also did not want to use the three-address code or the CNF representation directly, since that would negate the benefits of SMURFs which are to retain potentially exploitable domain-specific relationships. Our current approach is successful in spite of being amazingly simplistic.

  1. We read in the three-address code and recreate the large propositional formula so as not to lose domain-specific information. Starting at the bottom of this formula we start building a BDD. We use a greedy algorithm: when the BDD gets too large (10-18 variables) we insert a new variable to represent the BDD so far, include a BDD asserting that is what the new variable represents, replace the part we have translated with the new variable, and continue the process. This particular translation goes against our intention of staying in the original domain, however, this simple process still proves useful. In future research we hope to find a better algorithm.
  2. To break each resultant BDD $f$ down to a 10-variable maximum (so that the SMURFs remain suitably small), we do the following (see also Section [*]):

    (a) Compute all projections $f_i$ of the BDD onto 10-variable subsets of its variable set (see Section [*] for the meaning of projection).

    (b) Simplify the $f_i$'s against each other and delete resultant $f_i$'s which become True. Below we call the final simplified $f_i$'s $f_1,\ldots,f_k$.

    Note that $f$ logically implies each $f_i$; we can think of them as ``approximations'' to $f$, in the sense that each is false on some, but probably not all, of the truth assignments on which $f$ is false.

    (c) Recall that the goal is to replace $f$ with a set of smaller BDD's. Now $f$ is logically equivalent to the conjunction of the set $\{f_1,f_2,\ldots,f_k,f^\star\}$ where

    \begin{displaymath}f^\star=(f_1\wedge f_2\wedge \cdots \wedge f_k)\rightarrow f\end{displaymath}

    ($f^\star$ just excludes the truth assignments where all the $f_i$'s are true but $f$ is is false).

    If $f^\star$ has $\le 10$ variables, we replace $f$ with $\{f_1,f_2,\ldots,f_k,f^\star\}$. If $f^\star$ has $>10$ variables, we replace $f$ with $\{f_1,f_2,\ldots,f_k\}$ plus the translation of $f^\star$ to CNF. (Typically, $f^\star$ is satisfied in most truth assignments, so its CNF translation should be fairly short.)

    Again, this procedure is simplistic. We hope in the future to find a better algorithm.

next up previous contents
Next: Reference - Results: Experiments Up: SBSAT User Manual and Previous: Lemma database   Contents
John Franco 2011-09-15