
A Boolean variable can be assigned one of two values taken from {1,0}
or, alternatively, from {true,false}. The unary operator of negation
(¬) may be placed to the left of a variable, for example
like this: ¬a. The pair ¬a is called a
negative literal  a negative literal has value opposite to that of
the variable named in the pair. A variable alone is called a positive
literal. There are twelve "useful" binary operators
(∨,∧,≡,⊕,→,← plus their
negations). Truth tables for these operators are shown on slide 1 of
the lecture notes "Introduction to Satisfiability". Some of the
operators have the associative property:
(a∨(b∨c)) = ((a∨b)∨c) = (a∨b∨c)
Some of the operators have the distributive property , for example:
(a∧b)∨(c∧d) = (a∨c)∧(a∨d)∧(b∨c)∧(b∨d)
(a∨b)∧(c∨d) = (a∧c)∨(a∧d)∨(b∧c)∨(b∧d)
A disjunction of literals is a collection of literals separated
by ∨ such as the term on the right above. Such a
disjunction is called a clause.
According to DeMorgan's law:
¬(a∧b)=(¬a∨¬b)
¬(a∨b)=(¬a∧¬b)
 A propositional formula is constructed from positive literals,
negative literals, the 12 "useful" binary operators, the unary
"negation" operator ¬, and left and right parens.
For example:
((a∨¬b)→(a∧(c⊕¬(¬b≡c)))∧b)
A propositional formula has a value which is induced by an assignment
of values to its variables: the value is obtained by repeatedly
replacing a binary or unary operation with the value indicated by the right
column in a row of a truth table corresponding to that operation on
already evaluated operands, until there are no more operations. Such
a sequence of substitutions for the formula above under assignment
a=1,b=0,c=1 is as follows:
(((1∨¬0)→(1∧(1⊕¬(¬0≡1))))∧0)
(((1∨1)→(1∧(1⊕¬(¬0≡1))))∧0)
(((1∨1)→(1∧(1⊕¬(1≡1))))∧0)
((1→(1∧(1⊕¬(1≡1))))∧0)
((1→(1∧(1⊕¬1)))∧0)
((1→(1∧(1⊕0)))∧0)
((1→(1∧1))∧0)
((1→1)∧0)
(1∧0)
0
so that formula evaluates to 0 under the assignment.
 If there is an assignment of values to the variables of formula
φ that cause φ to evaluate to 1, then we say φ is
satisfiable. If not, φ is said to be unsatisfiable.
 A Conjunctive Normal Form (CNF) formula is a conjunction of
clauses. The Satisfiability problem is to determine whether a given
CNF formula evaluates to 1 for some assignment of values to its
variables.
 The Satisfiability problem is NPcomplete (that is, in general it
is really hard).
 Any propositional formula can be transformed efficiently to a
equisatisfiable CNF formula by repeatedly introducing a new variable
and substituting a collection of up to four clauses that correspond to
an equivalence between that new variable and a nonCNF binary
operation. For an example see slides 3 and 4 of the "Introduction to
Satisfiability" lecture slides. If the variables of formula
φ_{1} are a superset of the variables of formula
φ_{2} then we say that φ_{1} and
φ_{2} are equisatisfiable if for any assignment to the
variables of φ_{2} that satisfies φ_{2}, there
is an assignment of values to variables of φ_{1} that
satisfies φ_{1} and if φ_{2} is unsatisfiable
then so it φ_{1}.
 In addition, many important nonlogical problems can be
transformed to CNF formulas where a solution can be used to construct
a solution to the original problem.
 Most CNF SAT solvers use resolution as the main search operation.
There are two common forms of resolution:
 general resolution: Given two clauses C_{1} and
C_{2} such that there is exactly one
literal x in C_{1} whose complement
¬x is in C_{2},
the resolvent of C_{1} and C_{2}
is the clause containing all literals of C_{1} and
C_{2} except for x
and ¬x. For example, the resolvent of
(v_{1} ∨ v_{2} ∨ v_{3})
and
(¬v_{1} ∨ v_{2} ∨ ¬v_{4})
is
(v_{2} ∨ v_{3} ∨
¬v_{4}). The variable x is called the
pivot. Resolvents are added to the clause data
base. If a generated resolvent is ∅ then the formula
has been shown to be unsatisfiable. If no such resolvent can be
generated, then the formula is satisfiable.
 regular resolution: Choose a variable x and generate
all possible resolvents with x as pivot. Then, without loss,
all clauses containing x or ¬x can
be removed from the clause database. For example, suppose formula
φ contains clauses
(v_{2} ∨ v_{3} ∨ ¬v_{4}) ∧ (¬v_{2} ∨ ¬v_{4} ∨ ¬v_{5}) ∧ (v_{2} ∨ ¬v_{3} ∨ ¬v_{5})
and no other clauses in φ contain v_{2} or
¬v_{2}. Then the resolvents
(v_{3} ∨ ¬v_{4} ∨ ¬v_{5})
and
(¬v_{3} ∨ ¬v_{4} ∨
¬v_{5}) may completely replace the three
clauses above that contain either v_{2} or
¬v_{2}.
In practice, this is implemented as the
DavisPutnamLovelandLogemann procedure using the following rules:
given formula
φ, using the set of sets representation for CNF formulas, and a
variable v in a clause of φ, form
φ_{1} = {c \ {v ∈ c} : c ∈ φ and ¬v ∉ c}
φ_{2} = {c \ {¬v ∈ c} : c ∈ φ and v ∉ c}
Then DPLL(φ) = DPLL(φ_{1}) ∨
DPLL(φ_{2})
and DPLL(φ) = true if φ=∅ and
DPLL(φ) = false if ∅∈φ.
 The most effective SAT solvers today use a combination of the
following:
 Nonstochastic search: based on the operation of slide 7 of
the "Introduction to Satisfiability" lecture (also, regular resolution
shown above). Modern solvers use nonchronological backtracking.
 Learning: new clauses may be added to a clause database
at points where it is fruitless to search deeper.
 Restarts: the search is restarted, keeping what has been
learned from previous starts,
 Efficient data structures: for example, watched literals.
 Some classes of CNF Satisfiability are easy.
 Horn Formulas: every clause has at most one positive
literal. Solved using binary constraint propagation (see slide 23 of
the "Introduction to Satisfiability" lecture).
 Renamable Horn: the polarity of a subset of variables can
be reversed to produce a Horn formula. Such a subset can be found
efficiently.
 2SAT: every clause has no more than 2 literals. Solved by
traversing an implication graph (see pages 2425 of the "Introduction to
Satisfiability" lecture).
 qHorn: columns and rows of a 0+11 matrix representation
of a given CNF formula may be permuted and columns may be multiplied
by 1 to partition the matrix into Horn, 0, nonnegative, and 2SAT
quadrants. The Horn quadrant is solved, rows in the nonnegative
quadrant that represent clauses satisfied by the Horn solution are
deleted, and the remaining 2SAT quadrant is solved (see page 26 of
the "Introduction to Satisfiability" lecture).
 Nonstochastic search algorithms are based on the operation called
resolution (slide 29 of the "Introduction to Satisfiability" lecture).
Resolution is notoriously ugly on some very commonly occurring formula
types such as those derived from cardinality constraints (5 < x <
10, max(x,y,z),..), those derived from linear equations
(x⊕y⊕z=1), those associated with certain
functions such as multiplication.
 A circuit or software snipit may be faithfully represented by
a propositional formula. A temporal property may also be represented
by a propositional formula. Bounded model checking employs a SAT
solver to determine whether that property holds and under what
conditions. For a summary of Linear Temporal Logic see slides 12 and
13 of the "Introduction to Satisfiability" lecture. The following
summarizes the approach of Bounded Model Checking.
Let Q be a proposition that somehow expresses the operation of a
circuit or some software. This proposition may be a very long CNF formula.
Let P be a proposition that expresses a property we would like to
check that a circuit or software has.
We want to show that if Q is true (circuit behaves correctly)
then P is true (the property holds). Note that the property may hold
even if the circuit is misbehaving. The above can be written as the
logical expression:
Q → P (that is, Q implies P)
which we want to prove is true. This is logically equivalent to
¬Q ∨ P (that is, not Q or P)
This can be proved by contradiction  that is we show the complement
of this is false. By DeMorgan's law, negation gives
Q ∧ ¬P (that is, Q and not P)
Thus, proposition Q is conjoined with the
complemented P and the resulting formula is passed to a SAT
solver. If the solver returns "unsatisfiable", the property holds.
Otherwise the returned solution probably hints at the reason the
property does not hold. An example of this is two bullets below.
 Small Bounded Model Checking examples for SAT solvers can be
obtained from links in the "Satisfiability" section of the lecture
notes. Unfortunately, writing correct formulas takes some practice.
Here is a very simple example.
Consider a clocked device with one input and one output, the input and
output are low initially and the output stays low until the input is
found to be high at the beginning of a clock cycle. After that, the
output stays high.
First define variables. Call the input a and the
output b. Imagine that a changes its value only
mid way through a clock cycle and b changes only at the
beginning of a clock cycle. To model time, define variables
a_{1},a_{2},a_{3},...
where a_{i} is the value of input a at
the beginning of clock cycle i. Similarly for b.
Output b goes high only if b had been high on the
previous clock cycle or a goes high on this clock cycle.
The following constraints express this fact for 5 cycles:
p bdd 20 20
imp(or(b0,a1),b1)
imp(or(b1,a2),b2)
imp(or(b2,a3),b3)
imp(or(b3,a4),b4)
imp(or(b4,a5),b5)
Now add the following constraints which cause input a to be
high at the beginning of cycles 2 and 4:
equ(a2,T)
equ(a3,F)
equ(a4,T)
equ(a5,F)
and initialize the values of the input and output
equ(b0,F)
equ(a1,F)
Now run sbsat on this (say it is in a file called test.bdd):
prompt> sbsat R r maxsolutions 0 test.bdd
...
b0 a1 *b1 a2 b2 a3 b3 a4 b4 a5 b5
Satisfiable
...
Observe that b1 should be 0 but there is a satisfying assignment where
it is 1. The problem is we have not forced b to be 0 if its
value on the previous cycle was 0 and a on this cycle is 0.
This can be done by replacing imp with equ.
p bdd 20 20
;  operation
equ(or(b0,a1),b1)
equ(or(b1,a2),b2)
equ(or(b2,a3),b3)
equ(or(b3,a4),b4)
equ(or(b4,a5),b5)
;  init
equ(b0,F)
equ(a1,F)
;  test  set values
equ(a2,T)
equ(a3,F)
equ(a4,T)
equ(a5,F)
Now prove that b is low at cycle 5 if a never
goes high. Replace the test section above with the following:
not(imp(and(equ(a2,F),equ(a3,F),equ(a4,F),equ(a5,F)),equ(b5,F)))
Observe that the proposition expressing the desired property has been
complemented. Running sbsat produces "Unsatisfiable" which proves the
theorem.
 Now check this testable tidbit out. Replace the first equ
in test.bdd with imp and run sbsat. The result is:
b0 a1 b1 a2 b2 a3 b3 a4 b4 a5 b5
Satisfiable
which shows the error in the logic. Next remove the not
and run sbsat to find many satisfying assignments:
b0 a1 b1 a2 b2 a3 b3 a4 b4 a5 b5
b0 a1 b1 a2 b2 a3 b3 a4 b4 a5 b5
b0 a1 b1 a2 b2 a3 b3 a4 b4 *a5 b5
b0 a1 b1 a2 b2 a3 b3 *a4 b4 *a5 b5
b0 a1 b1 a2 b2 a3 b3 a4 b4 a5 b5
b0 a1 b1 a2 b2 a3 b3 a4 b4 *a5 b5
b0 a1 b1 a2 b2 a3 b3 *a4 b4 *a5 b5
b0 a1 *b1 a2 b2 *a3 b3 *a4 b4 *a5 b5
Satisfiable
Now replace the imp with equ (to get the original
working file but without the not) and run sbsat:
b0 a1 b1 a2 b2 a3 b3 a4 b4 *a5 b5
b0 a1 b1 a2 b2 a3 b3 a4 b4 a5 b5
b0 a1 b1 a2 b2 a3 b3 a4 b4 a5 b5
b0 a1 b1 a2 b2 a3 b3 *a4 b4 *a5 b5
b0 a1 b1 a2 b2 *a3 b3 *a4 b4 *a5 b5
Satisfiable
Huh? Lesson: without negating the property to prove it becomes hard
to tell whether we have a bug or not! Just to be sure, put
the not back in and run sbsat:
Unsatisfiable
 In the elevator homework there are constraints that force at most
one outer door to be open at a time:
or(not(f1),not(f2))
or(not(f1n),not(f2n))
or(not(f1),not(f3))
...
This pattern is common  anytime there are many possible values for
some problem variable but one value is forced, the pattern above shows
up. This shows that many clauses of a formula modeling some
phenomenon will have 2 literals. Formulas with only 2 literals can be
solved efficiently  this can be used to help solve the given formula.
 Also in the elevator homework there are constraints that force
the outer door on floor X to be open if the elevator door is
open and the elevator is on floor X. These look like this:
or(not(e1),not(ed),f1)
or(not(e2),not(ed),f2)
or(not(e3),not(ed),f3)
...
This pattern is also common because often, when modeling, the occurrence of
several events, say, implies another event must occur. The
above can be written:
imp(and(e1,ed),f1)
imp(and(e2,ed),f2)
imp(and(e3,ed),f3)
...
which should be clearer. By the way, you may wish to run sbsat on this file:
p bdd 10 10
equ(or(not(a),not(b),c),not(imp(and(a,b),c)))
to show that the or and the imp expressions are
equivalent.
 Satisfiability Modulo Theory (SMT) solvers have been developed
because (resolutionbased) SAT solvers are notoriously inefficient
on many important classes of problems. Additionally, SMT solvers
support the much more expressive first order logic.
 A first order logic formula has quantifiers (∀,∃),
predicates, functions, and domains. For example:
∀_{S⊂{1…n}}(∃_{S'⊂S} P(S',l) ∨ ∃_{S"⊂~S} P(S",l))
 The semantics of a first order formula are provided by a
structure. A structure defines all symbols associated with predicates
and functions and associates predicates and functions with those
symbols. It also defines the domain on which those predicates and
symbols apply. Finally, all constants and variables are specified
by the structure. Functions of no arguments are constants.
For the following:
∀_{x}(∃_{y} P(g(y,x),l) ∨ ∃_{z} P(f(z,x),l))
We might have the variable symbols be
V={x,y,z}, the propositional symbols be
P={P}, the function symbols be
F={g,f,l}, and the constant symbols be
C={l,0}. The domain
might be A is all nbit natural numbers except
0 and the semantics may be given by the following:
g(y,x) = y if xy == x, 0 otherwise
f(z,x) = z if (1^x)z == x, 0 otherwise
P(q,l) = true iff the number of 1 bits in q is l and
those bit positions form an arithmetic progression.
l = 6.
The above semantics cause the formula to represent the question
whether n is a Van der Waerden number for two classes and arithmetic
progression length 6 (this is just an example).
 The important concept in first order logic is validity. A formula
is valid in a structure if it evaluates to true regardless of values
given to (quantified) variables. So, the formula expresses something
that, in a particular world, we can count on as being fact. The above
formula is valid if n is at least 1132. If n is below 1132 there are
some assignments to x,y,z that cause the formula to evaluate
to false.
 A theory is a set of axioms and any formula that is valid due to
those axioms. We start with just some axioms then try to add as much
as we can to the theory, as long as new formulas (which we hope will
become facts) are valid.
 But showing that a new formula is valid is the same as showing
that the theory plus the complement of the formula is unsatisfiable
among the quantified variables. This is what is attempted in SMT.
 Some examples of theories that we are interested in are:
 the theory of bit vectors  includes all the usual axioms that
apply to modulo arithmetic.
 the theory of lists  some of the axioms may be as follows:
car(cons(x, y)) = x
cdr(cons(x, y)) = y
¬atom(x) = cons(car(x), cdr(x)) = x
¬atom(cons(x, y))
 the theory of uninterpreted functions. The main axiom here is
if x == y then f(x) == f(y).
 the theory of linear arithmetic  includes all the usual axioms
involving multiplication, addition over rationals.
 Assume that there is an effective solver for any theory that we
are interested in. That is, assume there is some procedure which can
determine whether or not a given formula over the symbols of that
theory is valid in that theory.
 An SMT solver aims to prove validity for formulas that contain
mixed theory expressions. There are two things the SMT solver does in
this regard.
 The first thing an SMT solver does with a mixed theory expression
is to add constants, make substitutions of equals for equals, and
add equality constraints that does not change the validity of the
original expression but results in an expression that can be
partitioned into singletheory segments. A simple example is shown
in slides 12 and 13 of the "NelsonOppen" lecture. From the theory
that mixes uninterpreted function, lists, linear arithmetic and this:
φ = {x ≤ y, y ≤ x + car(cons(0, x)), P(h(x)h(y))}
we want to show that P(0) is valid. We
add ¬P(0). We finish with
φ_{1} = {x ≤ y, y ≤ x + g_{5}, g_{3} = g_{1}  g_{2}, g_{4} = 0}
φ_{2} = {g_{1} = h(x), g_{2} = h(y), P(g_{4}) = false, P(g_{3}) = true}
φ_{3} = {g_{5} = car(cons(g_{4},x))}
Set φ_{1} is in a pure linear arithmetic theory, set
φ_{2} is in a pure uninterpreted function theory, and set
φ_{3} is in a pure list theory. Typically, the resulting
expression is going to have a form like this:
(φ_{11}∧ … ∧φ_{1k}) ∨ … ∨ (φ_{s1}∧ … ∧φ_{sk})
where each φ_{ij} is pure in theory j and is part
of the i^{th} conjunction of such pure terms. The
above example is simple and only has one conjuncted term.
 The second thing that the SMT solver does is apply the theory
solvers to the pure formula segments. The theory solvers will make
inferences that may be propagated to other theory solvers. This may
result in new inferences of equality to propagate. The process
repeats until unsatisfiability can be proved or until new inferences
cannot be generated. This is illustrated in slide 14 of the
"NelsonOppen" lecture. The procedure for doing all this is shown in
slide 14 of the "NelsonOppen" lecture. For this simple example
unsatisfiability results showing that the addition of P(0) to
φ is valid.
 The following c code snipit:
u_int_4 a, b;
bool c;
u_int_4 t = a + b;
u_int_4 s = a * b;
if (s == t+1) c = true; else c = false;
looks like this in the Yices language:
(define a::(bitvector 4))
(define b::(bitvector 4))
(define t::(bitvector 4))
(define s::(bitvector 4))
(define c::(bitvector 1))
(assert (= t (bvadd a b)))
(assert (= s (bvmul a b)))
(assert (implies (= (bvadd t 0b0001) s) (= c 0b1)))
(assert (implies (not (= (bvadd t 0b001) s)) (= c 0b0)))
and the property that c is true given a is
no greater than 7 and no less than 2 and b is no greater
than 3 adds these assertions:
(assert (= c 0b1))
(assert (bvlt a 0b1000))
(assert (bvgt a 0b0001))
(assert (bvlt b 0b0100))
 A Binary Decision Diagram (BDD) is a rooted binary directed
acyclic graph with two leaves labeled T and F. Other nodes are
labeled with variable names, and edges are labeled 0 or 1.
Out of each nonleaf node, one edge is labeled 0 and one edge is
labeled 1. There is an order on the variables of the BDD and
variables encountered along any path from root to leaf obey it
(although not all variables may be encountered).
A single BDD compactly represents a Boolean function.
Every path to F represents a clause (the polarity of each literal is
positive (negative) if 0 (1) is encountered on the edge leading out
of the corresponding variable node). Anding the clauses gives the
function represented by the BDD.
 A linear equation
v_{1} ⊕ v_{2} ⊕ ... ⊕ v_{n} ⊕ = 1
is compactly represented as a BDD (slide 4 of the "Slides on Above" in
the BDD lecture notes section). Corresponding CNF formulas are
exponential in size.
 The order of variables in a BDD affects its size (possibly drastically).
 Collections of BDDs may share nodes  each node is placed in a
hashtable. The hashtable address of a node is determined by the
variable name and the addresses of each of its descendents.
 When conjoining BDDs, the intermediate BDD may become
exponentially large before it is reduced in size to its final form
(see slide 14 of the "Slides on Above" notes).
 A BDD naturally expresses a function
as f(v,x) = (v
∧ h_{1}(x)) ∨ (¬v ∧
h_{2}(x)). But this is satisfiable if and
only if f(x) = h_{1}(x)
∨ h_{2}(x) is satisfiable. Removing a
variable from f in this way is called existentially
quantifying v away from f.
 In a collection of BDDs, it is easy to existentially quantify a
variable away from the function represented by a BDD if the variable
only appears in that function.
 Existential quantification can reveal inferences (see slide 17 of
"Slides on Above") but it can blur functional relationships (see slides
18 and 19 of "Slides on Above").
 Suppose f and c are two BDDs in a
collection of BDDs (they are to be or can be conjoined). Expand the
truth tables associated with each to accomodate all the variables that
are in both f and c (double every row
of a BDD's truth table for each variable not in that BDD but in the
other BDD and let the entry in the column of the variable be 0 in one
row and 1 in the other row). For every row of c's
truth table that has value 0, we are free to change the value of the
corresponding row in f's truth table to anything we
want. In zerorestrict we set all values to 0. This may or may not
reduce the size of the former f's BDD. It may or may
not reveal inferences.
 More successful is the restrict operation that existentially
quantifies away all variables in c that are not
in f and then does zerorestrict on f.
 Another successful restrict operation is called generalized cofactor.
In this case, the value of a row of f that may be
freely changed becomes the value of the "nearest" row where nearest
is determined by a weighted hamming distance over a permutation of
variables (see slide 31 of "Slides on Above").
 Choose a BDD c. Replace all other BDDs by their
generalized common cofactor wrt c. This
allows c to be removed completely from the collection.
 Algebraic methods complement SAT and BDD methods. Operations are
on equations of the form
∑_{0≤i≤2n} c_{i}v_{1}^{b0,j}...v_{n}^{bn1,j}
where c_{i} is a 2^{n} dimensional,
01 vector, b_{i,j} is the j^{th}
bit in the binary representation of number i, and v_{i}
are 01 variables. An example is
v_{1}v_{2}v_{3} + v_{2}v_{3} + v_{1}v_{3} + v_{1} + v_{2} + 1 = 0
 Examples of equations representing functions are given on slide 6
of "Slides" in the Algebraic Methods section of the lecture notes.
 Equations are manipulated according to the following rules:
 Any even sum of like terms in an equation may be replaced by 0.
 A factor v^{2} may be replaced by v.
 An equation may be multiplied by a term, the resulting equation
may be reduced by the rule above.
 Two equations may be added, using mod 2 arithmetic. The new
equation is said to be a new, derived fact and is added to the fact
database.
 An example of a derivation using these rules is on slide 7 of the
"Slides".
 Theorem: Using rules stated above, the equation 1=0 is always
derivable from an inconsistent set of multilinear equations
and never derived from a consistent set.
 A BDD's function may be represented as an equation.
 Theorem: The number of derivations used by the algebraic
solver is within a polynomial factor of the minimum number possible.
 Theorem: minimum number of derivations used by the
algebraic solver cannot be much greater than, and may sometimes be far
less than, the minimum number needed by resolution.
 By multiplying an equation by a term, inferences may be revealed
(just in the same way that inferences are revealed with restrict in
BDDs). See slide 11 of "Slides" for examples.
 The counterpart to existential quantification in algebra is to
choose a variable in an equation, create one equation leftside where the
variable takes value 0 and one where the variable takes the value 1,
and multiply both expressions (see slide 12 of "Slides" for examples).