Download your copy of 3SAT, the 3SAT problem generator and solver 3sat-0.8.1.tar.gz, by, John Haskins, Jr. today!


What is 3SAT and where did it come from?

3SAT is a program to decide the satisfiability of circuits expressed in 3-variable conjunctive normal form, and an accompanying arbitrary precision integer library. I coded it purely as an exercise, to keep my coding, analytical, and problem solving faculties sharp.

What is 3sat-0.8.1.tar.gz and how do I use it?

3sat-0.8.1.tar.gz is a tar-ed, gzip-ed archive, containing the source files for building the 3SAT problem generator and solver, including a simpler arbitrary-precision integer arithmetic library (to facilitate problems with more than 64 input variables). The source files should easily compile and run on many different flavors of UNIX and UNIX-like systems (e.g., Linux, FreeBSD, NetBSD, MacOS X, OpenBSD, BSDI, NeXT, BeOS, IRIX, OSF/1, Mach, Solaris, SunOS, AIX, HP/UX). The compilation procedure described in the Makefile has been tested with GNU's versions of the C compiler on Fedora, Ubuntu, and MacOS X.

To unpack the archive, gunzip and un-tar as in the example below:

All files should be deployed into a directory named 3sat-0.8.1. To build the executables,

Voila! 3SAT is ready to go; type:

at the command-line, where "-i" instructs 3sat to pull the 3SAT problem instance from the file expr1 (which is included in the 3sat-0.8.1.tar.gz archive). Once the problem is loaded, 3sat will immediately begin searching for a solution to determine the problem's satisfiability.

Alternatively, type:

at the command-line, where 96 is the number of variables, and 1024 is the number of clauses in the problem instance, and 3sat will generate a random 3SAT problem with 1024 three-variable clauses chosen from among 96 variables. If keeping the randomly-generated problem is desired, simply enter "-o filename" at the end of the aforementioned command-line; this file can later be used in conjunction with the "-i" option described above.

As the solver executes, it will silently test input variable configurations. To obtain the status of the solver's progress, you can send a SIGUSR1 signal to the solver process. From a separate window, get the process ID of the solver:

Then, find out the signal number for SIGUSR1, and send the signal:

In the window where the solver is running, you should see somethng like the following:

The string "000000000003990b2996000000000000" is a hexidecimal-formatted bitmap of the variable inputs presently being tested by the solver. Since there are 96 bits' worth of variables to test, the solver must test bit patterns from 00000000000000000000000000000000 through 00000000ffffffffffffffffffffffff; hence, the solver has so far tested less than 1% of the possible variable configurations.

Naturally, the probability of finding a satisfying bit pattern increases as the number of clauses is reduced, e.g.:

In this sample run, the program begins by telling the maximum number of non-repeating 3-variable clauses that can be formed from 32 variables: 23 · binom(32,3)= 39,680. Then, it invokes the getrusage system call to obtain the amount of time (in microseconds) spent generating the circuit; the first number is the amount of time spent by the CPU on the process's behalf in user space, and the second number is the amount spent in the kernel. Finally, the solver does its work, reporting satisfying bit patterns as it encounters them, and reporting the amount of time spent processing the 232 bit patterns.

Tell me about 3SAT's algorithm.

The algorithm is actually not very sophisticated at all. Starting with the all-zeros bit pattern, the solver tests each pattern against the circuit. If the circuit output evaluates to 0, the solver moves on to the next bit pattern; if the circuit output evaluates to 1, the solver reports the bit pattern, then moves on to test the next bit pattern. Since there are exponentially many bit patterns, the solver tries where it can to skip those configurations that would obviously cause the circuit to evaluate to 0.

The question of obvious 0 evaluation is a matter of ongoing research, but the approach used in this solver is to decide which of the circuit clauses caused the 0 evaluation, and to increment the bit pattern counter by some amount greater than one. Consider, if you will, that each circuit in a 3SAT problem consists of a set of clauses AND'd together:

C1 · C2 · C3 · ... · Cm

where each clause consists of a set of 3 variables OR'd together:

Cj = (vc + vb + va)

Each variable may be optionally inverted, changing the sense of its value during the clause's evaluation.

The solver evaluates the circuit by traversing the list of clauses in order, from C1 to Cm. If a clause evaluates to 1, the solver proceeds to the next clause, since there is still a chance that the entire circuit will evaluate to 1; if a clause evaluates to 0, the solver does not proceed to the next clause, since there is no chance that the entire circuit will evaluation to 1. In the latter case, let the 0-evaluating clause be (v83 + v47 + v29), and recall that there are 96 variables in the circuit: v1 through v96. Further, let the present evaluation bit pattern be represented by binary integer B. If the present evaluation bit pattern yields 0, then so will all 228 other possible configurations of v1 through v28, featuring variables v83, v47, and v29 set as they are in B. And since the solver processes the bit patterns in ascending order, it is safe to "skip" the evaluation of these bit patterns.

In other words, think of B as composed of bit patterns B96..29 and B28..1. "Skipping" unnecessary evaluations is achieved by incrementing B96..29 by one, setting all the bits of B28..1 to zero, and concatentating them to yield the new B. The circuit evaluation loop may now be rejoined, starting from clause C1, having saved ourselves as many as 228 needless evaluations.

An additional savings is realized by sorting the clauses in descending order, after the random circuit generation, and prior to the ascending order bit pattern evaluation loop. Also, considerable savings were also realized by short-circuiting clause evaluations. Consider again, the clause template:

Cj = (vc + vb + va)

If the variable vc evaluates to 1, then---because the variables' outcomes are logically OR'd together---the entire clause Cj will evaluate to 1; hence it is unnecessary to spend any further time evaluating vb and va. Similarly, if vb evaluates to 1, then evaluation of va can be safely skipped. This short-circuit evaluation was added to version 0.7.4, and cut running time relative to version 0.7.3 by about 45%.

Tell me about the arbitraty precision integer library.

I wanted the arbitrary integer precision library to be very light-weight, and built it from scratch as an exercise. (Gotta stay sharp!) The library defines a type integer_t, which is a simple structure that gives the length (in bytes and bits) of the integer, and supplies a pointer to an array of unsigned char bytes that contain the number itself. To manipulate these, the library supplies a series of functions to add, subtract, AND, OR, XOR, invert, compare (i.e., greater than [or equal], less than [or equal], or equal), and shift integer_t objects. Internally, the library keeps a stack of integer_t objects as a rapid-access cache of temporary storage for use by the arithmetic routines.

And, as you've probably already guessed, I named the library, and indeed all the routines with the jhjr_ prefix strictly for my own ego's sake: jhjr → JHJr → John Haskins Jr.

What are the known glitches in 3SAT?

The source code default caps the maximum number of variables to 512 in 3sat.c file:

This can be increased to whatever value desired.

What are the author's plans for the future of 3SAT?

I have a few half-baked ideas to improve the solver algorithm by juggling the variables, according to their relative proportionalities in the circuit... more on this later as it becomes clearer to myself.

I would also like to make the algorithm more intelligent with respect to caching. If the entire circuit fits inside in the level-1 data cache, the code runs amazingly fast, even for a very large number of input variables. If not, then even for a smaller number of input variables, this program will take a long time to complete. Recall that the algorithm works by processing each bit pattern on each clause in sequential order; if the clause evalutes to 1, the algorithm proceeds to the next. But if the next clause is not already in the L1 data cache, it will miss, and commence probing the L2 cache for the next clause. This quickly becomes very expensive as the number of clauses in the circuit increases.

Consider, for instance, a CPU with a 32kB L1 data cache, and a circuit whose representation in memory occupies 64kB. If the algorithm exhausts all clauses extant in the L1 data cache, requests for latter clauses will cause capacity misses: the new clauses will be brought in, displacing others. Later, when the algorithm repeats the process with the next bit pattern, it will begin from the first, probably displaced clause, triggering an L1 miss... Because the circuit's representation takes twice as much space as is available in the L1 data cache, it can easily set up a flip-flopping, back-and-forth oscillation of the first and latter halves of the circuit between the L1 and L2 caches. This problem becomes increasingly likely as the number of clauses increases, and wastes CPU cycles terribly. I have an idea how to address this... more later.

How do I contact the author?

I can be contacted via e-mail at predator@cs.virginia.edu, but I cannot guarantee a timely response.

I would be very interested in feedback from users of the solver, and/or the integer library. Wha'd'ya think?

Notes and final comments...

In addition to the sample problem file, expr1, the archive also contains a file with all the satisfying solutions to that problem: expr1.solutions. The latter was collected with the following command-line:

I have found it useful to compare the results from sundry tweaks to the solver engine by hashing the results output by 3sat, and comparing that to the hash of expr1.solutions, e.g.:

Since I trust the results in expr1.solutions, and the hash produced by 3sat matches its hash, I know that the solutions generated by 3sat are correct.

Also: I know that there are dozens more sophisticated methods for solving 3SAT problem instances. This whole project was mostly an exercise for myself to keep my programming, analytical, and problem solving skills sharp. Put another way: There are countless ways to improve this program. Why not give it a try... and send me a pointer to the result.

Legalese...

Neither I, the author nor the University of Virginia guarantee the suitability of 3SAT for any purpose, that the executables or any other part of the distribution be stable, correct or bug-free. 3SAT is free software, offered in the hope that others will find it useful and elucidating. Use at your own risk.

Latest version: 6 October 2009


3SAT Problem Generator and Solver. Copyright (C) 2008,2009 John Haskins, Jr.