20ECES228001  Data Structures  Fall 2007 


Due: September 28, 2007
Submit source code for backtracker solver and logic modeller to xxxxx@email.uc.edu
Rationale:
This is a warmup for work we will be doing this quarter. There are numerous ways to solve a Sudoku puzzle and most find solutions quickly. In this homework we will attempt to solve a puzzle by means of backtracking and by transforming to logic constraints and solving with a SAT solver. 
Purpose:
Use classes in a meaningful way, to represent objects. Read input data from a file. Recall backtracking and recursion. Study transformation to logic constraints. 
Background:
The game of Sudoku is played on a 9x9 grid. The grid is divided into 9 3x3
subgrids, three across at the top, three across in the middle and 3 across
at the bottom. At the start of a game some elements (about 30) have
numbers and some (about 51) do not (that is, they are unassigned). A player
must place numbers in unassigned elements. A game is completed and won by a
player if the following constraints are satisfied:
The numbers in dark squares are the initial element assignments. Click on Solve to attempt to complete a solution to the current assignment of numbers. Choose another initial set of numbers by clicking on the File button and selecting a file. Click on Clear to clear only the numbers in the lightly colored squares. Click on Clear All to clear the entire board. To set fixed numbers (dark squares), click on a square, enter a number, and then hit the "Enter" key. Erase by hitting the backspace key. If there is no solution to the problem at hand, numbers will not be shown in the lightly colored squares. Backtracking and Recursion: The following is a general template for solving a problem by means of backtracking. The type of problem solved requires filling an array with values that do not violate constraints. Values are tried for each array elements consecutively. If constraints are not violated, a value is found for another element. If some constraint is violated another value is tried for the current array element. If all values have been tried for the current array element and none satisfy constraints then change the value of a previously assigned array element.
Solve(d) /* Assume I is a global structure containing initial values. */ /* V is a global array that will be filled with values by this procedure. */ /* Assume V is initialized, if necessary. */ /* d is the number of array elements that have not been set. */ If d == 0 then output V; /* All elements have values, constraints satisfied */ For each possible value t for array element V[d]: V[d] = t; If constraints are satisfied by assigned elements of V then Solve(d1); Unassign V[d]; Return; Logic modelling: problems may be modelled with 01 valued variables and binary logic operators including 'or' (∨), 'and' (∧), 'exclusiveor' (⊕), 'not' (¬), 'equivalent' (≡), among others. Logic expressions may consist of subexpressions which are delimited with parentheses. Evaluation of a logic expression is from innermost subexpression out. For example, consider the following logic expression: 
Analysis:
We have to decide on a file format, an internal data representation, and
finally how the constraints are to be implemented.
File Format: The simplest format is to have the file store 0s for unassigned elements, and numbers from 19 for the fixed elements. Since we will read these numbers consecutively, it is best to store them in row major order. Since we will want to edit input files without the use of special editing software, the file should be a straight text file. The following illustrates a perfectly fine format for our purposes: 0 6 0 1 0 4 0 5 0 0 0 8 3 0 5 6 0 0 2 0 0 0 0 0 0 0 1 8 0 0 4 0 7 0 0 6 0 0 6 0 0 0 3 0 0 7 0 0 9 0 1 0 0 4 5 0 0 0 0 0 0 0 2 0 0 7 2 0 6 9 0 0 0 4 0 5 0 8 0 7 0where we have put in some double spacing for readability. Data Representation, Backtracking: The backtrack template above aims to fill array elements in order. Therefore, it is reasonable to represent all 81 board positions in an array of 81 elements. We interpret each array as follows: 0 means unassigned, 19 is a value that is for the corresponding board position. We can use the following C++ declaration to hold this array: int *board = new int[81];Thus, board[10] is the second column (that is, column 0) in the second row (that is, row 1), board[24] is the seventh column (column 6) of the third row (row 2), and so on. Given an index into the array board we can find the corresponding row and column in C++ as follows: row = index/9; column = index%9;Given a row and column the corresponding index into array board is index = 9*row + column;We aim to use both conversions because it is easier for the solver to have the data stored as an array and easier to implement constraints in terms of rows and columns. Data Representation, Logic: Many more variables are needed here than for backtracking since each variable can only take values 0 and 1. For the ith row and jth column we can use another 9 variables. The total number of variables will then be 9*81 = 729. Number the elements of each of the 3x3 subgrids as follows: 0 1 2 3 4 5 6 7 8Give each subgrid an identity as follows: 0 1 2 3 4 5 6 7 8For purposes of discussion let's identify each variable with three parameters and intended meaning as follows: w[i,j,k] = 1 if the value of the jth element of the ith subgrid has value k w[i,j,k] = 0 otherwiseAs for the case of the backtracking representation we will use an array board to store all these variables. The conversion from i,j,k to array index is given by index = 81*i + 9*j + kThe conversion from index to i,j,k is as follows i = ((index1)/81)%9; j = ((index1)/9)%9; k = (index1)%9;We will also have to worry about the output format because the result of the logic program will be to generate a file that will serve as input to a Satisfiability solver. The format we need is exemplified by the following: p cnf 729 10317 1 2 3 4 5 6 7 8 9 0 10 11 12 13 14 15 16 17 18 0 19 20 21 22 23 24 25 26 27 0 28 29 30 31 32 33 34 35 36 0 37 38 39 40 41 42 43 44 45 0 46 47 48 49 50 51 52 53 54 0 55 56 57 58 59 60 61 62 63 0 64 65 66 67 68 69 70 71 72 0 ... 3 4 0 3 5 0 3 6 0 3 7 0 3 8 0 3 9 0 4 5 0 4 6 0 4 7 0 4 8 0 4 9 0 ...where the first line always begins with "p cnf" and has parameters which are the number of variables (729 in this case) and the number of constraints (10317 in this case). The constraints all follow the first line and are expressed one constraint per line of file. All but about 30 of the constraints are the same for all problems as will be discussed later. The ones that are fixed will be given to you. Hence all you have to do is append the 30 or so constraints that represent the particular puzzle you are trying to solve and change the 10317 accordingly. Each constraint ends in 0  that tells the solver that there are no more variables in the constraint. The numbers in the constraints are the indices into the board array, plus 1 (0 is not allowed because it is a terminating character). If a '' preceeds a number that means the variable is negated. Constraints, Backtracking: Each of 9 subgrids has to be checked to make sure all assigned values are distinct. For each subgrid there should be two nested for loops to traverse all rows and columns like this: for (int r=imin ; r < imin+3 ; r++) { for (int c=jmin ; c < jmin+3 ; c++) {where imin and jmin are the starting row and column numbers for the subgrid. For example, the upper left subgrid starts at imin=0 and jmin=0. Assign a counter to each number from 1 to 9. Initialize each counter to 0. As each of 9 elements is considered increment the counter associated with the value of the element. If any count goes past 1, there is a violation and backtrack must commence. Each of 9 rows must be checked similarly. This time there is just one for loop, spanning all the rows of a particular column. Assign counters as before to the numbers from 19 and increment as those number are encountered in the loop. Repeat the above for columns. The code is nearly the same. Constraints, Logic: The 10287 fixed constraints, which comply with the conversions between i,j,k and w array index as stated above, are given in the file sudoku.cnf. These constraints are all logical 'or' subexpressions. Since they are given in the file you do not have to construct them yourself. But the interested reader may look at this page to see how they are constructed. You need to append to that file one constraint for every fixed number that is expressed in the input file. For example, if the 0th element of the 2th subgrid has fixed value 4 we need to fix the value of board[2*81+0*9+4] to 1. The constraint added to the output file is: 167 0Remember to also calculate the number of constraints and prepend the header to the output file as well. 
Homework Poblem:
Write a Sudoku solver that is based on backtracking. The solver takes a file name as input. The file contains all the fixed numbers for a given puzzle. File details are given below. The solver outputs to console the solution to the given puzzle. Output details are given below. Write a logic modeller that takes the same file as input and outputs a file that may be solved by a SAT solver. Format of the output file is given above. You are welcome to try your output files on a SAT solver such as this one. 
Requirements:
#include <time.h> // Be sure to include this header #include <iostream> using namespace std; int main () { clock_t t1 = clock(); // absolute time in microseconds for (int i=0 ; i < 1000000; i++); clock_t t2 = clock(); // absolute time in microseconds for (int i=0 ; i < 10000000; i++); clock_t t3 = clock(); // absolute time in microseconds cout << "Time: 1st loop=" << (t2t1) << ", 2nd loop=" << (t3t2) << "\n"; }Compile using g++ tst.cc o tst then run using ./tst. The output might look something like this
Time: 1st loop=10000, 2nd loop=80000which means 0.01 seconds for the first loop and 0.08 seconds for the second loop.
#include <iostream> using namespace std; int main () { getchar(); int *a = new int[250000]; getchar(); }Compile this using g++ tst.cc o tst. Then run it from the command line using ./tst. Open another shell and execute ps v  grep tst. You will see something like this:
4865 pts/4 S+ 0:00 0 2 7085 848 0.1 tst 4872 pts/6 S+ 0:00 0 80 56695 620 0.0 grep tstThe first line shows the resources used by tst. The column showing 848 displays memory in residence used (in KB) and the column showing 7085 displays virtual memory used (in KB). The total used for our purposes will be the sum of the two. In the first shell hit return (advancing to the next getchar()) and in the second shell type ps v  grep tst again. You might see something like this:
4865 pts/4 S+ 0:00 0 2 8065 896 0.1 tst 4877 pts/6 S+ 0:00 0 80 56695 620 0.0 grep tstObserve that (8065+896)(7085+848) = 1028 KB = 1003906 Bytes. Which is approximately the space added to the program by int *a = new int[250000]; since each int takes 4 bytes. The RAM usage we are looking for is the difference of the RAM used just before temination and the RAM used at entry. Hence a getchar() at the beginning and end of the program is enough to determine RAM.
class Sudoku { int *board; // Board values public: Sudoku () { // Constructor board = new int[81]; for (int i=0 ; i < 81 ; i++) board[i] = 0; } void readFile (char *filename) { ... } // Pretty print number in row i, column j of the board void printCell(int i, int j) { ... } // Pretty print the entire board void printBoard () { ... } // A number has just been placed in row i, column j. If all columns // rows and squares do not violate contraints then return true, otherwise // return false. bool check (int i, int j) { int count[9]; // To determine whether there is a violation for (int i=0 ; i < 9 ; i++) count[i] = 0; // Check small square ... // Check row ... // Check column ... } void solve (int d) { ... } }
The above can be used in main as follows:
int main (int argc, char **argv) { if (argc != 2) { cerr << "Usage: " << argv[0] << "\n"; exit(0); } Sudoku *game = new Sudoku(); game>readFile(argv[1]); game>solve(81); game>printBoard(); }