Thanks to Scott Y’s comments in my article regarding a Sudoku Solver I wrote as part of an AI project, I decided to write a real solver that not only can solve any Sudoku puzzle, but can do it in just a few milliseconds.
Long story short: I represented Sudoku as an exact cover problem, then used Donald Knuth’s Algorithm X and Dancing Links implementation to solve that exact cover problem. Still here? Good, it’s time for me to explain a few things.
(Update October 2013: This project is available on GitHub!)
There exists a type of problem called “exact cover”, and while Wikipedia explains things ever so well, I’ll throw in my own explanation that should serve well at least in the context of Sudoku.
Imagine you are challenged to rearrange where you place your cell phone, keys, and wallet. One item must go in a pants pocket, one item must go in a coat pocket, and one item must go in a backpack—and no two items can be in the same area. Have a solution yet? I hope you said yes. You could solve this a variety of ways: keys in pants, phone in coat, wallet in backpack, etc.
Now imagine you have a hundred pockets, and three hundred different things that must be placed in them in a certain configuration. Now things get a little more difficult. This is because the number of constraints has grown much larger, as has the number of items that could potentially satisfy those constraints. How would you go about solving this issue?
Well, an exact cover approach would do the following: create a list of constraints that must be satisfied, then create a list of potential ways to satisfy each constraint. The list of constraints become the column headers of a matrix, and the list of ways to satisfy each constraint become rows in that matrix. To simplify things, let’s consider the original pocket rearrangement problem:
Our list of constraints: pants pocket filled, coat pocket filled, backpack filled
Our list of possible placements of items: keys in pants pocket, keys in coat pocket, keys in backpack, wallet in pants pocket, wallet in coat pocket, wallet in backpack, phone in pants pocket, phone in coat pocket, phone in backpack.
Let’s look at the matrix we made:
You’ll notice I labeled the columns (constraints) as questions. The question is whether a row satisfies the constraint. Having Keys in our pants pocket satisfies the constraint that our pants should have an item in them, but not the other two constraints. Let’s look at the matrix with the answers filled in:
In reality, ‘Yes’ and ‘No’ are filled with Boolean True (1) or False (0) values:
Remember, this matrix represents our constraints, and possibilities that satisfy those constraints; we haven’t solved the problem yet. To solve the problem we have to “exactly cover” the constraint columns. That is, choose a set of rows so that a ‘1’ appears exactly once in each column.
Exact cover problems are typically represented like this: given a matrix of 1’s and 0’s, choose a set of rows so that every column has a ‘1’ in it exactly once.
BUT! We forgot a few constraints on this item rearrangement problem; we can’t exactly place our keys in all three places at once and call it a day, and we definitely need to place each of the three items somewhere. Let’s add a few constraints to reflect these requirements:
The initial solution I wrote down (pants: keys, coat: phone, backpack: wallet) does exactly cover our matrix, but we did this without even thinking, so how can we tell a computer to do it?
There are a few ways to go about solving this exact cover problem; the simplest (naive) way would be to just randomly pick a set of 3 rows over and over again until a solution is found. We can be a little smarter than that though.
The famous Donald Knuth is credited with formulating an algorithm to solve this exact cover problem, an algorithm he dubbed “Algorithm X”. Again, while the Wikipedia page is incredibly helpful, I’ll offer my own interpretation using the above example matrix.
Returning to the items-in-pockets problem, when we selected keys to go in the pants pocket, we also disregarded the wallet or phone being in that pocket because of the rule that no two items could be in the same area. That is, the rows compete against each other to be part of the solution, and when one row is chosen, its competition is removed. We also disregarded the keys being in any of the other pockets, so other rows where the key gets placed get removed too.
Algorithm X functions on this exact premise: select a constraint to fulfill, try to add an entry into the solution, remove competing entries, then repeat on the remaining matrix until a solution is found or a solution is impossible to find.
Let’s run Algorithm X on our simple problem:
Step 0: The matrix is not empty, so continue
Step 1: Choose a constraint to fulfill (arbitrarily choose Pants Pocket)
Step 2: Choose a row that fulfills that constraint (Keys in pants) and add to solution
Step 3: For each constraint that row satisfies, remove any other rows that also satisfy that constraint (Wallet in Pants, Phone in Pants, Keys in Coat, Keys in Backpack), and remove the fulfilled constraints’ columns (Pants Pocket and Keys Placed)
Step 4: The matrix is not empty, so continue
Step 5: Choose a remaining constraint to fulfill (Coat Pocket)
Step 6: Choose a row that fulfills that constraint (Wallet in coat) and add to solution
Step 7: For each constraint that row satisfies, remove any other rows that also satisfy that constraint (Wallet in Backpack, Phone in coat), and remove the fulfilled constraints’ columns (Coat Pocket and Wallet Placed)
Step 8: The matrix is not empty, so continue
Step 9: Choose a remaining constraint to fulfill (Backpack)
Step 10: Choose a row that fulfills that constraint (Phone in backpack) and add to solution
Step 11: For each constraint that row satisfies, remove any other rows that also satisfy that constraint (None), and remove the fulfilled constraints’ columns (Backpack and Phone Placed)
Step 12: The matrix is empty! We’ve found a solution!
Now that you have a basic understanding of Algorithm X, think about the same problem except you are given that your phone is already in your pants pocket. You perform the same competition-removal, and now your matrix is smaller and easier to manage. This is what Sudoku is like.
Making Sudoku an Exact Cover Problem
Sudoku, like organizing your pockets, can be represented as an exact cover problem. Our matrix will just be a lot bigger.
Sudoku has a few explicit rules: Values 1-9 must be in every Row, Column, and Box; and the numbers may not repeat in any Row, Column, or Box. An implicit rule is that every cell must contain a value.
Using the almighty power of algebra, we can discern that there are 324 constraints. Every row must have 9 values in it, and there are 9 rows (9 * 9 = 81), every column must have 9 values in it, and there are 9 columns (9 *9 = 81), every box must have 9 values in it, and there are 9 boxes (9 * 9=81), and every cell must have a value in it, and there are 81 cells. 81 + 81 + 81 + 81 = 324.
Again, we can use algebra to discover that there will be 729 rows in our matrix, because there are 729 row, column, value triplets (boxes are ignored because box value is a function of row and column values). Row, column, and value triplets represent a single entry into the empty Sudoku puzzle: (row 1, column 1, value 1), ( row 1, column 1, value 2), … , (row 9, column 9, value 9). There are 9 rows, 9 columns, and 9 values in a Sudoku puzzle: 9 * 9 *9 = 729 rows in our exact cover matrix.
Each of these rows also satisfies four constraints: a row-value pair, a column-value pair, a row-column pair, and a box-value pair. That is, (row 1, column 1, value 1) satisfies the constraint that row 1 must have the value of 1 in it somewhere, column 1 must have the value of 1 in it somewhere, row 1 column 1 must have a value in it, and box 1 must have the value of 1 in it somewhere.
A sample Sudoku exact cover matrix can be seen here. Although, not an entirely accurate matrix, it should give you an example of the representation. My own complete matrix is included near the end of this article with the source code of my program.
Now that we’ve turned Sudoku into an exact cover problem, we’re ready to run our Algorithm X on it.
Dancing Links (DLX)
The Sudoku’s exact cover matrix is a little larger than our clothes-items one. Also, it has way more 0’s than 1’s in it. This type of matrix is called a sparse matrix, and if we simply stored the whole thing in a 324 x 729 array, we’d spend an inordinate amount of time searching for 1’s.
So our old friend Donald Knuth created Dancing Links, or DLX. The gist of DLX is that we represent our matrix as a quadruply-linked circular list that only stores the 1’s from our matrix.
Each node in our list has links to the nodes to its right, left, top, and bottom areas in the matrix. It also points to a special column header node.
The graphic below (found at here) depicts what our data structure looks like, though it doesn’t show that there is a link between every node and its column header.
This circular linked list not only removes the task of searching through 0s to find 1s, but is also very efficient for performing removal and re-instatement of nodes because of the simple logic that removing a node from a row simply requires:
Node->left->right = Node->right;
Node->right->left = Node->left;
And reinstating the node requires
Node->left->right = Node;
Node->right->left = Node;
Using DLX with Algorithm X is a very fast way to solve any Sudoku puzzle.
More about Cover and Uncover
The above explanation may not be clear enough as to how we remove constraints and solutions in DLX. Let’s begin with a little review.
A Circular Linked List
A circular linked list is a simple way to represent a list of items, and the very end of the list wraps around to the front. Each node in the list is a data structure that knows which node is to its left, and which is to its right:
How do we remove a node from a linked list? Let’s go through an example. Say we want to remove (cover) node ‘B’ from the above list:
- B tells its left neighbor (A) to connect to B’s right neighbor (C) instead of connecting to B:
- B tells its right neighbor (C) to connect to B’s left neighbor (A) instead of connecting to B:
- Now that B’s left and right neighbors can find each other, we can remove B from the list!
- But if we want to put B back in later (uncover), then we won’t delete it. We’ll just hold onto it somewhere else:
- Notice that B still knows its original neighbors. If we wanted to put it back in, then we’d simply reverse steps 1 and 2. This is how a cover and an uncover would work in a circular list.
Let’s turn our circular linked list on its side:
Now it’s a column instead of a row. Why have I colored the top node green? Because in our DLX implementation, we use special column nodes to represent a constraint. Let’s add a little more detail to that figure:
So this is how we represent a constraint in DLX — a boring old linked list BUT with the additional property that every node has a link to the special column header. In the later figures I’ll omit the link to the column header for presentation reasons, but it’s always there.
Now, in our Sudoku puzzle, we have more than one constraint, so let’s add a few other constraint columns to our figure for realism:
Now we have a series of linked lists representing our constraints. Now we need a way for these lists to reference each other. So what do we do? How about another linked list? A linked list of linked lists:
Each column in DLX is a linked list, as is each row. Here’s what a single node’s data structure looks like:
Nodes in DLX don’t just point left and right, they also point up and down (and to their respective column headers).
When Algorithm X needs to perform covering, it uses the notion of circular linked list covering at its heart. After we choose a row to satisfy a constraint:
- For each 1 in the row,
- Cover the column header node containing that 1, AND
- Cover any other row with a 1 in the same column
If we consider the fact that each column of the matrix itself is its own list, then covering a row is simply removing a node from each column list:
In the figure above, we’re only removing a row. Algorithm X may remove multiple rows per constraint satisfied (plus the column header of the satisfied constraints!).
My implementation is written in C++ on Visual Studio 2008 running on Windows XP. If you ever use this source code, all I ask is that you credit me for it (you don’t even need a link to this article). I used .doc files because they were the only real file upload option WordPress would give me outside of .docx, and I figure more people will find compatibility with .doc files. This project is now hosted on GitHub (https://github.com/gieseanw/SudokuSolver) If you can’t manage to download and view these files then e-mail me at my email address found on my About Me page.
Alternatively, I will keep the old .doc files up for you to download. I really recommend getting the files through Git, as you don’t have to go about opening them in Word or OpenOffice and then copying the code into a .cpp or .h file. Making them super small text, though!
These files are the heart of the program – they implement the dancing links and algorithm x functionality. Puzzles are fed to the program via simple text files at the moment, but feel free to build a GUI around them!
This file controls sizing information about the matrix. My solver works in the general case, not just standard 9×9 Sudoku Puzzles. Commented out in the file is the ability to expand the solver to 10×10, 12×12, or 16×16, though you may need to allow your program to use more memory!
This file simply instantiates and uses a SudokuMatrix, looping to allow the user to solve multiple puzzles per session.
Exact Cover Matrix
This file is simply the binary output of my 324 x 729 exact cover matrix for a 9×9 Sudoku. There’s no row or column headers, though I very heavily document the assumptions I make about it in SudokuMatrix.h
Algorithm X’s little secret is that it’s actually a brute force solver—it doesn’t always choose the next best candidate to be part of the solution. In that regard, every iteration of the algorithm is a guess! It’s just a very, very fast brute-force solver: I was able to solve expert level 16×16 Sudokus in under a second!
Algorithm X is a great candidate to implement recursively: the same manipulations are performed on an object until one of two base cases are met, and if the failure base case is encountered, the algorithm can easily backtrack by simply returning one level up from the recursive stack.
The hardest thing for me to get my mind around was the cover() and uncover() functions that hide and reinstate values into the matrix. At first this was because I misinterpreted how it should’ve worked and accidentally covered the columns of ALL values that were covered. (For example, when we chose keys in the pants pocket, we removed the phone in pants pocket option. Because the phone in pants pocket also covers Phone Placed? I was initially also covering this constraint, when in fact placing keys in pants pocket does not cover it). Then I didn’t realize that the uncover function must work in the opposite order that the cover function did.
I had a lot of help using this site as a reference aside from the Wikipedia articles.
I’m not too sure how long this program took me to write… maybe 20 hours or so. Most of that was spent debugging the cover() and uncover() functions, though! I’m sure you could do it in half the time or less.
I’m all ears for questions or comments; I’ve done the best I could to thoroughly explain the concepts at work here. The source code is available above this section, in case you skipped here to the end.