User:Hakerh400/How to solve Slitherlink using SAT solver
Slitherlink (also known as Fences, Takegaki, Loop the Loop, Loopy, Ouroboros, Suriza and Dotty Dilemma) is a logic puzzle which is played on a rectangular grid. See details on wikipedia. It can be played online. There are many solution methods (the wikipedia article covers only a few of them), but since the game is NP-complete, no polynomial-time method can solve all puzzles (assuming P ≠ NP).
The easiest way to solve a Slitherlink grid is to use a SAT solver. Modern SAT solvers read the set of constraints as a CNF formula in DIMACS format. CNF stands for Conjunctive normal form. Constraints are represented as a conjunction of clauses where each clause represents a disjunction of one or more variables. Some SAT solvers require that each clause has at most three variables (also called 3-SAT). In that case, Tseytin transformation can be used to convert from arbitrary CNF to 3-SAT CNF.
Assume we have two constants and , representing the width and the height of the grid, respectively. The number of lines is . We have variables (one boolean value for each line). We may also have additional variables, but we do not care about them at the end. We want to construct a CNF, so that when we feed it to a SAT solver, the solver produces an assignment of variables that represents the solution of the puzzle.
Terminology: we say that each line is black (marked as a part of the loop), or white (otherwise).
Slitherlink has the following constraints:
- At least one black line exists
- No dangling lines
- No branching lines
- No crossing lines
- Around each numbered square, there are exactly that many lines
- All black lines are connected in a single loop
At least one black line exists
This rule is trivial. We simply assert that the disjunction of all lines is true.
No dangling, branching, or crossing lines
This can be achieved by iterating over all vertices and for each vertex adding an assertion that the number of lines around it is either 0, or 2.
Around each numbered square, there are exactly that many lines
Iterate over all numbered squares and for each of them add the appropriate assertion. It can be achieved by either adding a disjunction of all possible combinations (1 combination for 0, 4 comb. for 1, 6 comb. for 2, 4 comb. for 3 and 1 comb. for 4), or by allocating a new counter variable and conditionally incrementing it for each line and then adding an asserton that it is equal to the given number (in that case you would also need to implement increment on constant-sized bit-vectors).
All black lines are connected in a single loop
All constrants so far were easy. But how to achieve this one? This constraint is the key part of the game and we cannot avoid it. But how would we represent it as a CNF formula? There were some attempts in the past to do so, but many of them were failuers. For example, the author of this Slitherlink solver says:
- The above doesn't rule out multiple loops as a solution. I can't think of a practical way to specify that constraint. — Douglas W. Brown
Their solution calls SAT solver multiple times and if the solver outputs multiple loops, it adds extra assertions that forbid all of them and then call solver again. Most of the times it works well, but deliberately crafted puzzles can trigger exponential number of solver invocations.
Also, the solver by Simon Tatham uses very complicated logic and it is much slower than modern SAT solvers.
So, how to actually represent the constraint of a single loop as a CNF? There are actually multiple methods, and we show the two most performant ones.
Method 1: Cellular automaton
See what a Cellular automaton is. Basically, the idea is to locate any black line and put a cellular automaton in it. Then let it spread to adjacent lines, generation by generation. After enough number of generations, we assert that the automaton cells consumed all black lines.
First, we need to calculate the maximal number of generations. It cannot be more than (see above the definition of ). So just to be safe, we can define boolean matrices, each of width and height . Then we can add assertions regarding successive generations. Finally, we assert that the last generation has cellular automaton cells exactly where the black lines are in the grid.
This approach turns out to be pretty fast in practice. However, the number of variables and clauses grows very fast when you increase the size of the grid. For larger grids, it renders intractable.
Method 2: Path distance
The second method that we experimented with is the following. Instead of allocating matrices, we allocate just a single matrix of width and height . However, each cell of the matrix has bits. So, each cell is -sized bit-vector and we interpret them as unsigned integers.
Each number in the matrix corresponds to a line in the grid. Assert that there is eactly one black line whose corresponding number is 0. That line is called "the starting line". All other lines have a non-zero distance from it, by following the loop path. For each black line whose number is non-zero, we assert that there exists an adjacent black line whose number is smaller by 1 (for this you need to implement bit-vector decrementing). This way we make sure that from each line it is possible to reach the starting line in a finite number of steps.
This turns out to be faster for large grids, but slower for small grids (compared to the Method 1). However, it produces significantly less auxiliary variables and therefore can even be applied to large grids.
It is definitely possible to convert Slitherlink rules to a pure CNF formula and it is at least as performant as other Slitherlink solvers. This method can also be extended to generate a grid that satisfies the rules and has exactly one solution. We managed to generate extremely hard puzzles using this method (see for example the image on the right).
Both methods are implemented and published at our GitLab. However, it is spread across multiple repositories and it is not well-documented.