Hacker News new | past | comments | ask | show | jobs | submit login
Solving the “Miracle Sudoku” in Prolog (benjamincongdon.me)
276 points by andrelaszlo on May 25, 2020 | hide | past | favorite | 34 comments

Very nicely done, thank you for sharing this!

A CLP(FD/ℤ) solution such as this one has two parts: First, the relevant constraints are posted. Second, a search tries to find concrete solutions. In general, a search is necessary because the constraints by themselves are not sufficient to deduce the unique solution as that would be computationally prohibitive.

You can therefore influence the speed of the logic program in two categorically distinct ways: First, you can post different constraints or try a different formulation altogether. Second, you can try different search heuristics that try to find concrete solutions more efficiently.

Regarding constraint propagation, there is a clear trade-off between strength and efficiency of propagation. For instance, the all_distinct/1 constraint that is used in the sample solution propagates very strongly, and is therefore much slower than the weaker all_different/1 constraint, even though they are completely equivalent from a semantic perspective: Both are true if and only if the given integers are pairwise distinct. It depends on the problem at hand whether the stronger propagation pays off, or whether the weaker propagation is in fact sufficient to quickly find solutions.

Especially for very simple tasks such as Sudoku, quick and weak propagation may well outperform strong and slow propagation.

In this concrete example, I easily achieved a 20-fold speedup by replacing the all_distinct/1 constraint with all_different/1, and using the ff labeling strategy to search for concrete solutions. ff means first-fail, and is often a very good strategy. To get this, I simply textually replaced all occurrences of all_distinct with all_different in the source code, and posted:

    ?- problem(1, Rows),
       append(Rows, Vs),
       labeling([ff], Vs),
       maplist(portray_clause, Rows).
Also, in less than 5 minutes, I just produced all 88 solutions for problem 3 which is shown at the end of the article. This is only due to using less powerful propagation, and a better search heuristic.

And this is one of the key attractions of constraint logic programming (CLP): You can relatively easily try different search strategies while keeping the model the same.

In Hakan's very nice Picat solution, you see that a specific search heuristic is used, specified as:

    solve([ffd,down], Vars).
In addition, Hakan's program uses all_different/1. For a fair comparison between the solutions, constraints with the same propagation strength, and the same search heuristic must be used in both approaches.

This comment was a joy to read right after finishing "Out of the Tar Pit" [1]. This bit is one of the critical design principles that article was advocating for:

> You can relatively easily try different search strategies while keeping the model the same.

[1] https://github.com/papers-we-love/papers-we-love/blob/master...

Thank you for your explanation. However, as someone who has merely dabbled in Prolog (I've written less than 2000 lines of Prolog in total), I still don't understand the distinction between all_different/1 and all_distinct/1.

Even the documentation for all_different/1 https://www.swi-prolog.org/pldoc/man?predicate=all_different... asks the user to consider using all_distinct instead. I'm afraid I just don't have a mental model of how CLP(FD) works to understand what is meant to "stronger" or "weaker" propagation.

Some terminology: a propagator is an implementation of a constraint, and is used to remove values from variable domains, by deducing that they are in no solution. In general, propagation strength is about how much a propagator can deduce, that is, how many values it can remove while still not removing a value that is in some solution to the constraint.

As an example, consider the variables

    x=1, y in {2, 3}, z in {2, 3}, w in {1, 2, 3, 4}
with the constraint all_different(x, y, z, w) added (note here that I will use that names for the constraint, and refer to the propagation strength of a propagator).

With value-propagation (sometimes called forward-checking), a propagator would deduce the new domains

    x=1, y in {2, 3}, z in {2, 3}, w in {2, 3, 4}
by simply removing assigned values from other variables domains. However, there are stronger propagators for the constraint. In particular, there is well-known and reasonably efficient (O(n^2.5)) propagator that is domain-consistent (also known as GAC or generalized arc consistent). That means that it can always remove _all_ values from variables that are in no solution. The above example with that propagator would deduce

    x=1, y in {2, 3}, z in {2, 3}, w=4
The reasoning is based on something called Hall-sets. In the above example, y and z form a Hall set of size 2, since they must take the values 2 and 3 in any solution. there are also other propagators with different propagation strengths. For all_different, there are also bounds-consistent propagators, that can do more than value propagation, but only on does advanced reasoning based on the bounds of variables.

I think that SWI Prolog uses the all_different name for value propagation, and all_distinct for domain propagation. In the CP system Gecode that I'm most familiar with, we use an argument to the distinct-constraint (same constraint, different name) to indicate the desired level of propagation (https://www.gecode.org/doc-latest/reference/group__TaskModel...).

Thanks! This comment was super helpful. :) I made a couple updates to the post with these findings.

I suspected something like the `ff` labeling approach would improve the solution, but I had no idea how much of a difference `all_different` made vs. `all_distinct`!

I recently went though a whole bunch of your Power of Prolog videos [0], solving and generating such sudokus was the first thing I thought about when watching the original video of the guy solving the Miracle Sudoku.

Thanks for those videos!

[0] https://www.youtube.com/channel/UCFFeNyzCEQDS4KCecugmotg/fea...

The OP says there are 6x10^21 Sudoku puzzles.

If we remove rotations and reflections, there may be (far) fewer unique Sudoku puzzles than we think. In fact, there are many sub-reflections that are essentially the 'same' puzzle:

Take the first three rows as a block, 'rotate' to the bottom of the puzzle - its still 'solved'. In fact permute the blocks of three rows in any order, still the 'same' puzzle in a sense.

Same with the first three columns.

Same with the three rows or columns in a block of three - permute them, the solution is 'unchanged'.

Relabel the digits from 1-9 as 9-1 - another 'reflection' which is the same puzzle pattern.

In fact, permute the digits all different ways - its still solved.

I'm not at all sure there isn't actually just one unique pattern to sudoku. But I don't know how to prove that one way or another.

In 2005, Ed Russell and Frazer Jarvis calculated 5,472,730,538 "essentially different" 9x9 grids [1][2], once they've excluded valid grids that can be derived from another valid grid using relabelling, various permutations, reflection, rotation, etc.

[1] http://www.afjarvis.staff.shef.ac.uk/sudoku/sudgroup.html [2] (PDF) http://www.afjarvis.staff.shef.ac.uk/sudoku/russell_jarvis_s...

OK I read that -very interesting! And now I'm positive I don't know how to prove it myself. Luckily there are smarter people who can address problems like this. Thanks for the link!

https://en.wikipedia.org/wiki/Mathematics_of_Sudoku#Enumerat... I believe these ways of relabeling you mention are all taken into account when numbering the "essentially different" puzzle configurations.

With the extra constraints in this Sudoku variant, there can't be many unique solutions -- otherwise two numbers wouldn't be sufficient to uniquely specify a solution. In fact, there's only a single pattern. Note how along one axis every digit is 4 higher (mod 9) than the previous digit. One starting digit specifies the starting point (alignment of the pattern), the other starting digit the rotation + mirroring of the pattern. 9 possible alignments times 4 rotations times 2 (mirrored or not) = 9 * 4 * 2 = the 72 different solutions mentioned in the article.

But this is caused by the Knight's+King's move constraints (with the "consecutive numbers" constraint prohibiting relabeling of the numbers). There's way more solutions without those extra constraints.

yes, the number is greatly reduced, but you're leaving out a source of more variation. A sudoku puzzle not the solution, it's the constrained choice of what is shown vs hidden in the initial state. One solution could be parent to a number of solvable puzzles of varying difficulty.

This is a really nice post and I especially liked the vulnerability in the conclusion about not being able to generate puzzles quickly because of how you ordered your constraints - I get stuck there almost every time I write Prolog.

One thing to nitpick is that the number operators (#>) require an import of clpfd, so some readers may not be able to get the code examples to work.

> require an import of clpfd

    :- use_module(library(clpfd)).

Yeah, always show your imports with your code snippets! :-)

Absolutely wonderful puzzle. Back when I taught Prolog in college, I created a number of Sudokus for my students to solve, that and a few other prolog puzzles are here for those enjoying a lazy afternoon of logic programming: https://github.com/maebert/prolog_puzzles#sudoku

Thank you for sharing those puzzles; I rarely get a chance to use Prolog, and the 99 Problems are pretty boring, so it's nice to have a fresh challenge! Just a note: it's 'exercise', not 'excercise'.

I said just the other day, "Adding constraints that encode the additional rules is also a little puzzle, eh?" I think I see a kind of hierarchy that reminds me of the Futamura Projections:

1. Solving Sudoku puzzles.

2. Designing new Sudoku puzzles (with additional constraints even.)

3. Designing (e.g. CLP(FD/ℤ)) programs to solve all sudoku puzzles everywhere.

- - - -

How do you pronounce "ℤ" in this context? "see el pee eff dee zed"? Seems too plain. "see el pee eff dee Zahlen" seems too fancy. :-)

ℤ is just the integers, so say the integers or the ring of integers to emphasise the mathematical structure

As a mathematician, I'd say "Cee Ell Pee Eff Dee over Zee". American mathematicians never say Zahlen and rarely say zed.

https://benjamincongdon.me/blog/2020/05/23/Solving-the-Mirac... has the list of all the boards that are 'miracle' sudoku. Only 72.

Is that accounting for symmetry?

What I can see, of these 72 solutions, there are at most 9 distinct solutions when symmetries are removed (rotation, transpose, flipping). I might have missed some solution.

Though, of these 9 solutions there are more similarities, i.e. some of them are the same rows except that the positions are switched.

Some other thing. if the lines are sorted, there are only 4 distinct variants.

Also, the solution of the original Miracle Sudoku instance has some invariants, for example the rows are rotations of these numbers 1,5,9,4,8,3,7,2,6 and the columns are (when rows are sorted) rotations of the numbers 1..9.

I've added comments about this in my Picat model: http://hakank.org/picat/miracle_sudoku.pi

It's only a single pattern. Every solution has the 1,5,9,4,8,3,7,2,6 pattern (every step is +4 mod 9) in either the rows or the columns.

9 ways to transpose; 4 rotations, 2 for flipped or not. Multiply those, you get the full 72.

Which are the "9 ways to transpose"? In general there are 8 symmetries of a matrix, but you might also count reordering of rows/columns or blocks?

I agree that 1,5,9,4,8,3,7,2,6 pattern rotations is there somewhere, either in the columns or the rows of a solution, but that don't make all solutions the same.

Different people could reasonably disagree. The solutions with top rows:



Look pretty equivalent to me, which is how you get the 9 transpositions.

Yes, but the solutions are different if the order (permutation) of the rows with the pattern rotations are different. There is a huge number of ways the rows can be ordered (9!).

Very cool to see, as I'm on and off working on a Kakuro solver/generator using (almost) only SQL...when I get time to finish I'd like to put a similar article together.

Interesting, are there any specific features of SQL that make it a good tool for this? My knowledge of SQL is limited to relatively simple selects and joins, and unfortunately my take-away is that I find it clumsy and verbose. Are there advanced, expressive features that are a good fit for constraint solvers?

I can say your takeaway seems to be the correct one, as my mental model was far more compact than the actual SQL it has turned out to require. (Edit: Hope remains that proper structuring of data will obviate the need for much of this sql)

I am using joins over constraints written and read in MEMORY tables (including CROSS JOIN, generating an application of which was one of my motivations to do the project). Anything advanced is pushed to the query execution plan.

This is on MySQL; I also hope to explore a SQLite browser version. I aim to generate some performance profiling as a side product.

In college I took a "Survey of Programming Languages" course, and we did a Prolog unit which culminated in solving circular sudoku. The language blew my mind. I loved it.

z3 solver for comparison, from the original thread: https://gist.github.com/sielicki/fd86d68733133f654128519b3c4...

I find z3 python interface much simpler, more intuitive and more powerful than prolog for this kind of things

One strong point of prolog is that all outputs are prolog terms themselves. So in terms of ease of use and reuse of results, prolog is much better.

It all depends on what you are looking for.

Have you gotten a chance to look at miniKanren/cKanren? Any opinions?

The kings move constraint confuses me. Where does N1 get checked against N5, its diagonal?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact