
Knuth – Estimating the Efficiency of Backtrack Programs(1975) [pdf] - lamchob
https://www.ams.org/journals/mcom/1975-29-129/S0025-5718-1975-0373371-6/S0025-5718-1975-0373371-6.pdf
======
svat
Note that this paper (Knuth's “P69”) was reprinted as Chapter 6 of _Selected
Papers on Analysis of Algorithms_ , and this material is covered (in
compressed form) in Pre-Fascicle 5B of _The Art of Computer Programming_.[1]
This will be part of Fascicle 5, due to be published later this year, and
eventually the first one-third of Volume 4B (the middle third, namely Fascicle
6, has already been published).

Pre-Fascicle 5B is[2] entirely about (what Knuth considers an “introduction
to”) backtrack programming, and Pre-Fascicle 5C follows it up with a special
low-level technique for making backtracking more efficient, namely Dancing
Links. And Fascicle 6 is about Satisfiability (aka SAT), the premier instance
of a problem solved by “backtracking”. All of these are already available!

If you're interested in algorithms for combinatorial problems you should check
out whatever Knuth has published since he started working on Volume 4; there's
a lot of fresh material and he has a unique perspective of his own.

Edit: What I mean by unique perspective: For example, unlike mainstream
Theory-CS, Knuth is not primarily interested in the asymptotic complexity of
problems (despite being the “father” of analysis of algorithms and the
popularizer of Big-O notation in computer science), but is interested rather
in real programs that can be executed on real computers, and in the
mathematical analysis thereof. Given a problem, what I can imagine a
mainstream theoretical computer scientist doing is coming up with a suitable
abstraction that is mathematically elegant, reducing the problem to/from other
known problems, getting asymptotic upper and lower bounds on complexity, not
minding if one ends up with a “galactic algorithm”, etc. What I can imagine
Knuth doing is writing an assembly program (perhaps in MIX/MMIX, his best
approximation to the wide variety of actual machine languages in existence),
then doing a no-holds-barred mathematical analysis of its running time (not
just Big-O but including the constants for the leading terms: here Big-O is
used only to suppress less-useful information), then comparing this running
time against empirical estimates derived from writing and running a C (or
CWEB) program. See some talks on Sedgewick's site for examples of the
difference (sorry can't find the exact reference right now).

[1]:
[https://cs.stanford.edu/~knuth/news19.html#targetText=5b](https://cs.stanford.edu/~knuth/news19.html#targetText=5b)
(search for “5b” on the page)

[2]:
[https://en.wikipedia.org/w/index.php?title=The_Art_of_Comput...](https://en.wikipedia.org/w/index.php?title=The_Art_of_Computer_Programming&oldid=919657342#Volume_4B,_4C,_4D_–_Combinatorial_Algorithms)

~~~
svat
To put it differently, Knuth's perspective is that of someone who's
fundamentally a programmer at heart, but also happens to be a supremely
competent/skilled mathematician. There are not many people like that (Bill
Gosper comes to mind).

In this paper, which is worth reading even if you've never heard of
backtracking before — Knuth introduces backtracking with a very simple example
that he then deeply analyzes to great effect — you can get a sense of this
from the very first page:

> _Sometimes a backtrack program will run to completion in less than a second,
> while other applications seem to go on forever. The author once waited all
> night for the output from such a program, only to discover that the answers
> would not be forthcoming for about 10^6 centuries. A "slight increase" in
> one of the parameters of a backtrack routine might slow down the total
> running time by a factor of a thousand; conversely, a "minor improvement" to
> the algorithm might cause a hundredfold improvement in speed; and a
> sophisticated "major improvement" might actually make the program ten times
> slower. These great discrepancies in execution time are characteristic of
> backtrack programs, yet it is usually not obvious what will happen until the
> algorithm has been coded and run on a machine._

> _Faced with these uncertainties, the author worked out a simple estimation
> procedure in 1962, designed to predict backtrack behavior in any given
> situation. […] during subsequent years, extensive computer experimentation
> has confirmed its utility._

He spends 4 pages (one-fourth of the paper) gently introducing backtracking,
then a page making a mathematical formalism out of it, then the method falls
out almost naturally. The method is elegant and almost too good to be true:
instead of doing your backtracking (systematically trying every possible way
to extend a partial solution), at each step just pick a _random_ way to extend
the partial extension. Continue till you get stuck or find a solution. (You're
basically going down just one random path of the search tree.) The cost of
this path, weighted by appropriate probabilities, is an estimate of the cost
of the backtracking program.

Two corollaries show the worth of the formalism, then he refines the method to
work better in practice, etc. The motivation is always better tools for the
working programmer, but he's happy to, e.g. introduce generating functions and
differentiate them, when needed.

> _Furthermore, the game [hand calculation] is worthwhile, because it gives
> insight into the behavior of the algorithm, and such insight is of great use
> later when the algorithm is eventually programmed; good ideas about data
> structures, and about various improvements in the backtracking strategy,
> usually suggest themselves._

~~~
todd8
Another notable mathematician and exemplary programmer is D. J. Bernstein.

------
nickdrozd
Knuth mentions this paper in _Things a Computer Scientist Rarely Talks About_.
He said that the success of randomization for quantitative problems led him to
use similar techniques for qualitative problems -- like studying the Bible by
studying only a "random" sample of verses. He ended up choosing to focus on
verse 3:16 of each book (actually, the 16th verse after the start of the 3rd
chapter, provided such a verse exists). Questions about whether this was an
appropirate sample and whether this was a good idea at all are discussed
extensively in TACSRTA.

------
cellularmitosis
Perfect timing, I just realized this evening that I might need backtracking
[https://gist.github.com/cellularmitosis/aa46e1e88c3bcd2db83c...](https://gist.github.com/cellularmitosis/aa46e1e88c3bcd2db83c01f9af076270)

