
Boolean Satisfiability: Theory and Engineering - ot
http://cacm.acm.org/magazines/2014/3/172516-boolean-satisfiability/fulltext
======
ced
This should have a wider audience than just people familiar with Boolean
Satisfiability (SAT). SAT is the archetypal NP-complete problem (kinda like
travelling salesman), but the details don't matter much for the article's key
point:

 _The most fundamental challenge to emerge goes to the very heart of
computational complexity theory. This theory is based typically on worst-case
complexity analysis, which focuses on instances that are the most difficult to
solve. Worst-case complexity analysis has proven to be quite tractable
mathematically, much more, than say average-case complexity analysis. It also
seems intuitive from a practical point of view; for example, a worst-case
upper bound for an algorithm offers an absolute upper bound on its running
time in practice. Thus, worst-case analysis is the standard approach in
complexity theory. What has become clear, however, and also became painfully
evident at the SAT workshop, is that worst-case analysis actually sheds very
little light on the behavior of algorithms on real-life instances. For
example, theorists have demonstrated that current SAT-solving algorithms must
take exponential time to solve certain families of SAT instances.
Practitioners simply shrug at such bounds, while they continue to apply their
solvers to very large but practically solvable SAT instances. One role of
theory is to provide guidance to engineering, but worst-case (and average-
case) complexity seems to offer little guidance for problems that are
difficult in theory but feasible in practice. What is needed is a new
computational complexity model, which will capture better the concept of
"complexity in practice._

Great insights.

In other words, we rarely care about an algorithm's worst-case performance. We
care about average performance over the kind of problem it has to solve. This
is terribly difficult to analyse from theory (partly because "the kind of
problem it has to solve" is hard to define), so empiricism prevails - at least
for SAT (see also: Analytic Combinatorics).

~~~
mjn
I agree to some extent, and even wrote an op-ed along those lines a few years
ago [1]. The issue with asymptotic algorithmic analysis isn't only the
constant-factors-matter one (which most people are aware of), but also that
some heuristic algorithms are so good that they change the shape of the curve
in typical usage, not only the constant offset. In practice, some problems
that are supposed to scale as n^2 or n^3 or 2^n actually seem to scale
linearly or sublinearly in practice. So asymptotic analysis can be a poor
guide to algorithm selection in many places where it's used.

On the other hand, I think recent trends are making worst-case analysis more
relevant than it used to be, giving it something of a new lease on life. Many
algorithms are now network-exposed in some form, in which a user can
ultimately dictate or at least influence aspects of the input. That means a
lot of systems need to worry about maliciously chosen input, or at least it's
safer to build them under the assumption that they do. That's why Google wrote
RE2 rather than using PCRE, for example. PCRE generally has good performance
on real-world regexes, but it can be fairly easily forced into exponential
performance if an adversary gets to pick the regex. That could conceivably be
an issue for SAT-solving systems as well, if the problem comes directly or
indirectly from a user. They typically run quickly on "natural" SAT problems
derived from real-world situations, but it's also known how to purposely
generate likely-to-be hard problems [2].

[1]
[http://www.kmjn.org/notes/nphard_not_always_hard.html](http://www.kmjn.org/notes/nphard_not_always_hard.html)

[2]
[https://fenix.tecnico.ulisboa.pt/downloadFile/3779577376868/...](https://fenix.tecnico.ulisboa.pt/downloadFile/3779577376868/Selman95.pdf)

~~~
ced
Nice article. I'm 100% with you on the irrelevance of NP-Hardness to AI.

