
P vs. NP for Dummies - bdr
http://scottaaronson.com/blog/?p=459
======
JangoSteve
I really like the idea of introducing difficult problems like these to
outsiders (I'm an outsider in this instance) for the sole reason that it
greatly increases the odds that someone may come out of left-field and either
solve it or at least provide an insight that helps solve it.

Speaking of breaking down complex ideas with analogies, I'm reminded of this
Wired article [1] which examines the inhibition of breakthroughs caused by
experts' cognitive biases:

 _Modern science is populated by expert insiders, schooled in narrow
disciplines. Researchers have all studied the same thick textbooks, which make
the world of fact seem settled. This led Kuhn, the philosopher of science, to
argue that the only scientists capable of acknowledging the anomalies — and
thus shifting paradigms and starting revolutions — are “either very young or
very new to the field.” In other words, they are classic outsiders, naive and
untenured. They aren’t inhibited from noticing the failures that point toward
new possibilities._

Or a more specific observation from one of the cited studies in the article:

 _[A diverse lab] mulled the problem [of unexpected protein synthesis results]
at a group meeting. None of the scientists were protein experts, so they began
a wide-ranging discussion of possible solutions. At first, the conversation
seemed rather useless. But then, as the chemists traded ideas with the
biologists and the biologists bounced ideas off the med students, potential
answers began to emerge. “After another 10 minutes of talking, the protein
problem was solved,” Dunbar says. “They made it look easy.”_

[1]
[http://www.wired.com/magazine/2009/12/fail_accept_defeat/all...](http://www.wired.com/magazine/2009/12/fail_accept_defeat/all/1)

~~~
edanm
"I really like the idea of introducing difficult problems like these to
outsiders (I'm an outsider in this instance) for the sole reason that it
greatly increases the odds that someone may come out of left-field and either
solve it or at least provide an insight that helps solve it."

Really? I kinda doubt this happens often (or even at all) with hard
Mathematical problems like P vs NP. We're talking about problems that tend to
have so much history and theorizing about them that it would take enormous
effort just to understand what is known about _what's hard_.

Does anyone know of any statistics on Mathematical problems being solved by
"outsiders"?

~~~
JangoSteve
I guess I could have emphasized the "problems like this" and the "at least
provide insight that helps solve" parts, as I wasn't suggesting that some
amatuer will necessarily come along and completely solve this problem. And
sure, I agree that in specific instances in which the problem is purely
mathematical, outsiders probably rarely have much impact.

That being said, it was my understanding that even the most recent paper
posted here causing all this stir over the P/NP problem, was using statistical
physics, which does indeed make it a somewhat "out of left-field" approach not
used by the domain experts that have been traditionally studying this problem.

------
icarus_drowning
Thank you for this. I've been trying to figure out the "big picture" of this,
and it is helpful to have someone explain it who understands the minute
details.

~~~
Retric
P vs NP is basically talking about problems where you can checking each
possibility in a reasonable amount of time, but the number of possibility's
keeps growing faster as the numbers get larger.

The traveling salesman is the easiest to understand, let's say you want to
visit 11 city's which route is the fastest? It's easy to find out how long a
route takes just add 10 numbers (the travel times between each city). However,
there are basically 11! (11 * 10 * 9 * 8 ...) paths to look at. Add a new
city, and there are suddenly 11 times as many options and each of them take
10% longer to compute (you add 11 numbers for each path vs 10). Going from
1001 to 1002 city's gives 1001 times as many routs, but calculating each route
only takes 0.1% longer to compute (add 1000 vs 1001 numbers).

The only real hope if you want the perfect solution is for there be some
secret that let's you eliminate many options without checking them. The simple
truth is you can skip _some_ steps aka, if your looking for the fastest trip
though every major city in the US, you can ignore any path that starts with a
longer trip than a complete path. AKA if you now of a route that visits every
city in 21,357 miles you and you take longer than that to visit 80% of the
city's you can ignore all paths that start that way. But, what are the limits
on those improvements? What is the fastest possible approach?

PS: There are many options that can give a reasonably good solution, EX: if a
computer picks the best random path from 1 million possible paths then it's
probably better than 99.99% of the options. You can then take the best random
paths and try to improve them by trying simple manipulations (swapping city's
etc).

~~~
YuriNiyazov
In retrospect, the traveling salesman problem was actually an impediment for
me to understand the P vs. NP question. Reasons:

1) Formally, P vs. NP is presented as a binary YES/NO question rather than a
"what is the smallest number such that <something>"? Only after you grasp the
Y/N format of the question are you shown the trick where you can convert any
problem in FNP into NP by converting it into a binary Y/N question with the
help of a binary search. If you are presented initially with the TSP as the
canonical P/NP problem, it is unclear for a while how it fits into the formal
presentation

2) (If my understanding on this point is incorrect, I will be glad if someone
disabuses me of my misconception): Sometimes, P vs. NP is presented as
"problems to which it is harder to find the solution than to verify a solution
once it's given". TSP is not actually one of those problems: if someone hands
you a possible solution, the process for verifying that it is indeed correct
is equivalent to finding it in the first place.

~~~
edanm
1) Agreed.

2) You're right; TSP is NP Hard, _not_ NP Complete.

But if you're talking about the YES/NO version, then it is defined (ala
Wikipedia) like this: "given the [graph and] costs and a number x, decide
whether there is a round-trip route cheaper than x". This problem _is_ NP
Complete; it's easy to see that verifying in this case just means summing the
weights and seeing if they are below `x`.

~~~
Retric
A version I was told to think about was competing travel agents. At each step
they need to find a better solution or know when to stop looking. It's still
not quite there _"given the [graph and] costs and a number x, decide whether
there is a round-trip route cheaper than x"_ but if you are handed a graph you
can say it's shorter and then give it as the answer. The difference is there
may be approaches that prove there is a shorter path, without finding it.

However, the mental model is vary close to the generic solution because
someone can hand you the 2nd best solution which looks nothing like the best
solution and you still need to find it. Where if you are handed the best
solution they may be a shortcut to proving it's the best.

PS: The Clique problem always seemed like the version where shortcuts seemed
most obvious. But, understanding why they fail is a larger mental jump IMO
than NP Hard TSP to NP Complete TSP.

------
justlearning
fwiw, I found this <http://claymath.msri.org/pversusnp.mov> (warning:
video/.mov) very lucid. She (Dr Ramachandran) starts from some basic problems
and leads us to the problem P vs NP.

As a self professed 'dummy', I would recommend it anyone who wants to
understand the gist of P vs NP

------
dzuc
<http://simple.wikipedia.org/wiki/P_versus_NP>

------
InclinedPlane
An even simpler explanation:

There are many problems of the form of several sets of options, where a
possible solution to the problem represents a particular choice for each
possible option. An example might be a diet where there is a list of
acceptable foods (each with a particular calorie value) and a maximum calorie
recommendation for each day, the problem being choosing which foods to eat in
a particular day. A characteristic of these types of problems is that they are
relatively "easy" to check the answers. It's easy to determine if a particular
set of food choices adds up to under or over the maximum recommendation for
the diet. If you were to tag each food option with a "deliciousness" value
then you could also total up the deliciousness index of a daily set of food
choices and judge them relative to each other.

Some of these problems are such that the total number of possible solutions
grows extraordinarily fast relative to the "size" of the problem (e.g. number
of possible food choices). So fast that for relatively "small" problem sizes
(perhaps only 100 or so food choices) it becomes fundamentally impossible to
compute every single possible solution to the problem (even if you had a
computer that ran as fast as the laws of physics allowed and was the size of
the entire known universe). More so, some of these problems are such that
there is no way to come up with an _exact_ solution that is faster than being
forced to do a brute force check of all or many of the possible solutions.

Now, this doesn't mean that (depending on the problem) you can't generate
quite good approximations to the most optimal solution to the problem by using
shortcuts, but exact solutions are impractical.

As it turns out, many of these types of super-hard problems are closely
related to each other. Such that you can translate one variety of problem
(such as how to travel through several cities in the least amount of total
distance or how to schedule a large number of meetings between a large number
of people using a fixed number of meeting rooms) into another variety. Which
would mean that if you had a shortcut to "easily" (i.e. practically) compute a
solution to one you could come up with solutions to all of the other classes
of problems as well, merely by couching those other problems within the frame
of the problem you had "solved".

This is the essence of P vs. NP. P is the class of problems where the cost of
computing an exact solution doesn't necessarily grow too fast relative to
input sizes to be impractical with real-world computers. NP is the class of
problems that are equivalent to P problems if you happen to have a magical
computer which could evaluate and compare any number of possible solutions
simultaneously. Naturally, NP will include all of the P problems, so NP-
complete problems are taken to be the set of problem for which only the
magical computer would be suitable.

The question is whether or not there is actually a difference between P and NP
problems. As mentioned, all NP-complete problems are essentially equivalent,
if you come up with a way to turn one of them into a P-problem then you can
turn all NP-complete problems into P-problems. However, these problems are so
difficult to grapple with that we don't know whether this is possible or
impossible. Currently we believe that it is impossible, that there are no ways
to come up with exact solutions to NP-complete problems except through trying
many or all of the possible solutions (making exact solutions impractical or
impossible, depending on the problem size).

If P was equal to NP then it's possible that there would be some consequences
(some good and some bad). It might be possible to come up with exactly optimal
network routing, for example. But it might also be possible to crack public
key cryptography.

Proving that P was not equal to NP would have some consequences as well. It
would mean that we couldn't ever hope for anything better than approximate
solutions for some problems (e.g. routing and scheduling) and public key
cryptography might be generally reliable (with a large enough key size).

------
DarrenMills
Thanks to everyone who helped break it down. I had no idea my comment would
actually get this level of a response.

------
amichail
Maybe these "dummies" would benefit more by learning about social media or
python programming.

Why fixate on this particular problem as opposed to learning all sorts of
other things that are more useful?

~~~
vecter
"Useful" is a particularly subjective word, and shouldn't necessarily be the
metric by which we measure the opportunity cost of our time.

Many advances in the world that you and I take for granted are the direct
result of pursuit of knowledge for its own sake. When Einstein developed
relativity, he wasn't trying to figure out how to sync satellites to make GPS
more accurate (and obviously satellites didn't exist at the time). But if that
theory was never developed, we wouldn't have all the luxuries that GPS
provides for us. That is just a trifling example of how the quest for deeper
understanding of fundamental knowledge has made the world a better place.

Lots of people on hacker news love writing code. That's great, seriously.
Looking down on others because their interests aren't as purportedly "useful"
as yours is pretty arrogant and shortsighted.

~~~
amichail
_When Einstein developed relativity, he wasn't trying to figure out how to
sync satellites to make GPS more accurate (and obviously satellites didn't
exist at the time)._

These "dummies" have essentially zero chance of making any meaningful
contribution to the P vs NP problem.

~~~
vecter
So they shouldn't be interested in learning about it?

~~~
amichail
But why this as opposed to a whole host of other things that they might learn
instead?

I think people should think more carefully about how they spend their time.

~~~
mquander
If you're being sincere, I suggest that this is the generally-held position on
this thread of discussion:

A) A lot of people are immensely fascinated by questions in theoretical
science and mathematics, including P=NP.

B) A lot of people enjoy learning about things regardless of obvious utility.
Personally, my choice of topics to spend time studying has nothing to do with
whether it will pay off for me at some point. It's a pleasant luxury of modern
life that many people can afford to do that.

C) If you are in camps A and B, it seems extremely arrogant and insulting when
you suggest "Why this?" and "I think people should think more carefully about
how they spend their time." Since you seem to have no perception of why anyone
is fascinated by this work, it appears completely nonsensical to suggest these
things to communities populated with a bunch of people who are clearly
interested in it, or to readers of a big blog post discussing it.

If you really have something to say about points A or B, or something else, I
wish you would say it instead of wrapping it in these bizarre, judgemental
insults.

~~~
amichail
Basically this interest in P!=NP is like yielding to authority.

I would like to decide for myself what's worth studying -- not have it
dictated to me.

Having said this, of course anyone who has an interest in computing should
know what the P vs NP problem is about, but learning enough to be able to
follow proof strategy for P vs NP is another matter.

~~~
mquander
I think the interest in P=NP specifically is for two reasons besides
authority:

\- Anyone with a little bit of CS or programming background can understand the
formulation of the problem, similar to how anyone can understand Fermat's Last
Theorem.

\- There are only so many open problems that anyone can understand. Open
problems are interesting to casual observers since they represent a boundary
between the known and unknown, and since they invite the imagination to think
up ways to attack them.

Once you've invested yourself a little bit in the problem for these reasons, I
think it's natural curiosity to want to spend twenty or thirty minutes, or
more, learning about the outline of a novel proof strategy.

That said, this article was a general introduction to what the problem is
about, not a description of the proof.

