
P vs NP on simple english wikipedia - feedback, please - MarkPNeyer
http://simple.wikipedia.org/wiki/P_versus_NP
======
peteretep
Your example at the end - you should go in to less detail about the number of
combinations, and more in to how it's easy for a computer to check the right
answer has been reached.

~~~
generelativity
This. It's a really great example, but you never make the full connection by
explaining that it's easy to check that 2 towers have the same mass.

------
orenmazor
I see posting this here as a mistake. You should post it somewhere where
people dont know anything about it. Posting it here will result in us adding
more and more detail, possibly making it more complicated.

I used to use my grandmother as a subject for testing out paper
abstracts/intros for this reason. She could barely figure out scrolling.

------
buff-a
My 2c: Don't write "Problems in P", write "P Problems".

Also you definition of "why its interesting" conflicts with the other
Wikipedia page which says:

 _The P versus NP problem is a major unsolved problem in computer science.
Informally, it asks whether every problem whose solution can be efficiently
checked by a computer can also be efficiently solved by a computer._

~~~
MarkPNeyer
i'm not sure i see the conflict... care to elaborate?

~~~
buff-a
He's changed it =)

This is what it said originally: _That means they would like to know if there
are any problems where the answer cannot easily be found by a computer, but if
someone says he has the answer, it is easy to use a computer to check if that
answer is correct._

------
DanWaterworth
I think you might be missing the point of the simple English wikipedia. It's
the vocabulary that you use that is simple, not the way in which you express
the concepts.

~~~
Singletoned
I somewhat agree. For example why use a term like 'mass', when 'size' or
'weight' would be much simpler.

~~~
lamby
and wrong

~~~
Singletoned
In what way is the word 'size' wrong?

~~~
bnegreve
I guess that Lamby means that one should not use "weight" instead of "mass",
the two being two very different concepts.

~~~
corin_
They're very different concepts, but for the sake of the example it doesn't
matter which you are measuring, you could equally be measuring the number of
atoms in each one if you happen to know it.

------
Triumvark

      P problems are considered 'easy' for computers to solve. NP problems are considered easy for computers to check the answers to them.
    

To improve clarity:

1\. Avoid putting pronouns far from the main noun.

2\. Use simple sentences. Reduce the number of verbs and prepositional phrases
per sentence.

3\. Avoid scare quotes. If you need to qualify a term, do so explicitly.

Consider, for the above:

NP problems have answers which can be easily checked. P problems can be easily
solved.

(Maybe precede or follow with a quick definition of what you mean by 'easy.'
But I'm not sure it's necessary here.)

------
andrewcooke
i guess the article is currently being edited, but at the moment it isn't so
clear what is common and what different between the two groups. i think it
would be better if the motivation came earlier:

    
    
      solving a problem is sometimes harder than checking you have the right answer.  
      for example, it is easy to check that you have the secret key to a code.  you 
      can decode a message and see the original text.  but it is hard to find the key 
      if you do not know it.
    
      so the problem of finding the key to a code is much harder than checking whether a 
      key is correct.  other problems can be easy to solve and to check.  for example,
      adding 2 and 2 is both easy to solve and to check.
    
      this difference can be used to divide problems into two groups.  these groups are 
      called P and NP.  it is easy to check the answers for problems in both groups, 
      but only problems in group P are easy to solve.
    
      the problem of adding two numbers is in P, because it is easy to solve and to 
      check. the problem finding a secret key to a code is in NP because is is easy to 
      check but hard to solve.
    

i know my writing isn't simple enough here, but an approach like the above
makes it easier to understand the relationship between them. in particular,
the commonality between the two groups is emphasised.

~~~
Wilduck
I had a similar response to the article, but I can see at least one reason why
this explanation doesn't fit. Despite any inclinations we may have P vs. NP is
still unsolved. So, claiming that a problem is in NP is "easy to check but
hard to solve" is not necessarily true (if P == NP, it is both easy to check
and to solve).

The original article does a good job of not making this assumption, and
because of this gains the ability to pose the question:

> Many people want to know if there are any NP problems that are not P
> Problems

~~~
andrewcooke
hmmm. it's true that my text is wrong in that way, but couldn't you fix it
with a "maybe" or "appear" and then explain the open question afterwards? or
maybe the conditional nature of the text then becomes too complex?

------
nsimplex
You might want to mention P VS NP is similar to the following scenario: people
feel they can identify a piece of art (say a painting) as being of merit very
quickly just by seeing it; but won't be able to come up with any piece of
equal merit even if they had all the time in the world.

------
Udo

      Many people want to know if there are any NP problems that are not P Problems. 
      That means they would like to know if there are any problems where the answer 
      is hard to find, but if someone says he has the answer, it is easy to check 
      if that answer is correct.
    

This is probably the core of the article, but I find it misleading. The truth
is that we need to either prove that the answer to a certain problem will
always be hard to find (hence proving that P =/= NP) or we need to prove that
all NP problems have in theory a P algorithm (P = NP). We currently have many
NP (="easily" verifiable) problems that have no known solving algorithm in
polynomial time. The evidence would so far suggest that P =/= NP, but this is
empirical not mathematical. It's possible that all those non-P problems have
some way to resolve them in P, but it doesn't look like it right now.

I would thus rephrase the core explanation as something like this:

" _For problems where the correct answer can be checked easily, there are two
possibilities: If (P = NP) is correct, that means there is always an easy way
to come up with a solution no matter how complicated the problem is. If the
opposite is correct (P =/= NP), there are some problems where the answer will
always be difficult to find no matter how clever the solution is. We don't
know yet which one of these possibilities is true._ "

~~~
MarkPNeyer
the lack of a known polynomial time algorithm could be interpreted as evidence
that P =/= NP, but it could just as easily be interpreted as evidence of our
primitive state of knowledge with regard to algorithms. Suppose that P = NP,
but the minimum complexity of any NP-Complete problem is O(n^(# of atoms in
the universe)); we might never discover such a complicated algorithm, but that
in no way implies it doesn't exist.

still, i see your point, but i think your phrasing is too complicated; i've
changed it to this:

    
    
      It is known that all P problems are NP problems; the proof
      requires math which is too complicated for this article but 
      may be explained in the future. Many people want to know if
      there are any NP problems that are not P Problems. That 
      means they would like to know if there are any problems
      where the answer cannot easily be found by a computer, but
      if someone says he has the answer, it is easy to use
      a computer to check if that answer is correct.

~~~
Udo
I like mine better ;-)

Your idea about the the primitive state of our algorithm knowledge is
interesting. This has nothing to do with the article, but I do believe there
are deep philosophical implications here. If we knew (and could make use of
the fact) that everything is P, it would fundamentally shift our relationship
with reality. I would go so far as to say that P =/= NP is an instinctive
model we have formed about the universe as it presents itself today. If this
were to change, it would affect pretty much everything we're doing - I can't
even begin to imagine what would happen on the day abstract problem solving
becomes a P algorithm...

~~~
MarkPNeyer
what happens in the example i gave above? if the minimum complexity of solving
any np-complete problem is O(n^2e80) then we'd need such massively parallel
computing that you'd have to turn pretty much every atom in the universe into
a computing device.

i agree that p != np is the instinctive model most theoreticians have about
the universe, but if science has taught us anything in the last 100 years,
it's that our instincts are often wrong, and the story of science will
probably never be finished.

~~~
Udo
Absolutely, but I'd say that the idea of brute-forcing the entire universe -
while theoretically and mathematically sound - goes against the spirit of what
we're trying to express with Big O. Solutions like these work in a theoretical
universe with infinite matter and possibly over infinite time, but if the
world is indeed finite (which it may very well be) any Big O expression
containing the sum of all states of all atoms in the universe does contain a
practical infinity.

~~~
MarkPNeyer
i'm not sure what you mean by the 'spirit' of Big-O; i find the question
fascinating because it is relatively easy to understand the concepts involved,
yet it's been 40+ years since we understood enough to be able to ask the
question, and we still don't know the answer. when you couple that with the
fact that the prevailing wisdom that our inability to find an algorithm
implies that it can't exist, the contrarian in me goes nuts: the academic
attitude is always "we're so smart, if we can't find it, it must not exist,"
and i get a kick out of arguing for more humility on their part.

even though a polynomial-time algorithm with a massive polynomial would still
be computationally intractable, even a proof that such an algorithm exists,
but is too complicated to express using all the atoms in the universe as bits
would probably open the doors to all kinds of proofs and techniques we cant'
even imagine.

------
iand
The page says people "would like to know if there are any problems that are
easy for computers to see if a one person's answer is correct, but hard for
computers to find a correct answer." However the example given seems to answer
this question in the affirmative: it's a problem that's easy to check using a
set of scales, but very hard to find a correct answer.

~~~
MarkPNeyer
maybe the example needs to be more clear - it's hard for the computer to check
all possible answers, but maybe it's easy for the computer to find the answer
another way.

------
prtk
Rating 3/10. I don't think this article is written by someone who knows the
topic well.

e.g. you don't too complicated math as claimed in this line:

>> It is known that all P problems are NP problems; the proof requires math
which is too complicated for this article but may be explained in the future.

For P problems can be solved easily and then the solution can be easily
verified. This means all P problems are NP problems too.

Good news!! Better Article Exists. Read:
[http://cstheory.stackexchange.com/questions/5188/explain-
p-n...](http://cstheory.stackexchange.com/questions/5188/explain-p-np-problem-
to-10-year-old)

~~~
MarkPNeyer
I would agree that the math required is not 'complicated' from the point of
view of anyone with basic understanding of discrete logic, but it's certainly
too complicated to express in simple english terms that are understandable at
the level i was shooting for.

if you want to do the proof that all problems in P are also in NP, you have to
explain the notion of a decision problem and show that any np-complete
numerical optimization problem can be converted to a decision problem without
increasing its computational complexity by more than a function bounded by a
polynomial.

someone named 'balabiot' changed the article by adding an incorrect 'proof'
that P is a subset of NP, at the same time you made this post. the incorrect
proof reads:

    
    
      All P problems are NP problems: if a problem is easy to 
      solve, to check an answer you just solve it and check that 
      the results match.
    

this invalid proof makes the assumption that an np-complete problem has a
single solution. many np-complete problems have multiple solutions; there many
be many ways to pack the knapsack, color the graph, or satisfy the boolean
circuit. if you are presented with a valid solution, but the algorithm you use
to solve the problem instance produces a different, but also valid solution,
your comparison would fail.

note that simply solving the problem (because it is in p) is NOT the same
thing as verifying a solution correct.

------
kamaal
I came from an electronics background when I first started reading algorithms
my head was spinning. And I am sure a lot of other novices/laymen from the
same problem.

The reason I have seen over the years is simple. Expressing the problems in
terms of mathematical equations. Mathematical equations some how denote a way
of modifying elements through equations. They tend to build on existing
established principles of maths. There fore in order to understand what the
other person is saying a person must get into a high learning curve.

When I first read about P Vs NP problem, I really wasn't able to co relate it
to anything in the real world. Later on as I read more on puzzles like The Bin
packing problem, or the traveling salesman problems. Things began to make
sense.

The way to explain people about such things is to give them problems defined
in simple English. And then allow them to build their own way of representing
objects and elements, and the way they choose to modify them. Like this
article does, However while explaining to people start from very small input
sizes. Help them work out outputs for large input sizes.

Given the problem at hand, explain them why such large computation time
doesn't make sense in the physical world.

Representing the problem in pure math will never help the guy next to your
home. Giving him a day to day use case, will force him to think.

So here we go, what all common every day use cases fit into the P Vs NP
problem? If we can create such a list, a lot more laymen can think and build
on those problems easily!

Is there such a list lurking somewhere?

~~~
MarkPNeyer
doubtful. a programmer could spend an entire career in industry and never run
into an np complete problem he needed to solve.

i added more examples to the article, and made them as simple and practical
sounding as possible, but to be honest my primary interest in the problem is
mostly from a theoretical perspective. to me, the idea that P != NP implies a
host of bizarre conclusions that seem incredibly unlikely to me. i am so
convinced of this that i had it emblazoned my license plate:

<http://www.markpneyer.com/wp/about-me/>

~~~
kamaal
Thanks Mark,

Good work Indeed.

Can you please explain why you think P=NP. I am deeply interested in knowing
about them.

~~~
MarkPNeyer
to be honest, mostly it's because i enjoy disagreeing with people who are
convinced of things they don't have a full understanding of. if you are
decently educated, you realize how much there is we don't know, and approach
every unsolved puzzle with a combination of curiosity, confidence and
humility. philosophy aside, here's my thinking:

LONG VERSION: Suppose _P_ is an NP complete decision problem; for each
instance, the answer is either 'yes' or 'no'. Let _P_ have verifying function
_V_. Let the language _L_ be the set of all binary strings _s_ that represent
instances of _P_ for which _V(s) = 1_ , i.e. the set of all strings _s_ that
represent instances of _P_ for which the answer is 'yes'. If _L_ (the set of
all instances of P for which the answer is 'yes') is finite, then you can
easily answer all instances of the problem P in constant time by using a
lookup table that stores all members of L (i.e. all finite languages are
regular.)

So, consider the case when L is infinite. We know that V must be expressible
in terms of a finite sequence of symbols. Suppose we have manged to express V
using a total of C bits. The entropy of all C-bit binary sequences is bounded
by a function of C alone; i.e. it is independent of the specific content of V,
all that matters is how many bits are in some (it doesn't matter which)
implementation of V. My primitive understanding of information theory tells me
that if V has a finite amount of information, and L is uniquely determined by
V, then L cannot possibly have MORE information than V; if you have V and
enough time, you can compute all of L. This limits the possibilities of L, and
suggest to me that there must be a way to find a (possibly complex and
incomprehensible algorithm) for solving all instances of P, by 'compressing' L
in just the right way. I admit that this is very hand-wavey; i just don't know
enough information theory to pin everything down.

SHORT VERSION: if you can verify all solutions to a problem using a function
expressible using a finite sequence of symbols, then the total information
content of a sequence representing all solutions to the problem can't exceed
the information content of the verifying algorithm.

if P != NP, this (to me) implies that a list of answers to a problem somehow
has more information than an algorithm that can be used to generate that list
of all answers. it seems to me that the algorithm could easily contain more
information than the list, but not the other way around.

to put it another way, if you see a verifying function as a compression
algorithm on the list of all solutions (because they contain the same amount
of information), then P != NP implies that some infinite sequences (which
represent np complete problems) are simultaneously so redundant that they can
be compressed to a finite length, but at the same time are so complicated that
you can't figure out an arbitrary element without doing a massive amount of
computation. as i see it, some patterns can appear random unless you look at
them from just the right perspective, and then the pattern emerges. because
np-complete problems can be expressed with a finite amount of information,
there has to be a pattern that we're not seeing yet.

------
jarin
I think this generally does a good job of simplifying the terminology without
falling into "durr, people who don't speak English natively are dumb"
territory.

~~~
DiabloD3
I agree. Simple errs towards 5 year olds, not people whos native language is
not English, as they have their own wikipedias.

------
prof_hobart
As someone who has an interest in this kind of thing, but very little in the
way of formal education on it (my last formal maths education was my 'A'
levels around 25 years ago), I'm possibly the exact target audience for this
article.

Given that, I thought it was very helpful. I've read a whole host of attempts
to explain it before, and they rapidly descend into 1000 other mathematical
concepts that I don't fully understand and end up reading about instead of the
actual P v NP issue.

The one bit that I think is missing is a better explanation of what's meant by
"easy" in this context. Clearly a problem that takes more than the life of the
universe to solve is likely to classify as hard. But what about if it was
solvable in a year, or a minute, or a second? My vague understanding is that
there's not really a "less than 5 seconds to solve"-type cutoff, and that it's
actually to do with "nondeterministic polynomial time". But at that point I
disappear into one of the mathematical things that I don't really grasp.

~~~
MarkPNeyer
i agree, it could use more math. i plan to add that in; my concern is that i
don't want to scare people away.

~~~
prof_hobart
I would offer a suggestion, but I just had another go on the Wikipedia version
and I honestly don't have a clue what it's telling me. I think I get what non-
deterministic means, but then I'm totally flummoxed as to what that has to do
with problems whose solution can be quickly verified.

------
evandijk70
I liked the article!

I would mention that it is part of the millenium problems and the one million
dollar prize that comes with it.

------
DanielRibeiro
It seems to be missing the philosophical introduction to what P = NP means
[1]:

 _If P=NP, then the world would be a profoundly different place... Everyone
who could appreciate a symphony would be Mozart_

[1] <http://www.scottaaronson.com/blog/?p=122>

------
rnadna
Many people find it difficult to see which of two large numbers is greater,
unless they are reading on paper and take up a pencil and count the groups of
three. This issue must come up a lot in simple-English. The solution is to
introduce scientific notation, and to have a link to that. (As has been
pointed out, _mass_ is a tricky word, being used here.) Perhaps another method
would be to invent something quirky, saying perhaps "4+(20 digits)" or
something.

In this particular case, there is another solution: do not quote the
individual numbers, but only their ratio. Anyone can tell that "1000" is big
and "1/1000" is small, because all that's needed is to see if there are more
than two or so digits.

------
robrenaud
I dislike the common implication that NP complete means "the stupidest brute
force thing is the best way we know how to solve the problem." For example,
meet in the middle will turn subset sum from 2^n to 2^(n/2).

~~~
MarkPNeyer
i agree that there are a lot of problems, but the goal of this article was to
make the topic as simple as possible. any simplification of a complex,
mathematical construct will invariably result in hand-waving and undefined
terms; i tried my best to keep those to a minimum. i plan to expand the
article in the future, to clarify some mathematical details. thanks for the
feedback, though.

------
hypersoar
I suggest emphasizing that problems that can be solved easily can be checked
easily. It's a simple idea, but it wouldn't necessarily occur to the casual
reader and is quite important.

~~~
ilcavero
agreed that the example is not the best, a better example would be a picture
puzzle that can be verified in just a glance but to build it you need to test
all pieces against each other to see if they match.

~~~
mitko
Picture puzzle is in P because you can build the puzzle incrementally in
quadratic time

------
mseebach
I wonder if there is a good way to explain that the nature of these problems
is to find any and all _exact_ solutions to the puzzle and that the 95%
solution which is vastly easier and entirely sufficient for any practical
application (in terms of the given examples) is NOT interesting.

E.g.: Travelling sales-person: In terms of P/NP, she doesn't care if she _can_
as much as if there is any possible route she might take that is even an inch
shorter than the one she can fairly trivially plan out.

~~~
MarkPNeyer
your claim that a 95% solution is entirely sufficient only applies to
numerical optimization problems. every numerical optimization problem has a
corresponding decision problem, and an algorithm that could solve these
decision problems would be tremendously useful. think about circuit
satisfiablility; if you're building an airplane, you want to know if the
circuits for the airplane could ever cause the engines to shut off mid-flight.
the concept of an approximation algorithm is meaningless here; either you know
the answer, or you don't.

~~~
kamaal
Although I understand what your are trying to say here.

But the problems you tend to describe never exist in the real world either.
What I mean to say is, you can never be Boolean sure about anything in the
real world as you can never control all the parameters yourself.

Which in case you are always going to measure things in percentages like x%
safer or y% safer.

I guess that's what the parent poster tends to imply.

~~~
MarkPNeyer
you can never control all of the parameters yourself, true, but think of it
this way:

the cockpit of an airplane has hundreds of switches and nobs. suppose we want
to ask the question "is there some way the pilot could toggle switches or flip
knobs that would open the doors in flight?" you're never going to be able to
guarantee that the doors won't open, but you can guarantee that the circuit
alone won't cause them to open.

or, consider that many games are np-complete. if you are playing such a game
against another player, whoever gets the better solution will win.

in addition, the original author's claim that the 95% solution is "vastly
easier and entirely sufficient" _may_be true for the numerical optimization
problems given in the example, but there are most certainly np-complete
problems which cannot be approximated to any degree of accuracy.

for example: given a graph, the max-clique problem asks 'what is the largest
fully-connected subgraph?' this problem arises very naturally in social
networks; it's equivalent to asking 'what is the largest group of people who
are all friends with each other?' This problem cannot be approximated within
ANY bound of the optimal solution unless P = NP. The same problem arises in
computational biology and chemistry as well.

~~~
mseebach
That's a very good point, but my concern is that the provided examples dumb
down issues to a level where this is not easily understood. I mean, who gets
closer to grasping this nuance bu supposing a woman building towers of rocks
on the beach?

Actually, I think the cockpit-switches-example would be a valuable example.

------
ableal
A picture like this <http://en.wikipedia.org/wiki/File:Exponential.svg> would
help.

To rub it in, the x axis should go to 20 or so, e.g.
[http://www.wolframalpha.com/input/?i=2**x+vs.+x**2+%28x+from...](http://www.wolframalpha.com/input/?i=2**x+vs.+x**2+%28x+from+0+to+20%29)

------
camtarn
This sentence doesn't read quite right to me: "Most mathematicians also
believe that are any NP problems that are not P problems."

It feels like there are a couple of words missing, but I'm reluctant to guess
at them and potentially change the meaning of the sentence.

------
brent
I think your subset sum problem (the rock piles) is decent. However, for it to
be useful I think you have to make it clear which class it is in (NP) and why
(it is easy to check a solution and 2^n is not polynomial in n).

------
aperiodic
The real difficult part of explaining this concept to the layperson is getting
them to understand what asymptotic complexity is. The example I always use
when I explain this to people with little or no math or computer science
background is sorting algorithms, explained using the following example of a
deck of cards. Everybody's familiar with a deck of cards, and it doesn't take
much more math than the most intuitive bits statistics.

Imagine I give you a deck of perfectly shuffled (i.e. uniformly distributed)
cards, and decree that you need to find the ace of spades. What you have to do
is go through each card one-by-one, and stop when you've found it. You'll have
to look at, on average, 26 cards, since there are 52 cards in the deck, and
each card has an equal chance of being the ace of spades. Now, let's say I add
another 52 cards, this time with a defaced ace of spades, and wanted you to
find the original (non-defaced) ace of spades, which is now just one card out
of 104. On average, this'll take 52 cards.

Notice that when I double the amount of cards I gave you, the amount of work
(that is, looking at cards) you had to do also doubles, meaning that for N
cards, the amount of work is N/2. When the amount of work is directly related
to the number of cards, excluding dividing or multiplying by a number that
does not depend on the number of cards (in this example, dividing by 2), then
we simplify things and say that this process for finding the Ace of Spades
takes BigO(N) work to finish.

Now, imagine I give you that same deck, and instead of demanding that you find
the ace of spades, I require you to sort it out into suits, with the cards in
each suit in increasing order. One way to do this is to keep a separate pile
for each suit, with that pile in increasing order. Deciding where to put the
first card of each suit is easy, you just make a new pile for it. The later
cards are not as easy, since you have to go through the pile and find the
right place to put the new card.

Right away you can tell that this sorting takes more work from you than
finding the ace of spades does. Instead of just looking at each card to see if
it's the ace of spades, you first have to check the suit of the card to
determine which pile it should go in. Then, you have to look through that pile
for the proper place to insert the card. I won't go into the math, but it
turns out the best way to sort a deck of cards into piles of suits[1] like
this takes N * lgN work, where lgN is a fancy number that isn't constant, but
gets bigger when N gets bigger, unlike the division by 2 in the ace of spades
example. That means that sorting cards is harder than finding the ace of
spades.

This business of determining how hard tasks like sorting cards are is called
"analyzing the complexity" of a task. Except, instead of saying task, computer
scientists like to say "function." The question of whether or not P = NP is a
question about whether a certain kind of tasks called "deterministic
Polynomial" tasks (which are referred to by the P) are as hard as another kind
of tasks called "Non-deterministic Polynomial" tasks (referred to by the NP).

I have a second part that tries to explain the difference between determinism
and non-determinism using a magic deck of cards where we can always draw the
card that we want (the "magic" deck is, of course, merely a "non-
deterministic" deck), but I've already spend an hour or so writing this first
part up. I can post the second part of the explanation tomorrow, if folks are
interested.

[1]: This is meant to be a memory constraint. Obviously, radix sort takes
linear time too if the key sizes are constant, but has a memory overhead of N.
As far as I'm aware–and please correct me if I'm wrong–the fastest sorting
algorithms with constant memory overhead are comparison-based sorting
algorithms, which take N lgN time. But, even if I am wrong, it's incidental to
the main purpose of the explanation.

