I've implemented a SAT solver and read the literature extensively. This paper is not up the standards of clarity imposed by that literature, see, eg, "Efficient Conflict Driven Learning in a Boolean Satisfiability Solver" (Zhang, available for free online). There is a world of difference in the clarity of presentation between these two papers. There might be an English language barrier problem operating, I don't know.
If the author did some work to polish his presentation and state definitions more clearly, as well as submit his SAT solver to well know competitions, (http://www.satcompetition.org/2011/), I'm sure he could get some attention from the relevant people. Given how clear it looks right now, I'm too busy with my own research to try and pull out the hidden conflict-driven algorithm that I suspect exists in this paper, as it would be very time-consuming for little expected gain on my end.
If his algorithm beats the pants off all the others in SAT 2011, well, then I'd get right down to work.
Homework for someone who has some time: download his code and make it run on SAT 2010. Compare to other algorithms from that competition. Not, of course, a definitive test, but it it performs even in the middle of the pack, then you'll know it is worth a closer look.
I agree in principle, but the example is a bad one. Here, a deterministic polynomial time algorithm is stacked against probabilistic polynomial time algorithms.
In SAT, we'd have the (hypothetically) polynomial algorithm against exponential time algorithms. Theoretically, the constants in the polynomial might of course be so large as to erase the advantage on non-galactic problem sizes but this happens rarely.
0. This guy looks orders of magnitude less looney than the usual P=NP prover. I hope someone who knows this material well steps in soon.
1. This guy has implemented his algorithm. This is a very good sign -- most garbage "algorithms" are exposed to be broken when people try to implement them.
2. Most 3SAT problems are "easy". Being able to solve a particular problem doesn't mean that the algorithm works in general. He would have done better to demonstrate his algorithm on "known hard" problems.
3. He states a running time of O(n^4 m), but his experimental results don't scale quite like this; perhaps his analysis is wrong, or perhaps there's just a monster hiding behind the big-O.
4. If he is correct, and his algorithm is optimal, we probably don't need to worry very much about cryptography yet: It looks like this algorithm is far too slow to be a practical attack on existing cryptosystems.
(EDIT: Oops, in the time it took me to write that, 18 other people posted comments. Well, so much for getting the discussion started...)
Random without qualification is used to mean "uniformly". For this specific case (3-SAT), there are only four reasonable readings: both (variables within a clause), and (clauses within the set of all possible clauses) can be chosen with or without replacement. Neither makes a difference in the large n limit.
is from a reasonable book.
It's true that this phenomenon hasn't been completely rigorously proven, but it's been given "physicist proofs". There's enough evidence that it's unreasonable to not consider it true.
However, where you say this:
.... it's been given "physicist proofs". There's enough
evidence that it's unreasonable to not consider it true.
But I suspect we're in agreement, possibly "violent agreement."
ADDED In EDIT: Forgot to say, thanks for the reference.
Easy to generate random hard instances does not at all mean that highly structured "non-random" instances can't be hard. The "phase transition" only means that the fraction that are hard at a given non-critical ratio decreases with n, but there very well could be a large absolute number, that increases with n. The space of problems of size n grows exponentially, after all.
I suppose a proof that P=NP would still technically threaten cryptography, because if you assume that computer hardware performance increases polynomially with time you must concede that any given crypto technique will become practical to break after some amount of time.
Yeah, that's how crypto works as a field. As soon as there's a shadow of a doubt that something is secure in even the least likely edge case, it gets tossed out. MD5 was basically considered dead once someone showed to how create a collision with (correct me if I'm wrong) two very long, very unlikely, very similar (?) plaintexts. It didn't threaten anyone's actual use of MD5 in an immediate sense, but MD5 was shown not to be reliable and in the crypto world an algorithm's reliability is sort of treated as binary.
(I apologize to the true security folk in the crowd if I've oversimplified something here.)
The issue of whether it's a feasible amount of work is exactly what I was addressing.
One of the reasons this would be a huge result is that even a very expensive polynomial-time solution for any NP-complete problem (say, one which is O(n^2000) and would take several billion years to run against a 256-bit key) is only going to get faster over time. Also, the experience of cryptography is that once cracks start appearing in an algorithm the mathematicians start trying to wedge their chisels in and hammer those cracks into fractures, into holes, and eventually into whatever you call something which no longer resembles a crack so much it resembles absence-of-wall.
There are specific mechanisms by which this would happen, too. A near miss on P=NP would mean funding for professors looking into refining it, and professional interest generally, would suddenly explode. Nobody wants to fund or work on duds, but big contributions to this problem would make careers.
Not incidentally, while the government funds quite a bit of academic research, this particular branch of mathematics is of special interest to one type of government customer which has ways of getting funding for projects expedited if they are in the national interest... and it is virtually inconceivable that they would not find very, very strong national interest in the vicinity of this result.
Edit : For some reason I can't reply to posts so I'll do it here.
RiderOfGiraffes> It would will leave in question those problems in NP, but not NPC. Currently it is generally believed, but not known, that factoring is such a problem.
This is exactly my point. P = NPC doesn't imply anything about factorisation. Since factorisation is in NP we haven't shown that P = NP
kd0amg> NP-complete problems ... are, in essence, the hardest problems in class NP.
From what I understand they a 'thought to be' the toughest but this isn't proven. If this article turned out to be true then they are reduced to being as hard as P. So who is still standing as a contender for the hardest problem in NP? - factorization of course as it has never been demonstrated to be polynomially reducible to NPC.
Proving that any single NPC problem is in P will be enough to prove that every NP problem is in P, and not just the NPC ones. Suppose A is is NP, B is in NPC, and further suppose that solving B is polynomial. Reduce A to B (a polynomial operation because B is in NPC), solve B (a polynomial operation by assumption), and convert the solution back to a solution of A (a polynomial operation because B is in NPC), and that gives a polynomial solution of A.
Proving that any single NPC problem is not in P is enough to prove that all NPC problems are not in P. It would will leave in question those problems in NP, but not NPC. Currently it is generally believed, but not known, that factoring is such a problem.
Added in edit, to reply to your edit:
RoG> It would will leave in question those problems
RoG> in NP, but not NPC. Currently it is generally
RoG> believed, but not known, that factoring is such
RoG> a problem.
dejb> This is exactly my point. P = NPC doesn't imply
dejb> anything about factorisation. Since factorisation
dejb> is in NP we haven't shown that P = NP
Proving that any single NPC problem is in P will be
enough to prove that every NP problem is in P.
Let me add a little more.
1. Suppose 3-SAT is in P.
2. Let A be any problem in NP.
3. 3-SAT is in NPC.
4. Therefore A can be converted to A', an instance of 3-SAT.
5. By (1), A', as an instance of 3-SAT, can be solved in polynomial time
6. Hence A has been solved in polynomial time.
This was proved for SAT first: http://en.wikipedia.org/wiki/Cook%E2%80%93Levin_theorem
a. P is in NP, and
b. Every problem in NP can be converted to an instance of P.
Consider any two problems in NPC, P and Q. Then each is in NP, and each can be converted to the other. They are therefore equivalent.
Every NP problem is polynomial-time reducible to SAT. I do not recall the title of the paper off the top of my head, but it has been proven that the definition of an NP problem (a question that can be answered given some additional piece of information, a "certificate") is polynomial-time reducible to SAT.
The methodology was basically an algorithm taking as input a description of such a solution, and producing a boolean expression on the certificate and some other values needed to maintain integrity. The expression is satisfiable if and only if the answer is affirmative.
The reason that factorization is not known to be NP-complete is that the reverse is not known. That is, no NP-complete problem is known to be polynomial-time reducible to factorization.
Yes, it is. Any NP problem can be many-one reduced to SAT, which means that there's a function that transforms an instance of the NP problem to an instance of SAT, and the answer to whether that formula is satisfiable is the same as whether that instance is a member of the NP problem (in its decision version).
So who is still standing as a contender for the hardest problem in NP?
If P = NP, then all problems in NP are as easy as each other, so this question is meaningless.
The provable lower bound for the worst-case performance of searching is O(n log n). Finding the median takes linear time i.e. O(n).
Sorting and finding a median reduce to each other polynomially, of course.
Interesting subsets of polynamial problems are e.g. linear problems, or highly parallelizeable problems whose runtime tends to O(log n) as the number of processors tends to infinity. Adding a list of numbers is not only linear but also highly parallelizeable. Proving a problem to be not parallelizeable like that, is also interesting.
There's also the question whether randomness helps. It does in practice, but we don't know whether access to random bits helps in theory.
'It is suspected to be outside of all three of the complexity classes P, NP-complete, and co-NP-complete'
So even if this showed 'P = NP-complete' it may not imply imply factorization is in P. I'm at the limits of my rusty complexity knowledge here so if any knows better please correct me.
Also, there are problems in NP that are neither in P, nor are they NP- or co-NP-complete.
Err, it seems to me you've got a very serious statement there. I think you mean “that are not known to be …”.
Additionally, P=NP suggests that NP in RP. If an inefficient deterministic algorithm in P is found, I'm sure a much faster randomized algorithm would soon follow.
This hasn't been the case in linear programming, at least: 30 years after it was proven to be in P, the old exponential algorithm (the simplex method) is still competitive with the best known polynomial algorithms (interior point methods). From a practical perspective, no real efficiency breakthrough. The fact that there are now two families of algorithms with good performance for the problem is an improvement, but the fact that one of them is theoretically polynomial doesn't seem to have led to dramatic speed gains.
Edit: If the above comment doesn't make sense, it's because the comment I was replying to was "you won the Neumann" and has since been edited.
If I write a paper formatted in LaTeX saying that I cloned a T-Rex in my backyard, should it be upvoted so that an expert in cloning can take a look at it?
Second, doesn't mean much. It's almost as likely there's a bug in the code as in a proof. The only advantage is you can run some tets on it, but if it's wrong the tests may actually give you a false sense of security.
Lastly, if you prove P=NP you submit to FOCS or STOC. Let them review it.
Otherwise its just another Archimedes Plutonium.
My point is that a constructivist proof that P=NP along with working code (again I'm assuming it's implemented correctly, as it appears vetted by Romanov) is easier to prove incorrect as it's easier for someone who perhaps doesn't have the theoretical background to find pathological examples where it breaks down (and a much wider audience fits in this category, including most of HN).
I agree with your point about the false sense of security however--an inability to find such pathological cases is not sufficient to prove P=NP. In order to truly verify the claim a rigorous analysis of the proof will be necessary--but in this situation it's much easier to show what this guy is saying is false than in the Deolalikar case.
It sounds like what they are saying is equivalent to the following:
If S1 intersect S2 has a solution, and S1 intersect S3 has a solution, then the system S1 intersect S2 intersect S3 has a solution.
But this is evidently false. Consider the case where the CTS included each of the following rows, and were empty everywhere else (after re-ordering the columns so the same-name columns were in the same final column):
(3) 0 00
I suspect the problem they set up the induction for might not perfectly align with the theorem, but it needs more careful examination.
The algorithm uses an algorithm similar to the well-known '3-consistency', which has been shown to solve quite a lot of classes of problems, in particular some that are solved very poorly by the normal learning-based systems used in most SAT solvers.
The paper aims to strengthen 3-consistency slightly, using permutations. However, while that is a reasonable strategy, it is entirely unclear to me, and unclear in the paper, while that gives poly-time solving time.
A list of articles published on the P=NP debate is here http://www.win.tue.nl/~gwoegi/P-versus-NP.htm
Looks like someone thinks they've solved the problem every month or so :)
For example, "Quantum encryption" allows you to send data and know whether you're being eavesdropped. Then, once you manage to send it without being eavesdropped, you can use what you sent as a one-time-pad safely.
You could of course regress to distributing one time pads to everyone :-)
Also, if P = NP, there are still NP hard and Exp problems that remain outside the realm of P.
BTW, quantum encryption is not encryption as we know it (a purely functional deterministic bijective transformation of the message), as it requires special hardware, optical fibers, etc...
As of EXP etc... I am not aware of any (theoretical) encryption schemes that use problems harder than NP (actually most use circuit complexity, so NP/poly, but it is related). In designing encryption you often want polynomial verifiability (for decryption), and that implies NP membership.
1) One time pads are provably secure albeit impractical
2) Integer factorization is BQP, but not necessarily NP-complete.
2) Integer factorization is trivially in NP (the decision problem "n has a factor < x" has a certificate: the factor)
Recently someone claimed a polynomial-time graph coloring algorithm. I generated hard instances, their "solver" blew up. Claim debunked. It should be simple enough to do the same for this (for some definition of simple).
The key lies in generating hard instances. As cperciva has said in http://news.ycombinator.com/item?id=2121837 - most instances of most NPC problems are "easy."
1. For a fixed permutation construct a Viterbi-like search on the triplet assignments - if it fails it is not satisfiable. However, if it doesn't fail right away, there is still no guarantee there is an assignment. Call this structure compact triplet (CTF) or whatever.
2. Constuct a small set of permutation (at most m) for which
every clause in the original CNF failing to satisfy will mean that at least one of these permutations CTFs will fail to satisfy.
3. Efficiently? combine the structures.
I didn't really read it deeply but that from what I understood that seems to be the top level strategy. I'm not 100% certain about it.
If you skip the first part about pre-processing, and instead look at the graph that is generated, you can come up with following:
1. There exists a graph consisting of potential truth assignments on N unknowns in M clauses consisting of 7M nodes, and <48M2 edges.
2. Label each node uA,uB,uC,vA,vB,vC where u? is the unknown number, and v? is whether that unknown is true or false.
3. Edges exist between node I and node J if either nodes I and J share zero unknowns, OR both I and J share at least one variable AND all shared variables have the same value.
4. For there to be an assignment that causes the full statement to be correct, you must find an M-clique (a complete graph on M nodes within the 7M node graph).
Solving part 4 is NPC.
His claim is, basically, that if you permute/partition the input (to construct the tables), you can construct a disconnected graph with far fewer edges (get rid of the first part of #3 above), then add links graphs in such a way that if you can find a path from a tier-1 node to a tier-2 node, ..., to a tier-M node, examining the labels is sufficient to solve the problem. The two nasty parts are finding an efficient permutation/partition step, then managing to properly link the disconnected graphs. As you say, the two steps that the author hand-waves.
My bid/ask is 0% / 0.02%
I think I'd wager at most a 1% chance that P = NP, and, I'll be generous and put the odds that this particular person cracked it first at 2% of 1% (I know he has code posted, but think of all the smart people who failed, and within my 1% is the case where P=NP but no human ever proves it). Would anyone offer better odds than 1 in 5,000, or make a bid?
My knowledge is limited to taking Sipser's intro class in school; but as a programmer, I always find subset-sum to be the most tangible and convincing example that NP-hard feels pretty tough.
I love these announcements though; I am always humbled and fascinated by the resulting discussion.
is a market for long bets, e.g.
“Over a ten-year period commencing on January 1, 2008, and ending on December 31, 2017, the S & P 500 will outperform a portfolio of funds of hedge funds, when performance is measured on a basis net of fees, costs and expenses.”
PREDICTOR: Warren Buffett
CHALLENGER: Protege Partners, LLC
This particular attempt involves someone claiming P=NP, and they have code. Even people who aren't complexity theorists can jump in on this one, by analyzing the code to verify it really is in P, and by trying to find problem instances it fails on.
I think both Gowers and Tao talked about it. I don’t remember in how much depth.
- It is a O(n^4m) algorithm, where n is the number of boolean variables and m is the number of disjunctions.
- It not only tells you whether the formula is satisfiable or not, but gives you the satisfying values.
- The author claims to have implemented and tested the algorithm on reasonably sized problems, though I don't see his code posted anywhere (?). Someone else appears to have implemented it here: https://github.com/anjlab/sat3.
- The author references no prior work, just review papers on the problem itself.
Any yes/no NP solver can be turned into one that does this with polynomial overhead. After finding there is a solution, you ask it the similar problem "does this have solution with the first bit set to true?", then "does this have a solution with the first bit set to <answer to previous question> and the second bit set to true?", and so forth until you've identified all bits.
However, this guy seems very cool - he is humble about his work, appears very serious, and even puts his code on github. And even if wrong, his approach might be interesting.
Words worth of a politician, but not a hacker!
Politicians are not.
Edit: why downvoting?
If I recall correctly, Ramanujan once wrote a proof for 1=2 that baffled mathematicians for quite some time before they figured out what was wrong. I believe that proof was rather short (< 1 page). For a long proof with a tiny error, things could be much worse.
Let a, b be equal integers
a = b
a^2 = ab // multiply by a
a^2 + a^2 - 2ab = ab + a^2 - 2ab // subtract a^2 - 2ab
2(a^2 - ab) = a^2 - ab
2 = 1
2(a^2 - ab) = a^2 - ab
Someone should really verify his algorithm on various SAT sets. But I have to say that his approach is very good and humble... he suggests he may have solved the problem -- but it is up to us to verify! I'd like to follow this further, so I bookmarked it via an upvote.
The NP-hard property allows you to reduce any problem in NP to it.
As you stated it, generalizing (or reducing) 3-SAT to another problem proves nothing and is a common mistake for undergraduates learning complexity theory (reducing the wrong way).
To say you generalize an NP problem isn't even a statement that typechecks in the language of complexity.
Reduction on the other hand, is a formally defined term and is very specific in what it means exactly.
If moultano had actually said 'generalize [the problem]' then I wouldn't have even left a comment, because everything you have said is correct, I just don't think his statement needed that clarification. Now, however, I'm interested in whether the word 'generalize' can be used with respect to algorithms. (<--- someone not in complexity theory)
You're wrong. 3-SAT is NP-complete.
"But if any single problem in NP-complete can be solved quickly, then every problem in NP can also be quickly solved, because the definition of an NP-complete problem states that every problem in NP must be quickly reducible to every problem in NP-complete (that is, it can be reduced in polynomial time). Because of this, it is often said that the NP-complete problems are harder or more difficult than NP problems in general."
Though it's probably inaccurate, that's how I intuit the matter.
It's actually quite accurate, but you can do even better: any other NP problem can be "compiled down" to one 3-SAT statement.