Hacker News new | past | comments | ask | show | jobs | submit login
The Silent (R)evolution of SAT (acm.org)
245 points by panic on May 26, 2023 | hide | past | favorite | 103 comments



"Is progress in SAT solving the sole result of hardware advancement? Time Leap Challenge compared 20-year-old SAT solvers on new computer hardware and modern SAT solvers on 20-year-old computer hardware. Although hardware improvements make old solvers faster, algorithmic progress dominates and drives today's SAT solving."*

Pretty cool given computing progress over the last 20 years:

  1. CPUs sped up 40-60x
  2. GPU FLOPS/$ increased ~10,000x


Sources:

  1. https://www.cpubenchmark.net/year-on-year.html
  2. https://en.wikipedia.org/wiki/FLOPS#Hardware_costs
*Quote shortened for brevity


A fun exercise is to write a series of sat solvers: brute force, DPLL and CDCL you really get an immediate feel for the strength of the algorithms and none of these implementations need to go much beyond 50-100 lines in a modern language.

You can then of course spend months tuning and optimizing the hell out of your code to squeeze a further 10x performance.

If you're really crazy like Sarek, you can also formally verify these implementations: https://sarsko.github.io/_pages/SarekSkot%C3%A5m_thesis.pdf


Excellent link. I would love to see a CDCL implementation in a 100 (understandable) lines. From what I see in the repo, creusat is probably more like 1000?


I found the description of CDCL as an abstract rewrite system illuminating. It's much shorter than an implementation. See e.g. [1]. There is/was a more readable version online, but I can't find it now.

[1] https://easychair.org/publications/open/b7Cr


This one's 250 understandable lines: https://github.com/marijnheule/microsat. You could probably get it to below 200 lines if you removed the restart logic and learned clause compaction, but it's actually a surprisingly competitive solver as is.


Indeed, but CreuSAT is actually a high-performance implementation (ie: worst of the best), I should probably just write up a Rust gist of CDCL.


It's worth mentioning Mathias Fleury work who first verified a CDCL implementation [1]. He has written follow-up stuff.

[1] https://fmv.jku.at/fleury/papers/Fleury-thesis.pdf


What a lovely and well-written thesis; I am reading it with great interest.


You may want to read "A Time Leap Challenge for SAT Solving". It's just a really fun paper: https://arxiv.org/abs/2008.02215


And yet, despite tremendous progress accelerating continuous optimization workloads, there are still no competitive GPU-based SAT solvers. I wonder why?


Super short answer: SAT doesn’t exhibit the nice properties other optimization problems often have where you can tell when you’re near a correct answer. So it doesn’t resemble most continuous optimization problems. Regarding parallelism, there has been research into parallel SAT, but it’s hard in part due to the problem of sharing information between threads/tasks in conflict clause learning algorithms. I don’t remember specific papers but a quick search on google scholar or looking at past SAT competitions would turn up material if you’re interested in the topic.


> SAT doesn’t exhibit the nice properties other optimization problems often have where you can tell when you’re near a correct answer.

You mean like a metric? Not all SAT instances have a natural metric, maybe a large fraction could be relaxed to MAXSAT, but it seems parallel solvers are not very popular in that setting either, i.e., there is no parallel track and parallel solvers are disqualified [1] from participating.

I'm not sure I follow how metrics are related to parallelism though. A naive strategy would be to try every assignment in parallel on 2^n processors. There are plenty of better strategies for search space splitting [2] and stochastic local search [3] algorithms that seem amenable to parallelization.

[1]: https://maxsat-evaluations.github.io/2023/rules.html

[2]: http://sat.inesc-id.pt/~ruben/papers/martins-ictai10-talk.pd...

[3]: https://link.springer.com/article/10.1023/A:1006350622830


n is like a million variables so you would need 2^n gazillion cores.

I'm no expert but my thesis advisor is cited in the OP article, so I'm just guessing but it's an interesting question. It's not for the lack of trying in the part of SAT researchers; after the hardware advances with SAT solvers implementation in the early 2000s, they would've looked at parallelism and concurrency as well. But with continuous optimization (like with training machine learning), there's gradient descent which tells you where to guess next. But with Boolean logic there's nothing like that, the needle is either in the next bale of hay, or it isn't.


The 2^n idea is just a thought experiment, but you should be able to realize linear or close-to-linear speedup using a communication-free algorithm that shards the search space evenly across available cores, no? I don't see why this strategy isn't more widely used.

I can see how continuity accelerates the search for solutions, but I see that more as an orthogonal property, not a limitation of parallelism. Even if your domain is continuous, accelerating gradient descent is nontrivial, and many discrete optimizations problems benefit from parallelism.


It is obviously true that the worst-case runtime of a problem on n variables and m clauses is O(m2^n), and that the brute-force algorithm can be sharded out to 2k cores and run in time O(m2^(n-k)). If your approach is brute-force, yes, you might as well use this technique.

But, notice that a shard whose k preset bits are in direct conflict should take zero time -- it's smarter to detect a conflict and terminate than carry on computing all 2^(n-k) invalid assignments. Each such conflicting shard decreases the benefit of your parallelization -- a depth-first solver wouldn't spend almost any time on that particular branch, but you've wasted a whole core on it.

The real question is, what if your algorithm is smarter than brute force? Such algorithms may spend considerable time preprocessing, which is not trivially parallelized, resulting in many cores duplicating tons of work with very little effective speedup.


It depends on the sharding granularity. If you choose a low sharding ratio, then you waste a whole core, but that's assuming 1:1 shards to cores. If you shatter the search space using a large multiple of the number of cores, e.g. 1000:1, then shuffle the shards and have each core pick a shard off the queue, if any core ever gets stuck with a bad shard (i.e., conflict), it discards and picks a new one off the queue. This requires no communication and ensures all cores stay hot. I guess you might also need some preprocessing to remove trivially-conflicting shards.


You're not wrong, but it still only works for the most naive algorithm.


So I went down the rabbit hole a little and it turns out they did try all the stuff, it just hasn't panned out. Yet. It's one of those questions, theoretically it should work, but empirically it doesn't seem to work that compellingly. The PPfolio solver was to answer the question of why not just independently parallelize across cores, and so now it's a benchmark for parallelization efforts. So people are out there keeping tabs on this problem.

Now apparently neither of us read the article, because indeed there are a couple sections where it discusses parallelization. You might want to look at those remarks. For example, there's current research using GPUs and tensor processors to parallelize SAT.

But going back further I think there's a historical context, and the article alludes to why it was not in fashion the preceding decade. Parallel SAT was looked at for a long time, but that really depended on progress in parallel hardware (the transition from Moore's law having to end i.e. unicore to multicore). So in the 2000s when consumer-level multicore CPUs arose, one would imagine that would've been a turning point for running parallel SAT algorithms. But historically, it was in 2001 that sequential solvers also had their implementation breakthrough, i.e. the Chaff solvers were a 10-100x sequential improvement, and experimentally beating the parallel implementations of the time. It was a big deal. So in terms of fashion there was a lot of interest specifically in sequential solvers at the time. Hence the ebb and flow of research focus.

And it makes sense as a justification. From a bird's eye view, if, n >> k, then the subproblems are still large NP-complete problems. Which is a reasonable rationale for firstly investigating the best theory/algorithms for a sequential solver in the first place, one that would be run at the core of every parallel hardware node. As the article points out, Knuth says to have a good algorithm first, before you resort to throwing ever more hardware at the problem. (In some papers, SAT algorithms that are complete as well as deterministic are also offered as theoretical justifications, and the article also touches on why requiring these properties makes parallelization more complex.)

What the new GPUs/Tensors are doing are massively splitting up the space, perhaps so that n !>> k, and this alternative approach is again mentioned in the concluding paragraph. Clearly that seems like a promising area of future research to the authors. And who knows, maybe one day, there's some new breakthrough, that for all intents and purposes makes P = NP.


Thank you for the detailed reply, this is the first cogent answer I have read that explains why GPU solvers are not as widely used in SAT. The argument that we should focus on getting sequential solvers to run fast, then parallelizing that in hopes that it can be reused on each core does make sense, although I can't help but wonder if there isn't a smarter way to leverage parallelism than simply running CDCL in parallel on n-independent cores.

I guess if I were to summarize your argument, (current) solvers do not use GPUs because there are not many algorithms that can leverage the additional compute (and maybe there is some theoretical barrier?). On the other hand, the continuous optimization folks had neural networks for more than half a century before they figured out how to make them run fast, and were also ridiculed for pursing what was widely believed to be a dead-end. So I would be careful about falling for the "smart people tried really hard for a while" argument.


Yeah if there indeed is some kind of theoretical reason that makes NP-complete problems hard to parallelize, that would be interesting to know about and understand. Maybe somebody has studied this problem.

I think that unlike the story with neural nets, there wasn't some faction trying to say that parallelizing SAT is a bad idea. In fact the state-of-the-art sequential solvers use techniques to exploit properties of CPU caches to get their speedup breakthrough; that in itself was an important lesson within the research community to pay close attention to hardware capabilities. My impression is that ever since that success within their discipline, SAT specialists became interested not just in algorithms/formal theory but also hardware and implementation, and in today's case that would include having a close look at GPUs/TPUs for the possibility of new algorithms.


> there is no parallel track and parallel solvers are disqualified [1] from participating.

Maybe that's as much a cause as an effect. What's the incentive to create and improve parallel solvers if there's no place that evaluates and rewards your work ?


No, parallel SAT solvers are NOT disqualified, in fact there is a parallel track. Seriously? Every year there is a winner for the parallel track. Here's last year's:

https://satcompetition.github.io/2022/results.html


That’s SATComp, I was talking about the MAXSAT competition.


Unless I'm missing something, space splitting should be trivial. If you have two cores and three propositional variables x1, x2, x3, you could simply set x1 to tier on one core and false on the other, and then perform two parallel SAT searches on a space with one less variable.


What you describe looks like cube-and-conquer technique [1].

[1] https://www.cs.utexas.edu/~marijn/publications/cube.pdf

The thing is much more deeper than what you described. You need to choose which variables you will assign, that's first. Then you have to assign them, and it is important to do as carefully because assignment may produce inbalanced search space.

Splitting on one variable when there are hundredths of thousands of them is very, very inefficient.


The difficulty is which variables to pick so that all subspaces are of comparable difficulty. Cube and conquer is a nice paper on that problem.


Using a naive splitting strategy is trivial, but ensuring the distribution is well-balanced across the subspaces is not. You need to preprocess the formula by setting the variables, then doing some propagation, then restarting, otherwise one core will get stuck doing a bunch of useless work.


GPU's are not friends with pointer chasing, which is the basis of contemporary CDCL solvers, because internal lazily-updateable structures are multi-linked lists.

They are not friends with bottlenecks like clause queues used in contemporary stochastic SAT solvers.

They can be good at something like survey and/or belief propagation [1].

[1] https://simons.berkeley.edu/talks/belief-survey-propagation

The problem is that survey/belief propagatin is effective in the area of random CNF problems and it is not a complete solution technique. It may compute a solution but it cannot provide you with the evidence that there is no soolution.


My hunch is that CDCL is probably the wrong approach to take for parallelization due to synchronization issues. You want an algorithm that is communication-free or makes belief propagation wall-clock competitive for large set of instances. It should be possible to modify BP to be asymptotically complete by using a special-purpose PRNG that ensures samples are drawn uniformly without replacement from the search space, but I'm not sure how to make it more sample-efficient.


It's not just GPU. There are not even good multicore SAT solvers. State of the art SAT algorithms are serial and they don't parallelize.


ManySAT: http://www.cril.univ-artois.fr/~jabbour/manysat.htm

It shares short conflict clauses between parallel solvers and achieves superlinear speedup in some cases, e.g., 4 parallel solvers solve faster than one forth of the single solver soolution time.

Short conflict clauses are rare so there is little communication between solvers required.

CryptoMiniSAT: https://github.com/msoos/cryptominisat

Author's goal to have solver that is good in computing range from single CPU up to cluster. Judging from CryptoMiniSAT successes, he has mostly reached the goal.


I was thinking there might be some theoretical barrier to parallelization, e.g., maybe naturally-arising SAT instances have poor empirical scaling constants [1], so sequential solvers with good branch heuristics are pretty close to optimal. There are some portfolio solvers (e.g., [2]) that have shown modest, but not dramatic speedups. Or maybe we just haven't tried hard enough.

[1]: https://en.wikipedia.org/wiki/NC_(complexity)

[2]: https://baldur.iti.kit.edu/hordesat/files/horde.pdf


You could reduce to 2XSAT and presolve the linear portion. That would paralelize. I think people are not doing it because they are not aware of the reduction.


You'd think that the handful of people who devote a significant fraction of their time to writing SAT solver would be aware of such simple tricks and there might be other reasons why your idea isn't being applied.


I would think so too, but I couldn't find this particular trick anywhere.

Anyway, I believe it's similar to LP - we still use simplex method in practice, despite the fact that LP has a polynomial algorithm.

What likely happens in SAT is even though there actually is (as I believe) a nice polynomial algorithm on the order of O(n^4) or so (which involves repeated solving of linear systems), the CDCL still wins in practical problems simply because in most cases, the upfront cost of solving the equations is higher than trying to find a first solution.

However, not all hope is lost. You could actually presolve a more general problem, and then just rely on propagation to deal with a more specific instance. For instance, for integer factorization, you could presolve a general system for integers of certain bit size, and then if you wanted to factorize a specific integer, you would just plugin constants into the presolved instance. That would eliminate most of the upfront cost and bring it on par with CDCL.

Unfortunately, DIMACS is not all that good format to represent SAT, because it is not composable like this - it doesn't allow for representation of presolved instances. But composability might be a win for a different types of algorithms, which do presolving.


If you know a O(n^4) algorithm for SAT, I'm sure a couple of people would be interested in your paper.


I don't really know the exact algorithm yet, but I have a strong feeling it is possible based on what I see. There are like 3 major steps in the strategy, first one is 2XSAT reduction, second is a proof of refutation completeness (in particular, a form of deduction theorem) of a certain logic that models 2XSAT class, and the last one is polynomial-sized representation of all true statements in that logic (which uses n sets of linear equations, and from that follows the complexity of O(n^4)).

The 2XSAT reduction is 100% correct (I have been using it to solve small integer factoring problems), the deduction theorem proof still has a small flaw which I think is fixable, and the third part is an ongoing research which is trucking along. I will get there eventually, but if more people would be looking at this approach, then we (humanity) could arrive there faster, that's all.


Just gotta keep on trucking. Either way, I would be interested in seeing how the 2XSAT reduction works. Don't get discouraged!


There is a solver that can make effective use of GPUs (shameless self-promotion): https://github.com/nicolasprevot/GpuShareSat

It is a VERY fun work. Code entirely written by Nicolas Prevot, a magician of CUDA. Paper link here: https://comp.nus.edu.sg/~meel/Papers/sat21-psm.pdf


Oh you're one of the GPUShareSat guys, I forgot about that one. Would you say the main obstacle to scaling to a million processors is the synchronization overhead or are there certain instances from the SATComp benchmark that exhibited poor scaling with increased parallelism? Do you think sharing is essential or do you have any ideas how to improve communication-free strategies, i.e., parallelism without clause sharing using divide-and-conquer or randomized search? Six years ago, your co-author Mate wrote a very pessimistic remark on GPU-based SAT solvers here [1], I'm curious whether he still believes this or if his option was changed during the course of the project and what changed his mind? Thanks!

[1]: https://news.ycombinator.com/item?id=13667565


Hahha, I am Mate :) I still think that GPGPU can't help in the core of the algorithm. However, I was pleasantly surprised with Nicolas' work. He is a proper magician, and he had a great idea and made it work. Notice that he didn't make the GPU do propagation/conflict generation. Instead, he used it to better distribute clauses between the threads that do all of that. In a way, he improved clause cleaning. When I saw his work, I was very-very happy.

I still hold that GPGPUs can't do CDCL. However, they may be helpful in some of its sub-parts, or may be able to run another algorithm for solving SAT that we haven't invented yet.

Just my 2 cents,

Mate


Because all known SAT algorithms rely heavily on data-dependent branching based on non-local data. That makes the existing algorithms slow on GPUs. An algorithmic breakthrough is needed to change this.


Unit propagation (a major component of CDCL-based SAT solvers) is P-complete, and thus has no effective parallel algorithm.


Interesting. Do you have a link where I can read more about that? How does this relate to k-SAT or CNF complexity, isn't UP is complete w.r.t. these problems?


Parallelizing SAT solvers seems like a strikingly important avenue for future research.


Similar to faster-than-light travel.


Really? Does parallel algorithms for SAT contradict any known laws of physics?


It contradicts mathematics. :)

Unit propagation is P complete.


That's not really a proof that we can't get a nice thing, though. SAT is NP-complete, but here we are talking about it. There may be ways to use parallelism at a higher level. And even if there isn't an alternative to unit propagation, a purely constant factor speedup with parallel processors isn't out of bounds and would still be fantastic.


>a purely constant factor speedup with parallel processors isn't out of bounds and would still be fantastic.

The state of the art with this is about 2x the speedup with an unbounded number of threads.

That's not bad, but is not very useful because in most use cases of SAT you have multiple distinct problems solvable in parallel anyway.


Isn’t UP complete with respect to CNF satisfiability though? How does P completeness relate to parallelizability of SAT?


It's related in the sense that we can guarantee that all threads have to communicate at every step of the algorithm.

It's not a deal breaker, but we really need some unprecedented advance in mathematics to negate this or to solve SAT without this communication.


I remain skeptical. NC=P? is still an open conjecture, so there could be problems which are decidable in P, but only when using a polynomial number of processors or superpolynomial on a single processor, but polynomial on a polynomial number of processors. Furthermore, the amortized complexity of common instances could turn out to be tractable on average (e.g., random 3-SAT instances are tractable unless they fall in a very narrow region where the ratio of clauses to variables is ~4.25, but NP-complete in the worst case).


You're not wrong, but people have been trying to do this for a while now without much success.

So much that you can probably get a turing award by making a SAT solver that can be 100-1000x faster on a GPU than on a CPU.

Also that would lead to solving a bunch of NP complete problems 1000x faster, so you can see why that would be a big deal.


I keep asking this question and not getting satisfying answers so I'll try again: I'm sure I encounter problems every month that reduce to SAT (or SMT) and I could save a lot of time and energy (for both me and my computer) if I would just ask a SAT solver to find the answer instead of writing custom code each time.

However, that thought never strikes me because I haven't gotten used to thinking about problems as SAT reducible. But since the thought never strikes me, I also don't get many opportunities to get used to that!

How does one get started with incorporating SAT (or SMT) solving as a natural part of going about one's day?


Keep your eyes open for problems where you have to do some form of brute-force search. Those that doesn't have a clear solution other than trying all the possibilities.

Another thing is that the best kind of problem for SAT solvers are problems that ask a yes/no question. Is there a way to satisfy all the constraints, and if so what are the variable assignments that do so? It's also possible to use SAT solvers for other kinds of problems (e.g. find the solution with the best score) but it's more advanced.


Try CSP instead. It works as as higher level abstraction over SAT that is easier to model with.


Right, SAT can sometimes be thought of as the assembly language of discrete optimisation.


What is CSP?

Edit: looked it up, Constraint Satisfaction Problem.


I wonder if they'll write a followup on the SMT revolution that happened after the CDCL breakthroughs for SAT.

The lazy approach to SMT was a huge step forward, but itself spawned a whole lineage of refinements, alternatives and changed the state of automated reasoning entirely.

A lot of work today seems to be going into theory combinations so that you can efficiently answer problems for example involve arrays and integers in a manner that has dependencies between the two.

'Solving' (it's an undecidable problem) SMT would open up a bunch of new possibilities..


> it's an undecidable problem

In what sense? People often use SMT (Satisfiability modulo theories) but don't state _which_ theories. I often SMT-solve with decidable theories, like Presburger Arithmetic.


I almost exclusively use it with quantifiers which makes it equivalent to FOL. But even without quantifiers, theory combinations can introduce undecidability.

There’s the auxiliary problem that decidable but slow can often be equivalent to undecidable.


SAT is tantalizingly simple to describe and is an intense rabbit-hole if you are intellectually curious. I liked the article.

The SAT Competition [1] is a good place to find state of the art.

[1]: http://www.satcompetition.org/


For anyone who understands easier through code, I suggest:

https://github.com/msoos/minisat-v1.14

It's an early version of MiniSat by Niklas Eén and Niklas Sörensson. You can get the original ZIP from minisat.se, but it's easier to read from GitHub. Enjoy!


I'm not an expert in any of these, but in the past few years, in addition to the success of LLM's for natural language processing, I have repeatedly read about the impressive advances in SAT solvers and in proof assistants. Given how impressive LLM's are in spite of their inability to reliably perform reasoning tasks or follow instructions with logical strictness, I wonder how much more impressive it could get if such systems got integrated


You will enjoy "Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations". It prompts LLM to generate tree of explanations and run MAX-SAT solver over it: https://arxiv.org/abs/2205.11822


Thanks, I'll check it out!


I wonder what is the target audience of articles like this. I think the article is well-written, however 90-95% of the terminology make little or no sense to me who is not proficient in this field. Would the article be useful for someone who _is_ proficient in the field?


> Would the article be useful for someone who _is_ proficient in the field?

Yes, it is a very well written summary of what has been going on with SAT solvers in the last decades. I personally think most of this (r)evolution can be attributed to miniSAT. A relatively easy to read (at least compared to other theorem provers), free software implementation of a SAT solver, with instantiations if I remember correctly. Since variables only have two possible valuations (true/false) in boolean logic, they can be instantiated and the problem can be split up, and sometimes greatly simplified in doing so. I.e. we can replace each occurrence of variable 'a' with 'true' to make a new smaller problem, and then replace each occurrence variable 'a' with 'false' to make another smaller problem. I.e. split 1 problem into 2 simpler problems, and solve them in parallel etc.


As I remember, Glucose was a larger advance than MiniSat. MiniSat was well-engineered but not exceptional. Glucose started as a hack, but the heuristic it introduced was so effective that you basically couldn't compete without copying it.


That is true, but there are many interesting heuristics in modern SAT solvers, such as kissat: https://github.com/arminbiere/kissat (winner for a few years now). LBD ("glues" in glucose) is one of them. Kissat is actually quite readable. More readable is CaDiCaL, also extremely fast and effective: https://github.com/arminbiere/cadical

I personally also develop a SAT solver called CryptoMiniSat, but the above two are easier to read :)


Glucose was a patch on minisat I think. So it couldn't have existed without minisat and its exceptional popularity. For a while there was even a category at the competition where entrants had to be small modifications of minisat.


You could be correct. I haven't worked in academia for about 12 years; and I mostly worked on other types of logic. I do remember miniSAT being praised for its openness and readability, and it seems to predate the Glucose solver by some years. I.e. "in my day" miniSAT had all the praise, but perhaps Glucose was a more important contribution to the field.


FWIW, Knuth's chapter on satisfiability mentions MiniSat once for historical interest and discusses Glucose heuristic for five pages. This matches my impression of their technical contributions.


It's not just you. I found the technical sections of the article impenetrable. And yet my background is on resolution theorem-proving (I did my PhD research on Inductive Logic Programming) and I read the article wishing to understand how CDCL is related to Resolution, but I couldn't figure that out.

I don't have a background in SAT solving but the "clause learning" in CDCL looks a lot like what we call a Resolution-derivation in Resolution theorem-proving. In short, given a conjunction of two (propositional or first order) clauses (¬φ ∨ ψ) ∧ (φ ∨ χ) Resolution eliminates the contradiction in the "complementary pair" (¬φ, φ) and derives the new clause ψ ∨ χ. If that's what CDCL folk call learning then that's really interesting to me because most of Inductive Logic Programming is an effort to use Resolution for inductive inference i.e. machine learning, but in logic. I guess I should check the article's bibliography. They reference s a textbook that might be useful ("Handbook of Satisfiability". Though one of the authors is an author of the article so that may not bode well for the textbook).


Conflict explanation in CDCL is entirely based around resolution, the idea being to take a conflicting set of clauses and apply resolution a bunch of times to “simplify” the conflict into its most general form.


If I understand correctly what "conflicting" means, that process can't possibly result in an empty clause, as in Resolution refutation proofs, correct?

What would be the gold standard reference for CDCL? Something that would be taught at an undergraduate class, say?

Edit: after reading a bit on the net it seems like clause learning in CDCL is basically Resolution-derivation by input Resolution. Cool beans.


The resolution is actually also what 2SAT algorithm is doing.


I like how the Wikipedia page on 2SAT explains the use of Resolution:

>> Resolution, a method for combining pairs of constraints to make additional valid constraints, also leads to a polynomial time solution.

"Constraints" being a slightly odd-sounding way to put it, to me anyway, but I guess that's because I don't have the relevant background to think of clauses that way.


The target audience is members of the acm. So that mostly includes computer science researchers, I guess.


Increasingly it seems that P=NP is not really all that interesting of a problem from a practical perspective.

The progress made in various NP-Complete problems, SAT especially, shows that the vast majority of problem instances tend to be solvable, with the worst-case complexity problems being relatively rare but having much greater complexity cost.


I don't really get why people think P≠NP anymore. I mean, isn't everyone expecting some kind of superintelligent AGI that can recursively improve itself to turn the planet into goo? Don't people think that this kind of AGI will get so good at nonlinear optimization that it will turn all humans into paperclips as a way to maximize the output of a paperclip factory? Why do we think that this insane next-level intelligence will somehow be able to do things like that, but not be able to figure out what pattern of bits gets a bunch of AND and OR gates to output a "1"?

10-15 years ago, the basic idea was "P≠NP because otherwise computers could do crazy shit." Well, looks like they can do crazy shit!


P != NP is more likely akin to a fundamental mathematical law, like Noethers theorem [0]. No amount of self improving AI will alter the fundamental laws of physics. More provincially, in our solar system, there's a cap on energy so that any system that uses it will eventually hit a ceiling (like a self improving AI that eats up energy resources at an exponential pace).

As motivation for why P != NP, one can think of it as a finite restatement of the Halting Problem [1] in the form of asking whether a (polynomially sized) Turing machine has an input that will halt in K steps.

[0] https://en.wikipedia.org/wiki/Noether%27s_theorem

[1] https://en.wikipedia.org/wiki/Halting_problem


> As motivation for why P != NP, one can think of it as a finite restatement of the Halting Problem

That's a really poor motivation. Analogically, you can think of the NP question as solving a polynomial equation over Z_2. What you're arguing is that because for Z the problem is unsolvable (due to DPRM theorem), then for Z_2, we have to try all the solutions.

That argument fundamentally ignores that the source of unsolvability is the infinite nature of Z.

For the record, I believe that P=NP (see https://news.ycombinator.com/item?id=35729626). I wish people spend more time looking for polynomial algorithm using logic approach, as I am doing (trying to extend the 2SAT and XORSAT proof to 2XSAT). This approach could give new algorithms that actually compose and can be paralelized (DIMACS is a really awkward representation).


Godels incompleteness theorem and the Halting problem are very nearly equivalent. I believe Turing, very soon after hearing about Godels' incompleteness theorem, used it to show the non-existence of the Halting problem. The consequences of Godel's incompleteness theorem and the Halting problem wove into many outstanding problems, including Hilbert's 10th problem.

I would argue that computation is a much more natural setting for thinking about these ideas. Whether is recursively enumerable sets, Diophantine equations or set theory doesn't fundamentally matter so much, as they're all equivalent in some way. I wouldn't argue that you, or anyone else, shouldn't think of it in that way, but in my opinion, for many people, myself included, casting it through a computational lens makes more sense.

Not to harp on a small point but I would argue the diagonalization proof of the Halting problem relies on there being different orders of infinity, not that Z itself is infinite. Since Godel's incompleteness, we know that the continuum hypothesis can be true or false depending on which (consistent) axioms you choose, so I would argue that it's not so much which axioms we choose for how we think about infinities that's fundamental but the fact that consistent systems of sufficient complexity must be incomplete.

I'm firmly in the P!=NP camp but I commend you for putting your beliefs out there and actually doing the work. I do believe people throw up their hands when they hear "NP-Complete" when they absolutely shouldn't. The path from problems being "easy" (polynomial) to "hard" (exponential) is not very well understood and even if a problem is, in general, NP-Complete doesn't mean that a particular ensemble of problems being drawn from couldn't be solved by some clever polynomial time algorithm.

And of course, P ?= NP still remains an open problem so we still don't understand something pretty fundamental.


> Not to harp on a small point but I would argue the diagonalization proof of the Halting problem relies on there being different orders of infinity, not that Z itself is infinite.

I am philosophically very close to finitism, and I think from the perspective of the Skolem's paradox it's not clear whether different infinities are not just a "mirage", so to speak. The distinction between finite and infinite seems to be much more "real" to me.

> but the fact that consistent systems of sufficient complexity must be incomplete

The problem is that if P!=NP, then it would seem to imply (from what I have seen) that there is a large class of finite deduction systems which are incomplete, no matter what rules of inference you pick. This to me is a lot more counterintuitive than simply accepting that finite and infinite are just very, very different worlds, and our intuition easily fails when we transfer results from one to another.

And as I describe in that other thread, with the 2XSAT being NP-complete, the idea that you have two finite sets in Z_2^n that are each relatively easy to describe (instance of 2SAT and instance of XORSAT), but their intersection (which is in 2XSAT) is somehow "difficult" to describe is just.. really really hard to take seriously. Of course it's just another intuition, but I don't see how anybody who believes that P!=NP can look at this and not start doubting it.

> I do believe people throw up their hands when they hear "NP-Complete" when they absolutely shouldn't.

I very much agree. I was originally agnostic on whether P=NP (now I am tilted to believe that P=NP and there is actually an efficient algorithm). I always felt that assuming P=NP ("assuming" in the philosophical not mathematical sense) and looking for a polynomial algorithm (what would the polynomial algorithm have to look like, for example, it would have to represent exponential number of solutions to look at them at once) is a better way to progress on the question rather than assuming P!=NP. (Kinda like if we assume that the world is knowable, we are more motivated to know it, rather than if we assume it's unknowable.)

It's also better in practice because if you actually try to construct a polynomial algorithm, and it fails, you will get a much more concrete idea what the obstacle is (a specific set of instances where it fails). I think because of the natural proofs, if you assume P!=NP, then you believe that progressing this way is essentially impossible. So I think that's the reason why people give up, because if you believe P!=NP, resolving it this way is hopeless, and at the same time, nobody has any better idea. However, if you ever so slightly believe that P=NP, and humans are just not smart enough to figure it out in 50 years (remember, it took several centuries to figure out Gaussian elimination, which seems completely obvious in retrospect), then you're not constrained with natural proof limitations and there is no reason for hopelessness.


> I am philosophically very close to finitism, and I think from the perspective of the Skolem's paradox it's not clear whether different infinities are not just a "mirage", so to speak. The distinction between finite and infinite seems to be much more "real" to me.

I think I'm also in the same camp (finitism). My view is that infinities are kind of "verbs" rather than "nouns", even if we use them as if they were nouns for short-hand. I guess another way to say that is infinities have algorithms associated with them, where we kind of "turn the crank" and get more output.

> And as I describe in that other thread, with the 2XSAT being NP-complete, the idea that you have two finite sets in Z_2^n that are each relatively easy to describe (instance of 2SAT and instance of XORSAT), but their intersection (which is in 2XSAT) is somehow "difficult" to describe is just.. really really hard to take seriously. Of course it's just another intuition, but I don't see how anybody who believes that P!=NP can look at this and not start doubting it.

I don't understand the hesitation here. Once you have a fundamental gate (NOR, NAND, etc.) you have universal computation. In my opinion, 2-SAT is just barely in P because of the extraordinary restriction that each clause have 2 variables. As soon as you relax that condition, it's easy to do arbitrary computation.

> ... if you assume P!=NP, then you believe that progressing this way is essentially impossible.

I think we're in agreement here. My personal view is that it's not so much better to assume P=NP as it is to understand that there are different ensembles of problem space to choose from and even if the larger "space" is NP-Complete, the ensemble might only produce "almost sure" polynomial solvable instances. This is the case for Hamiltonian Cycle problems on Erdos-Renyi random graphs, for example.

> ... then you're not constrained with natural proof limitations and there is no reason for hopelessness.

I do wish I understood the natural proof limitation a bit better. I suspect there's a loophole somewhere or, at the very least, pointing us to another region of proof technique, but I just don't understand it well enough to know what's going on.

> ... now I am tilted to believe that P=NP and there is actually an efficient algorithm ...

I guess I don't have much hope of convincing you otherwise but I would like to point out again the "finite Halting Problem" interpretation. For an arbitrary program F(inp,K) (input `inp` and runs for K steps as a parameter), you're saying that the following can effectively be short circuited from an exponential search to a polynomial one:

    for inp in [0 .. 2^{n-1}] do
      if F(inp,K) then return true
    done
    return false
To my eyes, if this can be short-circuited, it's essentially saying that for arbitrary functions, there's no actual reason to do computation as it can be short-circuited. In other words, you're saying one can do better than raw simulation in every single case. While potentially true, it seems like a pretty bold statement.


> In my opinion, 2-SAT is just barely in P because of the extraordinary restriction that each clause have 2 variables. As soon as you relax that condition, it's easy to do arbitrary computation.

I think your intuition is wrong here. The number of variables in the clause is not the reason why NP is difficult. XORSAT also has arbitrary number of variables per clauses, and is doable in P. Furthermore, the 2XSAT reduction shows that you can represent any 3-SAT clause only with 2-SAT and XORSAT clauses.

> To my eyes, if this can be short-circuited, it's essentially saying that for arbitrary functions, there's no actual reason to do computation as it can be short-circuited. In other words, you're saying one can do better than raw simulation in every single case. While potentially true, it seems like a pretty bold statement.

I don't think that's what P=NP would imply. Remember, if P=NP, the algorithm is polynomial in the worst case, it can still be true that for a portion (even a majority) of actual instances, the simulation is faster than applying the algorithm. (Also it's possible that for some instances, functions that already have been properly "inverted", the decision algorithm and the simulation are equivalent, this is similar to linear equations, once you have factorized the matrix to the upper echelon form, then application of the matrix is the same process as solving it.) What the result will say that you don't need to be exponential in the worst case.


> I think your intuition is wrong here. The number of variables in the clause is not the reason why NP is difficult. XORSAT also has arbitrary number of variables per clauses, and is doable in P. Furthermore, the 2XSAT reduction shows that you can represent any 3-SAT clause only with 2-SAT and XORSAT clauses.

XORSAT is not NP-Complete because a chain of XOR boolean functions can't be chained to create arbitrary computation. 2SAT is restrictive because, even though it has a "fundamental" gate (NAND/NOR), the "splay" of variable interaction is so limited. In some sense, 2SAT and XORSAT, each individually, are conspiring to prevent general purpose computation.

> I don't think that's what P=NP would imply. ... What the result will say that you don't need to be exponential in the worst case.

If P=NP, then the program I listed above can be short circuited from exponential search to polynomial.


I don't think you understand. Sure, 2SAT and XORSAT are, in isolation, too weak to be NP-complete. However, if you combine them, and allow for both types of clauses in the same instance (on a common set of variables), then surprisingly it is NP-complete (I call that class 2XSAT), by reduction from 3SAT.

This is, to my knowledge, a previously unknown reduction and the reason why I now strongly believe that P=NP.


Yes, I think I understand, both why you believe P=NP and what the 2XSAT construction is.

You're basically saying "Look at this polynomial problem and this other polynomial problem, if we combine them in a natural way, suddenly it becomes exponential?". My point is that your intuition is wrong. For example, XOR by itself is not Turing machine equivalent, nor is NOT, nor is AND and OR by themselves, but combine some subset and suddenly you get NAND, NOR etc. which gives arbitrary computation.

One of the interpretations of the Halting Problem is that one can do no better than raw simulation. The intuition, by extension, is that for NP-Complete problems, one can also do no better than raw simulation (search the space for solutions). If we could do better than simulation, if there were a "short circuit" to the computation, then the program I listed above, which looks suspiciously like running arbitrary program with some time and space bounds, would not need to run in exponential time.

Arbitrary computation is the norm, not the exception. 2SAT and XORSAT are the exception because of their restrictions putting them in polynomial time and their inability to do arbitrary computation. If their fragility is disturbed, even by adding just a whiff of freedom in the form of more gates or arrangement, then they suddenly allow for arbitrary computation.

You believe your addition is just a slight nudge so the topple to arbitrary computation and, potentially, exponential time seems drastic, but you have a bias on what's "natural". For example, you could have decided to add some portion of 3 variable clauses to the 2SAT instance to see how quickly it blows up (if at all). Or you could have combined some different proportion of XOR clauses to the 2SAT instance to see how quickly it becomes intractable (or if it stays tractable). Which is more natural? Which is the smaller nudge? Why is your nudge not a shove?

FYI, there's a lot of work done for NP-Complete phase transitions. When I said this is poorly understood, I mean it. For all intents and purposes, "random" 3SAT (each clause chooses which variable uniformly at random, with equal probability of being negated) is, for all intents and purposes, solvable, even for millions of variables. As far as I know, there isn't a good proposal or understanding for a "natural" and "random" 3SAT ensemble that remains difficult as the scale increases.


> You're basically saying "Look at this polynomial problem and this other polynomial problem, if we combine them in a natural way, suddenly it becomes exponential?". My point is that your intuition is wrong.

Yes, but that's not the only thing I am saying. Look more closely at what 2XSAT really is - it is basically a 2SAT transformed on an affine subspace of Z_2^n. You claimed that the complexity of 3SAT comes from 3 variables per clause, and not only two, as in 2SAT, but here I am showing an NP-complete class that has the same property as 2SAT!

We can also restate 2XSAT reduction in this way - the only clause you need to have to be NP-complete is a clause that looks like (A XOR B) OR C (for arbitrary literals A,B,C). As you can notice, this clause already has the property of 2SAT clause (it forbids 2 combinations of A,B,C out of 8) rather than of 3SAT clause (which forbids 1 combination of A,B,C out of 8).

I think you're actually assuming the conclusion, if you think that because something gives rise to "universal computation", then it must be difficult. But that's possibly silently assuming P!=NP.

Still, the main point that 2XSAT reduction brings (at least to me) is that it hints that there is a clever generalization of both algorithms for 2SAT and XORSAT that runs in P (and if it doesn't, it will at least explain more clearly what the issue is). So that's my strategy, I am generalizing the proof of 2SAT to handle what I call generalized literals, XORs of any number of variables, i.e. linear equations, rather than simple literals.

> Or you could have combined some different proportion of XOR clauses to the 2SAT instance to see how quickly it becomes intractable (or if it stays tractable).

I don't understand this paragraph. I am not sure how I would measure whether a particular instance of 2XSAT is tractable or not. (I think they are all tractable, but I am still working on it.) I think the "hard" but satisfiable instances in 2XSAT have a really small dimension of affine subspace defined by XORSAT portion, and weak enough 2SAT portion to disallow all solutions.

> FYI, there's a lot of work done for NP-Complete phase transitions.

I am really not into phase transitions, but I think 2XSAT being NP-complete could explain them pretty nicely. I think most of the observed behavior comes from XORSAT portion dominating hard but satisfiable instances. AFAICT XORSAT exhibits similar phase transition, where there is only relatively small number of low-dimensional solutions to a random instance of XORSAT - either they have many solutions or the system becomes unsolvable. The only difficulty in analysing this is that 2SAT portion of 2XSAT, if it is dense enough, can also contribute to XORSAT portion (by forcing some variables to be constant or equal).


The catch here is that P vs NP talks about worst-case behavior. Many SAT solver inputs can be solved quickly with good heuristics, but there are some that are still difficult no matter what you do. One way to see this is via the phase transition behavior of SAT.

https://www.researchgate.net/profile/Lavallee-Ivan/publicati...

If you have too few constraints it's easy to find a solution to the problem. If there are too many constraints it's also easy to show there are no solutions. But in the middle there's a sweet spot in the ratio of constraints to variables that's when the problem becomes fiendishly difficult.


> but not be able to figure out what pattern of bits gets a bunch of AND and OR gates to output a "1"? 10-15 years ago, the basic idea was "P≠NP because otherwise computers could do crazy shit." Well, looks like they can do crazy shit!

I think you've fundamentally misunderstood the meaning of the P!=NP problem. In very simplified terms, it's not about what a computer can do, but about how long it takes to do something in the worst case.


There already exist local-search based algorithms that can find solutions way faster than a SAT solver can... Or they get completely stuck unable to make any progress.

All an LLM does is guess the next token based on the previous tokens and its training weights. For it to give you a solution to a large SAT problem it'll have to spit out a million character long binary string.

The likelyhood most of that string will be entirely hallucinated is very high. LLMs are not magic.

Deep Learning is already used internally in some SAT solvers to heuristically pick in which direction the search should go.


> "10-15 years ago, the basic idea was "P≠NP because otherwise computers could do crazy shit."

Everyone is downvoting you because that's not the mathematical explanation, but it's absolutely the informal explanation that the 'pop science' ones were saying.


I don't mind the downvotes as I expected this would be somewhat provocative, going against the general grain that people think P≠NP. I agree that, in general, this is not a mathematical explanation for why P≠NP, but a rough heuristic to explain why people think it is true, predominantly in pop sci circles. I would also emphasize that there is no mathematical explanation for P≠NP - or else the problem would be solved - and at the end of the day, the reasons that computer scientists suspect this to be true are equally heuristic in nature, not amounting to much more than "we tried to solve these problems and couldn't do it."


I don't think it was the informal explanation ever. People think that P≠NP, because they tried it hard, and did not find any NP-hard problem in P.


> "People think that P≠NP, because they tried it hard, and did not find any NP-hard problem in P."

That was the informal explanation given to the computer scientists. But that's not so convincing to non computer scientists and it's not the informal 'pop science' explanation.

An example (not the only one) of the kind of 'pop science' explanation I mean, is Impagliazzo's Five Worlds (https://gwern.net/doc/cs/cryptography/1995-impagliazzo.pdf). One of those hypothetical worlds he called 'Algorithmica' and it's where P = NP. One of the amazing and outlandish things that could be accomplished in such an exotic world would be the following feat: "Thus, a computer could be taught to recognize and parse grammatically correct English just by having sufficiently many examples of correct and incorrect English statements, without needing any specialized knowledge of grammar or English."

It's not so wild to me, to think that if someone's understanding of P vs. NP was from that kind of pop science article, then they would think we should start considering more seriously that we are in the Algorithmica (P = NP) world where such feats are possible!


If P = NP because there's a O(N^{Graham's Number}) polynomial algorithm - a 'galactic algorithm' indeed - then the lay description is theoretically correct, but meaningless in practice.

In particular, the hand-waving is the phrase 'sufficiently many examples.' This may be impossible to provide even if only a googol (10^100) examples are needed, because of mass and energy limitations of the universe - there's only so much you can do with 10^80 atoms, and the speed of light is too slow.

"Only a googol" because Graham's Number is mind-boggingly huge - https://en.wikipedia.org/wiki/Graham%27s_number

The Wikipedia entry for "galactic algorithm" at https://en.wikipedia.org/wiki/Galactic_algorithm even mentions SAT: "a hypothetical large but polynomial O(n^2^100) algorithm for the Boolean satisfiability problem, although unusable in practice, would settle the P versus NP problem".

Algorithmica may therefore be very little different than our world, other than that people no longer debate if P = NP.


Thank you. I stand corrected, people indeed argued "10-15 years ago, the basic idea was "P≠NP because otherwise computers could do crazy shit." (1 person at least).


Informal, as in, poorly-researched clickbait pop-science. Most of the serious pop-science sources (Scientific American, Quanta Magazine, Computerphile, etc.) have explained it correctly, without restoring to woo.


> isn't everyone expecting some kind of superintelligent AGI that can recursively improve itself to turn the planet into goo?

I most certainly am not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: