Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
NP-hard does not mean hard (2017) (jeremykun.com)
128 points by stefanpie on Aug 17, 2023 | hide | past | favorite | 93 comments


Corollary: P does not mean easy.

Most things that you do on the computer, you want to be O(n) or less; maybe O(n log n) or even O(n log^x n), but no slower than that. That means if you get a bigger problem, you can generally just get a proportionally bigger computer and be all set.

Now, sure, there are plenty of O(n^2) problems where the n stays small enough, or you don't mind waiting or spending a ton of money on it. But just because something is in P doesn't mean that you want to be solving it with a polynomial time algorithm on a regular basis.


Our CS-Prof also had another interesting point: P = NP could be true without changing many things in reality.

This could occur if the reduction of an NP-complete problem onto a polynomial problem results in a runtime of such monstrous polynomial degree that the exponential algorithms are just faster for every tractable problem size. Something like this exists in some graph algorithms - theoretically faster algorithms exist, but in practice, they are a lot slower than the theoretically slower algorithm until completely silly graph sizes.

This could turn even more frustrating if the proof was nonconstructive.


Another interesting take: P != NP could be true, while changing MANY things in reality.

Basically all modern asymmetric cryptography in common use rely on the difficulty of either integer factoring or discrete logarithms (including discrete logarithms of elliptic curves). The problem is, none of those problems are proven to be NP-complete! Even if we proved P != NP, there could still be polynomial time algorithms for integer factoring and/or discrete logarithms.


Staying in the realm of polynomial complexity matrix multiplication comes to mind, where we are approaching more and more O(n^2), where O(n^3) is the naive, but more common implementation.


In terms of practical algorithms, the Strassen algorithm (O(n^2.8)) is the only one that has runtime advantages for matrix sizes that aren't enormous, and even then, it's not always used because it has two non-trivial costs: reduced numerical stability and more memory space requirements for intermediate results.


It actually doesn't require more memory for intermediate results (see the Strassen reloaded paper). It's more just that it's a ton of work to implement well (even compared to a regular gemm which is already hard), and the benefits only start showing up at pretty large (~4000x4000) matrices.


> This could occur if the reduction of an NP-complete problem onto a polynomial problem results in a runtime of such monstrous polynomial degree that the exponential algorithms are just faster for every tractable problem size.

You don't even need to go that far.

n^8 is smaller than 2^n once you get past 44, still well in the tractable range, but encrypting something against n^8 brute force is a pretty reasonable task.


This is similar to Knuth's reasoning for P=NP. Essentially there might be algorithms for these problems that are simply so complex that we might never know them.


A terrific teardown of tracking down an unexpected O(n^2): https://randomascii.wordpress.com/2021/02/16/arranging-invis...


The "accidentally quadratic" blog collected such things. Not sure if it's still being updated since Tumblr's mass exodus (the posts don't seem to show timestamps) https://accidentallyquadratic.tumblr.com


Last post was made in 2019. (If you click on the name of the post it shows the timestamp at the bottom)

Blog author stated on Reddit in 2021 that he wasn't maintaining it anymore[0].

[0] https://old.reddit.com/r/programming/comments/jdylxs/acciden...


Also reminds me of the GTA online case.

https://news.ycombinator.com/item?id=26296339


Interestingly being P complete has another unexpected consequence: it means your problem is going to be difficult to parallelize.

This is why parallel SAT solvers are barely faster than the usual ones.

SAT is NP complete, but a crucial step in SAT solving (unit propagation) is P complete.


Is there some good intuition why P-complete problems are difficult to parallelize? This is the first I've heard of it (but then again, I'm usually interested in more obscure complexity classes)


This is the first I have heard of it as well (and I have properly studied NP complete in graph theory, so I don't know why this missed my attention). It seems that P-complete are difficult to parallelize by definition as they are the set of tractable problems (as opposed to NP) which don't parallelize well.

"The class P, typically taken to consist of all the "tractable" problems for a sequential computer, contains the class NC, which consists of those problems which can be efficiently solved on a parallel computer. This is because parallel computers can be simulated on a sequential machine. It is not known whether NC = P. In other words, it is not known whether there are any tractable problems that are inherently sequential. " From: https://en.m.wikipedia.org/wiki/P-complete


It's not by definition, there's a bit of math behind it! :)

The intuition is that the hardest problems in P have linear dependency chains. If you could have a good parallel algorithm for a P complete problem, you could take a problem with a dependency chain and solve parts of it in parallel without waiting for the results those parts are depending on.


Cerium is correct, we don't know if P is efficiently parallelizable.

Is there a formal proof of what you're talking about that we can read?


Are you perhaps confusing P with P complete?

https://www.researchgate.net/profile/Walter-Ruzzo/publicatio...


How does that matter? P-Complete is a subset of P.


What am I looking for in this 300 page document? Proof by intimidation, eh?


There is an easy daily example: the PBKDF function, the whole point of which is to be expensive to compute, and impossible to efficiently parallelize.


Yes, linear(-ish) dependency chains so that your threads have to wait for one thread to provide a result (infinitely often).


Are you stretching "this specific strategy for parallelizing P that I came up with won't work" to "there's no way to parallelize P"?


It’s more like : you win a Turing award by finding a strategy to parallelize this problem as you’ll be able to use that approach to parallelize all problems in P, proving NC = P.


This applies both ways. You'll win a Turing award if you prove NC ≠ P, which is kind of what you said — at least, that the best way I see of reading your first and a few following messages.


Matrix multiplication is O(n^2.73) time and O(n^2) space. That's what eats bigger and bigger chunk of electricity nowadays.


Are there approximations? I assume it's for machine learning right?


Not in the sense you mean. I think the other comment is talking about multiplication without carry, which is a simple form of hashing, but no one I'm aware of uses this for numerical computations. However, floating-point multiplication is inherently an approximation, and the precision of the input is first limited to the bit depth of the sensor channel no matter what, and then typically reduced further anyway by norming everything to fit between -1 and 1 in order not to overweight the importance of input features with naturally larger values as well as to just fit into f16 registers that a typical GPU might have tens of thousands of. Plus, while I don't know what they're doing these days with vector embedding in LLMs, with older school NLP, the probabilities you're dealing with are so small that the only way to reliably get joint distributions is to take the log and add instead of multiply. Otherwise, you'd be very quickly rounding to 0 in what can fit into any floating-point width.

Something to keep in mind is a lot of these matrices are sparse, though. When most of the entries are 0, specialized data structures that know this can avoid doing all of the pointless multiplication by 0 operations. This saves far more time than some kind of approximate multiplication would.


There's multiplication by hashing, which is a very fun subject, look it up :-)


Where does 2.73 come from?


Naive method is O(n^3).

Strassen algorithm drops it to O(n^log2(7)) = 2.8074 because it divides the process into 2x2 matrix multiplications that can be multiplied by 7 ops instead of 8. Strassen is practical algorithm.

You can still optimize and get to 2.371552 with complex trickery, but these are galactic algorithms, meaning that matrices are so big that you can't even construct them on Earth.


In my experience, solving the NP-hard problem isn't the part that is hard, it's reducing the problem space and/or solution space to make it into a P-problem (when n is sufficiently big enough to worry about it).

For example, changing how you're storing the data in the first place, or talking to PM/PO's to learn if you really need an exact solution or if a very close approximation is ok.

Sometimes, you're even trying to solve this problem while n has grown over months or years, and some customers are hitting the tipping point and the app is crashing. So there's a lot of pressure to perform.

That's what makes np-hard problems hard, imho; not the problem itself, but all the bullshit to get rid of it.


Computing a competitive equilibrium is NP-hard in Fisher Markets. [0]

The economy is not controlled by some grand central institution calculating competitive equilibria. We have independent humans with some institutional soft constraints optimistically buying and selling things. Even if we assume that there is no information asymmetry and all preferences are known, everyone has to run the algorithm themselves in their head and everyone has to arrive at the same competitive equilibrium. That is what it takes to guarantee a competitive equilibrium.

The thing is, if you have anything less than the competitive equilibrium, you lose the property of "free market supremacy" aka that markets are superior in every situation and that government intervention can never make anything better. So the NP hard problem has to be solved, you can't wiggle yourself out of this with an approximating solution. Any x%-approximation leaves a 100%-x% gap for something else to replace the free market. We now arrive in reality, where sometimes markets are really good and sometimes they are really bad, but that has not stopped people from worshipping Ayn Rand. The very same Ayn Rand that needed medical services provided by the government and therefore landed in the 100%-x% gap where markets aren't so nice any more.

The by far dumbest part though, is that the economy in reality is an iterative process, not a static process that jumps from one competitive equilibrium to the next, and there are ways to use local rules and institutions to encourage arriving as close to equilibrium as possible but those are considered government intervention or against human nature or some other nonsense, even if they actually let markets be more "free" and with less government intervention overall. So you have these schizophrenic economists who argue in favor of unattainable competitive equilibria, while simultaneously sabotaging the arrival at close enough equilibria plus some government intervention as a fallback. The height of irony is that this results in the economy ending up in a strong disequilibrium, but disequilibrium is fine as long as the economy is growing. The moment growth stops, that disequilibrium will become more and more apparent over time.

[0] https://en.wikipedia.org/wiki/Fisher_market#Fisher_markets_w...


In my experience, every NP-hard problem I've run into was due to trying to solve it naively or "up-front" and can be reduced into something not-so-by-the-book which usually involves storing the data in a different form, or reducing scope (i.e., ignoring impossible solutions/values). I've yet to see a real-world NP-hard problem that can't be solved this way; but would never be accepted scientifically because they don't solve it in the 'general case.'


I guess you never tried to find optimal join orders for you sql engine.


I actually have :), it was a time travelling database for $work. In our case, we assumed that queries wouldn't change too much (applications running on them tend to query the same tables over and over again). Thus we just picked an order "at random" (though IIRC, there were some heuristics) for table-sets, otherwise we used the best performing one seen so far -- but it would randomly run another query with a different order to keep finding the best one.

We just didn't care about trying to solve it 'up-front' and were more interested in empirically choosing the best solution.


A random search like that might give okay results but has no guarantees approximation ratio.


Despite having done my undergrad in CS, I never understood what NP-hard really meant. I mean yeah, polynomial time, boolean logic, general-case not specific-case, transformable into other NP problems, I get it, but like, I don't get it? Anyway, after reading this article I feel like I've gotten one small step closer to filling in this gap in my CS knowledge.


It's probably due to unfortunate naming of things.

First, NP doesn't mean "non polynomial", though people can imply that since there are no known polynomial solvers for NP-complete problems. It's a shorthand for "nondeterministic polynomial", but that naming is not intuitive either. It's a class of problems that has polynomial verifiers of solutions.

Second, NP-hard means the problem is either NP-complete or not in NP at all. Which is another confusing naming, because problems not in NP are NP-hard.

As a saying goes, there are two main problems in programming - naming things and cache invalidation. And in this case the former fails badly.


> Second, NP-hard means the problem is either NP-complete or not in NP at all. Which is another confusing naming, because problems not in NP are NP-hard.

But remember that P is in NP, so NP-hard implies not in P (assuming P ≠ NP).


And if P = NP any problem will be NP-hard. Until P ≠ NP is proved we can't heavily rely on the consequences.


That's an exaggeration. It's pretty safe to assume P ≠ NP, in practice. In theory you'd have to be more careful, but people have relied on weaker assumptions in math proofs. But for statements about super-mario, it's more than reasonable to assume.


So, NP-hard means that it's either NP-complete (the hardest NP can get), or even harder than that. Pretty intuitive to me.


Yeah, one way to always read it is that X-hard, for most Xs here, means at least as hard as X.


I came across the example of the "bounded halted problem" recently, which I found quite intuitive for understanding P=NP https://www.lesswrong.com/posts/8Af3X8b7f5piqtqZx/computabil...

We're given a program P of length n, let's say it's a Turing machine with binary symbols on its tape. We want to know if P halts in fewer than n steps for all inputs. Notice that only the first n bits of those inputs are relevant (since it would take n+1 steps to reach the n+1th bit, by which time we'd know it hasn't halted in fewer than n steps). Hence we can answer this in exponential time: run P(000...) for n steps to see if it halts, run P(100...) for n steps, run P(010...) for n steps, etc. There are ~2ⁿ bitstrings of length < n, so it takes O(n × 2ⁿ) steps to check all inputs.

This problem is in NP, since we can check a candidate input in polynomial time, e.g. checking `P takes at least n steps on the input 01010101` requires running P on that input for at most n steps. It's also NP-complete since we can translate other NP problems into it (see the link for details)

The reason I like this example is that it gives an intuition for how "overpowered" a polynomial solution to it would be: able to infer seemingly-arbitrary information about any program, without having to actually run them. This is similar to how an oracle for the (usual) halting problem could be used to quickly prove/disprove arbitrary mathematical statements, just by feeding it a dumb, brute-force proof searcher. (Of course, we could state the same more directly, since theorem proving is itself NP-complete: checking a proof of size n is easy, but finding one seems to require exponential time; however, that seems more abstract than the running of a program on inputs)


> Despite having done my undergrad in CS, I never understood what NP-hard really meant

Not having done CS undergrad, I have never really understood even P/NP. The naive explanation of problems that are easy to solve and check vs problems that are difficult to solve but easy to check seems to leave something essential out.

I mean, with the naive explanation you are I think you are left with either of two options:

1. A philosophical issue of not being able to prove a negative, you really can't prove that there is not a way to solve a problem fast.

2. That obviously P!=NP. You just for example lose information before giving the problem to the solver, say, I'm thinking a float with n decimals, then I round it to nearest integer and ask you to find the float given the integer. I can check the guess in linear time as n increases, but your difficulty increases exponentially as n increases.

I am pretty sure there is something in this problem that makes it a not legal P/NP problem, but I have no idea what. I have a couple of times tried to look for more formal definitions of P/NP, but the jargon goes immediately above my head.

(The "guess" is admittably there a bit informal, if you want a bit more formal problem, you take a process C that is mathematically proven to be sufficiently sensitive to initial conditions (e.g. chaotic), take integer x and calculate y = C(x), round y and ask what's x.)


You may want to take a look at https://cs.stackexchange.com/questions/38357/is-it-really-po...

Something along the lines of this is what you would likely need to do. Take some problem that is NP-complete. Prove it has an exponential lower bound on time complexity. You've just proved P != NP. Proof of impossibility is sufficiently common in mathematics that one of the oldest and best-known proofs of anything, Euclid's proof of the irrationality of sqrt(2), is an example of such a proof (it is impossible to express sqrt(2) as the ratio of two integers).

The philosophical issue relates more to physical non-existence. If I claim leprechauns don't exist, likely few people will argue with me, but I can't actually examine the universe outside of my own past and future light cones, so I can't really know for sure there is no part of larger existence that contains leprechauns. Even events that violate accepted physics may become possible if we live in a false vacuum that collapses to a state with a different physics at some time in the future. Proving mathematical non-existence is an entirely different animal. Trivially, any universally quantified proposition "for all X, predicate(X)" is logically equivalent to "for no X, not predicate(X)," yet proofs of such propositions abound.


> A philosophical issue of not being able to prove a negative, you really can't prove that there is not a way to solve a problem fast.

You can definitely prove negatives, and we do so all the time. There's the undecidability of the halting problem, for example: there is no algorithm that can be expressed in a Turing-complete language that can determine if a particular program will halt or not.

Another fun one is the unsolvability of the quintic. You know how there's a formula for solving a quadratic equation (https://en.wikipedia.org/wiki/Quadratic_formula)? Well, there's also one for order 3 polynomials (cubics) and order 4 polynomials (quartics). But order 5 (quntics)? There is no formula that can solve quintic equations using addition, subtraction, multiplication, division, exponents, and radicals (square roots, cube roots, etc) in the general case. The theorem actually goes even further by providing explicit examples: the equation x^5 + x^3 + 1 has a root, approximately equal to -0.83762, which cannot be expressed in terms of the operations I listed above. This is all the consequence of Galois theory.


Problems in P are about deciding whether an input passes or not, where there exists a 'turing machine/algorithm/black box' that will tell you if an input does, or does not pass in a 'fast way'. A simple example is graph coloring. The box contains a graph, you pass in a color for each node, and the box checks if there are no adjacent nodes with the same color. Note that this box has an input (the coloring) but also a sort of parameter (the graph) you can create many different instances of this box by just changing the graph.

You could also add another parameter, call it K. This will be the maximum number of colors you can use. It is still easy for the box to check if a coloring fits this criteria.

For problems in NP, there is an underlying set of boxes in P. The input to an NP problem is a set of parameters that describe a box in P. In other words, any input for an NP problem corresponds to a 'fast box'. The NP problem then asks, "is there any input this fast box will accept". Such an accepted input is called a witness, because it proves that the box created by the input to the NP problem is solvable.

In the graph coloring example, instead of having a graph, a number of colors K, and a coloring to check, you now have: a graph and a number of colors, and the question 'can we find any coloring'.

We transformed a 'check a solution' into a 'find if a solution exists'. A key element here is that someone attempting the NP problem gets to look inside the box, figure out how it works.

In your example of 'you have to guess the number I thought of', you need to hand the guesser a box that will check that number. The guesser can then look inside the box to see how it works. So a simple comparison with a fixed value allows the guesser to 'read the hardcoded value from the source code'.

Do note that the box in P has to be fully deterministic.

In an NP problem, an input describes some parameters


> That obviously P!=NP ... but your difficulty increases exponentially as n increases

> I am pretty sure there is something in this problem that makes it a not legal P/NP problem

It's not in the space of P and so isn't relevant to the problem, if I'm reading your post right, as it's exponential complexity


Wait, how is it impossible to prove a negative?


Well, I guess you can prove that there are no odd numbers in set (2,4,6), so some negatives you can prove, but in general case, I do not know how to prove that something does not exist and will not exist ever.


That's the absence of proof. It's a bit different from the proof of a negative.

Don't worry you're not the first one mentioning it so that's something that must be some kind of colloquialism somewhere but I was asking to understand what people may have meant.

To top it all, you proved a negative (by providing a counterexample) :o)


No, it's absolutely a proof.

Claim: There is no odd number in {2, 4, 6}.

Proof: 2 is not an odd number, 4 is not an odd number, 6 is not an odd number, there are no other elements in {2, 4, 6}. Therefore, there is no odd number in {2, 4, 6}.

If there was simply an absence of proof, we would be forced to conclude that we don't know whether there's an odd number in {2, 4, 6}. That's clearly not the case.

The claim "S is a set with no odd number" is equivalent to the claim "S is a set where all numbers are even", which is something I assume you agree that we can prove.


I'm not talking about the first part of the comment but the latter parts regarding the general case.

(I even wrote that he proved it himself at the end)


when proving a negativ in math, you often do a proof by contradiction

precondition: Either A is true or it is false.

question: is A false?

solution:

1. assume that A is true

2. <math stuff>

3. find a contradiction

4. ==> A can not be true

5. ==> therefor A is false


Proving a negation is not a proof by contradiction, that's just the proof of an (absurd) implication. Logic people actually write blog posts about this: https://math.andrej.com/2010/03/29/proof-of-negation-and-pro...


> I am discovering that mathematicians cannot tell the difference between “proof by contradiction” and “proof of negation”.

I am glad to learn, that I just as dumb as the average mathematician lol.

> Proving a negation is not a proof by contradiction

did I say that it is, though?


In P/NP case I would need to assume that I can solve a problem in polynomial time, and then find a contradiction. To me it is really hard to see a useful path forward from there.


That's what's called a "non-constructive" proof in some circles. It /doesn't/ have a useful path forward, which makes it frustrating. There are some weird sandwiches of complexity classes - like, iirc, P-SPACE, EXP, NP-SPACE - which we know to have exactly 2 subsets plus the total subset, but which constellations? Dunno.

This is different from a constructive proof. "Here we have a problem, and all algorithms so far are in O(n^3). We can demonstrate the existence of a substructure in all instances of this problem which lowers the complexity to O(n^2.978)" would be a huge constructive proof in some fields. Such a breakthrough could lead to a large number of follow-up improvements.

A non-constructive breakthrough is like "Yup, this is a barrier. No clue why though, but it's hard."


> I would need to assume that I can solve a problem in polynomial time, and then find a contradiction

iirc that's actually how many of these proofs work.

> To me it is really hard to see a useful path forward from there.

yes, CS is hard.

there are resources online, that explain the general idea, e.g. [1]. But understand the specifics, you really do need a solid foundation in theoretical math and theoretical CS

[1] https://www.quora.com/What-is-a-proof-by-contradiction-in-co...


If you see a path, you'll be one million USD richer. Don't forget about us. :-)


[deleted]


For cryptography, every instance is (conjectured to be) complex, not just laborious.

For worst-case NP-hardness, the complex instances are also laborious, while the simple instances are not laborious.


You should grab a textbook about NP-completeness like Garey and Johnson. It's actually quite easy to understand and you can follow along some proofs for NP-completeness even if you don't have experience with that. NP-hard will follow.


From what I remember:

NP: Finding the solution takes more than polynomial time, but you can verify the answer is correct in polynomial time.

NP-Hard: Finding the solution takes more than polynomial time, and it also takes more than polynomial time to verify that the solution is correct.

NP-Complete: NP-Hard, but it can be transformed into any other NP-Complete problem in polynomial time. This is special because it means if you find a solution for any NP-Complete problem, you have found a solution for all of them. Finding an NP-Complete solution always seemed rather rather optimistic to me, but computer science professors obsessed over these problems.

Caveat: It has been more than 20 years since I was quizzed on this stuff, so it might be wrong.


NP: Checking a solution takes polynomial time.

NP-Hard: "The problem is at least as hard as any problem in NP." Basically, if X is an NP hard problem and you are given an oracle to solve X, you can solve any problem in NP by first transforming it to an instance of X and then solving it.

NP Complete: The problem is NP Hard & in NP.


To add to this NP can also refer to the more formal idea of a non-deterministic turing machine being able to compute the problem in polynomial time. We learned NP as "Non-deterministic Polynomial (time)". I think the more practical definition is "not polynomial"...lol.


> To add to this NP can also refer to the more formal idea of a non-deterministic turing machine being able to compute the problem in polynomial time.

Yes. This definition is equivalent to the one about being verifiable in polynomial time, since your non-deterministic TM can just have a different branch for every possible verification "oracle".

> I think the more practical definition is "not polynomial"...lol.

Well, there are non-polynomial algorithms harder than NP. If a solution can't even be verified efficiently, for example.


“Not polynomial” is an easy trap to fall into (and one I have fallen into) but it’s really not correct. All polynomial-time problems are also NP problems (P=NP is about whether the converse is true, and if P=NP, all problems in NP are actually polynomial-time). So even if talking to a layman, saying that NP is non-polynomial is really missing the point of the question.


Following more or less directly from this:

- If you walk backwards from where the non-deterministic Turing machine halted, the list of taken state machine transitions is polynomial in length (naturally, as the machine stopped in polynomial time).

- Walking that polynomial length list of actually-taken edges forward through the Turing machine constitutes a polynomial time verification of the solution.

This is the essence of the equivalence between being able to solve problems in NP in polynomial time on a (hypothetical) non-deterministic Turing machine and being able to deterministically verify a solution to those problems in polynomial time.


> - If you walk backwards from where the non-deterministic Turing machine halted, the list of taken state machine transitions is polynomial in length (naturally, as the machine stopped in polynomial time).

> - Walking that polynomial length list of actually-taken edges forward through the Turing machine constitutes a polynomial time verification of the solution.

These are really great explanations that would've saved me so much trouble in graduate school. Connecting the automata itself to the term "nondeterministic turing machine" is something that is sorely missed is most CS programs I think. It's usually handwaved in automata theory in order to give you enough to head to compilers. Then you run into it again in graduate school algorithms where it is once again handwaved (because not even the professor fully understands it).


I don't think the downvotes here are fair given the corrections this person has received. These misconceptions are wildly common. I'm sure 90% of the downvoters couldn't even accurately identify this stuff.


^ The most common misconceptions about NP/NP-Complete/NP-Hard, packed in one comment!

But seriously for anyone reading this, please refer to the other sibling comments.


--Edit: this was wrong, see reply comment--

NP: you can verify the answer is correct in polynomial time. (and no other clauses)

Anything in P is in NP


> --Edit: this was wrong, see reply comment--

You're not wrong tho. NP problems can be verified in polynomial time.

Nondetermisitic turnig machine is like a turing machine that has multiple "next steps" instead of one at a given time, and can migically choose the correct next step.

You can think it as "taking all the possible paths at the same time (but at the end only the correct one matters)". But you can also think it as "given all the 'choices' it made, check if there is actually such a path", in other words, verifying a certification.


I think it's worth saying what NP stands for.

Nondeterministic Polynomial. Where "nondeterministic" basically means you get to try every single polynomial solution in parallel.


Not wrong, just a different criterion that describes the same complexity class.


[flagged]


Is this a generated answer?


Look into the formal verification community, where PSPACE-complete is "easy" and undecidable is normal.


Hopefully they don’t need to deal with “big data”


Isn't there some graph problem about finding cliques that is NP-hard (presumably NP-complete in the decision version) but that has an expected _constant_ time algorithm over the usual distribution of random graphs?

I seem to remember the algorithm is something like "check a few samples at random, if that doesn't find what you're after then do exponential-time search" and the point is the chance that the samples don't find what you're after is exponentially small, because the thing you're looking for is so frequent.


If you're allowing for some error probability, then the answer is trivially yes. For example, the NP-Complete problem of "does this graph have a clique of size sqrt(n)" is trivial over G(n, 1/2) random graphs, because the answer is No with overwhelming probability (something decaying exponentially in n or n^2). So you don't have to read the input before responding.

If you're referring to Las Vegas style (always correct) algorithms, then I don't think something along those lines can work. Reading a constant number t bits from a G(n,p) random graph yields each possible event with the constant probability (1/2)^t. So the expected running time is still at least some (small) constant times that of the failure case, which is still exponential.

Are you perhaps thinking of the average case problem Planted Clique, where the task is to distinguish between a "clean" random graph G(n, 1/2) and a "planted" one where we force some random set of, say, k=n^(1/3) vertices to induce a clique? While distinguishing a graph with a k-clique from one without one is NP-hard on general graphs, you can show that G(n, 1/2) graphs virtually never contain cliques as large as 3 log n, and hence brute force searching all 3 log n sized subsets of vertices for cliques (in subexponential time n^(3 log n)) will almost always lead you to the correct answer. And once you find a clique of size 3 log n, expanding it to n^(1/3) can be done quickly using a greedy algorithm.


It was years ago I studied the problem and I've lost my notes, but what you say makes sense.


Traveling salesmen is even solved to optimality (no heuristics) for pretty large instances with integer programming techniques.

Also SAT is pretty much a solved problem. With algorithms like CDCL.

This really surprised me. I also left my first complexity theory course believing that NP-hard = "not solvable in practice“


It's always about worst-case complexity; you may be able to solve typical problems much easier. This even applies to the halting problem, which is unsolvable in general but pretty easy for most real programs.

But also, "pretty large" is relative. Wikipedia says the largest known exact solution for a traveling salesman problem is 85,900 nodes, which is not really that many.


> SAT is pretty much a solved problem

We don’t really understand what makes some SAT problems harder than others.

You can go from a problem solvable with CDCL in a few seconds to one that would outlast the solar system by changing a couple of input bits.


Sure, but in practice that’s not what people worry about, from what I have seen. At least when dealing with SMT problems, the SAT part is the easy part.


I think that's because _in practice_ people don't waste their time working on intractable problems if they just want to get something done. There's almost always some way to avoid the intractable problem and approach it a different way.

It's sort of like how "in practice" it didn't matter for thousands of years that nobody understood electricity. Any problem that came up that would have required that knowledge to solve just got dropped, because they didn't have the tools to solve it.


I like to think about this in terms of "ensembles", or a distribution on the problem instances you're drawing from. NP-Hard talks about worst case in this set or distribution, even if the "average" or "normal" case is creating an instance that is "easy" to solve.

This is part of the problem of using technical terms in a lay context. "Hard" here talks about "worst case hard" or "provably hard" (via a reduction to 3-SAT). Whether a given instance of an NP-Complete problem that is drawn from an ensemble or distribution is "intractable" is a more subtle question.


> The class of problems solvable in a finite amount of memory is just the class of regular languages. The “finite memory” is the finite state machine used to solve them.

I think they meant to say "constant memory" since every halting Turing Machine uses finite memory.


Same thing about "halting problem". The fact that it's unsolvable in general case, doesn't mean we can't solve it easily for 99.99999999% of all problems in the category


Hey, I vouched for your submission. Just wanted to let you know that there seems to be some issue with your account causing your submissions to be automatically flagged. You may want to contact HN moderators (hn@ycombinator.com) about that.


I've fixed it now. Thanks for watching out for a fellow user!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: