Hacker News new | past | comments | ask | show | jobs | submit login
Solver Performance: 1989 vs. 2024 (solvermax.com)
146 points by stncls 8 months ago | hide | past | favorite | 83 comments



These solvers really show that NP-hardness is no reason to give up. For example, they can solve surprisingly large Traveling Salesmen instances to proven optimality.


Solvers aren't magic. But it turns out that many naturally occurring instances of NP-hard problems aren't the "really hard" instances that the NP-hardness proof depends on.

Solvers can also fail for really tiny problems. You simply have to try to figure out how hard (or how well adapted to the solver's heuristics) your particular problem instance is.


I’m saying this because most people got from their CS degree that NP hardness is practically a death sentence. But it‘s really not true. Even if the problem is proven to be NP hard (or worse) and every previous approach has failed. There can still be some trick or technique (non heuristic!) that brings a breakthrough. Maybe you still wait 10^whatever years for a solution in some cases. But if you get an answer in seconds in 99.9% of cases then its not a big deal in practice.

Even the famous SAT problem can almost be considered solved nowadays with the solvers we have.


The true benefit of being able to tell NP-complete/NP-hard from the garden variety bucket-o-bytes moving is not in giving up when you encounter them, as you correctly identified, but in knowing that even attempting an optimal solution is futile and proceeding to looking for approximate solutions to the business problem more or less instantly.

People unfamiliar with the matter may attempt to solve the problem, might even get rewarded for effort and hard work if management is also unfamiliar with the matter. Maybe it's fine though...?


Optimal is overblown for business problems in general. Knowing there’s a mathematically optimal solution, people want it. Even if it’s practically impossible to get. It feels like if you have the optimal, you don’t have to consider trade offs (not usually true).

Having a solution within 1% of optimal with ten nines of confidence is usually indistinguishable.

Anyone ever notice how these CS problems are rarely famous business problems? It’s because the pure problems don’t usually matter and the approximate versions are usually quite easy. You almost never need the shortest route for a salesman, or need to pick a secretary with no chance of going back.


I'm quite convinced that if you had a 1% better solution to the salesman problem than what fedex or ups currently have, they'll pay good money, though :)


I would not be so sure.

What's the point of the theoretically perfect solution when the travel times are so unpredictable? The truck stops are 3-5 minutes apart on average (according to random reddit comment [0]), 1% of that is 3 seconds. Meanwhile a single missed light (say because some other vehicle was driving slow or UPS driver was guided into non-familiar route) can add more than a minute.

So something like a better way to arrange packages in truck, or better traffic preidiction model, would be much more useful than any TSP improvements.

[0] https://www.reddit.com/r/UPS/comments/89zw0h/how_long_does_o...


No individual truck would notice the difference. But averaged over many trucks and many days, it should result in a measurable change in gas spending.


The problem is that in practice, you don't have complete information, and the information you have is slightly incorrect. You also don't have a complete model, and the model you have is slightly incorrect.


Throw on a few more decimal places if you like. The point is that in the physical world, “best” isn’t usually categorically different from “extremely close to best with high probability”.


Yes, exactly this. You don't need the optimal solution in most cases, you just need a solution, and if it is 9x% there then the optimum solution can be approximated by burning more cycles but from an economics perspective you may well already have a viable solution in hand.


It also depends on if you actually need the optimal solution. Getting a solution in your time budget, even if not guaranteed to be optimal, can be appropriate for many problems.


>I’m saying this because most people got from their CS degree that NP hardness is practically a death sentence. But it‘s really not true.

It's a death sentence for economists, since they rely on every agent optimizing the entire economy of 8 billion agents in real time. 3 hours for 8 million variables isn't real time.


NP hard for a pathological case doesn't mean all practical cases are pathological. Many of the cases have optimal solutions in a reasonable amount of time without resorting to brute force because you can use structure in the dataset to limit the search space.

That you can construct a pathological case makes something NP hard for an arbitrary case but not for all cases. Compare with QS: it's very fast for most practical cases but you can construct a case where it performs quite bad, much worse than you'd expect given the name. But in practice that isn't all that relevant.


This reminds me of the random polynomial time bound for the simplex algorithm; an algorithm that is, naively, exponential time[1].

I seem to vaguely remember — this is ~17 years ago, now — that it's possible to "characterize" the really bad edge-cases for simplex and work around them (the whole point of the paper).

[1] https://cstheory.stackexchange.com/questions/2373/complexity...


> Solvers can also fail for really tiny problems.

What that tells me is that the current metric we use for problem complexity (e.g. big-o) is woefully inadequate at measuring the actual complexity of problems.


Why would they be inadequate? They are a very good metric that lets one instantly recognize the way the problem’s certain properties change with respect to some other factor. Does it give a complete picture for everything? No. Why would it? But to say that they are woefully inadequate is just ridiculous.


> the current metric we use for problem complexity (e.g. big-o) is woefully inadequate at measuring the actual complexity of problems.

The complexity of all problems. But big-O isn't the only complexity metric available.

It's extremely useful and very adequate in almost all cases, but it doesn't work well when the numbers are very small and the problems are part of a small subset of all available problems.

But those are edge cases. In practice those are fairly rare and when the datasets are small enough normally all solutions are more or less viable. But as soon as your data set is non-trivial big-O is the right tool to apply at the outset.

Right tool for the job... small dataset, tricky problem: big-O may not apply.


Lots of CS theorists are working on non worst case analysis and have been for some time. The CS research community recognizes the limitations of worst case.

It makes sense to teach worst case in undergrad classes because it's easier to understand and basically a prerequisite for other kinds of analysis.


> they can solve surprisingly large Traveling Salesmen instances to proven optimality.

For someone who studies computational complexity theory, is the ability to solve some instances of NP-hard problems efficiently due more to these instances having lower average-case complexity than worst-case complexity, or because they possess structural properties that allow for more effective algorithmic exploitation?

More formally, are certain NP-hard problems easier to solve because the expected time to solve a randomly selected instance from their language is polynomial? Or is it because real-world instances are not uniformly distributed across the language, and these instances possess some amount of Kolmogorov compressibility that leads to better-than-expected performance?


The latter. The real-world instances are not uniformly distributed across the language. It is pretty easy to randomly generate problems that are hard for current solvers.

Caveats:

* "uniform distribution" is a tricky concept over infinite countable sets. Famously, it doesn't exist. You have to restrict the length or something.

* I have no idea if there's a concrete result linking Kolmogorov complexity and solving ease. I suspect no, since even a complex but known problem can be solved rather quickly by special-casing.


I played quite a bit with vertex three coloring in the past and it has a surprisingly sharp phase transition. If you randomly generate graphs by including each possible edge with probability p, then the graph will have average degree p×n. Don't quote me on the exact numbers, but something like if the average degree is about 3, then the graph is usually hard to color. If it is only 2.9, then the graph is usually easy to color, if it is 3.1 then there is usually either no valid coloring at all or it is easy to find one. So of all the graphs with average degree from 0 to n, mostly only graphs with average degree in a narrow band around 3 are hard.


> I have no idea if there's a concrete result linking Kolmogorov complexity and solving ease

I’ve only heard of the incompressibility method but don’t know too much about the details: https://core.ac.uk/download/pdf/301634196.pdf


NP-hard problems commonly come up as human-solvable puzzles. Like Sudoku... or perhaps a more applicable problem... layout and routing of electronic components on a PCB and/or chip. Or even assembly-language register allocation (coloring and packing problem).

Trained Humans are surprisingly good at these problems, far better than expected given how much computational power we have today. So its clear we don't understand something fundamental with regards to NP-hardness. And that's why the research continues, to bring human-like intuition into these problems and provide more automation to these tasks.


I’m not sure there’s necessarily going to be a satisfying general answer to find here though. I can invent an NP-complete problem with very easy instances by combining a linear time problem with an NP-complete problem, e.g. “Does this list of cities have a complete route shorter than X OR is one of the cities New York?”

All of the “yes New York” solutions are easy, but there’s no deep, satisfying reason for that. It’s just a hard problem glued onto an easy problem. For all we know, a lot of NP-hard problems could be like that, and the structure of the easy instances could be essentially unrelated between problems.


Maybe the answer then is to find what these hard cases are and use them in heuristics or approximate algorithms? If we could tell when part of the problem is hard, and maybe bound the error for them, and use an exact algorithm for other parts of the problem, you could get a better result in most cases without it blowing up on you.


Well, you can just start an exact solver on the problem in one thread, and an approximate algorithm on another. If the former doesn’t finish in n seconds, you cancel it and use the result of the latter.


The advantage of combining them though is that you might be able to treat different subparts of the problem differently (which is the hard part), so that you could use an approximate algorithm for the hard part of it, and an exact algorithm for the easy part.

Thi of course assumes the problem can be divided in this way, which is fairly speculative.


> Trained Humans are surprisingly good at these problems

Are we? I don’t think we would even start working on problems with big enough `n` where the complexity actually ramps up.

Like, optimally scheduling even just a couple of things will have a shitton of combinations, and I really doubt we would be good at it.

Good at iteratively decreasing a cost function? Yeah, I guess with a good interface we could move around tasks to be scheduled on a timeline and optimize it. Finding the optimum? No chance.


Our monkey brains can get within a few % of optimal on written TSP problems.

For a lot of these problems, having a tight upper bound lets you narrow the search space, regardless of whether you're looking for the ideal answer or simply a less-bad one (Would you turn down saving $500,000 on fuel and labor in a year just because someone thinks $613,000 is the maximum savings achievable?)

The more we can get automation to do approximations as well as humans, the better.


It's often a modeling issue. Modeling languages are so abstract and general they give up a part of the problem's structure.

The most famous example is the pigeonhole problem, when given to SAT solvers. Definition of the problem is: I have n pigeons and m pigeonholes, with n = m + 1. Can I put all n pigeons in a different pigeonhole? The answer is obvious to a human. If I have 20 pigeons and only 19 pigeonholes, I won't make it. But even state of the art SAT solvers will fail at solving this in a reasonable amount of time. Because of the way the problem is represented (a conjunction of boolean clauses). With just a slightly more expressive modeling language (such as pseudo-boolean representation / reasoning), you solve it with the blink of an eye.

We humans are efficient because we recognize the underlying structure of the problem. You'll solve the pigeonhole problem in a split second. But if I encode the problem in CNF (the language of SAT solvers) and give it to you without more information, you won't be able to do it anymore.


I do find this interesting...

> Trained Humans are surprisingly good at these problems

Surprising why, and by what metric? (speed? Or quality of solution?)


A problem is NP hard if there are instances that other hard problems reduce to. That is, only some instances of the problem have to be hard. NP hardness says nothing about "average" problem instances and even less about hand-picked ones.

And that's what Sudoku puzzles are. They are made for humans to solve. They are specifically crafted to contain challenging, but possible, solution paths.


I was asking about why specifically it's surprising that humans are good at such problems. I think that to be surprising there must be some prior expectation about how good humans should be and then evidence that they're actually better than that. Both sides of that are interesting and I'd like to know more about it: What is the prior expectation of human capabilities here (and why), and how does it compare to actual observed capabilities.

I'm not sure sudoku is a good example since the problem size there is so small that I don't have any intuitive sense that it should be something humans can't do.


Computers have to solve hard exact problems. You don't seem to understand that heuristics can significantly speed up NP hard problems, especially if you are willing to give up exact solutions.


Hmm.... A neural network for register allocation? Might be an interesting experiment to try.


Sudoku puzzles are usually set by humans.


Not really, as I understand it. The big blow up of sudoku books full of puzzles came about because some guy figured out how to create them with a program, rather than by hand. Earned the guy millions I believe.

Edit: Wayne Gould https://en.m.wikipedia.org/wiki/Wayne_Gould


People who are more into it usually prefer human made puzzles since they often have a logical path that you're supposed to find, which can be quite satisfying. Generating sudoku puzzles is actually quite easy. Just put random numbers on the board until you have a unique solution. It runs surprisingly fast. The tricky part is deciding on the difficulty. I made a program that would identify all the different sudoku techniques needed to solve each puzzle (x wings, pairs, all the way up to chains), then set the difficulty based on what techniques were required. Code is here for anyone interested: https://github.com/magnusjt/sudoku/ Sadly I don't think anyone would pay millions for this anymore


Most instances of a class of problems that are NP-hard are in fact easy. Usually NP-hard is something resembling exponential blow-up in a problematic edge case.


One of my favorite terms in this space is 'relaxation'.

When you set yourself against the worst case scenario, you are destined to have a very bad time. But there are subsets of the problem where one or two details are treated as trivialities, while still keeping the rest of the problem 'interesting' or 'useful'.

Compression cannot compress white noise. But it turns out humans don't find noise that interesting (except in the negative). We value signal, and meaning. Most of the communication we care to exchange has a much more straightforward message than the medium, and so we continue to find new ways to condense both the message, and the most valuable nuance, down.


Sadly, the field is still mostly dominated by commercial solvers. There are a few open source ones but their performance is nowhere near the commercial ones which is kind of expected given the number of people working on them and the little funding they have. It is really a pity that the OR world hasn't embraced open source as much as ML world.


This is true, but there is HiGHS [1] which is "good enough" to use for a lot of real problems (though the commercial solvers are still better).

[1] https://highs.dev/


I'm working actively on this area – anyone interested in helping should please get in touch! carson@spindle.app


The thing is, ML building blocks are very simple and composable, I don't think that holds for solvers.


Yes, I think this is true. I've worked in both fields. But we should be really happy about the fact that we did not end up in the same place with ML.


> But we should be really happy about the fact that we did not end up in the same place with ML.

Why so?


I believe having our tools available as open source is good for the society as a whole.


Also ML gets a lot more funding (academic or commercial) than MILP.


The pricing of some of this software is absurd — easily enough for a company using more than one or two server licenses to dedicate an entire FTE toward open source.


A bit disappointed that there were no reimplementation and real benchmarking happening.


Next article apparently. I was very much looking forward to that, too.

> In our next article, we'll look to compile crosswords using a MILP model.

and

> In the next article, we attempt to formulate and solve a MILP to compile crossword puzzles. This is still a difficult problem, though given the improvement in computer and solver speed, there is some basis for optimism that the task is now possible.


Exactly. In particular here I suspect the perf is dramatically worse than 1800s/20e9=90ns. Constant factors are real-world annoyance :)


Anyone has a good literature review (or anything similar) on AI technics applied to ILP/constraints solver ?

I have seen a couple of result for specific domain ( like place and route ) but I am wondering how those new technics fair in more general settings


"Combining the computer hardware speed increase of 4,000 times with the solver software performance improvement of 5 million times, the total improvement from 1989 to 2024 is a factor of 20 billion times faster!"

No wonder people have started making jokes that programmers no longer know how to program! We can get away with a lot more now.


The software improvements are on an order of 1000x more than the hardware improvements.

IE: The software matters more. As in, today's programmers know more about this subject.


I'd say on average programmers know less about writing optimal code now, but certainly the best in the field know a lot more.


I'd say that the algorithm matters more. As in, today's mathematicians know more about the subject.

I suspect the actual implementation of said algorithms probably achieves a lower % of peak performance than the older ones (though to be fair they are /much/ more complex algorithms).


>I suspect the actual implementation of said algorithms probably achieves a lower % of peak performance than the older ones

In my experience I have found the opposite to be the case. Most old maths libraries are written in FORTRAN (generally an order of magnitude or more slower than a comparable C/C++) and the implementations of standard algorithms are often sub-optimal and naive. I got the same impression when I compared arctangent (Taylor series) implementations in π programs and Minimax in chess (see Bernstein's program for the 704). I would guess that in the worst case they were 100x slower than what those machines could theoretically achieve. The C+inline asm libraries of today might be 2-10x slower at worst and some are even bottlenecked by memory. In this case I doubt the programs discussed in the 1989 paper, which were written in Prolog and FORTRAN, are exceptions to this.


I went to a Guest lecture in ~96 on performance of supercomputing and applications to FEA, so basically matrix factoring.

In the time from the Cray 1 -> then, there were 6 orders of magnitude of hardware gains, and 6 orders of magnitude in software as well.


Matrix factoring as in LU, Cholesky, QR, SVD etc? 6 orders of magnitude from mid-70s to mid-90s?

Unless I'm misunderstanding I'm shocked that there was that much left on the table.


I think it went from naïve gaussian through LU and SVD to approximate iterative forms for the top eigenvectors/values. So a good portion of that was not computing the higher order terms that didn't significantly contribute to the results.

Hazy memory though, as it was 25 years back and I've been out of the FEA side of things for 20+ years now.

I will say though -- I was doing some stuff at the time that was burying SuperSparcs for 24 hours at a time, and would now probably run realtime on a watch or phone. (Again, a big mix of hardware advancement, reduced precision for insignificant terms, and generally optimized algos)


FEAs probably involve sparse matrices, which have a lot more complexity than simple dense matrices. For example compute optimal reordering of a generic sparse matrix is iirc NP-complete.


Of course, it takes knowing how to program to take advantage of the solver improvements. I find that most programs do not, in fact, use any solver techniques. More, I think I can program, but I would also have trouble using solvers in a lot of my programs.


The problem with factorials is that 20 billion doesn't necessarily mean much.

Let's say we could solve a problem of 100 elements 35 years ago, with 100! operations. If the 2x10^10 multiplier only made the exact same calculation but faster, that would let you solve n = 105 in the same time. Business problems don't grow that slowly. Not over 35 years.

You have to get better at solving the problem in less than n! steps by culling impossible scenarios, and doing it more aggressively.


I wish they tried to solve the 1989 4x4 crossword puzzle optimization with a modern solver, but a small memory limit (~8MB) and perhaps a severely underclocked CPU to showcase the algorithm improvements.


It's kind of funny because comparable hardware would be hard to find nowadays.

Even the ESP32 which can be purchased for something in the neighborhood of $2 runs at 600mips (technically dmips but all that means is they're not benchmarked for floating point operations https://en.wikipedia.org/wiki/ESP32), although I am not sure that they can run the full exact same instruction sets.


I heard about some competition like this: they made a boolean satisfiability problem. Then they ran a "race" -- an old solving algorithm running on modern hardware, versus a new algorithm on old hardware. The new solver won, even with a massive speed handicap!


So solvers are now much faster, but I haven't found a single hint in the article as to how they got faster (aside from the "more memory", "more CPU" aspect).

Was there a major theoretical development in the solver field that allowed this to happen ?

Or is it a bunch of tiny tuning heuristics ?

If so, how are those even possible given that the solver is supposed to be a generic tool applicable to a large class of problem ?

Are there problems whose structure fit a recognizable pattern where optimizations are possible ?

I confess to being left hanging by the article.


The details are pretty math heavy but what these solvers are doing is they try to find an optimal assignment to a bunch of variables x,y,z,etc while respecting a bunch of constraints like

3x + 2y <= 10

4z >= 3.5

Additionally, there is an “objective function” that defines what optimal means. Something like:

maximize (3x + 10y - 2z)

It’s not obvious but all kinds of problems can be modeled in this framework, like scheduling-, graph-, routing- problems. A big application is logistics: maximize profit / minimize costs under certain constraints.

So the solvers are just dealing with this inequality solving business. And this is where a lot of theoretical advances have happened. It’s a large field and I barely scratch the surface but it’s very interesting. Some keywords are: Operations Research, Mixed Integer Programming, Simplex Method.


I would answer "yes" to all your questions.

> Was there a major theoretical development in the solver field that allowed this to happen ?

A few major theoretical developments did happen, although the really big ones are 25+ years ago (see Figure 4 in the OP): 5x in 1994 with the incorporation of the dual simplex method, 10x in 1998, mostly because of cutting planes, Gomory cuts specifically.

> Or is it a bunch of tiny tuning heuristics ?

Also yes. Bob Bixby, co-founder of CPLEX and Gurobi, describes mixed-integer programming as "a bag of tricks". And of course, there is a whole spectrum between pure theory and heuristic trickery, it's not black-and-white.

> If so, how are those even possible given that the solver is supposed to be a generic tool applicable to a large class of problem ? > Are there problems whose structure fit a recognizable pattern where optimizations are possible ?

Yes, plenty! Commercial solver developers have a business to run. Clients got problems, they need to solve them, regardless of the algorithm. The canonical example of problem-structure-detection is knapsack problems. CPLEX and Gurobi both detect when their input is a knapsack, and they then run a completely different algorithm to solve it.

At a smaller scale (but larger impact overall), there are a wide range of "presolve" techniques that each detect some microstructures in problems and simplify them [1]. Most of these techniques affect <20% of problems, but together they are extremely powerful.

Another example of half-theoretical half-tricky technique that has a great impact on a few instances by detecting structure: symmetry detection. The theory behind it is serious stuff. Implementing the techniques requires serious (and unpublished) engineering efforts. Most problem instances aren't affected at all. But when it works, you can expect a 10x speedup.

[1] https://opus4.kobv.de/opus4-zib/files/6037/Presolve.pdf


Thanks for your interest in our article about solver performance.

We've now posted a follow-up pair of articles where we attempt to compile crossword puzzles using a Mixed Integer Linear Program.

Let us know if you have any questions.

https://www.solvermax.com/blog/crossword-milp-model-1

https://www.solvermax.com/blog/crossword-milp-model-2


1989: Odd that the superminicomputer that cost hundreds of thousands had 8MB RAM and managed 1 MIPS, while my Acorn Archimedes cost $2000 had 1MB (and you could get up to 4MB) and managed 8 MIPS.


Good catch. That might be a typo: the Prime 750 seems to date from 19_7_9.


The paper was written in 1988 and published in 1989 (1 Jan 1989, so just). The Prime 750 is specifically named in the paper, it was probably the best system he had access to when doing the work.


I would side with their grain of salt until they have actually completed their promise to implement a crossword puzzle solver using integer programming.


We did that:

https://www.solvermax.com/blog/crossword-milp-model-1

https://www.solvermax.com/blog/crossword-milp-model-2

Though note that we're compiling new crosswords, rather than solving existing puzzles.


I wish they put all this increased computer power to solve the real problems, rather than crossword puzzles :/ we may have lived in a 4 day working week already


Are there good reference books on solver implementations?

I tried diving into the subject using online references but found them lacking in context and explanations sometimes.


For mixed integer programming solvers, this thesis is a good reference:

https://opus4.kobv.de/opus4-zib/frontdoor/deliver/index/docI...


Great, thanks!


Yet more reasons to try more central planning rather than rely on analog markets. If it’s good enough for large Capitalist firms, it should be good enough for sectors of the economy.


The map is not the terrain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: