Hacker News new | past | comments | ask | show | jobs | submit login

You could look at this as "CPUs are cheap, of course we can afford to leave them idle." But a more interesting angle is, "For one programmer's hourly cost, you could run 4000 CPU cores continuously. Can there really be no practical way to apply thousands of cores to boosting the programmer's productivity? What are we missing?"

For instance, couldn't https://github.com/webyrd/Barliman develop into something that makes computing worth spending on at that level?

> What are we missing?

We're missing the magic bit that can transform whatever the programmer needs to do into something the thousands of cores can calculate.

For example, the other day me and a colleague tried to optimize a query. This is something where thousands of cores could have tried all various variations of the problem, and we could have sit back and let them figure it out.

The issue is that I don't have any magic way to tell the horde of cores what to try and how to verify the result. Also there are so many variations to try, I'm not sure it would have been more cost effective without some clever sql-aware thing running the show.

In other words, superoptimization for SQL queries? That's a neat idea. I don't know that anyone has done that before.

It's dependent on the RDBMS, the schema, and the data (or at least the engine's current stats about the data). The good news, though, is that if you could extract just the stats that your RDBMS engine knows about your data, that could be a pretty small bit of metadata to send over the wire to an army of CPUs. You wouldn't need to send actual database tables over the wire (which would be big and slow, and probably a security red flag).

I once worked on a similar problem which also amounted to "enumerate every possibility in N-dimensional space to find the good ones". It was surprising to me how well this worked in practice. Starting with literally every possible solution and then chopping off obviously bad branches by hand will get you to a 90% solution pretty quick.

This seems like an entirely tractable problem.

RDBMS do query optimization by trying out different plans and picking the best via cost estimations. With prepared statements you can reuse that optimization so the cost is (kinda) amortized.

If you want something deeper you'd need an optimizing sql compiler. The closest I know is DSH (database supported haskell) which translates haskell comprehensions into reaonably optimized sql. The name is a nod to data parallel haskell which used the same flattening transformation for automatic parallelization of comprehensions. https://db.inf.uni-tuebingen.de/staticfiles/publications/the...

Though this system can't come up with answers like 'add an index', 'split this table' or 'add sharding' so for the complex cases it doesn't really help.

> Though this system can't come up with answers like 'add an index', 'split this table' or 'add sharding' so for the complex cases it doesn't really help.

And that's the answer we ended up with. Adding a couple of materialized views with indexes, and rewriting the original query to utilize the materialized views.

FWIW I think that DOES happen, but it happens on the "wrong" computers!

Google has at least tens of thousands of cores running builds and tests 24/7. And they're utilized to the hilt. Travis and other continuous build services do essentially the same thing, although I don't know how many cores they have running.

From a larger perspective, Github does significant things "in the background" to make me more productive, like producing statistics about all the projects and making them searchable. (Admittedly, it could do MUCH more.)

I think part of the problem is that it's cheaper to use "the cloud" than to figure out how to use the developer's own machine! There is a lot of heterogeneity in developer machines, and all the system administration overhead isn't worth it. And there's also networking latency.

So it's easiest to just use a homogeneous cloud like Google's data centers or AWS.

There's also stuff like https://github.com/google/oss-fuzz which improves productivity. I do think that most software will be tested 24/7 by computers in the future.

Foundation DB already does this:

"Testing Distributed Systems w/ Deterministic Simulation" by Will Wilson https://www.youtube.com/watch?v=4fFDFbi3toc&t=2s

Autonomous Testing and the Future of Software Development - Will Wilson https://www.youtube.com/watch?v=fFSPwJFXVlw

They have sort of an "adversarial" development model where the tests are "expected" to find bugs that programmers introduce. So it's basically like having an oracle you can consult for correctness, which is a productivity aid. Not exactly, but that would be the ideal.

Neat -- I haven't worked at Google but wondered what they might be doing on this score.

There's probably lots of potential for modern AI to hack at programmer productivity more directly. Machine learning so far has been more of a complement than a substitute, but I'm imagining a workflow where a lot of the time you're writing tests/types/contracts/laws and letting your assistant draft the code to satisfy them. You write a test, when you're done you see there's a new function ready for you coded to satisfy a previous test, you take a look and maybe go "Oh, this is missing a case" and mark it incomplete and add another test to fill it out.

Maybe in the sci-fi future programming looks more like strategic guidance; nearer term perhaps we might see 500 cores going full blast to speed up your coding work by 20% on average. Or maybe not! But it's one idea.

What practical problems can 4000 CPUs solve that 16 CPUs can't?

You get less and less been benefit for each additional CPU you add to the problem, unless the CPU is the main bottleneck.

Also if something requires 4000 CPUs, it is going to start getting expensive if you need to double the output. These types of problems don't scale well.

We've spent decades developing a huge software/hardware edifice that we all stand atop of, thinking in terms of a single thread. The majority of programmers still have to actively push themselves to think in more than one thread (and why would they - the majority of problems programmers come across are single-threaded).

I don't know if there a whole other edifice of computing out there, built atop of decades of thinking in terms of multiple threads, but I have sympathy to the idea that if it's out there, we'd have an awful lot of trouble conceptualising it, and an awful awful lot of trouble conceptualising it after decades of development.

I don't know what kind of practical problems 4000 CPUs will solve that 16CPUs can't, but I give weight to the argument that the way we think, the problems we've created for ourselves and subsequently solved, could have blinded us to them.

There is much that can be done that isn't easily done because of the way software and hardware evolved together. We think single threaded because our languages express single threads and because our languages express single threads the processors we use have to do outrageous things to reorder instructions trying to extract some meager parallelism across a couple execution units and do all this behind our backs. And, because they do that well enough, we don't bother inventing many new languages for doing that explicitly.

You can't easily express "do these two independent things as you can and, when finished, do this other thing" in C (or Python, or Java) and it's up to brave compiler writers to figure out (sometimes erroneously) what can be done with the independent execution flows.

> the majority of problems programmers come across are single-threaded

Surely it's solutions (or rather programs), not problems, that are single-threaded? A problem can probably be solved in many ways, and the fact that many programmers will first reach for a single-threaded program to solve it doesn't mean that's the only, or even the best, way to solve it.

One obvious answer is parallel builds. It's an immense waste of developer productivity to force builds on developer machines.

At sufficient speed too. A build on our build farm takes 2 hours, my local mac does it in 50 minutes.

Am I reading this right that a build farm is slower than your local development computer? Shouldn't that launch a re-evaluation of the build farm?

Not just builds but running tests as well. Actually I think I've heard that Google does exactly that (and caches compilation results, so they could just return them if they are ready).

Google's build and test execution infrastructure is both huge in size and absolutely amazing. One of the biggest positive surprises for me when I had my technical onboarding period.

Feel free to tell us more about it! :)

Thank you, I hadn't realized that was out there!

I think the bottleneck there is actually memory. When building LLVM I run out of system memory before I run out of cores.

In a lot of C++ game development this is a huge factor for productivity, especially since a lot of translation units may need to be touched when changing APIs. Being able to utilize an entire idle office for each .cpp file can make a big difference in build times.

Most places doing any non-trivial C++ development will have a centralized distcc and ccache cluster, surely?

Most C++ shops I've worked at use MSVC and Incredibuild, so yes! I think MSVC is the norm in game development, especially for PC games, but I could be wrong on that.

> You get less and less been benefit for each additional CPU you add to the problem, unless the CPU is the main bottleneck.

I disagree. Each CPU you add to a problem comes with 4x to 8x memory channels of DDR4 (2x on consumer systems, 4x on Threadripper, 6x on Skylake-X). So each CPU increases your memory bandwidth in a very predictable manner.

> Also if something requires 4000 CPUs, it is going to start getting expensive if you need to double the output. These types of problems don't scale well.

Finishing the problem in 1/4000th of the time is often good enough reason. That turns a problem that takes 10-years to finish into a problem that takes 1-day to finish.

You only get good scaling when all the data fits in memory and the problem scales well, but that happens often enough that its worth studying these cases.

> What practical problems can 4000 CPUs solve that 16 CPUs can't?

I am going to be the funny person and say that 4k CPUs can solve a scheduling problem on time so that jobs for 10Mi Idle CPUs can be assigned on time.

But yeah, there are problems where ~ 200x CPU power can make a lot of difference, especially if you're time bound (that's roughly solving in 2 days what 16 CPUs would solve in 1 year)


computational fluid dynamics, FEA, physics based simulations and the like

So many tasks can only take advantage of a single machine (and too often, only a single thread!). While that doesn't change the cost of the CPU time it uses, it does mean that you have to wait much longer to get those N hours of CPU time to save that hour of programmer time. That wait time could completely undo the productivity gains.

It might be useful. For example my IDE uses all cores for build but then those cores idle. If there's local data center nearby (so latency is something like 10ms and speed is gigabit), theoretically my IDE could upload its compiler in advance to cache it there and then just upload those files and get back object files. It'll allow for extremely fast compilation times while I could use very lightweight computer (for example energy-efficient laptop). But I don't think that it's possible to implement that transparently. Software must support this behaviour and developers should carefully decide, because network latency could kill all performance gains.

Regarding compilation, it's perfectly sensible: distcc has done that for a long time.

Part of the Plan9 design (which is almost 30 years old at this point) was the uniform access to computing resources across machines over the network. Unfortunately, we've mostly abandoned that and stuck with mainframe design.

It always pains me to think of what Could Have Been when it comes to things like this. We should have the everything-is-a-file, perfectly network transparent system, but instead most people are still using the bastard child of DOS and VMS

It's even more paining when you begin to realise that nearly every problem humanity has falls into this category - the solutions exist and are known, but don't get used because we're stuck in a local maximum (one that usually involves bank accounts).

>Can there really be no practical way to apply thousands of cores to boosting the programmer's productivity? What are we missing?

Surely that's the whole point of high-level languages?

That's using the end user's cores. Using something at programming-time could be something like static analysis. Then there are entire tests the programmer doesn't have to write and bugs that never get pushed.

I've seen people write Java or C# like I would write plain C. Those first two are insanely high level with large standard libraries.

I disagre with "Those first two are insanely high level" but totally agree with the "with large standard libraries." part.

Java: "I've got a huge standard library!"

C: "See, you're doing it wrong! I've got a really lean standard library!"

Java: "Ah, so you just focus on the basics. You have a dictionary structure? That's pretty basic."

C: "...no."

BSDs have <sys/queue.h> and <sys/tree.h>; the former is standard on Linux as well.

To my eyes just having generics and hiding away pointers and vtables make them very high level, but I think my argument was more about what is coming with the language (aka their standard libraries) and I've definitely seen people ignore the available libraries, even in python.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact