For instance, couldn't https://github.com/webyrd/Barliman develop into something that makes computing worth spending on at that level?
We're missing the magic bit that can transform whatever the programmer needs to do into something the thousands of cores can calculate.
For example, the other day me and a colleague tried to optimize a query. This is something where thousands of cores could have tried all various variations of the problem, and we could have sit back and let them figure it out.
The issue is that I don't have any magic way to tell the horde of cores what to try and how to verify the result. Also there are so many variations to try, I'm not sure it would have been more cost effective without some clever sql-aware thing running the show.
It's dependent on the RDBMS, the schema, and the data (or at least the engine's current stats about the data). The good news, though, is that if you could extract just the stats that your RDBMS engine knows about your data, that could be a pretty small bit of metadata to send over the wire to an army of CPUs. You wouldn't need to send actual database tables over the wire (which would be big and slow, and probably a security red flag).
I once worked on a similar problem which also amounted to "enumerate every possibility in N-dimensional space to find the good ones". It was surprising to me how well this worked in practice. Starting with literally every possible solution and then chopping off obviously bad branches by hand will get you to a 90% solution pretty quick.
This seems like an entirely tractable problem.
If you want something deeper you'd need an optimizing sql compiler. The closest I know is DSH (database supported haskell) which translates haskell comprehensions into reaonably optimized sql. The name is a nod to data parallel haskell which used the same flattening transformation for automatic parallelization of comprehensions. https://db.inf.uni-tuebingen.de/staticfiles/publications/the...
Though this system can't come up with answers like 'add an index', 'split this table' or 'add sharding' so for the complex cases it doesn't really help.
And that's the answer we ended up with. Adding a couple of materialized views with indexes, and rewriting the original query to utilize the materialized views.
Google has at least tens of thousands of cores running builds and tests 24/7. And they're utilized to the hilt. Travis and other continuous build services do essentially the same thing, although I don't know how many cores they have running.
From a larger perspective, Github does significant things "in the background" to make me more productive, like producing statistics about all the projects and making them searchable. (Admittedly, it could do MUCH more.)
I think part of the problem is that it's cheaper to use "the cloud" than to figure out how to use the developer's own machine! There is a lot of heterogeneity in developer machines, and all the system administration overhead isn't worth it. And there's also networking latency.
So it's easiest to just use a homogeneous cloud like Google's data centers or AWS.
There's also stuff like https://github.com/google/oss-fuzz which improves productivity. I do think that most software will be tested 24/7 by computers in the future.
Foundation DB already does this:
"Testing Distributed Systems w/ Deterministic Simulation" by Will Wilson https://www.youtube.com/watch?v=4fFDFbi3toc&t=2s
Autonomous Testing and the Future of Software Development - Will Wilson https://www.youtube.com/watch?v=fFSPwJFXVlw
They have sort of an "adversarial" development model where the tests are "expected" to find bugs that programmers introduce. So it's basically like having an oracle you can consult for correctness, which is a productivity aid. Not exactly, but that would be the ideal.
There's probably lots of potential for modern AI to hack at programmer productivity more directly. Machine learning so far has been more of a complement than a substitute, but I'm imagining a workflow where a lot of the time you're writing tests/types/contracts/laws and letting your assistant draft the code to satisfy them. You write a test, when you're done you see there's a new function ready for you coded to satisfy a previous test, you take a look and maybe go "Oh, this is missing a case" and mark it incomplete and add another test to fill it out.
Maybe in the sci-fi future programming looks more like strategic guidance; nearer term perhaps we might see 500 cores going full blast to speed up your coding work by 20% on average. Or maybe not! But it's one idea.
You get less and less been benefit for each additional CPU you add to the problem, unless the CPU is the main bottleneck.
Also if something requires 4000 CPUs, it is going to start getting expensive if you need to double the output. These types of problems don't scale well.
I don't know if there a whole other edifice of computing out there, built atop of decades of thinking in terms of multiple threads, but I have sympathy to the idea that if it's out there, we'd have an awful lot of trouble conceptualising it, and an awful awful lot of trouble conceptualising it after decades of development.
I don't know what kind of practical problems 4000 CPUs will solve that 16CPUs can't, but I give weight to the argument that the way we think, the problems we've created for ourselves and subsequently solved, could have blinded us to them.
You can't easily express "do these two independent things as you can and, when finished, do this other thing" in C (or Python, or Java) and it's up to brave compiler writers to figure out (sometimes erroneously) what can be done with the independent execution flows.
Surely it's solutions (or rather programs), not problems, that are single-threaded? A problem can probably be solved in many ways, and the fact that many programmers will first reach for a single-threaded program to solve it doesn't mean that's the only, or even the best, way to solve it.
I disagree. Each CPU you add to a problem comes with 4x to 8x memory channels of DDR4 (2x on consumer systems, 4x on Threadripper, 6x on Skylake-X). So each CPU increases your memory bandwidth in a very predictable manner.
> Also if something requires 4000 CPUs, it is going to start getting expensive if you need to double the output. These types of problems don't scale well.
Finishing the problem in 1/4000th of the time is often good enough reason. That turns a problem that takes 10-years to finish into a problem that takes 1-day to finish.
You only get good scaling when all the data fits in memory and the problem scales well, but that happens often enough that its worth studying these cases.
I am going to be the funny person and say that 4k CPUs can solve a scheduling problem on time so that jobs for 10Mi Idle CPUs can be assigned on time.
But yeah, there are problems where ~ 200x CPU power can make a lot of difference, especially if you're time bound (that's roughly solving in 2 days what 16 CPUs would solve in 1 year)
Part of the Plan9 design (which is almost 30 years old at this point) was the uniform access to computing resources across machines over the network. Unfortunately, we've mostly abandoned that and stuck with mainframe design.
Surely that's the whole point of high-level languages?
C: "See, you're doing it wrong! I've got a really lean standard library!"
Java: "Ah, so you just focus on the basics. You have a dictionary structure? That's pretty basic."