"In other words: we got plenty of unreliable CPU, memory, disk & network… but we can’t get at with the same ease the hardware guys made possible when going to dram from multiple CPUs connected over internal buses. Until we break that ease-of-use barrier, we’ll never get every-day programmers coding distributed systems as easily as we do single machines now."
Isn't that the problem that MapReduce/BigTable/etc. are already solving?
And if the problem you're trying to solve is making distributed systems like MapReduce/BigTable themselves as easy to code as single-machine systems, I strongly believe that will never be possible. Distributed systems are far more complex and have failure modes that don't exist in single-machine systems. Trying to hide that would only give you a horribly leaky abstraction.
> Isn't that the problem that MapReduce/BigTable/etc. are already solving?
Not in realtime, and it is somewhat cumbersome. I think Dr. Click's friendly debates with Rich Hickey regarding STMs have been taken to a new level. H20 and Datomic are somewhat competing visions.
Cliff (if you are reading): I think R on top of H2O is an excellent choice, but as I have been thinking about this same sort of architecture, I would suggest that front-ending H2O will be Scala's killer app. [I will take the pill if Trinity shows up :-)]
The github repo for 0xdata's new "h2o" database is still empty, but there's another interesting project containing highly concurrent in-memory java data structures: https://github.com/0xdata/high-scale-lib
Isn't that the problem that MapReduce/BigTable/etc. are already solving?
And if the problem you're trying to solve is making distributed systems like MapReduce/BigTable themselves as easy to code as single-machine systems, I strongly believe that will never be possible. Distributed systems are far more complex and have failure modes that don't exist in single-machine systems. Trying to hide that would only give you a horribly leaky abstraction.