Fun to see come full circle here!
Really eye opening stuff! I am still impressed that prefix sum (generalized over all associative operators, not just +) can be done in O(log n) span. And the algorithm is not too complicated either, I was able to get the gist of it even as an undergrad.
1. Data parallelism, esp. for super power efficient parallel hardware (SIMD, GPU, all the stuff that lets you go 10-100X over Spark for the same budget) is hard for computing over "irregular" data like trees and graphs. Prefix scan showed a basic way to do it for a bunch of basic math over these. A lot of the 90s / early 2000s became research into covering all sorts of weird cases. This shows data parallelism is broadly applicable so worth pushing forward on.
2. Guy B, and then Manuel C & Gabriele K, then showed you don't have to be a weird HPC person hand-rolling Fortran nor rely on weird one-off special case libraries to use the above. We can build libraries of high-level primitives -- think map/reduce -- and even to the level that compilers can make it look like "normal" haskell. Most things can compile down to prefix scan. This is similar to Impala/Spark figuring out RDDs/dataframes and doing SQL on top, except again, for 10-100X better performance, and you don't have to contort everything into 1970s SQL. RE:Prefix scan in particular, we don't _have_ to implement all the algs as prefix scan, but it's reliable base case, and where possible (most cases!), we nowadays instead swap in more efficient hand-written CUDA library calls.
Nowadays, when GPUs are in the cloud, Moore's Law stalled out for CPUs but is still going strong for GPUs, and most big data systems realize their latency and data shuttling/serialization sucks, all ^^^ is a big deal. See 10yr AWS price chart I calculated: https://twitter.com/lmeyerov/status/1247382156487213057/phot... . You can just code your Python dataframe code in https://rapids.ai/ and not really think about it, just know you're doing better than the hadoop stuff underneath and without managing a dirty horde. The existence of NESL and the research area that came after means we're still at the beginning of swapping out crappy CPU software for better data parallel stuff, and _it's doable_.
Now if Cargo could install Haskell and its libs, I would jump on it in an instant.
Eh? Some sort of ML (or maybe Lisp) is going to be at the cutting edge in something like this where you have JIT-compilers and such.
> Lately it's PyTorch or PySpark
There's some GPU stuff you can do in Python via Futhark. It's very cool.
I find it kind of weird that, outside of academic circles and the like, Haskell is largely dismissed as "not practical"; I think the "cabal hell" days before Stack came out really hurt the language and its adoption. Libraries like Accelerate really demonstrate how powerful something like Haskell can be when applied correctly.
Similar to Haskell it's originated in academic research. It utilizes static single assignment (SSA) concept that can be used for both imperative and functional programming . Neat stuff. Hopefully someone can emulate this with a modern open source hybrid functional language like Scala or D.
At least when I program in MATLAB instead of python I think very differently.