
Lasp: a little further down the Erlang rabbithole - unbalancedparen
https://notamonadtutorial.com/lasp-a-little-further-down-the-erlang-rabbithole-febba29c8d0c
======
sigil
The LASP author also has a good reading list for CRDTs. [0]

Here's why I find LASP intriguing. It started with a simple question: can't
the OS just migrate a "process" to another machine for me?

For a couple decades now, we've been taking hardware and software originally
designed to run on little islands called PCs and trying to build datacenter-
scale distributed systems out of them. We've got kernels that can load balance
workloads across many CPU cores, but we can't reuse those to distribute work
across a large number of physical machines -- because of CAP, because of
unreliable networks, and because there are performance issues. So we're
reimplementing OS schedulers now [1] on machines instead of cores, but the
basic building blocks we're using don't seem to have anticipated distributed
systems very well. So there's tons of resource waste, we're still solving lots
of hard distributed systems problems at the application layer, and we're
maintaining a huge, somewhat-historically-accidental pile of abstractions
between the hardware and the distributed application running on it.

One wonders if and when that pile may collapse. One also wonders what a
distributed-first OS might look like. LASP _might_ be the first step towards a
whole new computing architecture. The syntax is hideous to my eye, and I have
serious questions about performance, but if I can write a long-running program
that efficiently scales from 1 machine to 100,000 machines with no code
changes, that might be a game changer.

PS. I think Meiklejohn's decision to start by modernizing Distributed Erlang
was a solid one. Distributed Erlang's fully connected graph topology and pre-
CAP approach to state probably wasn't workable...we've learned a ton about
distributed systems since the 80s.

[0]
[https://webcache.googleusercontent.com/search?q=cache:QJlbCk...](https://webcache.googleusercontent.com/search?q=cache:QJlbCkMI68YJ:https://christophermeiklejohn.com/crdt/2014/07/22/readings-
in-crdts.html+&cd=1&hl=en&ct=clnk&gl=us)

[1] [https://kubernetes.io/docs/admin/kube-
scheduler/](https://kubernetes.io/docs/admin/kube-scheduler/)

~~~
nickpsecurity
"One wonders if and when that pile may collapse. One also wonders what a
distributed-first OS might look like. LASP might be the first step towards a
whole new computing architecture."

What about Tannenbaum et al's Amoeba OS and Globe toolkit? Or the grid
computing platforms that have been around a long time? There's been steady
work in this area with some of it used in production in both CompSci and
industry.

~~~
sigil
> What about Tannenbaum et al's Amoeba OS?

 _Unlike the contemporary Sprite, Amoeba does not support process migration._
[https://en.wikipedia.org/wiki/Amoeba_(operating_system)](https://en.wikipedia.org/wiki/Amoeba_\(operating_system\))

> Globe toolkit

Got a link to this one?

> Or the grid computing platforms that have been around a long time? There's
> been steady work in this area with some of it used in production in both
> CompSci and industry.

Curious if any of these grid computing platforms support process migration
(aka "don't make me think about machine boundaries"), and whether any of the
more recent ones have been informed by CAP / designed to withstand unreliable
networks. LASP is interesting because it has been. As I understand it, process
state is a CRDT, so computations should survive network partitions and state
should reconverge afterwards.

The biggest questions for me are around the performance characteristics of
this approach.

~~~
nickpsecurity
"Unlike the contemporary Sprite, Amoeba does not support process migration"

Oops. My bad. It did a lot as a distributed system except for that apparently.
You also found the one that did.

"Globe toolkit Got a link to this one?"

It's an object-based system. The old site is dead I think with me having to
constantly dig it up. Here's a nice summary:

[https://cds.cern.ch/record/400321/files/p117.pdf](https://cds.cern.ch/record/400321/files/p117.pdf)

They also mapped the WWW protocols to it to some degree to show we could reuse
those legacy systems benefiting from Globe's capabilities. Or start the
transition that way.

"Curious if any of these grid computing platforms support process migration "

The first I saw was MOSIX:

[https://en.wikipedia.org/wiki/MOSIX](https://en.wikipedia.org/wiki/MOSIX)

Modern, smaller experiment that's similar in Linux:

[http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=...](http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=pdf&doi=10.1.1.103.417)

I found a nice survey paper that includes references to a number of systems
from 2006:

[https://www.cse.buffalo.edu/~vipin/book_Chapters/2006/2006_1...](https://www.cse.buffalo.edu/~vipin/book_Chapters/2006/2006_1.pdf)

I think most of the research is going into "VM" rather than "process"
migrations given most computing is going in that direction. If you research
that, you will get many more results whose techniques might apply to process
migration. Quick example:

[http://mvapich.cse.ohio-
state.edu/static/media/publications/...](http://mvapich.cse.ohio-
state.edu/static/media/publications/abstract/huang-cluster07.pdf)

The high-performance part of CompSci has mostly shifted in that direction
since most industry and FOSS have shifted toward containers/VM's. The point of
my original comment, though, was to say that what you say needs to be done has
been happening for a while. It's just a small few in academia who dig in that
deep to develop those mechanisms. And resulting in something worth finding
with my quick searches. :)

~~~
sigil
All good stuff, thanks.

> I think most of the research is going into "VM" rather than "process"
> migrations given most computing is going in that direction.

The VM-migration-over-RDMA paper you cite is interesting, at the very least,
because it gives current upper bounds on the wait time for that operation.
They do mention there's application downtime. This might not matter at all for
scientific / HPC computing (the paper chooses CFD as a benchmark), but it
definitely matters to people building highly available services, which I'd
guess is most of HN.

Application downtime on a single migrating VM isn't a dealbreaker, but it does
assume you've either cleverly replicated state across VM instances, or you've
got some other load-balancing infrastructure that's routing traffic in front
of a mostly stateless app tier. In either case, you have to be careful with
state, and careful operationally as you bring these things up and down.
Wouldn't it be nice if the OS just did this for you? That's all I'm saying. I
can't remember the last time I migrated a running process from CPU0 to CPU1 by
hand. Or even wrote a script to do that.

> The high-performance part of CompSci has mostly shifted in that direction
> since most industry and FOSS have shifted toward containers/VM's.

It most certainly has. We might be at a local optimum though, and one
determined by increasingly irrelevant computing history. I think about that
every time I rent a VM to stand up a Unix box to run a container to run what's
supposed to be a totally stateless application. Seriously, at that point, why
aren't we running side-effect-free FP code as close to the metal as possible?

We've got all these PCs laying around, we've got the peace dividends of the
smartphone wars kicking in, we've got almost 50 years of Unix, we don't talk
about the Von Neumann architecture anymore because _everything_ is Von
Neumann, etc. From my perspective it sure looks like we're surrounded by
hammers. If we happen to be living in the Nail Universe, there's nothing wrong
with that.

> The point of my original comment, though, was to say that what you say needs
> to be done has been happening for a while.

I'm very interested in the history of distributed computing -- tons to learn
there. "Relearn" might be more accurate. At the same time, I don't trust
anything pre-CAP-theorem to work at datacenter-scales. We simply didn't
understand the scale problems well enough before the early 2000s.

------
inopinatus
My favourite thing about CRDTs (and event sourced data in general) is that
they force us to consider the meaning of time and order as it relates to
information. When handled with clarity of thought there are flow on effects in
correctness of processing and (my favourite outcome) in POLA consistency i.e.
the UX improves - especially in the marginal or edge cases.

The downside is that popular frameworks don't mesh well, or at all, with this
paradigm. Trying to shoehorn event-based data into say, Rails (which I've
done), is an frustrating exercise in resolving massive impedance mismatches
and growing technical debt. Turns out most database-driven MVC frameworks have
a very limited understanding of time and order.

CQRS helped here, because I realised that my MVC can and should just be a view
& query layer, and I should be building the event sourced/CRDT-based logic
elsewhere.

~~~
elcritch
Glad to hear someone else describe the benefits of using CRDTs on the UI side.
My most recent project currently ignores the time ordering of conflicting
events, but it's expissue you done. The crdt foundation gives a free solid
design to handle multi user synchronization issues when we need to.
Unfortunately, I've come to believe the whole MVC architecture to be an anti-
pattern. It's just hard to test, hard to debug, and overall just takes a lot
of work to maintain.

Luckily I've found that many of the modern JavaScript UI libraries play very
well with event sourcing, IMHO. React and Vue.js in particular work very well
when combined with Redux/Vuex and a CRDT database. Both of those stores are
essentially event sources themselves, where the actions and mutators are
separated and the data store itself is updated only in one place. Very quick
to,develop with!

There are a few server side setups for both React and Vue.js which makes them
useable in place of rails, etc. Also Elixir and Phoenix work well since views
and controllers in Phoenix are functional.

~~~
rkangel
What CRDT databases do you use in this sort of application?

~~~
elcritch
Currently leveraging CouchDB and it's incremental map-reduce to handle the
"CRDT" merging of an event stream. Technically I'm not using a CRDT library or
DB, but it's surprisingly easy to implement the core "commutative operations"
of CRDT's via a modified deep-merge versioning algorithm. Works pretty well as
I can deploy 20,000+ user interactions on an embedded CouchDB instance with
steady performance.

Interesting to note that while the actual data structures don't conflict the
oder that multiple users modify data and choosing an ordering (and choosing)
of which user events poses a challenge. It's possible with CouchDB DB merging
to notify the user's interface that their changes were overwritten by another
user or device allowing them to choose.

------
tsukaisute
If you're thinking of dabbling into Erlang and may be put off by the syntax,
try Elixir. Beautiful, actively evolving language with a healthy community.
Compiles directly into Erlang bytecode. You can even still call any native
Erlang libraries if you happen to need that.

A short review [1] of Elixir from Joe Armstrong, author of Erlang. This is
from 2013, and the language has evolved quite a bit since then.

[1] [http://joearms.github.io/2013/05/31/a-week-with-
elixir.html](http://joearms.github.io/2013/05/31/a-week-with-elixir.html)

~~~
macintux
I don't want this to devolve into a syntax flame war, but I'd encourage people
to step outside their comfort zone. I find that the native Erlang syntax helps
me to think in Erlang, vs imperative/OO/scripting/whatever.

But yes, better Elixir than yet another reheated Algol.

~~~
tsukaisute
Oh, I never meant it as a gripe at Erlang's syntax myself. However, there is
so much untapped power that the developer community is missing because Erlang
is immediately dismissed. People will try node.js, but not Erlang. I'm seeing
Elixir as a chance for the Erlang VM to finally get wide-wide adoption it
deserves.

~~~
macintux
Absolutely, agreed on all points.

I really appreciate the work Meiklejohn (disclaimer: former co-worker) is
doing to make Erlang more powerful. As I like to say, Erlang gives you better
distributed/concurrent primitives than other languages, but they're still
primitive.

