Hacker News new | past | comments | ask | show | jobs | submit login
Lasp: a little further down the Erlang rabbithole (notamonadtutorial.com)
152 points by unbalancedparen on May 9, 2017 | hide | past | favorite | 23 comments



The LASP author also has a good reading list for CRDTs. [0]

Here's why I find LASP intriguing. It started with a simple question: can't the OS just migrate a "process" to another machine for me?

For a couple decades now, we've been taking hardware and software originally designed to run on little islands called PCs and trying to build datacenter-scale distributed systems out of them. We've got kernels that can load balance workloads across many CPU cores, but we can't reuse those to distribute work across a large number of physical machines -- because of CAP, because of unreliable networks, and because there are performance issues. So we're reimplementing OS schedulers now [1] on machines instead of cores, but the basic building blocks we're using don't seem to have anticipated distributed systems very well. So there's tons of resource waste, we're still solving lots of hard distributed systems problems at the application layer, and we're maintaining a huge, somewhat-historically-accidental pile of abstractions between the hardware and the distributed application running on it.

One wonders if and when that pile may collapse. One also wonders what a distributed-first OS might look like. LASP might be the first step towards a whole new computing architecture. The syntax is hideous to my eye, and I have serious questions about performance, but if I can write a long-running program that efficiently scales from 1 machine to 100,000 machines with no code changes, that might be a game changer.

PS. I think Meiklejohn's decision to start by modernizing Distributed Erlang was a solid one. Distributed Erlang's fully connected graph topology and pre-CAP approach to state probably wasn't workable...we've learned a ton about distributed systems since the 80s.

[0] https://webcache.googleusercontent.com/search?q=cache:QJlbCk...

[1] https://kubernetes.io/docs/admin/kube-scheduler/


i've been thinking along these lines for a long time. i was originally very inspired by the plan 9 paper, but that model has a lot of weaknesses too. i'll admit that i don't have (m)any solutions, i've been busy with other projects, but this is definitely an area that i would like to have time to explore properly.


"One wonders if and when that pile may collapse. One also wonders what a distributed-first OS might look like. LASP might be the first step towards a whole new computing architecture."

What about Tannenbaum et al's Amoeba OS and Globe toolkit? Or the grid computing platforms that have been around a long time? There's been steady work in this area with some of it used in production in both CompSci and industry.


> What about Tannenbaum et al's Amoeba OS?

Unlike the contemporary Sprite, Amoeba does not support process migration. https://en.wikipedia.org/wiki/Amoeba_(operating_system)

> Globe toolkit

Got a link to this one?

> Or the grid computing platforms that have been around a long time? There's been steady work in this area with some of it used in production in both CompSci and industry.

Curious if any of these grid computing platforms support process migration (aka "don't make me think about machine boundaries"), and whether any of the more recent ones have been informed by CAP / designed to withstand unreliable networks. LASP is interesting because it has been. As I understand it, process state is a CRDT, so computations should survive network partitions and state should reconverge afterwards.

The biggest questions for me are around the performance characteristics of this approach.


"Unlike the contemporary Sprite, Amoeba does not support process migration"

Oops. My bad. It did a lot as a distributed system except for that apparently. You also found the one that did.

"Globe toolkit Got a link to this one?"

It's an object-based system. The old site is dead I think with me having to constantly dig it up. Here's a nice summary:

https://cds.cern.ch/record/400321/files/p117.pdf

They also mapped the WWW protocols to it to some degree to show we could reuse those legacy systems benefiting from Globe's capabilities. Or start the transition that way.

"Curious if any of these grid computing platforms support process migration "

The first I saw was MOSIX:

https://en.wikipedia.org/wiki/MOSIX

Modern, smaller experiment that's similar in Linux:

http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=...

I found a nice survey paper that includes references to a number of systems from 2006:

https://www.cse.buffalo.edu/~vipin/book_Chapters/2006/2006_1...

I think most of the research is going into "VM" rather than "process" migrations given most computing is going in that direction. If you research that, you will get many more results whose techniques might apply to process migration. Quick example:

http://mvapich.cse.ohio-state.edu/static/media/publications/...

The high-performance part of CompSci has mostly shifted in that direction since most industry and FOSS have shifted toward containers/VM's. The point of my original comment, though, was to say that what you say needs to be done has been happening for a while. It's just a small few in academia who dig in that deep to develop those mechanisms. And resulting in something worth finding with my quick searches. :)


All good stuff, thanks.

> I think most of the research is going into "VM" rather than "process" migrations given most computing is going in that direction.

The VM-migration-over-RDMA paper you cite is interesting, at the very least, because it gives current upper bounds on the wait time for that operation. They do mention there's application downtime. This might not matter at all for scientific / HPC computing (the paper chooses CFD as a benchmark), but it definitely matters to people building highly available services, which I'd guess is most of HN.

Application downtime on a single migrating VM isn't a dealbreaker, but it does assume you've either cleverly replicated state across VM instances, or you've got some other load-balancing infrastructure that's routing traffic in front of a mostly stateless app tier. In either case, you have to be careful with state, and careful operationally as you bring these things up and down. Wouldn't it be nice if the OS just did this for you? That's all I'm saying. I can't remember the last time I migrated a running process from CPU0 to CPU1 by hand. Or even wrote a script to do that.

> The high-performance part of CompSci has mostly shifted in that direction since most industry and FOSS have shifted toward containers/VM's.

It most certainly has. We might be at a local optimum though, and one determined by increasingly irrelevant computing history. I think about that every time I rent a VM to stand up a Unix box to run a container to run what's supposed to be a totally stateless application. Seriously, at that point, why aren't we running side-effect-free FP code as close to the metal as possible?

We've got all these PCs laying around, we've got the peace dividends of the smartphone wars kicking in, we've got almost 50 years of Unix, we don't talk about the Von Neumann architecture anymore because everything is Von Neumann, etc. From my perspective it sure looks like we're surrounded by hammers. If we happen to be living in the Nail Universe, there's nothing wrong with that.

> The point of my original comment, though, was to say that what you say needs to be done has been happening for a while.

I'm very interested in the history of distributed computing -- tons to learn there. "Relearn" might be more accurate. At the same time, I don't trust anything pre-CAP-theorem to work at datacenter-scales. We simply didn't understand the scale problems well enough before the early 2000s.


Addendum: MOSIX looks pretty interesting for small-scale things.


Doesn't IBM have the facility to migrate processes (by virtue of Virtual Machines) from core-to-core, processor module to processor module, and machine to machine.

I'm fairly confident I've seen that in their hypervisor on Power.


Migration of processes has been done in hpc for instance: http://fox.eti.pg.gda.pl/~pczarnul/DAMPVM.html


If you're the author of DAMPVM or are familiar with the internals, can you comment on:

1. The costs of migrating a process. How long does it typically take to an order of magnitude -- ms, secs, mins? How much state needs to be transferred?

2. Limitations. Can all processes be migrated, or are there certain types of process state that prevent migration?

3. What process state needs to be migrated, and how does that process state get enumerated?

4. Are there cases where a process is migrated, but some of its state stays behind, and the new process uses a proxy to continue to read and write it?

Been curious about past attempts at this lately, and I've found a few by digging around. For instance, Sprite OS was a BSD offshoot in the late 80s that had some form of cross-machine process migration:

https://web.stanford.edu/~ouster/cgi-bin/papers/migration-sp...


I'm not the author, I attended class by DAMPVM author long time ago.

There are some references to publications on the main page that should lead to measurements that you're interested in. look around years 2000-2005. As far as I remember it was very efficient and shined in dynamic load scheduling using process migration. Very cool at that time. It was aimed at the HPC world (PVM), there were many interesting things done in that space that we're replaced by the Beowulf attitude towards off-the-shelf components.


My favourite thing about CRDTs (and event sourced data in general) is that they force us to consider the meaning of time and order as it relates to information. When handled with clarity of thought there are flow on effects in correctness of processing and (my favourite outcome) in POLA consistency i.e. the UX improves - especially in the marginal or edge cases.

The downside is that popular frameworks don't mesh well, or at all, with this paradigm. Trying to shoehorn event-based data into say, Rails (which I've done), is an frustrating exercise in resolving massive impedance mismatches and growing technical debt. Turns out most database-driven MVC frameworks have a very limited understanding of time and order.

CQRS helped here, because I realised that my MVC can and should just be a view & query layer, and I should be building the event sourced/CRDT-based logic elsewhere.


Glad to hear someone else describe the benefits of using CRDTs on the UI side. My most recent project currently ignores the time ordering of conflicting events, but it's expissue you done. The crdt foundation gives a free solid design to handle multi user synchronization issues when we need to. Unfortunately, I've come to believe the whole MVC architecture to be an anti-pattern. It's just hard to test, hard to debug, and overall just takes a lot of work to maintain.

Luckily I've found that many of the modern JavaScript UI libraries play very well with event sourcing, IMHO. React and Vue.js in particular work very well when combined with Redux/Vuex and a CRDT database. Both of those stores are essentially event sources themselves, where the actions and mutators are separated and the data store itself is updated only in one place. Very quick to,develop with!

There are a few server side setups for both React and Vue.js which makes them useable in place of rails, etc. Also Elixir and Phoenix work well since views and controllers in Phoenix are functional.


What CRDT databases do you use in this sort of application?


Currently leveraging CouchDB and it's incremental map-reduce to handle the "CRDT" merging of an event stream. Technically I'm not using a CRDT library or DB, but it's surprisingly easy to implement the core "commutative operations" of CRDT's via a modified deep-merge versioning algorithm. Works pretty well as I can deploy 20,000+ user interactions on an embedded CouchDB instance with steady performance.

Interesting to note that while the actual data structures don't conflict the oder that multiple users modify data and choosing an ordering (and choosing) of which user events poses a challenge. It's possible with CouchDB DB merging to notify the user's interface that their changes were overwritten by another user or device allowing them to choose.


If you're thinking of dabbling into Erlang and may be put off by the syntax, try Elixir. Beautiful, actively evolving language with a healthy community. Compiles directly into Erlang bytecode. You can even still call any native Erlang libraries if you happen to need that.

A short review [1] of Elixir from Joe Armstrong, author of Erlang. This is from 2013, and the language has evolved quite a bit since then.

[1] http://joearms.github.io/2013/05/31/a-week-with-elixir.html


There's also LFE (Lisp Flavoured Erlang) by one of Erlang's co-creators, Robert Virding as an alternative to Erlang or the Ruby-like syntax of Elixir.

  [1]  lfe.io


I don't want this to devolve into a syntax flame war, but I'd encourage people to step outside their comfort zone. I find that the native Erlang syntax helps me to think in Erlang, vs imperative/OO/scripting/whatever.

But yes, better Elixir than yet another reheated Algol.


Oh, I never meant it as a gripe at Erlang's syntax myself. However, there is so much untapped power that the developer community is missing because Erlang is immediately dismissed. People will try node.js, but not Erlang. I'm seeing Elixir as a chance for the Erlang VM to finally get wide-wide adoption it deserves.


Absolutely, agreed on all points.

I really appreciate the work Meiklejohn (disclaimer: former co-worker) is doing to make Erlang more powerful. As I like to say, Erlang gives you better distributed/concurrent primitives than other languages, but they're still primitive.


What people don't get is that Erlang syntax can literally be described in one page. The syntax and the language are very shallow (in the minimalism/simplicity way). When helping someone learn Erlang, most of the questions are around macros/dialyzer annotations, creating a Hello World app, why OTP has things named the way it does ...


It's interesting you say that. I loved the syntax of Erlang at first sight (although the semicolons took some getting used to). Whereas the syntax is the thing I like the least about Elixir - I find it much less elegant.


I, very breifly, dabbled in erlang to try and contribute to a blockchain project that use it. It turned out to be a terrible decision because it's not a language with a large and active community and finding people to contribute to the codebase proved difficult for the project. On the surface, erlang made sense for such a project: mainly due to the concurrency primitives and the OTP framework. I still believe erlang is a bad choice for an open source project that deals in difficult problems and depends on lots of eyes and contributors, such as blockchain development.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: