
A Case Study in LASP and Distribution at Scale [video] - cmeiklejohn
https://www.youtube.com/watch?v=lhHtiBpDa54
======
elcritch
This is great! I'm continuously nonplussed by deployment systems which require
bootstrapping a master to bootstrap the system. Docker Swarm, Kubernetes,
Rancher, Project FIFO, etc all require setting up hosts machines, designating
some coordinator node(s), and then turning on the system.

Working a lot in IoT fields, or even with onsite deployments of services, this
pattern becomes a blocker. It's more difficult than traditional "setup a
server, turn it on, install software" approach, especially for small
deployments. There's a lot of need (IMHO) for various orchestration systems
which are self-boostrapping and fully P2P for smaller teams deploying on
clouds, IoT devices in a given locale, research labs etc.

I've been watching your work for a while, and it's great you're moving toward
usable systems! Is there a good place to watch your work and/or engage with
the community? I'm starting work on some new IoT devices, and while this won't
be directly applicable I'd like to play around with some of the core ideas.

~~~
cmeiklejohn
We have a Slack, so you can reach out to me on Twitter at @cmeik and I'll be
happy to invite you. We have several almost-production users we have been
fixing bugs and the Lasp slack is the place to go to get more information
about what we are doing and how you can influence our future direction.

~~~
elcritch
That'd be great! I'll have to digup a dusty twitter account. I may be more of
a lurker for now, but seeing how Lasp is shaping up would be pretty valuable.

A public slack would be nice. It seemed like LASP might be dead for a while
when looking at some of the project pages a few months ago. Understandable
when you're focused on solving hard problems, but an FYI from an outside
perspective. This talk is really helpful in seeing what you've learned and
where it's going. Thanks again!

~~~
cmeiklejohn
How do you suggest we move forward? We tried slack, but being invitation only
prohibited us for making a big impact.

What's the right way to build an open community? What do you recommend?

~~~
shalabhc
Another option is gitter.im, which links to your github project and doesn't
have to be invite only.

~~~
cmeiklejohn
I've setup a gitter here: [https://gitter.im/lasp-
lang/lasp](https://gitter.im/lasp-lang/lasp)

------
cmeiklejohn
Related work:

PPDP '17 paper on scaling:
[http://christophermeiklejohn.com/publications/ppdp-2017-prep...](http://christophermeiklejohn.com/publications/ppdp-2017-preprint.pdf)

Ensuring monotonicity via types:
[http://prl.ccs.neu.edu/blog/2017/10/22/monotonicity-types-
to...](http://prl.ccs.neu.edu/blog/2017/10/22/monotonicity-types-towards-a-
type-system-for-eventual-consistency/)

~~~
rdtsc
Great stuff, love your work.

From the video also enjoyed the details about how things ran on variety of
cloud environments. The google one was funny: kubernetes in kubernetes on borg
on hw virtualtization on hardware.

~~~
cmeiklejohn
We are open to any platform that provides free resources for running
experiments. Of course, any funding will be acknowledged in papers we publish.
:)

------
shalabhc
Very interesting - LASP looks like it is based on CRDTs, which I understand
work by attaching location and time metadata to the data operations
(add/remove) and then being able to merge history consistently.

Is there more work that generalizes this idea? E.g. attach metadata to more
generic 'messages' that operate on 'processes' and then have the processes
define how (or if) the merge occurs, given the history of all messages at
different cloned replicas? Can LASP support something like this?

~~~
cmeiklejohn
Lasp can't support this yet, but we're moving towards this direction. We've
build a highly-available membership library called partisan, and the next step
is to make it so we can basically do this type of work inside of something
like OTP with normal messaging. However, it's not trivial because Erlang
doesn't allow us to trivially include metadata in our messages, like something
like Microsoft Orleans does (this is how we transparently added transactions
into Orleans without changing user semantics, by transparently augmented the
message during transmission.)

~~~
cmeiklejohn
Right now, everything is manual and requires explicitly passing metadata
between messages -- this is less than ideal and we're hoping to have some sort
of facility where we can do this transparently in Erlang -- a la
parse_transformation or similar, where we can thread either a causal context
or session context through to make certain types of guarantees. If this is
interesting, we've love to talk to you to learn more about how we could bring
this to reality in a real application.

~~~
shalabhc
My interest in this is atm is research oriented. I do feel LASP is a great
project and I'd like to see more work in this area because I believe there is
a general lack of distributed programming systems which abstract away an
entire network of computers. I believe that is where the future lies - we do
not want to be programming one OS process at a time and use a disparate array
of containers, while building very large and complicated systems.

Passing through a causal context seems like a good idea. I'm not that familiar
with Erlang but perhaps it doesn't have to be fully transparent? I presume
there is a subset of processes that would participate in the distributed
implementation and each could be encapsulated by another LASP process that
wraps/unwraps messages as they go out/in and handles the metadata?

Btw, some related ideas exist in Croquet/TeaTime as well
([http://www.vpri.org/pdf/tr2003001_croq_collab.pdf](http://www.vpri.org/pdf/tr2003001_croq_collab.pdf))

------
fenollp
> 10k EUR in 9 monts

Did you take a look at Scaleway's "Dedicated ARM cores"? They start at ~3EUR
for 4c/2GB.
[https://www.scaleway.com/pricing/](https://www.scaleway.com/pricing/)

Their instances come up in under a minute. They have a nice client
[https://github.com/scaleway/scaleway-
cli](https://github.com/scaleway/scaleway-cli) and they are from the EU which
is a great place to spend EU grants ;)

Not affiliated. Just a happy customer.

------
binarytemple
Nice work. Distributed computation is the future. Also glad to see someone
scaling Erlang node count.

~~~
cmeiklejohn
Thank you!!

------
rad_gruchalski
I've been looking forward to watching the video and learn a little bit more
about LASP itself. Been following your work closely on Twitter a while ago.
Unfortunately, got a little bit lost about half way through. I am not exactly
sure what was the point you are trying to put across and I still don't really
know what LASP is. It seems that it's somehow about disseminating the state in
distributed systems? However, the digs at Mesos, distributed Erlang, the use
case of launching a cluster within a short period of time suggest that
disseminating state isn't the core of the talk.

I've done some work on gossip systems in the past,
[http://gossiperl.com](http://gossiperl.com) is the result of my research.
Gossiperl was based on work I've done at Technicolor Virdata (shut down nearly
couple of years ago). We've built a distributed device management / IoT / data
ingestion platform consisting of over 70 VMs (EC2, OpenStack, SoftLayer). That
was before Docker became popular, virtually everyone was thinking in terms of
instances back then. These machines would hold different components of the
platform: ZooKeeper, Kafka, Cassandra, some web servers, some hadoop with
Samza jobs, load balancers, Spark and such. Our problem was the following:
each of these components have certain dependencies. For example, to launch
Samza jobs, one needs Yarn (the Hadoop one) and Kafka, to have Kafka, one
needs ZooKeeper. If we were to launch these sequentially, that would take
significant amount of time considering that each node would've get
bootstrapped every single time from zero (base image with some common packages
installed) using Chef and installing deb / rpm packages from the repos. What
we put in production was a gossip layer written in ruby, 300 lines or so. Each
node would announce just a minimum set of information: what role it belongs
to, what id within the role it has, the address. Each component would know the
count of the dependency it requires within the overlay to bootstrap itself.
For example, in EC2, we would request all these different machines at once.
Kafka would be bootstrapping at the same time as ZooKeeper, Hadoop would be
bootstrapping alongside. Each machine, when bootstrapped, would advertise
itself in the overlay and the overlay would trigger a Chef run with a hand
crafted run list for the specific role it belonged to. So each node would
effectively receive a notification about every new member and decide to take
an action, or not. Once 5 ZKs are up, Kafka nodes would configure themselves
for ZooKeeper and launch. Eventually Kafka cluster was up. Similar process
would've happen on all other systems, eventually leading to a complete cluster
of over 70 VMs running (from memory) about 30 different systems being
completely operational. Databases, dashboards, MQTT brokers, TLS, whatnot. We
used to launch this thing at least once a day. The system would usually become
operational within under half an hour, unless EC2 was slacking off. Our gossip
layer was trivial. In this sort of platform there are always certain nodes
that should reachable from outside: web server, load balancer, mqtt broker.
Each of those would become a seed, any other node would contact one of those
public nodes and start participating.

From the capabilities perspective, the closest thing resembling that kind
infrastructure today, is HashiCorp Consul. Our gossip from Virdata is
essentially what the service catalog in Consul is, our Chef triggers is what
watches in Consul are. With these two things, anybody can put up a distributed
platform like what you are describing in your talk and what we've built at
Virdata. There are obviously dirty details like, one needs to have a clear
separation of installation, configuration and run of the systems within the
deployment. The packages can be installed concurrently on different machines,
application of the configuration triggers the start (or restart), system
becomes operational.

Or do I completely miss the point of the talk. I'd like to hear more about
your experiences with Mesos. You're not the first person claiming that it
doesn't really scale as far as the maintainers suggest.

By the way, HyParView, good to know, I've missed this in my own research.
Maybe it's time to dust off gossiperl.

* edit: wording

~~~
jamesblonde
Karamel is an interesting orchestration platform for chef solo that does what
you are describing. But it is general purpose - add orchestration rules to
chef cookbooks and then artibrarily compose platforms of services in a single
compact yaml file. In contrast to gossiperl, it is centralized. Might be
interesting to combine the two!

------
polskibus
How does lasp stack compare to Akka's?

