
Proposal of a new concurrency model for Ruby 3 [pdf] - tenderlove
http://www.atdot.net/fp_store/f.iu25do/file.2016_rubykaigi.pdf
======
pmontra
Tl;dr The goal is to keep compatibility with Ruby 2. It introduces the concept
of guilds and channels to send objects between guilds. The bullet points below
are quoted from a couple of slides, the other text is mine:

* Guild has at least one thread (and a thread has at least one fiber)

* Threads in different guilds can run in parallel

* Threads in a same guild can not run in parallel because of GVL (or GGL: Giant Guild Lock)

A guild can't access the objects of other guilds.

About channels:

* We have Guild::Channel to communicate each other

* 2 communication methods

1\. Copy

2\. Transfer membership or Move in short

Copy is a deep copy and the object is duplicated into the destination guild. A
transfer removes an object from a guild and makes it available to another.

There are also immutable objects that are available to all guilds. An obvious
example are numbers, which are objects in Ruby, booleans and symbols. I think
that other objects are frozen with [https://ruby-
doc.org/core-2.3.1/Object.html#method-i-freeze](https://ruby-
doc.org/core-2.3.1/Object.html#method-i-freeze)

They already did some encouraging benchmarks.

~~~
catnaroek
> A guild can't access the objects of other guilds.

> 2\. Transfer membership or Move in short

How is this enforced? What exactly happens at runtime if a guild tries to
manipulate an object that belongs to another? (Absent a compile-time check,
this is always a possibility.)

~~~
dragonwriter
> How is this enforced? What exactly happens at runtime if a guild tries to
> manipulate an object that belongs to another? (Absent a compile-time check,
> this is always a possibility.)

It would seem that:

(1) Guild ownership would have to be tracked in the runtime, obviously.

(2) Any access from Ruby code in the runtime, the runtime would also know what
Guild the access request came from as well as the Guild the object belonged to
against which access was sought.

(3) The runtime would be required to fail in some well-defined way (presuming,
raising an exception in the requester) when the rules were violated.

It should be reasonably straightforward to assure this for all accesses within
the runtime, since you can just make sure that there is no method to _request_
access which isn't always attached to the Guild that the request comes from.
It may be possible to break the runtime with poorly-behaved extension code
that subverts the normal mechanisms, and it may be impossible to fully protect
against that, but that's pretty much always a potential with extension code.

~~~
catnaroek
How would you transfer ownership of big linked data structures? See my other
comment:
[https://news.ycombinator.com/item?id=12455566](https://news.ycombinator.com/item?id=12455566)

I'm not particularly worried about C extensions. I already know those are a
lost case.

~~~
dragonwriter
> How would you transfer ownership of big linked data structures?

Very carefully?

More seriously, I think with guilds, what you absolutely _don 't_ want to do
is build yourself into a position where you ever _want_ to move a big linked
data structure (that is, unless you know you are only going to use it in one
guild, you never want to _build_ a big linked data structure of mutable
objects.)

Big structures of mutable objects should be guild local (or external, in a
store that has its own controls for concurrent access.)

~~~
_ko1
absolutely.

------
_ko1
Could you link to
[http://www.atdot.net/~ko1/activities/2016_rubykaigi.pdf](http://www.atdot.net/~ko1/activities/2016_rubykaigi.pdf)
? current one is on temporary file space (will be removed soon).

~~~
tenderlove
Apparently I can't change the link. I'm sorry! :-(

~~~
lake99
The mods can do it. I once left a comment about changing the link. They saw it
and changed it on their own. I don't know how to notify them though.

~~~
yxhuvud
I'm reasonably certain they are notified if the article is flagged, but I
suppose it may not be the correct way of doing it.

~~~
nitrogen
AIUI flagging the article can also affect its ranking or kill it entirely, so
it's probably not the best way.

------
kent1
I worked on a similar proposal during my PhD thesis. It is formalized for a
Java-like language and implemented in the Jikes RVM. We also carried a proof
of isolation using Coq.

[https://tel.archives-ouvertes.fr/tel-00933072](https://tel.archives-
ouvertes.fr/tel-00933072)

~~~
mattnewton
That looks really useful. If you have time, please chime in on the proposal!

~~~
kent1
The ownership check is requiered for each access to an object. However it is
straightforward to understand that successive checks of the same object can be
optimized out if the object has not been passed to another owner. In this
thesis I describe dynamic and static analyses to remove the unecessary checks.

------
jph
The key points IMHO:

1\. This Ruby 3 proposal says that Ruby 2 compatibility is mission critical,
therefore this proposal rejects concurrency solutions from other languages
(e.g. Erlang) and concepts (e.g. functions) and data structures (e.g.
immutable collections).

2 Instead the proposal is to create a fast copy-on-write with rules to "deep
freeze" some kinds of objects and primitives into an immutable sharable state.

~~~
nateberkopec
> This Ruby 3 proposal says that Ruby 2 compatibility is mission critical

Matz has been very public about his fear of a "Python 3" situation occurring
in the Ruby community.

~~~
awj
And rightly so, I should think. Given the presence of languages like Elixir
and Go, creating a situation where you are breaking people's code to introduce
multicore programming systems is a pretty bad idea.

I can easily see how people might (rightly or wrongly) say "Ruby 3 broke my
code, I'm rewriting in Go".

------
readittwice
Hmm, I am wondering how moving ownership would work in a GC'ed system. You
could have arbitrarily many references to the moved object (or subobjects).
The slides say that an exception is thrown if an object of a different guild
is accessed, but doesn't that mean that Ruby needs to check the guild at every
object access?

Transfering ownership would probably also mean that Ruby not only needs to
move one object but probably all subobjects recursively as well. I assume here
that "moving" just means updating the guild field for each object.

Is this really feasible or wouldn't just copying the object be faster... I
don't know of any system with gc that uses moving to transfer mutable objects
between threads. Do such systems exist? Are there better ways of implementing
this?

~~~
chrisseaton
> The slides say that an exception is thrown if an object of a different guild
> is accessed, but doesn't that mean that Ruby needs to check the guild at
> every object access?

Ruby is already checking the class of the object on every access. You could
combine the guild and the class into a tuple and compare against that instead,
so it adds no extra overhead.

There is a paper at OOPSLA this year on doing just that
[http://2016.splashcon.org/event/splash-2016-oopsla-
efficient...](http://2016.splashcon.org/event/splash-2016-oopsla-efficient-
and-thread-safe-objects-for-dynamically-typed-languages)

~~~
aardvark179
I can't comment on that paper since it doesn't appear to be available publicly
yet, so I'm just going to talk about guilds as proposed.

Adding a guild word to each object header is certainly a way to check
ownership, and should be a cheap check to perform in the interpreter, but will
obviously add some extra overhead to standard program execution.

The thing that concerns me is that explicit ownership passing can introduce as
many bugs as it solves. If I have two objects A and B, with A holding a
reference to B, then I can freeze A and freely pass it between guilds, but if
I try and touch B I'll get an error until that too has been frozen or its
ownership transferred. The same problems occurs with explicit ownership
transfer of a non-frozen A, which leaves you with the slower option of a deep-
copy or a recursive ownership transfer which can have equally unexpected
consequences.

The "Ruby global data" slide also gives me the scream heebie-jeebies, as did
finding stack overflow answers on how to unfreeze objects in MRI. I'm sure
nothing will go wrong. :-)

Having said all that, it probably can work nicely for the common use cases of
balancing requests between a group of worker guilds where the request is a
simple data structure whose ownership can be safely transferred, but it would
be hard to do a general work stealing solution that was always safe.

~~~
parenthephobia
_> Adding a guild word to each object header is certainly a way to check
ownership, and should be a cheap check to perform in the interpreter, but will
obviously add some extra overhead to standard program execution_

It needn't be done this way. When an object is invalidated it could have its
class pointer sneakily changed into a special "invalid object" class. Any
attempt to do anything concrete with the object would be rebuffed, but normal
object accesses wouldn't be changed.

 _> The thing that concerns me is that explicit ownership passing can
introduce as many bugs as it solves. If I have two objects A and B, with A
holding a reference to B, then I can freeze A and freely pass it between
guilds, but if I try and touch B I'll get an error until that too has been
frozen or its ownership transferred._

At least you get an error. Ultimately, the only alternative with comparable
performance is sharing mutable references. That avoids this specific problem
but is open to the full assortment of problems that can occur with concurrent
mutable state, many of which aren't automatically detectable in principle.

 _> The "Ruby global data" slide also gives me the scream heebie-jeebies, as
did finding stack overflow answers on how to unfreeze objects in MRI. I'm sure
nothing will go wrong. :-)_

If this proposal is adopted it's a simple matter to prohibit unfreezing
objects that have been shared. :)

Not that it would really be necessarily. If you're reaching into MRI's
internals to unfreeze an object, then it's up to you to make sure that things
don't break.

------
transfire
Hope thy improve the syntax, it looks horrid -- code in strings and all.

~~~
_ko1
Because of current limitation. We'll improve it.

------
masterleep
How would you use this to parallelize Rails requests? I guess you would need a
pool of guilds, each with its own set of controllers, etc.

Since the requests would not be in the "main" guild, it might be painful to
call into gems.

~~~
artellectual
I guess you could boot up a pool of guilds in your process or better yet get
generated on demand as requests are coming in, to process the request, and
kill the guild off when the process is done since the request object shouldn't
be shared.

It all really depends on how much overhead there is to create and destroy
guilds. If it's easy then ideally you could start 100s of guilds or 1000s
should your hardware allow it.

I see guilds as a subprocess with its own isolated resources.

~~~
pmontra
Ideally guilds could be equivalent to lightweight processes at application
level (not OS), much like in Erlang. Then they could be scheduled to run
concurrently using OS threads (multiple guilds per thread) and take advantage
of multiple cores. That's part of BEAM, the Erlang VM. I think it's going to
take a while.

~~~
_ko1
Similar to Erlang process, but more heavy weight (because it creates OS thread
per Guild).

~~~
_ko1
More correctly, making a OS thread per a Ruby thread, and creating a Guild
makes 1 Ruby thread.

------
ivoras
Ok, "guilds"? Is the principle behind this so much different than everything
done before that it requires repurposing a completely new word?

On par with That's "crates". Gives the impression some people just want to be
remembered as inventing names.

~~~
dragonwriter
> "guilds"? Is the principle behind this so much different than everything
> done before that it requires repurposing a completely new word?

Pretty much. I mean, if there is a standard name for a thing between a process
and a thread that is not a thread group, I haven't heard it.

------
DougBarth
If I'm reading this proposal correctly, locks will still be needed within
multithreaded guilds to guard mutations against complex object graphs.

Here's my reasoning. Since the GVL is insufficient to guard against data races
on Ruby 2, under the guild system, locks would be needed to guard against
concurrency issues if multiple threads are present.

It would seem like the intention would be to replace usages of Thread with
Guild to avoid the concurrency issues inherent with threaded code. Will there
be API support to create a Guild that only allows a single thread?

~~~
dragonwriter
> locks will still be needed within multithreaded guilds

It seems to me that is the intent; that is, any Ruby code that exists now is
single-guild Ruby 3 code -- if its multithreaded, it needs locks, for the same
reason it does now.

> It would seem like the intention would be to replace usages of Thread with
> Guild to avoid the concurrency issues inherent with threaded code

I think that'll be a common use case, though running what amount to multiple
"legacy" Ruby 2 multithreaded systems in separate Guilds in the same Ruby 3
process seems also to be an intended supported use case.

> Will there be API support to create a Guild that only allows a single
> thread?

It certainly sounds like a good idea.

------
DanWaterworth
This is interesting. It doesn't mention GC, but since frozen objects can be
shared between guilds, I assume the GC remains global. Perhaps this will
trigger interest in immutable datastructures in ruby.

~~~
_ko1
Quoted from slides: > GC/Heap > * Share it. Do stop the world parallel
marking- and lazy concurrent sweeping. > * Synchronize only at page acquire
timing. No any synchronization at creation time.

~~~
DanWaterworth
I stand corrected.

------
zeckalpha
How does this compare to the ongoing efforts to remove the GIL in Python? It
looks like the Ruby GVL would stay, but be scoped to a Guild, rather than a
Process?

~~~
artellectual
Anyone correct me if I'm wrong here.

Seems like a guild is just a subprocess with its own resources. And you copy
objects over as needed. And when the guild is done it will get garbage
collected. Like other objects.

~~~
Someone
I think I would consider implementing it as 1:1 threading where every
thread=guild runs its own set of green threads.

That likely would be faster than having OS threads in each guild that use PS
locks to prevent running >1 of them concurrently.

~~~
_ko1
CRuby/MRI supports C-extension which can use TLS (thread-local-storage). So
that each Ruby threads runs on one OS thread.

------
sciurus
This reminds me in some ways of Eric Snow's (rejected, afaik) proposal to
extend "subinterpreters" to allow parallelism in Python.

[https://lwn.net/Articles/650489/](https://lwn.net/Articles/650489/)

------
jellymann
The PDF appears to have been removed. I'm getting a "Not Found" page.

------
gamesbrainiac
Any idea where the video of the talk is?

~~~
steveklabnik
Ruby Kaigi has just started, so I'm guessing it will be a while.

~~~
_ko1
Yes.

------
claudiug
do we have any date from this new way of doing concurrency in ruby?

~~~
pkmiec
the new concurrency is part of ruby 3. matz says he wishes for it to be out by
2020. but who knows :).

~~~
cutler
So 4 years to go. Not quite Perl 6 but it could be a bit late in the day
considering the rate at which Ruby is losing mindshare.

------
porges
I{HEART}COM

