Can someone explain the "write barrier" stuff? It's hard to glean what the problem is that they're solving here... And what they're saying about Arrays, Strings, Hash, etc, in relation to WB protected vs. WB unprotected.
I've not read the code, so someone may have to correct me, but I think this is the gist of it:
Ruby 2.1 introduces a generational garbage collector, this divides all objects into young and old generations. A regular GC run will only look at the young generation, with the old being collected less frequently. An object is promoted to the old generation when it survives a young generation run.
If you have objects in the old generation referring objects in the young generation, but you're only looking at the young generation it may seem like an object doesn't have any references, and you might incorrectly GC an in-use object. Write barriers prevent this by adding old generation objects to a 'remember set' when they are modified to refer to a young generation object (eg old_array.push(young_string)). This 'remember set' is then taken in to account when collecting the young generation.
Most generational garbage collectors need these write barriers on all objects, but with the many 3rd party C extensions available for Ruby this isn't possible, so a workaround was devised whereby objects that aren't write barrier protected won't ever be promoted to the old generation. This isn't ideal as you won't get the full benefit of the generational GC, but it does maximise backwards compatibility.
Everytime I read a Ruby version announcement/explanation, it reminds me that I've been a coder working and thinking exclusively in English, but my current programming language of choice is prominently steered by non-English speakers...and somehow things work pretty peachy. But it always makes me wonder how much more in-depth and richer the discussion of Ruby amongst its Japanese maintainers, before they translate it for us in English?
You can subscribe to ruby-dev and find out. ;) It's actually much more low-traffic than ruby-core, and someone has volunteered to translate any email if you ask.
I really like the ObjectSpace.trace_object_allocations addition. I had a quick look at the 2.1 docs for ObjectSpace and there seem to be all sorts of interesting methods in there reachable_objects_from(obj) and memsize_of(obj) for example. It looks like understanding memory usage and tracking down memory leaks will get a lot easier with 2.1.
Koichi is gives some of the best presentations. It is hard enough to be funny in one language, yet to be funny and understandable in two is quite an accomplishment. If you're ever at a Rubyconf, go to his talks. Also love the speed work he's been doing with generational GC. For a primer, Pat Shaughnessy has a great talk you should check out, also Ruby Under a Microscope is a good dive into Ruby.
I'm uncertain why they've added an `f` suffix for frozen string literals/instantiation.
I can understand needing to freeze an existing String, but you can do that with Object#freeze, but a literal syntax seems to overlap with the usage of Symbols. The only benefit I can think of is not needing to use Symbol#to_s when working alongside strings.
I'm not saying frozen strings are not useful, because they can be after taking in some input/params to work with/store, but a literal syntax seems extremely edge-case-y. Now instead of "foo".freeze it's "foo"f, but how frequently will that save you time/trouble?
The bigger implication is that slides 23/24 shows that immutable strings and symbols will share their heap locations, if I'm reading the diagrams correctly, but have different object_ids.
This seems a bit bewildering and focused on micro-optimization, which is not the mental model I have in mind when I'm coding in a very high level language. It's also blurring the semantic difference between Symbols and Strings.
It's not sending a message - that would defeat the purpose. It's constructing a frozen string, reusing an existing one when possible.
Prior to 2.1:
def foo
"bar".freeze
end
does these things every time `foo` is called:
1. copies the characters 'b', 'a', and 'r' into a mutable string
2. sends the message `freeze` to the new string, which...
3. marks the string as frozen.
In 2.1 `"bar".freeze` is equivalent to `"bar"f`, which will not make a copy every time `foo` is called. See https://bugs.ruby-lang.org/issues/8579 for more discussion.
My understanding is that frozen strings (unlike Symbols) are garbage collected. This is useful because symbol construction can be used to DOS an application.
(I agree that it's a weird, low level detail to have to think about.)
Wait, so Ruby has the same problem with symbols that Erlang does with atoms? Why don't I constantly see warnings against using String#intern the way I do about list_to_atom/1?
> Wait, so Ruby has the same problem with symbols that Erlang does with atoms?
Yes.
> Why don't I constantly see warnings against using String#intern the way I do about list_to_atom/1?
Because leaking memory over time is pretty much the natural state of being for Ruby apps. Periodic process restarts are culturally A-OK. Leaking symbols are likely to be the least of your perf problems.
But even in Erlang, the issue is only calling an interning operation on untrusted user data. In most real word use cases, you'll leak until a constant limit, which is probably no big deal. However, I've seen many Rails vulnerable to trivial DOS attacks by sending 1MB of random nonsense in a field known to be .to_sym-ed
That slide was a bit confusing for my underslept mind. Is that meant as an example of calling methods which work over the result of defining main (in this case static and void)?
the "private" keyword already takes a symbol for a method name to privatize, so in this case the return of "def methodname(args); end" is ":methodname" which then gets passed to "private" and everything magically works.
It's unclear from the intro summary, but I'm assuming from the lack of further explanation that refinements are just GONE, and not just "no longer experimental"?