Hacker News new | comments | show | ask | jobs | submit login
RTL MJIT – Register transfer language VM and JIT for Ruby (github.com)
85 points by claudiug 6 days ago | hide | past | web | 40 comments | favorite





According to Aaron Patterson (tenderlove / ruby and rails core committer) this maybe Ruby 3x3 branch [1].

[1]: https://twitter.com/tenderlove/status/875467599290613760


that will be quite bold moved from the ruby core developers :)

what does 3x3 mean?

Ruby 3.0 will be 3 times faster then 2.0

There are a whole host of alternative Ruby language implementations out there, but I only see MRI and sometimes JRuby in the wild. The same goes for Python: largely compatible alternative interpreters like PyPy have existed for years, and often have better performance than the CPython standard interpreter.

Why don't we see more adoption of alternative runtimes? Why doest the community push for the adoption of higher performing implementations as the standard? Does anyone run a more esoteric interpreter for one of these languages in production?


Specifically Python is just a very problematic language to re-implement, since CPython has leaked internals literally everywhere. Last time I looked into Pypy (to be fair, that's now a few years ago) that was the main pain point.

Implementing a sane subset of Python is not so difficult, because, well, d'uh, it's sane. Implementing all the leaky details, the insane data model (slightly less insane in Python 3), all the invasive interpreter APIs which are frequently used by frameworks and extensions, is a whole different story. More importantly, if you implement all those, there's not much to gain in performance any more. Pypy has made tremendous progress there, but I'd expect that it still suffers from this — bottom line is, there is no way to efficiently implement what CPython does, and I'd be surprised if Pypy is significantly faster than CPython in code that uses all this shit. I expect Pypy produces a lot of gain as soon as code doesn't use that.


PyPy today is nothing like pypy years ago. It had improved in many areas including cext and python 3.5 is fully supported now. complex suite like odoo ca run on it effectively. Their next goal is to run entire scipy suite on it without modification.

We are running production systems in pypy and performance improvement is significant. Average 40percent faster than node in many areas.


Can you elaborate on what internals you've seen leaked?

Instead of typing a long list, I'll just refer you to this talk: https://www.youtube.com/watch?v=qCGofLIzX6g

Some of that has been fixed / got make up applied to play pretend.


Thank you!

They're moving towards standardizing the ordered dict layout

https://docs.python.org/3/library/inspect.html

PyPy has a glue layer to implement the C API. The C API is designed around reference counting, which PyPy doesn't do. This also shows up in Python code assuming files will be immediatly closed in lines like 'x = open(fname).read()'

Method resolution order. __slots__. globals(). id


Ok, fair enough, thanks for providing some examples, but I'm not sure how many are valid though (Don't get me wrong, there are a bunch, the main one being reliance on the GC/ref count semantics).

> They're moving towards standardizing the ordered dict layout

Can you elaborate? Not sure what you mean.

> inspect library

Yes, a fair few functions in that module are CPython specific (or assume a CPython-ish runtime).

> C API

Isn't the C API CPython specific? It is not a Python-language feature. PyPy implemented it to drive up adoption. How would the C API be expected to work with Jython, or IronPython, or a JS implementation?

> Method resolution order

This is not a CPython leaking, it's a specified thing. Python uses the C3 algorithm.

> __slots__

Again, this is a language (or object model) feature?

> globals()

Again, specified by the language? Not a CPython specific thing?

> id

ID returns a unique identifier per object. That's all, in CPython it returns the memory address. Is that what you mean?

---

There are also only 5 'CPython implementation detail' warnings in the language data model docs[1]

1. https://docs.python.org/3.6/reference/datamodel.html


> Isn't the C API CPython specific? It is not a Python-language feature.

But the Python ecosystem depends on extensions written using the C API. If you are trying to write a new Python implementation and they tell you they need to be able to run some C extension you can tell them 'actually that's a CPython feature' all you want, but they'll go away and not use your implementation.

> How would the C API be expected to work with Jython, or IronPython, or a JS implementation?

Well you'd implement the same API using JNI, PlatformInvoke, or the V8 C extension API, or whatever else the platform you are building on has, respectively. But it's hard!


Calling __slots__ a language/object feature is a crutch. It was made a language feature based on how CPython implements objects with a __dict__ field showing the backing dictionary. Also there's the 'is' operator which has unspecified behavior with integers/strings due to integer-cache & string interning. If you used unboxed integers then 'is' would likely work as equals for all integers that weren't bignums. Python's tried to define a standard outside CPython's details, but CPython is essentially de facto, so for serious implementations it's details are important to follow. Similar thing happened with clang needing to implement many GCC C extensions

Ordered dict layout: https://mail.python.org/pipermail/python-dev/2016-September/...


> There are a whole host of alternative Ruby language implementations out there, but I only see MRI and sometimes JRuby in the wild.

Well I think those are the only ones actively maintained, aren't they? As a language community we're pretty lucky to have even two production-ready implementations.

Rubinius seems to lost all their contributors and to have deleted their JIT and ground to a halt. Topaz was just an experiment. I don't think Maglev or IronRuby have been maintained for a few years. Opal is still going but it's not quite what you mean when you talk about Ruby implementations.

> Why don't we see more adoption of alternative runtimes?

Because it's a huge volume of work that needs very specialised skills that few people have, so they often aren't maintained very long. Starting one is fun, finishing and maintaining is more of a slog.


JRuby spent years to get to the point where they could run Rails. If you cant do that in ruby you are not capturing the biggest use case of the language. Im not sure any of the alternatives got there

Theres a good list of problems making a proper ruby implementation here http://blog.headius.com/2012/10/so-you-want-to-optimize-ruby...

And a list of what remains different in jruby here (mostly c extensions and threads) https://github.com/jruby/jruby/wiki/DifferencesBetweenMriAnd...


And on the flip side once you get to that point the critical mass is with you. I know plenty of enterprises that run JRuby because technically it's "just Java". Being able to run Rails apps on the sly is huge.

Nobody is going to use a new Ruby or Python unless it brings something new to the table. Pypy has it with speed. JRuby has Java compability. Without an X factor it's hard to get people excited to try it, let alone contribute.


Poor compatibility with C extensions is often an issue. That hurts if you're building something large that interfaces with native code that's expensive/practically impossible to replace (stores, custom network transports, scientific libraries...).

There are no official spec iirc for python and ruby.

The guy who was doing on the spare time spec for ruby got frustrated and believe the official ruby team were trying to kill competition and left iirc. Rubinius or whatever was built to that spec and engineyard was funding them iirc.

Drama. There was drama between Jruby and Ruby for awhile. And then it was og ruby vs jruby dev.

If you have a language spec then it's easier to implement kinda like java and tcl (tck?). But at the same time Oracle and Sun system was really really into the fact that only they can control that language.

Otherwise the edge cases will beat you nasty.

R have an alternative because RRevolution made money and now Microsoft bought them and there's a huge company backing. I don't think the people behind R really care if there's a competitor. They just do their thang.


> Why don't we see more adoption of alternative runtimes?

What exactly would you like to see? For python:

- PyPy is seeing a lot of use,

- Cython is seeing a lot of use, mostly as a way to implement C glue, but also on its own

- MicroPython is seeing a lot of use on limited environments

- Nuitka is a "secret sauce" for people who want to compile their Python to an .exe without leaving any easily-decompilable .py files. (And they get a speedup as a bonus)

Even TinyPy had gotten some use in its day (before it was abandoned - nowadays, MicroPython fills the same niche).

Except for older languages (C++, C, Pascal, Basic, COBOL, Fortran, LISP, APL), it is actually rare to have more than two mature implementations of a language see e.g. (D, Go have their own and a GCC backend; C# has its own and Mono).


Because nobody pay for this.

This is the ultimate truth.


You've completely missed his point. He is asking why they are not widely adopted, not why they not exist (which they do).

No, parent does make a good point.

The only way for an alternative implementation to compete is for someone to pay for its creation, documentation, maintenance, etc.

That's how we got all these great JS JITs and stuff: with Apple, Google, and co paying the bills, and pushing for them to succeed.

PyPy managed to get close to something working, but it lacks lots of the polish and resources available for CPython.


They exist but most can not serve as drop in replacements for various reasons. Appropriate funding (which most lack) would enable such projects to tackle the many, often hard to solve details to get there. Moreover these things come and go, you don't want to jump the ship and then realize that there are effectively no contributors left because all of the few were hobbyists or researches who have no time anymore or moved to doing something else.

Because Ruby and Python are relatively good languages. New implementation would need to be order of magnitude better to gain traction. It also makes large improvement difficult as well.

I am rooting for Crystal, but without massive backing like Go and Rust enjoy, it is difficult.


You're talking about new languages, not new implementations of a (specified) language.

Nim is an excellent contender for a Python-esque compiled language, but it also lacks corporate and community backing.

> lacks [...] community backing

People keep complaining about the small community. Please join and help.

> lacks corporate [...] backing

Is that a bug or a feature? ...


> People keep complaining about the small community. Please join and help.

I will when I find the chance to start working with it! I'm currently using Scala.

> Is that a bug or a feature? ...

It seems like a bug to me. Yes, corporate backing has its downsides, but it's a great way to ensure that a language will have long-term support. Look at React and TypeScript, for example. If Facebook and Microsoft didn't push them down our throats at the start, do you think their use would end up being so widespread?


> great way to ensure that a language will have long-term support

If anything, community-driven projects can keep on track longer: Linux, Python, non-commercial distributions, while corporate-driven one can change direction or be dropped: Java, Visual Basic, MySQL, Solaris, a lot of Google projects.

It's much more difficult to steer a whole community in wrong direction or convince most contributors to drop a project.


I remember a talk / presentation from Chris Seaton, where he said something like optimizing the static part of Ruby isn't hard. But it is the C Extension / Core Lib and Especially the Meta Programming nature of Ruby that makes it near impossible for compiler to optimizes, especially the use case with Rails.

Just fix the VM then. tinyrb based on potion based on lua with classes is ~200x faster in method calls, and has a full dynamic MOP. Common Lisp ditto. Traditional optimizations don't work when the base VM is poorly designed.

Always take advice from people of failed products with a grain of salt.


> Just fix the VM then

No the point isn't that the VM is broken. The point is that the Ruby language has semantics and is used in idioms that nobody knows how to optimise well yet.

> tinyrb based on potion based on lua with classes is ~200x faster in method calls

tinyrb is interesting, but it just isn't the same semantics as Ruby:

https://github.com/macournoyer/tinyrb#what-wont-be-in-tinyrb...

tinyrb doesn't have a better VM - they're implementing a different programming language with all the hard bits left out!

In my project we've been having to do new original research into how to optimise Ruby - introducing concepts such as dispatch chains which haven't been needed more because people don't try to push metaprogramming in most language like they do in Ruby.

http://stefan-marr.de/downloads/pldi15-marr-et-al-zero-overh...


Your interpretation of the tinyrb omissions is wrong. I'm the current maintainer of potion which is the underlying VM for tinyrb. What is missing us just sugar and peg parsing grammars for some trivial stuff. And then of course the huge stdlib.

The simple potion/IO mop and ABI layout is far superior to matz ruby, just the method cache and thread support is missing. And this compiler has no optimizations at all yet, not even trivial constant folding. Still 200x faster. This lua with mop VM can be used for every dynamic language, like ruby, perl, python, PHP, ... and will beat rpython or truffle/graal by lengths. The other VM based on this tvmjit ditto. This uses even luajit with s-expressions.


I am disappointed to see Topaz missing from the lineup, but I suppose that it never had strong adoption in the Ruby community.

please don't use RTL, since it's commonly used for Hardware/chip design languages such as Verilog and VHDL

That's register transfer level. This is register transfer language. The names are similar, but the technology is unrelated.

https://en.wikipedia.org/wiki/Register-transfer_level

> Register-transfer level ... Not to be confused with Register transfer language.


And it is already an established term: https://en.wikipedia.org/wiki/RTL

HN show there is 37 Comment in this thread. And I am only seeing 11.

What is the causes of this?


refresh the page? if still not the right number, that sounds like a bug and you should contact the mods with details - for me everything looks right. Their e-mail address is in the footer.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: