There's a third way here - run the C code using the same JIT as the Python code, instead of compiling it natively.
That might sound mind-bending, but C is just a language like any other. It's actually fairly simple and consistent to implement compared to something like Python. You can interpret and JIT compile it if you want to - there's no major magic to that.
Then you can optimise the C code and the Python code at the same time, inline the two, do the same optimisations as you do on Python code, etc. We're using this technique to run Ruby C extensions in JRuby and the results so far are great - running real C extensions faster than native code because we can optimise both the Ruby and the C at the same time.
Even if you could interpret that code, your JIT-ing is not going to create exactly the same assembly and will thus perform worse. You can very easily get a C implementation that you can interpret easily but at least as output by a C compiler, that performance is nowhere near where you want to be.
(I realize this is not a problem though, just have the JIT call the libraries, as long as the wrapper can be interpreted you are good.)
There is an interpreter in the LLVM codebase, but this is essentially only possible because it is part of the codebase and so tracks the rest of the project very closely.
Note that chris is not talking in the abstract, but about an existing codebase: https://github.com/graalvm/sulong
How do you handle the other reason people write C code-- to bridge Python code to an external native library? In that case, you'd be JIT-compiling the bridge itself, but your native library remains "opaque [insert language here] stuff" that the JIT can't do anything with.
So yes, that'll be an optimisation barrier, but it's the same barrier as you would already have if you were using an interpreter for Python making native calls to the same library. So it won't be any slower than our current best case.
(In fact the JIT can do some clever things to make the native call more efficient than in general, as it can do things like schedule values to be in the correct registers from the start rather than copying them in place just for the call, but that's probably not a significant source of inefficiency anyway).
And for the Ruby C extensions we tried not only did we not have to port them from C to Ruby, they actually ran faster than either the compiled C version or the pure Ruby version because we can inline and optimise between the two languages.
Same reason applies here.
Userify's architecture is probably a bit unusual. It's distributed as a single binary that uses an external redis server for locking and stores data in either S3 or a local filesystem (NFS, EBS, iSCSI, etc.) When it first launches, it automatically installs any missing python prereqs (if pip is available) and explodes out a web server static file repo in case you want to front-end it with nginx (etc) and then starts up multiple web servers (HTTPS, HTTP, alternative ports, etc). We've found Python to work extremely well for all of this. We've looked at a few other languages that we think fit our core requirements (incl lua, scala, and haskell) but Python works pretty much perfectly for this use case, we can scale it horizontally using Redis as a synchronization mechanism, and the GIL hasn't caused any problems for us because we just add more processes as needed. Cython is really fantastic and 'just works'.
Things aren't perfect.. we're still on Python 2.x, but we're going to be moving forward as soon as all of the third-party libraries we rely on move to 3.
I see a fourth option:
- compile the scripting language to C (like Crystal for Ruby)
Not to mention, bootstrapping is about the compiler executable and the problem I was talking about applies to any attempt to use an ahead-of-time compiler for a dynamic language.
The approach chrisseaton describes is generally only feasible if you're implementing a Futamura projection à la PyPy or Truffle.
That said, it's only a matter of time before a high-performance Python pops up on Truffle and JITs extensions like RubyTruffle.
Has anyone in the Python community thought of doing something similar, e.g. using PyPy to build a C interpreter, maybe re-using front-end components (pre-processor, parser, type-checker, etc.) from an existing C compiler? In fact, it might be useful to build an interpreter for something like LLVM IR, or even x86 machine code, in order to gain access to a bunch of existing languages.
Once access is gained at that low (ABI?) level, abstractions and interfaces can be built to hide the horribleness. The performance would initially be terrible, but some targetted, profiling-guided optimisation might get it down to reasonable levels; in a similar way to a JS interpreter adding specific optimisations to make asm.js code run fast.
To get some perspective to this, consider the alternatives for a scientific programmer: MATLAB, R, FORTRAN, Mathematica, or if you're hip, Julia -- all specifically made for scientific programming with 0% general purpose/web/etc. development going on.
So I would say that the scientific Python community has been doing extremely well in terms of even using a language that isn't designed ground up for scientific computing.
I could write a lot about why that is (and how some of the CS and IT crowd doesn't "get" scientific computing..) -- I'll refrain, I just wanted to say that were you see something and get frustrated, I see the same picture and think it's actually an incredible success, to bring so many scientists onto at least the same ball park as other programmers, even if they are still playing their own game.
I've been doing scientific software development now for 20 years. I do non-numerical scientific computing, originally structural biology, then bioinformatics, and now chemical informatics, the last dominated by graph theory.
I rarely use NumPy and effectively never the languages you mentioned. Last year on one project I did use a hypergeometric survival function from SciPy, then re-implemented it in Python so I wouldn't have the large dependency for what was a few tens of lines of code.
Biopython, as another example, has almost no dependencies on NumPy, and works under PyPy.
I don't see what's awesome or all that interesting about it.
You are so correct. I have not used Julia, but what do the first four have in common? As languages, they suck. Each in their own way can be used to do amazing, incredible things. But from a development perspective, they are pure torture to code in.
The joy of writing scientific code in Python is that there is a whole set of users, the majority really, who have nothing to do with science. The language must stand on its own, so Python cannot suck.
Please do! I'm a PhD student in CS, and I don't think I "get" scientific computing (I'm in compilers myself).
Most people doing it did not have a formal CS education. They are biology, physics, mathematics or chemistry majors that have had one or two courses on programming, from other scientific programmers.
There are two main families, one that comes from the Fortran background, which still writes programs like they did in the 80s, with almost no new tooling. Programs are written for some time, and then they are scheduled for clusters that spend months calculating whatever it is.
The other family of scientific programmers, which I believe is the majority, uses a tool like Matlab, or more recently R, to dynamically inspect and modify data (RStudio is a Matlab/Mathematica-like friendly environment for this task) and use libraries written by more proficient programmers to perform some kind of analysis (either machine learning, DNA segmentation, plotting or just basic statistics).
Most of these programmers know 1 or 2 languages (maybe plus python and bash for basic scripting). They write programs that are relatively small and the chances of someone else using that code is low. Thus, the deadline pressure is high and code maintainability is not a priority.
For a non-CS programmer, learning a new programming language is almost impossible, because they are used to that way of doing things, and those libraries. They take much more time to adjust to new languages because they do not see the language logically, like anyone who had a basic compiler course.
Given this context, web apps, rest APIs and all the other trending tech in IT are not commonly used in scientific programming, because they typically do not need it (when they do, they learn it). Datasets are retrieved and stored in CSV and processed in one of those environments (or even in julia or python-pandas).
If you're doing web development you have an insane amount of languages to chose from because after String, Array, and File are implemented, HTTP is next. Having done a bit of web development, I'd also say a typical project only uses a subset of libraries that is surprisingly small.
Scientific computing is quite different: a paper in structural biology (my former stomping grounds) can easily require a few dozen algorithms that each once filled a 10-page paper. These could easily be packaged as libraries, but it's a niche so it rarely happens. Newer language quite often don't even have a robust numeric library. Leave the beaten tracks and your workload just increased by a magnitude.
That's also why science, unlike "general purpose" programming, often uses a workflow that connects five or more languages or so: a java GUI, python for network/string/fileIO, maybe R for larger computations, all held together by a (typically too long) shell script.
But these workflows are getting better. There's a build tool that formalizes the pipeline somewhat (I forgot the name) and APIs are surprisingly common. The reason why csv will never die is that the data fetched from APIs is usually more static than it is in a typical web app (-> local cache needed) and that scientists often work with data that just isn't a good fit for a database. Postgres just doesn't offer anything that enriches a 15MB gene sequence.
The way he painted the scientists skills matches my experience thus far.
But I don't think that is only fixed by more education and making scientists behave more like programmers. I think that to change things one also needs far better alternatives than the options available today, so that people are really encouraged to switch. Somehow, these must be written by people who know their CS and can write compilers, yet engage with the why scientific computing is a mess on the tool side too, not dismiss it as laziness.
I started out as a programmer, I have contributed to Cython, past two years have been pure web development in a startup. So I know very well why MATLAB sucks. Yet, the best tool I have found myself for doing numerical computing is a cobbled mess of Fortran, pure C, C/assembly code generated by Python/Jinja templates, Python/NumPy/Theano...
The scientific Python community and Julia community has been making great progress, but oh how far there is left to go.
Because the majority of programmers in areas where software isn't the core product being sold, don't spend one second thinking about code quality.
As such tooling that on one side is more forgiving while allowing for fast prototyping, but at the same time enforces some kind of guidelines is probably the way to improve the current workflows.
You can do that natively in C, and the result is very fast. You can do that natively in Fortran. And in Matlab, etc.
You cannot do that at all in Python. Well, you can, but it will be orders of magnitude slower.
NumPy + Cython would beg to differ.
Personally I gave up writing numerical code in Cython and instead wrote it in Fortran and merely wrapped it with Cython...
But yes, being 2x-10x slower is something one can live with for productivity, vs sometimes 1000x slower of Python.
I want to clarify that when I say "0% web development going on", I don't mean that scientific programmers don't do web development (they do! a lot!), I meant that people don't pick it those languages in general if they are only doing web development without a numerical/scientific/statistical aspect to it.
What would be interesting is, do you know of any teams or companies using Julia in anger in a non-scientific setting, with programmers from a non-scientific background?
The scientific Python community certainly makes good use of all the Python web tools!, and being a "scientific stack" in no way precludes the need for general purpose frameworks that can also be used by others. It's at least as much about people and community and habits as about the tooling...
Julia is designed for general purpose computing from day one, but this community is not dealing with the same painpoints that scientists have been.
It's catching on among early-adopters in finance. What do you consider that?
Having a few libraries of each kind (of varying quality and with very small adoption) != tons.
Given all this, what's the best way for someone to pick up and start contributing to Cython?
If you don't already, subscribe to and start to follow cython-devel. First step is probably repeat the question there for more up-to-date info than what I can give (I don't even follow any longer).
Think about what you want to achieve/change in Cython. A new feature may be easier than a bugfix, though I'm not sure how many "low hanging features" are left at this point.... Anyway, make sure you understand what the change would involve in changes in the C code. Write a testcase that uses Cython the way you would want it to work (elicit the failure/bug/feature); look at the C code that Cython generates, and make sure you understand why that C code is the wrong code and that you know how you'd want it to look. (Understanding the generated C code at this level and read it almost fluently may take a little bit of getting used to but is an absolute requirement for working with Cython -- eventually you look past all the __pyx_ everywhere).
Then somehow try to beat the Cython codebase into submission in generating that C code... as you repeat this process you'll gradually learn the code.
It seems like Robert Bradshaw and Stefan Behnel are still around, they are very capable and friendly people and I learned a lot of what I know about programming from them, they were very welcoming to me as a new contributor.
Anyway, I think PyPy doesn't get the attention it deserves - the performance gains are fantastic and the vision of the Python ecosystem as Python-only packages with occasional lightweight C libraries integrated with cffi looks very nice to me.
The exact phrasing was:
> PyPy is ten years old at this point, but to a first approximation, no one is using it.
Using your 0.5% - 1.0% PyPI metric as an approximation for how many people are actually using pypy, I think it's reasonable to say that. I personally would definitely have phrased it differently, but if >=99% of the market isn't using PyPy, then it is far, far, far removed from the "mainstream" market. In fairness, the Python community is, on the whole, large enough to give you a reasonably sustainable niche, but it pales in comparison to the market itself.
It's a bit like comparing Facebook to CouchSurfing, to be honest. You know about it, maybe even have some friends who have done it, and hey, it's still 3 million people, but in comparison to Facebook... no one uses it. And that's not a value judgment or anything, it's just a scale of comparison kind of thing.
That being said, I share your frustration about the deep schism between web vs scientific in the Python community, and I think it's really a shame for the community as a whole. Unfortunately I'm not sure that's going to change very soon: the web side of things has a tremendous amount of momentum (and money) invested in "their" architecture, and at the same time, the scientific side of things has far, far less patience for pain points in their language tooling.
Put differently, if your background and job is programming, you're more likely to view "dealing with this programming problem" as actual work, but if your background and job is "I have this data, and I need to analyze it", then "fiddling with this programming problem" is, at best, a frustrating distraction from your actual task. And I think both arenas need to have a better appreciation for the other: as programmers, our tools generally really do suck for everyone, we're just used to it; as data scientists, we are woefully under-aware of how difficult these programming problems can be. Some more unity would be tremendously beneficial for all.
And then on top of it all, there's this whole group of weirdos using Python for stuff like desktop applications (I happen to be in this camp). Good luck finding a packaging and deployment solution there!
The amount of C code used with Python declined somewhat with Python 3, because the C interface changed and things had to be reimplemented. There's pymysql, for example, which is a database connector in pure Python. No more need for the C version.
I have the opposite view. I'd rather have all of the important libraries written in a language with a safer type system than Python's. C's type system has some safety issues, but it's still safer than Python's.
If Python 3's optional type hints could actually be enforced at runtime, I'd be sold on using Python for pretty much everything.
Uhm. An error in Python code code can lead to a runtime crash with an exception or a bad result ...most of the time. An error in C can lead to anything from an impossibly hard to diagnose memory leak to an exploitable buffer overflow... most of the time.
C may be be "statically typed", but I think the number of people capable of writing (and maintaining!) secure C code is incredibly low. Heck, even the openssl devs failed at this at least once.
And the fact that C is "static" and Python is "dynamic" does not make C's type system safe. Heck, even js's type "system" is safer than C's if when you use the word safe you refer to, you know, security.
So, if you write a Python library in C or in Python, you'll have undefined behavior either way. Undefined behavior is not a good criteria for choosing which language to use.
A language can say that certain operations are undefined, like reference NULL in C. What of the Python language is undefined? (Some things, like garbage collection, are implementation defined. That is different.)
Even if a language contains undefined behavior, a program written in that language can avoid the undefined operations. What of the Python implementations use or depend on undefined operations in the lower-level language?
An implementation can also be flawed. But that's not "undefined behavior" but non-conformant behavior. "A bug", in the vernacular.
According to your definition, is there any language that doesn't have undefined behavior? After all, even the hardware can have flaws and undefined behavior, so to mix metaphors, it's all a house of cards built upon sand.
That's correct, and it's not pedantic to point that out. It proves that undefined behavior is not a useful criteria for choosing which language to use.
But even if we accept your more restricted definition of what constitutes undefined behavior, we still must accept that any language that is implemented in C could possibly exhibit any of C's undefined behavior. Therefore, writing a Python library in Python instead of C cannot help you avoid C's undefined behavior. It might prevent you from introducing more undefined behavior, but that's not the same as avoiding it entirely -- if the implementation is written in C, then that ship has sailed.
No, it means that you've warped the definition of "undefined" to the point where it's no longer useful. It's entirely possible to have fully-defined languages where every possible string of symbols is either a valid program with single well-defined behaviour or not a valid program.
And no, writing the implementation in C does not introduce undefined behaviour, although it may require a number of compile and run time checks to ensure that you're never invoking it.
This is a valid C program that contains an undefined behavior, from http://blog.regehr.org/archives/213:
int main (void)
printf ("%d\n", (INT_MAX+1) < 0);
It is a perfectly valid program. The outcome is undefined (integer overflow is considered undefined behavior in C by every authoritative source I've ever seen). GCC does not emit any warnings when this program is compiled, even with all warnings turned on. There certainly aren't any runtime checks for it. Any program that adds numbers which are passed into it can run into undefined operations.
Here is another example:
i = i++ + 1;
The result of that operation is undefined.
These examples demonstrate that undefined behavior is not something you "invoke" in a special way -- it's often the result of a mistake. It is not restricted to certain operations -- these examples use simple addition! The compiler and runtime checks can't help you avoid it in many cases -- you sometimes won't even get a compiler warning.
Here is an example of a bug in the C implementation of Python that caused undefined behavior because of integer handling: https://bugs.python.org/issue23999
It's basically impossible to avoid. Even if your code is perfect, the compiler might optimize it into something that might contain undefined behavior. I suppose that if you never did any math, or anything with strings, or any pointer dereferences of any kind, or any casting, or any recursion, and you made sure the compiler didn't try to optimize anything, you could end up with a C program that could not have undefined behavior. But it's clearly impossible to implement Python without doing all of those things.
> Here is an example of a bug in the C implementation of Python that caused undefined behavior because of integer handling: https://bugs.python.org/issue23999
The comments in that bug report suggest it was a false positive in Coverity.
In any case, the Python language defines how left and right shift are supposed to work. https://docs.python.org/3/reference/expressions.html?highlig... . What you pointed to, if it were a true positive, would be an example of where the implementation didn't comply with the specification.
It wouldn't be an example of undefined behavior as given at https://en.wikipedia.org/wiki/Undefined_behavior : "undefined behavior (UB) is the result of executing computer code that does not have a prescribed behavior by the language specification" because Python prescribes that behavior.
But the C spec doesn't prescribe any behavior in this case, and the code that is running in this case is C, not Python! It was written in C, it was compiled by a C compiler -- it's C! Whether or not the behavior is undefined for anything a C program does is determined by the C spec only. The Python spec is not relevant. Any C program that overflows an int has done something undefined, whether or not that C program happens to be a Python implementation.
The Python specification defines what the implementation is supposed to do. It happens that the implementation doesn't comply with the specification. That doesn't mean the Python specification is undefined, it means the implementation is in error.
Remember too that the actual implementation is a binary. The C compiler converted the C code into that binary, but in theory it could have been generated manually, as a byte-for-byte equivalent, with C completely out of the picture.
Would you still say it's "undefined behavior" if there were no C compiler? How do you tell the difference in the binaries?
You are using a non-standard definition of "undefined behavior". In common use, "undefined behavior" is only meaningful relative to a language specification, not an implementation.
Why do you think you are using the common definition when you include implementation bugs as part of UB?
Case 1: A C program overflows an int (which is listed as an undefined behavior in the C spec). Under your definition, and pretty much any other, that program has executed code that results in undefined behavior.
Case 2: A C program overflows an int. That C program happens to be the Python interpreter. Under your definition, based on your responses to previous comments, that did not result in undefined behavior, but was an implementation bug instead.
There is no difference between case 1 and case 2. They are both C programs, so the spec for C determines what is undefined behavior for them. They both did the same thing. Either both are executing code that results in undefined behavior, or neither is.
Case 2 is also an example of an implementation bug if it affects the result of the Python code running in the interpreter.
Programs written in Python follow the Python language specification.
When run in CPython, the implementation follows the C language specification.
If the implementation uses C UB, which causes it to be out of compliance with the Python specification, then it is both undefined behavior for C and a failure to follow defined behavior for Python.
It is not undefined behavior for Python.
I have said multiple times that I don't like how your non-standard definition mixed the two together. It is no longer interesting to come up with new ways to restate my statement.
That's true, but it's not what you said. You said "the C specification is irrelevant" when I brought up bugs in the C code in CPython, and you said they were implementation bugs but not UB, because the Python spec doesn't define them as UB -- even though the code we were talking about was C! In other words, case 2 in my previous comment.
Now you say "When run in CPython, the implementation follows the C language specification," and "If the implementation uses C UB, which causes it to be out of compliance with the Python specification, then it is both undefined behavior for C and a failure to follow defined behavior for Python."
Those arguments contradict each other. Which one do you believe?
This is false. If your code is perfect, the compiler shouldn't do such optimization. Of course compilers have bugs, but that's compiler bugs not language standard problem. (Not to speak of bug free compilers like CompCert.)
Then it's not false!
Pretty much every language has some undefined behavior. That undefined behavior can be invoked intentionally, or because of bugs.
C has a lot of undefined behaviors. Even simple addition can result in an undefined behavior because int overflow is undefined. Any C program that takes numbers as input (via the shell, or FFI, or whatever) and adds them together can exhibit undefined behavior. Null pointer dereferences are also undefined, and pretty much any C program of reasonable complexity can encounter those, in the form of bugs.
A bug in hardware can also cause these undefined behaviors to be invoked -- for example, a friend of mine who does embedded programming had some null pointer dereferences or something like that because the hardware did not set a value when he told it to set the value. Null pointer dereferences are undefined, and if they happen because of hardware bugs, that is a case of hardware causing undefined behaviors to happen. It would not be reasonable to say they were not undefined behavior simply because they were caused by hardware instead of programmer error. The fact that it happened at all is what matters, not what caused it.
Even if a language does not have any undefined behaviors whatsoever, that's not the end of it. If a program in that language interacts with other programs that do have undefined behavior, such as C programs, it could potentially trigger or be affected by undefined behaviors in those other programs. If that happens, it's not reasonable to say that it was not undefined behavior simply because it happened in a piece of code in a different language. (So, if the result of running a Python program is affected by undefined behavior in the C implementation of Python, that program has undefined behavior -- even if the Python code does not. The undefined behavior is C's, not Python's -- but it happens because of running the Python code, which caused the C code to run.)
In other words, any invocation of code that causes undefined behavior to happen -- whether it is intentional, or caused by a software bug, or even caused by a hardware bug -- counts.
Those are compliance errors. That is, if the spec allows it (as "undefined behavior") then it's not a bug. If the spec doesn't allow it, then there is a defined behavior and it is a bug.
For example, a hardware bug is one that does not comply with the specification. Either the spec must be changed (perhaps allowing the existing behavior), or the bug fixed.
I suspect that is where we differ. It seems to me that if a C program does something that results in undefined behavior according to the C spec, it is undefined behavior period, regardless of any other factors, because the spec says it's undefined and that's all that matters.
I'm saying that when the Python language specification prescribes a behavior, and the Python implementation in C has a different behavior because it does something which the C language specification considers undefined, then the Python implementation in C does not comply with the Python language specification.
I'm saying that it's incorrect to say that non-compliant behavior with respect to the Python specification is the same thing as undefined behavior. It makes no sense to say something is undefined when the specification defines what is supposed to happen.
The implementation being buggy does not mean that the behaviour is undefined in the spec.
It's entirely possible that executing a Python program could cause C code in the Python interpreter, or C code called via FFI, to do something that the C spec defines as undefined behavior. That could affect the result of the Python program. If that happened, it would not be reasonable to say that the Python program did not have undefined behavior -- even if nothing in the Python code was undefined according to the Python spec.
Personally, I started using Anaconda for everything and just drop down into pip when I find a package that isn't in there already, and it's been working very well.
wait, what? I always use pip and/or the ubuntu repositories to install scipy. Is there another package manager for it?
Even gcc (egcs) was developed by Cygnus during an important stage.
I prefer programming languages and utilities that are free and open source. And that is some really bad stock photography.
ND4J: N-dimensional arrays for the JVM
Libnd4j: The C++ engine powering the above
JavaCPP: The bridge between Java and C++ (Cython for Java)
Fwiw, all that works on Spark with multi-GPUs.
* CPython, Jython, Lua, MicroPython, Ruby (1.8): >100 seconds
* Cython (naive), IronPython, Ruby (2.3): 50 - 80 seconds
* LuaJIT, PyPy, Cython (with type hints): ~4 seconds
Recently, however, I came across https://www.reddit.com/r/Julia/comments/4c09m1/, which suggests running with `--precompiled=yes` to reduce startup time. I have now run the Julia version with that command-line option, and the runtime is impressive - comparable to C#/Go/V8.
It almost seems to me that just as we're about to get c-speed in Python via JIT tech, so it'll almost immediately be left behind because the GPU is where it's at.
However, array programming is basically functional programming, and does map very well to parallel execution in general. While most array programming is nowadays of the fairly simple form you see in libraries like Numpy, older languages like APL show how far you can go. It's a somewhat alien style of programming to most modern programmers, though.
My rule of thumb: parallel programming is hard. Functional programming makes it trivial to make parallel programs that are correct, but they still might not be fast.
One problem is that you have to codegen to meager intermediate languages that are fragmented & buggy (OpenCL or GLSL), or go proprietary with just one set of bugs and features (CUDA) but with much limited user base.
Also GPU software stacks are very crashy, have roots in the culture of testing all shaders for driver bugs before shipping in apps ("your game/app is buggy and poorly qa'd if user pc crashes") and only
recently has been slightly exposed to dynamic workloads by WebGL (which is only a OpenGL ES 2 subset, impls invest major engineering effort to work around driver bugs, and still had/has big crash problems).
It's also possible that everyone ends up implementing CUDA (AMD, Google, OTOY, PGI have impls out or underway) and it'll get supported as a codegen target.
None of these solve the problem that the underlying drivers are still terrible (when I test Futhark code I routinely have to restart our GPU servers, as the NVIDIA kernel module crashes). Hopefully, the smaller surface area of OpenCL and Vulkan compared to OpenGL will eventually result in better drivers. I am also very hopeful about AMDs switch to an open source driver model.
Getting 30x speed up with compiled code using optimized data structures (e.g. pointer compression for better data cache usage, etc.), although not reaching performance of best JIT implementations, it could be good enough.
Also pypy not supporting the latest python 3.5 is kind of a big deal for adoption. I would rather have 3.5 support than numpy/cpython compat but I don't do scientific computing :)
While one can argue Ruby gets the best and leading edge JIT from Jruby / Graal / Truffle. I dont think it is used much in production. It seems the majority of Python and Ruby has stuck to the default CPython and CRuby runtimes. Both PyPy and JRuby seems to be rather small in usage.
Why may that be?
Which libraries are they talking about? I tried searching but the only mentions of JITs I found on Google were IBM's JVM, and Microsoft's CLR.
Back when Unladen Swallow was being worked on, I'm unaware of anything except OS X's OpenGL implementation using LLVM's JIT support—and that ultimately was very different code being JIT'd (they essentially just inline a sequence of interpreter states, so their source IR comes from Clang).
Nowadays… I'm not even going to try listing even dynamically typed language VMs that use LLVM for JITing.
It's also worth pointing out that even "long-running" in JS terms is often considered "short-running" in more general JIT terms, so compile-time performance is even more critical in the JS case than elsewhere.
They say that like it's a bad thing.