Hacker News new | past | comments | ask | show | jobs | submit login
Wrestling Python into LLVM Intermediate Representation (2019) [video] (youtube.com)
72 points by tomrod 13 days ago | hide | past | web | favorite | 19 comments





I'm not sure if it's an exact fit, but dropbox used to have pyston: https://github.com/dropbox/pyston

The blog posts by Kevin Modzelewski went into the internals: https://blog.pyston.org/

Also by the same person, a good article on Python's performance: http://blog.kevmod.com/2016/07/why-is-python-slow/

As a side comment: On the subject of type inference, I really come to like python's type hints. I don't use mypy itself yet, but already had a habit of adding types in docstrings, and like it when I can add a clue that a signature/return is dealing with a something that's not a basic type.

There are some nice little things out there for advanced typing, being added now and then:

- TypedDict and Literal: https://github.com/python/typing/blob/master/typing_extensio... (TypedDict was accepted in PEP 589[1])

- NamedTuple had variable annotation added in Python 3.6: https://www.python.org/dev/peps/pep-0526/

I wonder if typings are added to a codebase, we could give LLVM one more shot. It'd be pretty crazy to build able to build a large graphql server and build it into a statically linked binary like golang.

[1] https://www.python.org/dev/peps/pep-0589/


You should check out mypyc

https://github.com/python/mypy/tree/master/mypyc

and this new recent issue that mentions targeting LLVM or other targets

https://github.com/python/mypy/issues/8373

https://github.com/mypyc/mypyc/issues/709


This looks awesome. Have you used it for anything yet?

It would already be nice if it was Julia LLVM JIT style.

Many of the "why Python is slow" articles usually hand wave the fact that other languages, just as dynamic, like Common Lisp and Smalltalk, have quite capable JIT compilers.

So it is a matter of having enough resources to throw at it.

Maybe PyPy and Numba are as good as it gets, unless some big corporation is willing to spend big bucks into improving Python's JITs.


Let me chip in here to clarify some things.

> other languages, just as dynamic, like Common Lisp and Smalltalk, have quite capable JIT compilers.

Common Lisp implementations don't need a JIT compiler to be fast. In fact, most of them don't use JIT compilation at all.

What makes Common Lisp implementations (especially SBCL) so fast is a combination of sophisticated static analysis, optional type declarations, and the fact that Common Lisp has been carefully designed to allow for high performance. I cannot stress how important the last point is. The Common Lisp standard is a contract between the programmer and the compiler writer that allows the former to write portable programs, yet gives the latter enough freedom to optimize.

In contrast, the language "standard" of Python doesn't clarify what portable programs may rely on. Instead, programmers tend to rely on the specific behavior of CPython. And reproducing the exact behavior of CPython is much harder than implementing a carefully designed standard.

> So it is a matter of having enough resources to throw at it.

No amount of resources can heal the design decisions of Python. The only way to get Python fast is by going through a painful standardization effort and by breaking some existing code. And I don't see that happening anytime soon. (Python 4 anyone?)


Right, JIT compilers in Common Lisp are mostly not known. There are only a few bytecode interpreters, where it could be used: CLISP, CMU CL with its bytecode interpreter, ...

Common Lisp has several execution modes and for compiled code: important are

1) AOT compiled, but fully safe with optional debug info: safety = 3 and debug = 2

2) AOT compiled, but fast and potentially unsafe with little debug infos: speed = 3, safety = 0, debug = 0

The usual goal is to be able to run much of the code in mode 1) and only compile portions in mode 2).

Thus much of the language is optimized around possible compilation. The language is very dynamic, but there is also a core, which is not object-oriented - thus potentially easier to compile to fast code.

Also incremental in-memory compilation in Common Lisp is AOT and not JIT.


Fair enough that relying in CPython specific behaviour might be an issue.

Although stuff like dictionary ordering, GC implementation, or GIL shouldn't impact JIT implementation.

However going back to Smalltalk, which you didn't mention, not only it is as dynamic as Python, at any given moment can the image change its contents, and via messages like becomes: an object completely changes its internal structure.


> not only it is as dynamic as Python, at any given moment can the image change its contents, and via messages like becomes: an object completely changes its internal structure.

Another thing: in Smalltalk the idea of a "stack frame" is actually encapsulated in an object called Context, and you can always inspect the current context. It's just an object like everything else in the system. I don't think Python has something like that, but I could be mistaken.


You can walk and inspect the stack using sys._getframe() forever or the more friendly stack function in the inspect module.

* https://docs.python.org/3/library/sys.html#sys._getframe

* https://docs.python.org/3/library/inspect.html#the-interpret...


Common Lisp can also change the class of a CLOS object. Via CHANGE-CLASS:

http://www.lispworks.com/documentation/HyperSpec/Body/f_chg_...

But these are not the parts where Common Lisp will be fast...


SmallTalk's become: is a crazy-ass feature which walks the entire graph of reachable objects, and replaces all occurrences of one object with another. A becomes B by virtue of all known references to A being replaced with references to B.

> unless some big corporation is willing to spend big bucks

It seems like the outcome of the Unladen Swallow project was that there was no point to JIT generally because the language semantics prevent the sort of accelerations possible in other languages. (I'm aware of how PyPy and Numba are successful exceptions.)


LLVM just isn't that good for a JIT compiler:

http://qinsb.blogspot.com/2011/03/unladen-swallow-retrospect...

Azul made it work for Java with what they said was ~20 man years of work.


They're really in a different league. My favorite example is the ad-hoc stack analysis that Python lets you do -- and not just of the current stack, but of already-exited stack frames! This is free in CPython but prevents sizable performance gains from JITs (I don't remember the exact numbers, but I think it was multiple percent).

There was a python+LLVM talk last weekend as FOSDEM too (video is up)

https://fosdem.org/2020/schedule/event/llvm_python/

"Python with LLVM has at least one decade of history. This session will be going to cover-up how python implementations tried to use LLVM such as CPython's Unladen Swallow branch (PEP 3146) or attempts from PyPy and why they failed. After that it will show what are the current python projects that use LLVM for speed, such as numba and python libraries for working with LLVM IR. In the end, it will mention about new ideas that would unite the powers of both LLVM and Python."



I think this is actually from PyCon Israel 2016 and was uploaded in 2019.

I think that's accurate. I think it's this repo: https://github.com/aherlihy/PythonLLVM

Which hasn't been updated in 5 yrs :(


Brutal editing



Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: