Hacker News new | past | comments | ask | show | jobs | submit login
RustPython: A Python Interpreter Written in Rust (github.com/rustpython)
196 points by bovem 42 days ago | hide | past | favorite | 70 comments



I don't understand how it's possible that we just randomly come across a project that just casually implements a Python interpreter in Rust. Don't these things take a massive amount of effort? Wouldn't this be making waves much earlier in its development process?

I feel the same way about Ruff, for example. One day it was "black all the things" and the next it's "btw we just reimplemented the entire Python formatting/linting ecosystem in Rust, and it's 100x faster, no biggie".

What's happening? Is it just so much easier to write stuff in Rust that projects like these pop out of people's heads, fully-formed? It boggles the mind.


For me the "uv" manager has changed my Python experience because: (1) it has a correct resolver whereas "pip" certainly doesn't and I'm not sure about poetry, (2) it is crazy fast, (3) "uv" is just a binary which I can pip into my system.

(3) is important because if it was written in Javascript or Java or Python or .NET or many other languages I'd have to learn something about the runtimes of those environment to get it working. If it was written in Python it would have to deal with the bootstrapping problem that it ought to have it's own Python installation separate from the one that it is manipulating so it can't have conflicts with that environment. (e.g. how many times have I busted my poetry?) I can use "uv" or "ruff" without learning anything about Rust!

As for (2) the speed of "uv" has as much to do with better algorithms and caching as it does with being in Rust and thus much faster than Python. I think you could have done better than Poetry in Python but "uv" is transformative in that it can often build an environment in seconds or less whereas with "poetry" or "pip" or "conda" I might have time to pound out a few posts on HN. I used to avoid creating new Python environments as much as possible but now it is fast, easy, and even fun.

I bet it is more work to write "uv" in rust as opposed to a similar tool in Python but the impact on the community is so huge because we can finally put problem (1) behind us and do it with speed, reliability and grace. I had notes on how to build a better python package management system and sometimes thought about trying it but I'd become convinced that the social problem of too many people finding half-baked tools like "pip" and "poetry" acceptable was intractable. Thanks to "uv" nobody will ever have to write one.


(Thank you for this, it was really inspiring for me to read.)


I'm really looking forward to uv being a drop-in replacement for Poetry. I don't know if that's what they're planning to do, though. Does it currently have all the niceties of Poetry (dependency management, locks, building wheels, etc?).


Isn't it Rye ? (https://rye.astral.sh)

"Rye supports two systems to manage dependencies: uv and pip-tools. It currently defaults to uv"

I've been evaluating it lately and it has pretty much the same CLI commands as Poetry except it's faster and comes with complete Python interpreter management (which is to me the real killer feature as I don't really care about speed of dependency resolution, but I do care about the DX).


Yeah, that's definitely within scope for what we're trying to build, and we've been hard at work on extending uv to support those workflows (platform-agnostic resolution, lockfiles, etc.). Honestly, a lot of it is already implemented, but not yet stabilized or announced. Coming soon.


Fantastic, I'm really looking forward to that!


> Don't these things take a massive amount of effort?

Yes, RustPython has been in development since at least 2018.

> Wouldn't this be making waves much earlier in its development process?

It's been posted on HN several times before: https://hn.algolia.com/?q=rustpython


Ah ok, it's at least comforting to know that I missed it, rather than there are superhuman developers that crank these projects out in an afternoon.


I’d wager they don’t hit major spread from opinion leaders and upvotes in social media until they are mostly usable.

It’s “I’m making a Python interpreter in rust,” claims emitted into the void with increasing engagement as it grows in usefulness.

Edit: and you can even see that in the HN search above. Every year it’s had a little more functionality and a little more engagement than the last.


Implementing a interpreter like that isn't as hard as you probably think as the standard library does a lot of the heavy lifting once you have the basics.

It's still a lot of work but the only need to make the "built in" parts of the language and that's a lot smaller subset.

Example of what im talking about: https://github.com/RustPython/RustPython/pull/3858


I've had some fun converting some of my Python scripts into Rust and it's really not that difficult with the help of modern tools once you wrap your head around Rust. Python is too huge to crank out in an afternoon, for sure, but on the human level, the translation from python to something compiled is a well trod path.


The one time cost of learning borrow rules and traits is steep, but the lifetime savings of cargo vs PIP probably hits break even after a few months.


Also if evolution has shown us anything we will all one day evolve into crab. Crab is the final form (Carcinisation).


Not to mention the change in rate of runtime errors.


I cranked out a Lua interpreter implemented in Rust in a week or two.

It only ran about 3x slower than PUC Lua... And never collected garbage either :P


A quick check on the contributors page shows ~8ish heavy contributors working over the course of 6 years and 13k commits. That's a good thing to check for any project you're thinking about integrating with IMO.

That said, my experience has been that adding business features in Rust apps is quite fast indeed!



Interesting that it relies on OpenSSL, either dynamically from the OS or vendored at compile time. I wonder what the implications would be for using something like rustls. You’d get TLS batteries included and kill a large external dependency… but possibly introduce behavior changes to low-level cryptographic operations, which is scary.

Still, the maintainers stated that they don’t plan to implement Python’s readline module because they already have a rust implementation of readline. A similar argument could apply here - use native rust implementations of dependencies and expose them via the expected Python APIs. This would break some ambitious Python programs, but those probably wouldn’t consider alternative runtimes anyway.

https://github.com/rustls/rustls


Does numpy runs on rustpython? And other libraries used in ML (not expecting compatibility with huge libraries like torch or tensorflow, but rather, getting the leaves to work should be doable)

If not, is it at all possible to get numpy to work and other libraries written in native code? I see that rustpython also work in wasm: but what about compiling numpy's native code to wasm as well?


Does this have faster startup times than cpython?

Every time I want to rewrite a shell function in python, I always hesitate due to the slow startup.


How fast does it really need to be? On my M2 macbook air:

    $ time A=1 B=1 python -c "import os; print(int(os.getenv('A'))+int(os.getenv('B')))"
    2

    real 0m0.068s
    user 0m0.029s
    sys 0m0.026s


Eh. Once you start using imports, python slows down dramatically.

So I guess it really just depends what your scripts use.


Regarding import cost, as it’s doing heavy IO traversing the file system, the cost heavily depends on how fast you can do IO in the hardware, and also the file system (and the OS).

So a fast SSD will help, and somewhat surprisingly putting it inside docker helps (in an HPC context, not so sure it’s implications here as we’re talking about a short scripts.)

But the context here is to port shell scripts to Python, I’m not sure how huge amounts of imports matters.

And it is probably an intrinsic problem of the language, unless we start talking about compiling (and somehow statically) not interpreting the Python program, whichever implementation of the language probably won’t help the situation.

Lastly, if high startup costs of the script becomes relevant, perhaps it is orchestrating wrong. This is an infamous problem of Julia, and their practice is then just keep the Julia instance alive and use it as “the shell”. Similarly, you can do so in Python. Ie rather than calling a script from the shell acting on millions of things, write a wrapper script that start the Python instance once. Memory leak could be a problem if it or its dependencies are not well written but even in that case you have ways to deal with that.


>But the context here is to port shell scripts to Python, I’m not sure how huge amounts of imports matters.

Never question the modern developers ability to import 1500 heavy libraries to accomplish something that only takes 10 lines of code.


I curious why you would think so, since CPython is already written in C


Well CPython has a lot of backwards compatibility to deal with that RustPython doesn't, so "import subprocess" might result in very different behavior.


I wonder if this would make Python web applications more secure at interpreter and library level.

Running it on hardened Linux, OpenBSD, or FreeBSD was a start. A Rust implementation might help.

I also miss setups like eCos RTOS where a GUI determined which features got compiled in. Strip each Python app down to just what it needs in the interpreter. Might squeeze it in L1-L2 cache that way, too. Aside from embedded (eg MicroPython), has anyone anything like that for use on servers?


If Rust makes its way into the Linux Kernel, another mature C project, I wouldn’t be surprised if a Rust interpreter replaces CPython.


https://notes.eatonphil.com/lua-in-rust.html It's some kind of developer trend.


I don’t see it on the main read me, what are the limitations/incompatibilities with CPython at this point in time? How “drop in ready” is it?


They have a "What's Left?"[1] and the results of running RustPython against the CPython test suite[2] up on the website.

[1] https://rustpython.github.io/pages/whats-left

[2] https://rustpython.github.io/pages/regression-tests-results....


sqlite3 is the main one on that list that jumps out at me.


This seems very weird to me. Anyone who is just slightly interested in the project would want to know if specs are fully implemented and this has parity with the "official implementation". Can't believe it's not in README.


[flagged]


>The modus operandi of the shrinking set of people who took over CPython is to add new garbage so other implementations cannot keep up.

Ridiculous accusation bordering on paranoid.


that's a bit harsh. c++ has specification and programs still break easily by just moving to a different compiler.


Do you mean they would work like that to prevent an alternative implementation? This seems a bit far fetched.


I suppose you consider new Python features are "garbage" because you don't care about them?


Does RustPython support the GIL? It would be ironic for a language with a crab as a mascot to not depend on having gills or at least one.


There's no installable version for pyenv. In general, how well does this work with virtualenvs?


Are there any benchmarks made public in comparison with the python3 interpreter?


We waiting for Ruby ;)


Lookup Artichokeruby


A Python interpreter has to be written in every language.


Just as Doom runs on about any hardware you can think of.


Is there GoPython ?



Yes and it's called grumpy


hopefully not


While you're at it fix Python's crippled lambdas and ...


I wonder why you stops there midway in a sentence and only after reading the other comments I get what you mean … What problems you are referring to exactly, and how would they be fixed as an implementation but not at the language level?


What's wrong with Ellipsis?


This project doesn't seem to be adhering to the terms of the license of the original CPython modules that have been copied into its source repository.


What are they violating, out of curiosity?


They list the repository as MIT licensed, but the python modules are distributed under the Python Software License which says:

2. Subject to the terms and conditions of this License Agreement, PSF hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python alone or in any derivative version, provided, however, that PSF's License Agreement and PSF's notice of copyright, i.e., "Copyright (c) 2001-2024 Python Software Foundation; All Rights Reserved" are retained in Python alone or in any derivative version prepared by Licensee.



Thanks. I definitely missed it.


does it have a GIL


idk but Jython didn't. So I don't think there's anything inherent in the language outside of CPython that calls for it.


The complete semantics of Python object lifetime are expensive to implement in a compatible manner without a GIL. Jython got around this by not doing it, making it not fully compatible (yes, people do depend on objects being eagerly freed), just using the JVM GC instead. If you do want full compatibility, the choice is between single-threaded performance and parallelism.


> yes, people do depend on objects being eagerly freed

I get that this must be one aspect of the necessity of the GIL but I mean, C++ also has eager free behavior due to RAII and threads are working fine there, as long as you know what you're doing. Perhaps that's the rub though, it's pretty easy to crash/deadlock in C++ and we blame the programmer rather than the language.


Idiomatic C++ relies much more heavily on ownership and not so much on refcounting. If you have code that's a rat's nest of shared_ptr, it's going to perform very poorly in a multithreaded environment. But that's why any C++ guru will tell you to not make a rat's nest of shared_ptr. When refcounting is commonly used in C++, like with GUI code or dependency graphs of network requests, it's usually in non-performance-critical sections.

In Python, by contrast, all variables default to object references, and so nearly everything you do involves updating a refcount.


Right so you're saying that Python's need to keep ref counts is what leads to the need for synchronizing updates, leading to the need for a lock, more or less. Which is only needed in C++ if you program in a kind of Python style. Makes sense and is a good point.


Iirc, it mostly impact C modules in terms of the guarantees that are offered / not offered with GIL / NOGIL.


Yeah, but the ecosystem of C modules is what makes Python so great.


There is a competing c interface that numpy and a few other projects are adopting that allows for no gil. Last time this came up I thought the rust implementation used that one.


It's what makes Python tolerable.


Hadn't thought about Jython for a while. Whatever happened to that?


One of the casualties of the Python 2 => Python 3 debacle, it seems.


It's special-purpose, and always has been. You use Jython when you want to embed a Python-like interpreter into a Java program. Usually when you do so, you're scripting the objects of the Java program, and don't need or want to import arbitrary Python packages. Indeed, that's often the whole point of Jython - the system designer wants a language that's familiar to Python programmers, while also being able to control the environment that those Python scripts can access.

This is not different from the Python 2 days. Jython has always had subtly different semantics from Python (eg. it uses Java strings instead of Python ones, there's no C API, it relies on the Java GC so no eager free), so many common libraries wouldn't work with it. Just try to run NumPy on Jython - you can't, despite the same developer authoring both Jython and NumPy's predecessor.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: