Hacker News new | comments | show | ask | jobs | submit login
Speed up your Python using Rust (redhat.com)
291 points by jD91mZM2 on Nov 17, 2017 | hide | past | web | favorite | 98 comments

About as fast as numpy.. More tools to create fast code is always great, but the tooling for Rust/C in Python needs to be easier, I just can't be bothered most of the time.

This in numpy gets a better relative boost on my machine YMMV.

    import numpy
    def count_double_chars_np(val):
	return np.sum(ng[:-1]==ng[1:])

    def test_np(benchmark):
        benchmark(count_double_chars_np, val)

Good numpy implementation of the algorithm. If, for whatever reason, numpy isn't available, you can also pull it with a good comprehension:

    def count_doubles2(val):
        return sum(1 for c1, c2 in zip(val, val[1:]) if c1 == c2)
Which will also allow you to avoid a function call entirely, if it was useful in some way:

    In [56]: %timeit count_doubles(val)
    198 ms ± 13.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

    In [57]: %timeit count_doubles2(val)
    189 ms ± 21.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

    In [58]: %timeit sum(1 for c1, c2 in zip(val, val[1:]) if c1 == c2)
    135 ms ± 3.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

    In [59]: %timeit count_double_chars_np(val)
    6.95 ms ± 782 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
(Numpy still beats it, for long strings).

Hi, can you send a Pull Request including your numpy implementation? https://github.com/rochacbruno/rust-python-example I would like to add it there just for the record and then I will update the article.

Thank you for the very nice, educative article, Bruno!

If performance comparison of counting character pairs really were the issue here, in addition to the already suggested numpy approach, an implementation I'd dare wager to be as competitive is re2, e.g. [1], a drop-in replacement for the standard re package.

But I want to point out that I think all this performance comparison of this trivial character counting distracts from the core idea here: You'd use a low-level implementation in Rust (or C/C++/Cython, for that matter) when such "nifty tricks" are not available, after all. So again thanks for the article, and do think if you really want this performance issues to degrade the article to a only marginally relevant performance "showdown".


Thanks would be nice, you should be able to just copy and paste that oneliner, but I'm not sure you blog post is better for it. The idea that Rust is easier to include in Python is important enough, and Numpy is a bit of an edge case imho.

Ideas are always CC-zero

See also “Fixing Python Performance With Rust” previously discussed here:


And “Evolving Our Rust With Milksnake”:


Both from Armin Ronacher at Sentry. About Milksnake:

Milksnake helps you compile and ship shared libraries that do not link against libpython either directly or indirectly. This means it generates a very specific type of Python wheel. Since the extension modules do not link against libpython they are completely Python version or implementation independent. The same wheel works for Python 2.7, 3.6 or PyPy. As such if you use milksnake you only need to build one wheel per platform and CPU architecture.

Yeah, Milksnake is mentioned in the article :)

Doh, completely missed that reading the article on my phone.

I never got around to talking about it, but as part of my "month of Rust", I ported permission-based authorization logic from Python to Rust and then ran performance benchmarks of the Rust implementation and a pypy-compiled version. The pypy-compiled python ran slightly faster.

I've been told not to expect similar results in other implementations. These findings cannot be used to draw any conclusions about pypy.

My rust project: https://github.com/YosaiProject/yosai_libauthz

That's pretty interesting - JIT's can do a lot of great optimizations with runtime information, but I'm still surprised to hear that Pypy was faster. It would be cool to see the benchmarks, methodology, and Python code.

Personally, Pypy has never been an option due to the nature of the codebases I work on - or at least it wasn't. I was using pandas, numpy, scipy etc and I don't think it was compatible.

Very interesting. I glanced at the code to see if there were obvious performance issues. All I noticed was a triple `map` invocation in https://github.com/YosaiProject/yosai_libauthz/blob/master/s...

Not sure what the compiler does with that, but I'd expect that it means you're running through those elements three times. I imagine it would be better to reduce this to one `map` call.

ah you're looking at the CABI stuff.. that part wasn't benchmarked because I wanted as close to apples to apples as I could get and know that the cffi bridge taxes performance

here's the actual rust library: https://github.com/Dowwie/rust-authz/blob/master/src/authz.r...

the project includes a bench

I don't understand all the implications, but I often hear from JIT language people that their language could be as fast as a AOT language if it was used correctly.

What are the use-cases that lend itself to be faster in Rust, than JS/Ruby/Phython?

Fwiw, Ruby does get much more performant with JRuby but nobody cares that much because the benefits are mostly lost with Rails.

I would like to know this as well. This code is one instance of it.

For comparison, I just implemented the same as C SWIG extension[1]. It's about 10% faster, but it's cheating by comparing bytes instead of utf-8 encoded characters. The more interesting part to me is the comparison of the amount of boilerplate code required.


One thing though that gets very complicated about using SWIG is ownership semantics. With anything more complicated than passing scalar values, it is very easy to introduce a memory leak or double-free if you don't get the flags right. I wonder if Rust types naturally allow a much better inference of ownership semantics across the language boundary?

If you try to wrap any non-trivial type using SWIG typemaps you will quickly go insane. However to speed up an inner loop you can often get away with a few PyObject* arguments/returns. SWIG will pass those through and you can use the Python/C API directly, e.g. to return a numpy array. Allow SWIG to handle only simple types. The Python/C API is relatively sane, but you'll have to learn the reference counting conventions.

Agreed! On my current project (C++) I found that things get extremely complicated with shared_ptrs and directors. I even ended up contributing some solutions to SWIG.

It all appears to be due to a lack of semantics in the C header. SWIG depends on specifying this stuff in the interface file, but I've often wondered if it wouldn't be better to enhance the C-side, either by standard parameter name conventions or by some Doxygen-like standard comments to indicate ownership and other stuff.

SWIG has this nice potential to generate wrappers for (m)any language(s), but in practice as you said it's often just easier to use the Python API directly instead of trying to make it too general. Shame.

I have a C++ project that is available in multiple languages (Python, Rust, Fortran, Julia and JS), and I decided not to use SWIG partly because of this issue. Instead, I manually maintain a clean C API with everything that I need, and manually wrap this API using whatever is available in other languages. It is a bit more work (and thus incentivize me not to break the API ^^), but allow me to mix and match various ownership semantics throughout the API.

> It's about 10% faster, but it's cheating by comparing bytes instead of utf-8 encoded characters.

I'm really glad that you acknowledged this — I work with a lot of non-ASCII text and have run into that more than a few times in real code.

I find pybind11 [1] to be perfect for my C++ code. There's so little boilerplate, and I get RAII-guaranteed memory safety and all the speed my C++ development can bring.

For example, the binding of an accelerated HyperLogLog implementation only requires tiny amount of work, plus a line in my Makefile:

  PYBIND11_MODULE(_hll, m) {
      m.doc() = "pybind11-powered HyperLogLog"; // optional module docstring
      py::class_<hll_t> (m, "hll")
          .def("clear", &hll_t::clear, "Clear all entries.")
          .def("resize", &hll_t::resize, "Change old size to a new size.")
          .def("sum", &hll_t::sum, "Add up results.")
          .def("report", &hll_t::report, "Emit estimated cardinality. Performs sum if not performed, but sum must be recalculated if further entries are added.")
          .def("add", &hll_t::add, "Add a (hashed) value to the sketch.")
          .def("addh_", &hll_t::addh, "Hash an integer value and then add that to the sketch.");
[1] https://github.com/pybind/pybind11

If working in C++ land, I'd agree this is the nicest approach. It does, however, require linking against a specific libpython version [1], unlike Milksnake. But I'm not sure that's a bad thing...

[1] http://pybind11.readthedocs.io/en/master/basics.html#creatin...

True. It's not that bad, though -- "python3-config --extension-suffix" or "python-config --extension-suffix" is all it takes to generate the suffix you want, and you can drop it straight in your site-packages folder. At that point, you're just dropping the 3 or not depending on your version.

It's not as simple as milksnake. I would like to see some smarter extensions added to pybind11, but I'm okay with that for now.

I'm not familiar with Rust libraries, but I would guess it's just counting code points and not characters, so strictly speaking both are cheating.

I would love to see people showing how to do simple string processing, like counting characters in proper grapheme cluster level in their favorite programming language.

   for (c1, c2) in val.chars().zip(val.chars().skip(1))
chars() iterates by unicode scalar values. It'd be bytes() for bytes.

If you wanted to do it by grapheme clusters, you'd add https://crates.io/crates/unicode-segmentation to your Cargo.toml, add the relevant imports you see on that page to your code, and change the above line to

   for (c1, c2) in UnicodeSegmentation::graphemes(val, true).zip(UnicodeSegmentation::graphemes(val, true).skip(1))
... possibly splitting that up into variables becuase dang, that's a long line.

Then, you're getting &strs instead of chars for the iteration, but I think the body still says the same, as == checks by value.

Another possibility for this kind of examples that I used and has the minimum boiler pate is to just generate a .so file with your native function in C and call it from ctypes. Basically almost no boilerplate at all.

> with your native function in C

Works well this way with Rust too.

pytest.benchmark really needs to default to a smaller width of it's stats, those stats are really just meant to be used in a terminal..

I like the article, but the following advice confused me, especially since this comes from RedHat i.e. Linux people:

> Having Rust installed (recommended way is https://www.rustup.rs/).

This essentially recommends unconditionally using the "curl | sh" anti-pattern.

Shouldn't they recommend instead e.g. "apt-get install rustc" for Debian users?

Since this doesn't make use of too recent Rust features, using Rust 1.14 of Debian/Stable should be fine, shouldn't it? Same of Fedora, etc.

If you look at the way the rustup-init.sh script is written it's safe to be used with this "anti pattern". I see your objection though but unfortunately this ship has sailed, you might as well complain about websites that don't work without Javascript...

The advantage of this method is that it will work on any linux distro (and even BSD, Darwin and mingw) and you'll get the latest stable version. I don't see the advantage of using potentially outdated OS packages for installing a compiler, it's not like it's a dependency for other packages.

It also makes it easy to manage the various components of the toolchain, for instance if you later want to crosscompile for an other target, use the nightly version etc...

That makes no sense. If one curls to bash obviously they can't "look at the way the rustup-init.sh script is written".

The ship hasn't sailed and neither has the sites without JS one. You have made a decision to favor covenience instead of security and you're trying to make it look like it's the normal state of affairs.

Rust can be installed by downloading the appropriate package and checking it with gpg.

Recommending curling stuff to bash is ridiculous and makes a mockery of the idea of safety.

You're planning on downloading binaries and gpg keys from that site anyways. Either you trust it, in which case you might as well curl | bash, or you don't, in which case you shouldn't be running that script no matter how carefully you inspect it.

And of course you can inspect the bash script (not that it does you any good), curl > file; bash file. It's just that most people don't so that's not what is recommended .

No, you're completely wrong. The key is available in multiple places and has been available for a while, so there is some verification that can be done.

The binary will be checked by gpg, it shouldn't matter where it's from.

Finally, if the recommmendation is to run curl foo | sh, the bash script can literally not be inspected.

Just separate the steps? Curl to a file, inspect it, and then execute it? I don't see the problem. Most users just don't care because it's official anyways.

Rust is a language in active development, continually getting improvements and new features. Using Rustup is the best way of managing up to date toolchains (and multiple toolchain versions if you have to).

It's no harder than apt-get install, and sets the best practice early on so that someone doesn't get confused and have to switch later.

"curl | sh" is only an anti-pattern in the sense that you have to trust the source (and therefore "curl | sh" without https is bad). It gives exactly the same ability to execute arbitrary code on your machine as downloading an RPM/DEB does, or adding a vendor specific repo (e.g. Docker). Distro package repos probably have broader scrutiny of the contents of packages, but there are a lot of packages so how sure can you be?

I agree with your main point, but would like to add that apt does have additional signature verification with gpg, so it's a bit more secure than just https (e.g. anyone with access to a trusted CA and your network can mount an active attack against you).

HTTPS also doesn't guard you against someone replacing the binaries on the server (e.g. what happened to transmission). It also doesn't protect you from misconfigured corporate or state level MITM firewalls that don't check certificate validity.

HTTPS is intended for transport security. Using it for package authentication is generally a mistake. That's why most distributions accept the additional complexity of PGP instead of only relying on HTTPS.

I think distros package rustup too. Here on Arch it is pacman -S rustup (instead of the curl | sh thing) then proceed normally.

Installing an outdated toolchain makes little sense because after this example people interested in Rust may want to do other things, and will encounter an artificial roadblock when they (or a worse, a dependency) needs a newer Rust.

I think the Rust packaged in the distros is meant to be a build-dependency of software written in Rust (for example, ripgrep), not for Rust developers.

This is true for the rust compiler included in the opensuse repository - its only there to build packages (of which the newly released Firefox is and has been since 54)

Arch has Rustup? Today I learned. That's amazing!

I edited the article including a reference to rust-toolset which is available on RHEL repositories `yum install rust-toolset-7` https://developers.redhat.com/blog/2017/11/01/getting-starte...

Rustup is very useful for managing toolchains. For someone new to Rust, it's probably best to be familiar with it from the outset.

> using Rust 1.14 of Debian/Stable should be fine, shouldn't it? Same of Fedora, etc.

That is a RedHat developer blog. Is rust available on RHEL already? Looking at CentOS (which should have nearly the same packages), it doesn't appear to be available yet.

You can download the rustup bootstrapper if you don’t like curl to bash. I would recommend against using the debian packages.

Debian packages integrate better with the system, are authenticated by gpg, and have at least one more critical pair of eyes on them.

If you're willing to stick with an older version of cargo and rustc, why not?

Because the ecosystem is not. A lot of what’s on crates.io needs the latest and greatest version.

That's a very good point. However, choosing to use a specific version already implies forgoing anything newer than the next release. This would include language features and crates (generally).

Thankfully, crates.io lists the dates next to crates so that users can select a version that is released before or during the release cycle of their compiler, thus guaranteeing compatibility.

Packaging Rust for Fedora/RH distributions is a work in progress ATM:


You can install rustup using your package manager and then use it to manage your Rust installs.

> Rust is a language that, because it has no runtime, can be used to integrate with any runtime; you can write a native extension in Rust that is called by a program node.js, or by a python program, or by a program in ruby, lua etc. and, however, you can script a program in Rust using these languages. — “Elias Gabriel Amaral da Silva”

Can someone explain why is "having a runtime" problematic for writing extensions and calling them from Python ? From what I gather Go does have a runtime, so implicitly it should be suboptimal for calling from Python. Yet since 2015 (Go 1.5) can be called directly from Python. I'm a Python programmer looking to expand my tool belt. I'm wondering of relative pros and cons of Rust and Go. I have only written small toy programs in C and other compiled languages.

Is Go better suited to completely rewriting software rather than using it for extensions ? Why ?

I would appreciate a benchmark with a Go extension, too.

If I understand the matter correctly, FFI-ing with a language that has runtime has more friction in extra overhead of initializing runtime, i.e. more state management in your app. Not that it is not doable.

EDIT: maybe indeed someone more knowledgeable will explain it or point to good condensed reads.

"Can someone explain why is "having a runtime" problematic for writing extensions and calling them from Python ?"

Perhaps instead of saying "having a runtime" it would be better to examine the situation in terms of what the code assumes. Python assumes that it has the Python GC running on its code, that everything is a PyObject of one sort or another, that it has a Global Interpreter Lock that if taken will prevent anything from modifying anything it thinks it owns, and so on. Go assumes that it has the Go GC running (despite both "having GC", there's enough differences that it must be specified as a difference), that its objects are laid out in certain manners such that most field references are compiled down to static offsets rather than dynamic lookups, that it can run its core event loop and dispatch out work to its internal goroutines without asking anyone else, etc.

You could go on for quite a while; I don't intend those as complete lists. I just want to convey the flavor of conceptualizing the runtime in terms of assumptions that the code running in that runtime can make.

Once you look at it this way, it should be more clear why trying to jam two runtimes into one OS process gets to be tricky. I use the word "jam" quite carefully, because it always feels that way to me. The more differences between the assumptions of the two runtimes, the more translation the code is going to need. For instance, Python to anything else is going to involve unwrapping the data from the internal PyObject wrappers, and wrapping anything coming back from somewhere else back into PyObjects. Threading models have to be matched up. Memory layout has to be harmonized. Memory generally has to be kept strictly separated, because the two runtimes both expect to be able to manage memory, so you can't hand memory allocated by one of them to the other, which further implies that you're almost certainly copying everything across the boundary. Etc. etc.

I'd also separate out the way there can be differences in the affordances of the languages. For instance, Python doesn't have what Rust or Go would call "arrays". Rust and Go are fine with getting arrays of pointers, but the languages afford the use of memory-contiguous arrays without pointers, so especially if you're integrating with a third-party library, you have no choice but for some layer somewhere along the way to convert Python lists into the correct sort of array. The runtimes technically don't force this, but the structure of the libraries and code afforded by the other languages do. By contrast, if you were integrating with lisp, you might find many points where you need to turn things into singly-linked lists, again, not because Lisp can't handle arrays, but because you're likely to encounter pre-existing Lisp code that expects Lisp cons lists.

As another example, despite the fact Go and C generally see eye-to-eye on how to layout structs, the C support from Go is still extremely expensive due to the need to convert from how Go sees the concurrency world to how C sees the world. C, contrary to popular belief, actually does have a runtime, and that runtime tends to assume it has very deep control of the OS process it is running in. Go has to do a lot of work to isolate the running C code in an environment it is comfortable with, where it won't be pre-empted by the green thread code (on account of the fact that it can't be, C doesn't support that). There's also some tricksy code you may need to write to harmonize C's memory-management-via-malloc model with Go's "lifetimes determined via the GC" model. (If you listen carefully, you can hear the Go runtime go "klunk" every time it runs cgo code.)

Rust has a runtime too, but unlike a lot of languages, it has the ability to shut it off. You lose some services and capabilities, but on the upside, you significantly reduce the number of assumptions the Rust code is making, making it easier to integrate with other runtimes. (I say reduce because technically, it still doesn't make it to zero if you are precise enough in your thinking, but I'd expect that of all the current "cool" languages, Rust with the runtime off probably makes fewer assumptions than anything else.) That said, I'm not sure if this code is working in that mode. I see the rust code doesn't directly turn off the runtime, but I don't know what that "#[macro_use] extern crate cpython;" line fully expands to. It's possible that the full Rust runtime is still in play, which looks enough like C anyhow (by explicit design of the Rust team) that Python's existing C integration can just be reused. Either way Rust is still making many fewer assumptions that Go's relatively heavyweight (in terms of assumptions moreso than resources) runtime.

Rust's runtime is basically the same weight as C, that is, crt: https://github.com/rust-lang/rust/blob/master/src/libstd/rt....

What you're talking about is more of dropping the standard library.

> C, contrary to popular belief, actually does have a runtime

I've been left wondering what you meant by this. Are you referring to the stack and heap management? Or OS processes and threads?

If not, could you please explain what you mean by C runtime, and how does Rust differs from it when it is shut down??

"could you please explain what you mean by C runtime,"

There's two components to the C runtime, what is specified by the C standard, and what is specified by POSIX and the operating systems. I am not sufficiently familiar with the C world to tell you exactly which thing is defined in which part. Fortunately, for this discussion of how integrating C code into another runtime goes, it doesn't really matter.

The C runtime includes the assumption that there is a malloc-compatible memory allocator available (note it's swappable), the process of linking programs when they start up and the whole surrounding "symbols" they can obtain. It has certain assumptions about what state needs to be saved when a function is called; for instance, it won't save the flags on the processor controlling IEEE FPU conformity. Function calls have a "stack" and there's a "heap", and the language itself distinguishes between them. C itself, IIRC, has no specification for threads whatsoever, but the OSes seem to have converged on a fairly similar model that could be fairly called part of the runtime now.

It's hard to "see" the C runtime because it has won so thoroughly that it just looks like "how computation is done", or is so deeply integrated into the operating system that it forces parts of the model on everything that runs on that OS. You kind of have to piece together what C does by looking at what it does that other languages do differently. Yes, most programs at some point will do some linking and symbol resolution, but once the interpreter has started up, dynamic languages have no concept of a static symbol table. Loading another Python module doesn't even remotely resemble loading a C library, either at startup or dynamically later. The language Go doesn't have a stack or a heap. The implementation does for practical reasons, but the language does not. Most other languages now will save the same things on the call stack as C, but that's not a requirement of computation; you could save a lot more of the processor's state, but it'll trash your function performance to do it. A "stack" and "heap" model is not necessary; Haskell for instance does not have a clear "stack" at all. (It does stack-like things, certainly, but it turns out getting what most people call "a stacktrace" from the runtime is actually fairly hard. I believe still not possible on GHC.) There are alternate methods for threading, including models that still use the C-style threads under the hood but include mandatory code to be run at startup and shutdown to be "part" of the runtime.

C is not as thin as it looks; it's just that history has made it appear to be the baseline. And as I know my internets, let me say that nothing in this post is criticism. Something has to be the baseline. While I think the C baseline is getting long in the tooth, it won for a reason, and I don't know that we could have gotten much better from the 1970s. (The other competition usually cited was either a performance non-starter (the Lisp of the time), or had it survived for 40+ years, we'd be able to write a very similar post about how it is getting long in the tooth too in 2017 (Pascal, for instance).)

A great answer!

> I am not sufficiently familiar with the C world to tell you exactly which thing is defined in which part.

I've got some bits of knowledge here. I could be wrong, as it's not my expertise...

> Function calls have a "stack" and there's a "heap", and the language itself distinguishes between them.

I don't believe this is true or at least, not literally but the details are interesting! http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf is what I usually go by when talking about C11. Malloc is defined in, and says:

> The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.

In 7.22.3, the overview for all the memory functions, it says stuff like

> The lifetime of an allocated object extends from the allocation until the deallocation.

which restricts how you can implement it, of course, but it doesn't use the words "heap" and "stack" at all; "stack" is never mentioned in the document. 6.2.4 talks about storage durations, this is usually what we think about when we talk about "stack" and "heap" and such. "stack allocated" is more properly termed "automatic storage duration" and "heap allocation" is "allocated storage duration."

This is a side effect of the fact that C itself is defined in terms of a virtual machine! They call it the "abstract machine".

Anyway, all of this is in service of your point about history and such. Many people just assume all of this is how it has to be, rather than something that came to be thanks to history. It's all very interesting!

> C itself, IIRC, has no specification for threads whatsoever, IIRC

C11 added this, actually, but before that, you're 100% right.

Thank you for the elaboration. Now that you remind me, I remember about the C11. Which also adds "The C memory model" as part of the runtime, IIRC. Other languages have different memory models. Usually simpler, though it's hard to hold that against C11 since it was in the unenviable position of trying to codify decades of implicit and divergent practice in one of the trickiest places in software engineering.

Rust has a big standard library that is linked by default. Using it with no runtime is pretty rough going and is usually done only by people targeting bare metal microcontrollers or OS kernels, because it loses a lot of the power normally available in the language.

You may be confused about the difference between a standard library and a runtime. Rust has a standard library (which can be disabled in resource-constrained environments) but no runtime - there's nothing to initialize or tear down, code is just executed. Languages like Go, Python etc. have a runtime (which includes for example the garbage collector.)

What power are you missing? I'm usually surprised by how much still exists without libstd. The biggest thing is collections.

I have to confess to not actually being a Rust user. But, isn't for example std::io used by most Rust programs?

std::io doesn't use a runtime, it just makes system calls directly.

(It used to use a runtime because it had optional M:N threading, but that was removed when it turned out not to be any faster than 1:1 threading, caused all sorts of problems with stack switching, and of course pulled in a problematic runtime.)

If anyone is looking for something like this for Ruby, check out Helix:


Awesome, thanks! It feels like a lot of people dislike Ruby, but I'm glad some people still like it. I think it's a good Python alternative, and has syntax that reminds you of Rust.

What's the advantage of doing this over using cython or pypy?

More perfs since you are going lower level. You are not limited by cython types or pypy git warming up. You can use the full range of c libs AND rust libs. You benefit from the cargo build tools.

I'm not sure how different this is with Cython since you can also write C/C++ code with it, and hence use any C/C++ libraries. I say that the benefit offered by the example is more of bringing the Rust ecosystem to Python than solving performance issues.

Well, with cython, either you write python compiled to C, and you won't match rust perf nor types, or you write c/c++ and bind it to python, and you won't match rust safetiness.

why would rust have better performance than cython? cython is as fast as C

Cython just transpiles to C. It's not "as fast as C", it's as fast as the runtime support structures it uses, and the communication with Python allows.

You can write very efficient Cython code but it's true that in this case, you tend to adopt a lower level code style that is very close to C/C++. Basically, you need to think about the C/C++ code that will be generated by Cython.

C/C++ compilers might be able to generate more optimized native code than what rutsc does though. Actually, this is a question: how good is rustc with numerical / math intensive code? For instance, does it implements loop unrolling and SIMD vectorization?

Most of the time when a developer needs loop unrolling - numpy will work best anyway. Why does everyone always start mentioning this fact when performance is mentioned?

For example, in my case I always need high-performance code to work with strings loaded from loads of CSV files. That includes: merging strings, matching them, comparing them. Loop unrolling/SIMD would not really help here, while an ability to write safe, checked code fast - would.

On the other hand I do need the pythonic dynamics, so that's what I stick to.

yes, exactly as I said.

Total layman here, but I thought one would use Rust because it’s easier to write “safe” code with it than C/C++, while maintaining an equivalent low-level speed advantage.

you should also quote "easier" ;)

it's certainly easier once you sacrifice a goat to the compiler to stop insulting you and your mother.

disclaimer: i love rust

Well it's better than your client insulting you and your mother because your software killed their goat by mistake.

lol. It's sad that I can only upvote this once :)

Cython is unsafe.

It's much easier to learn Cython once you know Python. That's the biggest selling point.

But the tooling is terrible in comparison. I find rust sigificantly easier as a python developer than cython. There is so much more rust ecosystem to take advantage of.

To be fair to Cython, they have access to the entire C++ stdlib, so that's a fairly good amount of tooling. The main thing is lacks is good documentation and memory safety.

And C++ has absolutely no package distribution system at this point.

Cython deceptively similar to Python with a lot of the pitfalls of C baked in. I personally found their documentation lacking and had to read through tons of Cython projects to discover how it behaved in many scenarios. I've had a much better experience with Rust documentation in comparison.


We've banned this account for mostly posting against the guidelines.


> too much commercial exploitation

Could you elaborate here? I'm interested.

I've mostly been saying "Rust has over 100 companies using it in production right now, but like, that's still 100 companies. It's good for where we're at, but it's not a massive number."

Rust would have to have a significant and meaningful presence in the commercial space for this exploitation you mention to exist. It doesn't. Rust adoption right now is trivial and tiny compared to commercial programming at large.

Rust is a great language with some over-zealous fans who seem to not understand that frothing at the mouth about how C and C++ are literally weapons of murder and how Rust-safety is the most important thing any programmer should think about for any project is...not the best way to advocate for the language.

Now we got Numba, Cython and Numpy results for comparison https://github.com/rochacbruno/rust-python-example#new-resul...

https://news.ycombinator.com/item?id=14588333 (beautifulsoup/lxml upgrade)

>Python: interactive glue language between high performance C libraries.

Appreciate this walkthrough for Rust!

How does it compare with https://github.com/servo/html5ever (someone with free time do run some benchmarks)

That's a great question, and fits nicely in the context of the current discussion.

I think the primary claim to fame for this C-based https://github.com/kovidgoyal/html5-parser is serving as a drop-in performance boost for lxml (at the API level; it parses invalid HTML differently/more consistently).

I too would be interested in a performance comparison to help decide which project makes more sense for new projects. The existing Python layer in html5-parser might give it a leg up if the language of choice is Python - is there a similar project for the Rust-based html5ever?

Shouldn't Python, et al have more "native" ways to achieve these sorts of performance improvements?

Maybe because of the existence of `Numpy` and `Cython` and `PyPy` + all other `FFI` possibilities it is not on Python Roadmap.


Nim probably makes more sense if you're coming from Python.

Oh, and Python is perfectly fine for most things. If you like it, sticking with it and just speeding up the rare thing that needs that performance makes much more sense than switching language, especially if you already are well into a project.

Yeah, that is exactly what I said in the article! Speed up your Python using Rust; Only for those rare cases when Python is detected as the bottleneck. And of course, it can be C/C++ or Nim or Go etc...

Or Rust. ;-)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact