Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It allowed me to take some code that reads in a bunch of data and performs a few rounds of some pretty standard operations (groupby, filtering, calculating mean/stdevs) and going from pandas to polars allowed me to go from ~1 minute per dataset to 1 second (yes, I tried the Arrow backend for pandas too). This was after spending some time profiling the pandas code and fixing up the slowest parts as best as I could. The translation was pretty straightforward. The output of this pipeline code was a few different dataframes (each to be inserted into a separate table) and each dataframe was output from a function. I was able to migrate one function at a time after asserting that the outputs of the two functions were identical and that all relevant tests passed (I used `to_pandas()` where needed).

I'm not sure how much faster I could go, since ~1 second/dataset allowed me to answer some questions that I had that required scanning values for a few parameters. The biggest wins for me were in grouping and merging operations.

I'm a complete convert now. The API is simpler and more obvious IMO, and the ability to compose expressions (`polars.Expr`) is awesome. The performance benefits are nice and what motivated me in the first place, but I'm more swayed by the aforementioned benefits.



Running on a very high core count server? Polars in single thread applications definitely are faster but not 60x faster unless the work isn't comparable. Are you reading from parquet and only operating on some columns? That could also be it.

But yeah, polars is awesome, I'm all in on it.


I'm not including parsing time, both pandas and polars versions started from an in-memory data structure parsed from two XML files (low GB range). This is on my workstation with a single Xeon 4210 (10 cores, 20 threads @ 2.20-3.20Ghz).

Perhaps I can focus on a subset of this processing and write this up since it seems like there's at least some interest in real examples. As pointed out in a reply to a sibling comment, I don't guarantee that my starting code is the best that pandas can do -- to be honest, the runtime of the original code did not line up with my intuition of how long these operations should take. Maybe someone will school me but either way switching to polars was a relatively easy win that came with other benefits and feels right to me in a way that pandas never did.


Is polars not parallelizing some ops on the GPU?


It has zero GPU support for now.


Important point.

Nowadays, we write a pure pandas version, and when the data needs to be 100X bigger and faster, change almost nothing and have it run on the GPU via cudf, a GPU runtime that fully follows the pandas API. Most recently, we port GFQL (Cypher graph queries on dataframes) to GPU execution over the holiday weekend and it already beats most Cypher implementations. Think billions of edges traversed per second on a cheap 5 year old GPU.

We're planning the bigger than memory & multi node versions next, for both CPU + GPU, and while cudf leans towards dask_cudf, plans are still TBD. Polars, Ray, and Dask all have sweet spots here.


According to GitHub, 90% of Pandas’ codebase is written in Python, which probably means there’s a lot of language overhead during operations compared to the rust code in polars.

That, plus parallelism, probably explains the performance difference. If anything, 60x sounds conservative to me.


I think with parallelism that difference is realistic, definitely not in single core performance though, most of pandas is implemented in numpy which should be pretty fast.


Bloody hell!! Thanks, that's exactly the kind of comment I was hoping to see. Sounds like a bit of an Apache --> Nginx moment for dataframes. Super cool!!


To add some balance:

- I can't rule out that a pandas wizard couldn't have achieved the same speed-up in pandas

- polars code was slightly more verbose. For example, when calculating columns based on other columns in the same chain, in pandas, each new column can be defined as a kwarg in a single call to `assign`, whereas in polars, columns that depend on other must be defined in their own calls to `with_columns`

- handling of categoricals in polars seemed a little underbaked, though my main complaint, that categories cannot be pre-defined, seems to have been recently addressed: https://github.com/pola-rs/polars/issues/10705

- polars is not yet 1.0, breaking changes will happen


Regarding your second point, you can use the walrus operator to retain the results of a computation within a single `.with_columns()` call. See https://stackoverflow.com/a/77609494

Edited to add: also, if you’re using a lazy dataframe, you can just naively do the same operation twice (once to store it in a named column and once again in the subsequent computation), and Polars will use common subexpression elimination (CSE) to prevent recomputing the result. You can verify this is true using the `.explain()` method of a lazy dataframe operation containing the `.with_columns()` call.


That's awesome, thanks for sharing! Though tbh I'm not likely to use it.. it's a bit too magical - though still a delicious hack.


I just edited my comment above to add more info about common subexpression elimination. It’s magic that happens behind your back on lazy dataframes. Polars is great!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: