Hacker News new | past | comments | ask | show | jobs | submit login

Some of the examples provided in the paper are eloquent:

•df[df.a>3]

•df[df["a"]>3]

•df.loc[df.a>3]

•df.loc[df["a"]>3]




Not sure that I'd consider those all that eloquent since it's just the product of 2 different pieces of syntactic sugar (df.a being shorthand for df["a"] and df[<index filter>] shorthand for df.loc[<index filter>]).


Here's another example: https://stackoverflow.com/questions/49936557/pandas-datafram...

What's the difference between query() and loc()? Do they evaluate to the same thing under the hood? Is one better than the other? In what cases?

These are questions that don't have obvious answers at first sight.


Well, that's kind of the point. What's the purpose of the syntactic sugar? Is it just that, or is there some hidden performance difference? This is not clear at first sight.


The point is to "Huffman encode" the API for expressing near-boilerplate. Like unix command names and flags.

The problem is that there is no simple logically coherent API to use when you haven't memorized all the shortcuts. And the author only allows "tax form" APIs (what he calls "Pythobic/Pandonic" where every parameter is a single atomic step, so it's laborious to express things like tree-structured queries that are more complex than parameter dictionaries.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: