Hacker News new | past | comments | ask | show | jobs | submit login

As a daily user of pandas for a few years now, I really must suggest that anyone looking to use it for serious data analysis familiarize themselves with the Split/Apply/Combine paradigm [0].

Lots of data munging has been enabled or sped up by judicious application those concepts.

[0] https://pandas.pydata.org/pandas-docs/stable/groupby.html

I agree. Hadley Wickham (a very prolific author of important R libraries) wrote a great paper about this method using one of his libraries. I'm a Python + pandas user, but his paper really helped me understand the approach better: https://vita.had.co.nz/papers/plyr.pdf

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact