Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Scaling Pandas with Devin Petersohn – Weaviate Podcast #101
2 points by CShorten on July 17, 2024 | hide | past | favorite
Hey everyone! I am SUPER EXCITED to publish our 101st Weaviate Podcast with Devin Petersohn from Snowflake! Devin has had a remarkable career so far in scaling dataframes from building Modin while at UC Berkeley to then marrying the project with Lux at Ponder, and eventually joining Snowflake! This was one of the most educational conversations of my time hosting the Weaviate Podcast!! Devin explained all sorts of things from: • Origins of working on the scaling dataframes problem • What makes Pandas slower than SQL? • Separating the API from the Execution Engine • What is a Task Execution Engine? • Query Optimization • Materialized Views • Innovation in File Formats • How to read CSVs faster? • gRPC, Serialization, and Apache Arrow • The Separation of Storage and Compute • CUDA Dataframes and RAPIDS • Ponder • And of course... Large Language Models!! I hope you find this useful! Thank you so much Devin!! YouTube: https://www.youtube.com/watch?v=r4XSsgyYR9c


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: