Hacker News new | past | comments | ask | show | jobs | submit login

Flink is pretty cool, but the Flink Python API and Beam Python APIs are pretty atrocious, and some of the higher-level APIs (like Flink SQL) are pretty hard to grok. I kicked the tires on Flink and tried to love it, but I couldn't get there. Storm (which is Heron's inspiration and is still better than Heron, IMO) is a lot simpler conceptually. But it's pretty clear Flink was built to avoid the need to combine Storm Streaming + Spark Batch for "Lambda Architecture" style setups.



Python's cool, but "right tool for the job" and all that. Just dive in with the Java or Scala api's and you'll have a better time :-).


Thanks for the suggestion, but, pragmatically, it's not an option. My colleagues and I haven't written data processing, distributed systems, NLP/ML/datasci, or web tier code in Java since 2006, and we don't plan on starting now.

We've used Python at scale on petabyte-sized production data, multi-billion-request API tiers, and high-concurrency low-latency data processing all the while.

Java is the right tool for writing system-level code in some isolated contexts, but Python is the right tool for the job for a huge number of important use cases, with those use cases growing by the day.

I love Java (and appreciate Kotlin and Clojure), but with parallel compute power cheaper by the day, Python's focus on code simplicity, open source ecosystem, and programmer happiness continues to win the day.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: