Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Gazette: towards unification of batch and stream processing (github.com/gazette)
12 points by jgraettinger1 on Nov 11, 2019 | hide | past | favorite | 2 comments



Hi HN,

I've been working on Gazette for a little while now, having used it as a proprietary solution in my last couple of companies, and having now open-sourced it. I'd appreciate the community's take on it.

One of Gazette's contributions is that it unifies "my real-time event stream" and "all of my historical data" into a single source-of-truth with a representation that's super easy to integrate (plain old files on S3).

Going forward, what excites me is its potential to unify 1) stream processing, 2) Data Lake build-out, and 3) ad-hoc analysis (via BigQuery/Athena/Hive/etc external tables) into a single system. If that's interesting, I'd love to talk with you about it.


I think this looks very neat. I think you have exactly the right idea about offloading the physical storage to object stores like S3 -- I had the same idea some time ago, and using a version of it for an internal analytics streaming system.

Thanks for making this open source. I'm considering an application where I might be able to use this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: