
Show HN: Gazette: towards unification of batch and stream processing - jgraettinger1
https://github.com/gazette/core
======
jgraettinger1
Hi HN,

I've been working on Gazette for a little while now, having used it as a
proprietary solution in my last couple of companies, and having now open-
sourced it. I'd appreciate the community's take on it.

One of Gazette's contributions is that it unifies "my real-time event stream"
and "all of my historical data" into a single source-of-truth with a
representation that's super easy to integrate (plain old files on S3).

Going forward, what excites me is its potential to unify 1) stream processing,
2) Data Lake build-out, and 3) ad-hoc analysis (via BigQuery/Athena/Hive/etc
external tables) into a single system. If that's interesting, I'd love to talk
with you about it.

------
atombender
I think this looks very neat. I think you have exactly the right idea about
offloading the physical storage to object stores like S3 -- I had the same
idea some time ago, and using a version of it for an internal analytics
streaming system.

Thanks for making this open source. I'm considering an application where I
might be able to use this.

