
Google Dataflow: A Unified Model for Batch and Streaming Data Processing [video] - espeed
https://www.youtube.com/watch?v=3UfZN59Nsk8
======
buremba
It's strange that the service looks quite promising but I don't know any
company that uses this service. Isn't it mature enough (it's actually kinda
strange argument for a managed service though) or is it hard to use? I skimmed
the documentation but the pricing model seemed unclear compared to AWS's
managed services.

~~~
alooPotato
We use it here at Streak for streaming log processing. We have it setup such
that our backend and client side logs are streamed to our "dataflow job" where
we do some pretty simple processing/transformations and then it gets outputted
to BigQuery. Sounds simple but there is a lot of complexity its hiding when
you're streaming at a large enough scale. We know because we built our
infrastructure for this at first and it sucked.

As for pricing, its just consumes your regular google compute engine
instances, so its all based on how big your jobs are and how long they run
for.

