Have you actually seen this happening ?

I have terabytes of new data arriving per day that needs to be verified, ingested and translated. There simply is no way to do this via online or stream processing. If you ingesting clickstream or Twitter data then sure it will work. But more often than not you need to work with sets of data. And for that batch processing is the only option.

JD @ Pachyderm here. We try to talk to as many people as possible about their data infrastructure so I have a decent sample size on this. My experience is that there a specific use cases for which streaming solutions are gaining ground but there's still a lot for which it isn't and probably never will. Streaming is great but there are a lot of situations where it just doesn't fit. I think what might be going on is that there isn't a lot of chatter surrounding batch based processing so it seems like it's going out of style but the use cases are still real

What is it about stream processing that doesn't fit your needs? I do research and development on a streaming system, so I'm very interested to hear about how you need to use your data.

So, I've got to ask...what do you actually get for that data? Like, what does your business actually use it for?

Are you folks doing sensor analysis or something, or eating logfiles, or what?

Financial Services with millions of customers (and growing rapidly).

What we can get for that data is a major competitive advantage. We can offer much cheaper financial products since we model risk individually rather than as a cohort. It also allows us to have a single customer view despite reselling other companies products.

Building a single customer view with lots of disparate data sets is a big trend right now.

Ah, okay, so that makes sense. I've seen a few cases where data is just indiscriminately hoovered because reasons, and I can never help but wonder what the expected ROI on it is.

