Hacker News new | past | comments | ask | show | jobs | submit login

How much data do your "data pipeline apps" process with Kafka, Kubernetes and probably other K-named things?



Several terabytes a day, but it varies, hence why I love the auto-scaling.


That's 10tb = 10000gb = 10000/24/60/60 = .11gb/s. My 2015 desktop could handle that.

Where do you work? I'd like to pitch my radical idea of edge consolidated cloud computing.


"Big data" frameworks are very good at throwing a lot of hardware at problems. See eg the classic big data vs laptop treatment: http://www.frankmcsherry.org/graph/scalability/cost/2015/01/...


The inevitable comment lol. I'm sure it could. To clarify, when the app scales, it's pods (container instances) that are scaled, not necessarily machines,that is managed separately by k8s. Depending on how much of the cluster is currently being used.

Yep, we could handle it easily on one machine, if we gave it unfettered access to all the resource, but our throughput varies during the time of day, so dedicating a machine to it to handle the peak throughput wouldn't make financial sense at 3am at night. So it's far simpler to scale pods with known resource usage as needed.

It also helps prevent issues when throughput suddenly doubles because of a business decision that you're left out of the loop on.

So autoscaling pods is ideal for our use case.


Were talking about $5k of hardware to meet your needs 20 times over with the new threadrippers.

Is your aws bill really under $500 a month?


On AWS you would be paying for many other things besides elastic CPU power. CPU and bandwidth is infamously expensive there. (But I don't think AWS was mentioned anywhere in this thread...)




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: