Hacker News new | past | comments | ask | show | jobs | submit login

Serverless is fantastic for ETL and data analysis, especially for workloads that vary in scale (eg cronjobs). Feed data in, get data out with scaling as needed.



but how do you feed data in? Usually, it's some other service on one of the big 3 cloud providers. I'm using google for my projects these days so it's a mix of Google PubSub and Dataflow.

I think this is the issue/risk with serverless. You either get locked into one of the big 3, or you end up doing all of the ops work to run your own stateful systems. As some of the people above you said, managing and scaling the stateless HTTP components is not the hard/expensive part of the job.


Can't you use a queue service that's essentially just managed kafka/activemq/other-standard-system? I mean sure if you wanted to move off the cloud vendor you'd have to run your queues yourself, but if you're programming to the API of well-known open-source queue system then you're never going to be very locked into a particular vendor.


The short answer is yes, you can do that, but it starts to get nuanced rather quickly. The context of this is a desire to go “serverless” and that solutions like this only give you serverless for the relatively easy parts of your stack. If your goal is to go “serverless” I take that to mean a few things listed below.

    1) you don’t have to manage infrastructure
    2) you don’t have to think about infrastructure (what size cluster do i need to buy?)
    3) you pay for what you use at a granular level. (GB stored, queries made, function invocations, etc)
    4) scale to zero (when not in use, you don’t pay for much of anything)

Most things don’t hit all of these points, but typical managed services hit very few of these points. Sure, I can use a managed MySQL, but it only satisfies 1 of the 4 points.


How does one get locked in when it’s a simple function in X language? Seriously, serverless is just an endpoint they provide. You write the code and they handle everything else.


Because the function is the stateless easy part. To make any non trivial system, in a serverless way, you have to use their proprietary stateful systems. IN my case, google pubsub, google data flow, google datastore, Spanner, etc. that’s where the lock in happens.


Right, because serverless is actually just a cover for "de-commoditizing" the cloud services that companies like AWS built to commoditize datacenters. You hit the nail on the head. It's not completely useless to help less technical people solve the problems that folks like you and I consider "the easy part" and so people will find a use for it.

But the primary utility of serverless is an attempt at solving Amazon's problem of being commoditized by containers.


I’d say something more nuanced. Serverless is increasing commoditization of one layer of the stack at the cost of de-commoditizing a high layer of the stack. This is what makes it a hard decision to grapple with. You’re getting very real benefits from it, and potentially paying a very real cost sometime down the road when being locked into the propietary system bites you.


> all of the ops work to run your own stateful systems

Can we please call it "stateless" instead of "serverless"?


Well, there are stageful serverless products like Google Datastore.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: