This is a very neat approach and I agree with the premise that we need a framewo...

chrismccord · on Dec 6, 2023

Thanks for the thoughts – hopefully I can make this more clear:

> * Pretending that starting a fly machine doesn't cost the same as triggering via s3 seems disingenuous.

You're going to be paying for resources wherever you decide to run your code. I don't think this needs to be spelled out. The point about costs is rather than paying to run "my app", I'm paying at multiple layers to run a full solution to my problem. Lambda gateway requests, S3 put, SQS insert, each have their own separate costs. You pay a toll at every step instead of a single step on Fly or wherever you host your app.

> * I wouldn't do this at all. There's no reason the results need to be queued. Put them in a deterministically named s3 bucket where they'll live and be served from. Period. This is totally unnecessary. Your application should forget it dispatched work. That's the point of dispatching it. If you need subscribers to notice it or do some additional work I'd do it differently rather than chaining lambdas.

You still need to tell your app about the generated thumbnails if you want to persist the fact they exist where you placed them in S3, how many exist, where you left off, etc.

> * Your lambda really should be doing the DB work not your main application. If you've got subscribers waiting to be informed the lambda can fire an SNS notification and all subscribed applications will see "job 1234 complete"

This is exactly my point. You bolt on ever more Serverless offerings to accomplish any actual goal of your application. SNS notifications is exactly the kind of thing I don't want to think about, code around, and pay for. I have Phoenix.PubSub.broadcast and I continue shipping features. It's already running on all my nodes and I pay nothing for it because it's already baked into the price of what I'm running – my app.

davidjfelix · on Dec 6, 2023

> This is exactly my point. You bolt on ever more Serverless offerings to accomplish any actual goal of your application. SNS notifications is exactly the kind of thing I don't want to think about, code around, and pay for. I have Phoenix.PubSub.broadcast and I continue shipping features. It's already running on all my nodes and I pay nothing for it because it's already baked into the price of what I'm running – my app.

I think this is fine if and only if you have an application that can subscribe to PubSub.broadcast. The problem is that not everything is Elixir/Erlang or even the same language internally to the org that runs it. The solution (unfortunately) seems to be reinventing everything that made Erlang good but for many general purpose languages at once.

I see this more as a mechanism to signal the runtime (combination of fly machines and erlang nodes running on those machines) you'd like to scale out for some scoped duration, but I'm not convinced that this needs to be initiated from inside the runtime for erlang in most cases -- why couldn't something like this be achieved externally noticing the a high watermark of usage and adding nodes, much like a kubernetes horizontal pod autoscaler?

Is there something specific about CPU bound tasks that makes this hard for erlang that I'm missing?

Also, not trying to be combative -- I love Phoenix framework and the work y'all are doing at fly, especially you Chis, just wondering if/how this abstraction leaves the walls of Elixir/Erlang which already has it significantly better than the rest of us for distributed abstractions.

tonyhb · on Dec 6, 2023

You're literally describing what we've built at https://www.inngest.com/. I don't want to talk about us much in this post, but it's so relevant it's hard not to bring it up. (Huge disclaimer here, I'm the co-founder).

In this case, we give you global event streams with a durable workflow engine that any language (currently Typescript, Python, Go, Elixir) can hook into. Each step (or invocation) is backed by a lightweight queue, so queues are cheap and are basically a 1LOC wrapper around your existing code. Steps run as atomic "transactions" which must commit or be retried within a function, and are as close to exactly once as you could get.