No need to use SQS here, assuming you don't need strict ordering behavior, since SNS can use Lambda functions as notification targets. The nice thing about SNS is that it can automatically fan out to multiple Lambda functions in parallel.
And then what if your Lambda function fails because a service dependency is down? The answer is, it retries twice and then you can have it go to a deadletter queue. If you want to have any configuration control over your retry or deadletter behavior, SQS is really handy.
It's also a useful pattern to, instead of configuring the SQS event itself as your Lambda trigger, have a scheduled trigger and then perform health checks on your service dependencies before pulling messages from the queue. This technique saves you wasted retries and even gives you more fine-grained control over deleting messages (the SQS mechanism for saying, "this is done and you don't need to retry it after the timeout). The SQS trigger, in contrast, deletes the entire batch of messages if the Lambda exits cleanly and does not delete any of them if the Lambda does not exit cleanly, which is a bit messy.
Also, you can have an SQS queue subscribe to multiple SNS topics, which means you can set up separate SNS topics for backfill or other use cases. This is especially useful in cross-account use cases, where often your online events are coming from an SNS from a different account, but you want to be able to inject your own testing and backfill events.
I usually don’t let the lambda SQS back end control message deletion for just that reason. I delete individual messages as I process them and throw an exception if any fail.
If a specific message fails to process multiple times, most often it’s because that particular message is malformed or invalid, or exposes an unhandled edge case in the processing logic, and will never successfully process, at which point continued attempts are both fruitless and wasteful.
It’s better to retry a finite number of times and stash any messages that fail to process offline for further investigation. If there is an issue that can be resolved, it’s possible to feed deadletter messages back in for processing once the issue is addressed.
Feeding your deadletter events back into your normal processing flow not only hides these issues but forces you to pay real money to fail to process them in an infinite loop that is hard if not impossible to diagnose and detect.
Using SQS as an event source, it will also fanout with the advantage of being able to process up to 10 messages at once, dead letter queues for SQS make a lot more sense than lambda DLQs etc.
But, I prefer to use SNS for producers and SQS for consumers and assign the SQS queue to a topic. With SNS+SQS you get a lot more flexibility.
Be careful the if there is possibility for a job to fail and need retry. SQS visibility and automatic dead lettering can give you this basically for free. I don't think SNS will.
1. You can have one SNS topic, send messages to it with attributes and subscribe to different targets based on the attribute.
2. You can have one SNS message go to different targets. There may be more than one consumer interested in an event.
3. “Priority” processing. I can have the same lambda function subscribed to both an SQS queue and an SNS topic. We do some things in batch where the message will go to SNS -> SQS -> lambda. But with messages with a priority of “high” they will go directly from SNS -> Lambda but they will also go through the queue. High priority messages are one off triggered by a user action.
By “fan out” I mean that SNS will execute as many Lambda functions as needed to handle the published events as they occur. This gives you much greater concurrency than a typical pub/sub architecture where you typically have a ceiling on the subscriber count or are forced to wait on consumers to process the earlier enqueued messages before processing subsequent ones (i.e., the head-of-line blocking problem).