I think the sqs and sns naming convention is a bit confusing. Realizing that sns should be the typical choice for an event based architecture is weird to people used to working with a message queue
SNS and SQS works together but serve two different purposes.
SNS is for producers. You use SNS to say something happened. You have no guarantee that any consumer actually successfully processed the message and if the consumer is down, that message is lost to them.
The only way a consumer can actually directly process a message is by subscribing to via http or lambda. But again, if the consumer errors or is down, you’re out of luck.
SQS is a traditional simple queueing mechanism. It has no fanout capability on its own but you do get the traditional queuing functionality. But it doesn’t make sense logically for more than one process (or group of processes doing the same thing) to consume the queue.
If you want the traditional fanout, filtering, multiple queues that do different things on the same event/message, you use SNS and SQS together.
>If you want the traditional fanout, filtering, multiple queues that do different things on the same event/message, you use SNS and SQS together.
For this sort of use case, I've generally opted to go with a Kinesis stream, and I'm having a bit of a hard time understanding why a mix of SNS and SQS would be better here.
For a simple “something happened” with a message and attributes, Kinesis is overkill and not as flexible.
You can send an SNS message with attributes and subscribe to that SNS message with any combination of SQS queues, lambda functions, emails, http endpoints, etc and with any of the subscriptions you can design it so that any of the subscriptions only get messages based on attribute conditions.
Also with SQS, you get the standard granular retries, dead letter queues etc. Yes with Kinesis you can do shards but it really doesn’t make sense to have more than one process reading messages from the same shard to scale out processing. With SQS, you can autoscale instances to read from the queue based on the queue size or just subscribe the SNS to an SQS queue and then subscribe the SQS queue to a lambda and let AWS work it’s magic.
This actually has me rethinking the architecture on a project I'm working on right now. It looks like Kinesis would be a little bit cheaper at the volume of data I'm looking at... But the SNS/SQS method will let me sidestep some potential future scale-up concerns I had with the 5 reads per second limit without making a Rube Goldberg machine of Kinesis Analytics feeding into additional Kinesis Streams, which would drive the cost up higher than an SNS/SQS fanout.
5 reads per second - yeah that’s kind of low. At maximum scale up. I am processing 80 messages simultaneously on 8 instances - each instance is running 10 (I/O bound) asynchronous threads. I could push it higher but the database starts screaming.
But with Kinesis, it’s true that you can do only 5 reads per second, but each read can have up to 10,000 records. With SQS, you can only get 10 records per call. I would think you could get much higher throughout with Kinesis, you would just have to handle storing your iterator/sequence number per shard somewhere to know where you left off in case of crash.
Kinesis is much better for higher throughput and you can always scale up instead of out if you need consistent throughout. It depends on your use case.
Yeah, my primary concern is that there were some other teams expressing interest in hooking up to the stream for working with the data in real time.
I know of a few workarounds - if the data is also being sent to S3, having some services working off of S3 events instead of the stream directly, or using Kinesis Analytics to fan out to additional Kinesis streams.
I might also just hook up SNS to the stream via Lambda and have them fan out from SNS. Hmm.
Keep in mind -- SNS is a push source for Lambda. When doing an approach like SNS->Async Lambda->...->Async Lambda it can become easy to saturate the event bus for your account.
(Amazon employee here, but I have no idea what the SQS/SNS groups are doing.)
I always wished that SQS/SNS/Kinesis/et. al. could all be grouped under an "AWS Pub/Sub" brand that encompasses those features. I understand how those systems interact, but damn, it can be confusing so someone who's new.
I agree. When I first start getting my feet wet in AWS, I couldn’t for the life of me figure out why would something as basic as the fanout pattern be missing with SQS. I read the documentation and thought I had to be missing something until I figured out that you had to use SNS+SQS. That still is completely illogical.
That being said, I made it an edict on my team that no process could put anything directly in an SQS queue, they had to use SNS and subscribe it to a queue.
But, keep making things obtuse. The more obtuse things are with AWS the more money I will be able to charge in my next life as an overpriced AWS Architect.