Hacker News new | past | comments | ask | show | jobs | submit login
Amazon Web Services in Plain English (expeditedssl.com)
483 points by ccnafr 4 months ago | hide | past | web | favorite | 105 comments

Azure service names by comparison: Virtual Machines, Storage, Functions, Pipelines, Archive Storage, Active Directory, Repos [1]. Personally, I am not irked by less-descriptive names as much as with the hoard of acronyms—S3, EC2, SQS, SES, SNS.

[1]: https://azure.microsoft.com/en-in/services/

What is?

- Data Bricks?

- HD Insights

- Data Factory

- Cycle Cloud

- Signal R (does that have anything to do with the R programming language)? I know what Signal R is but if I weren’t a C# developer I wouldn’t.

I’ll give you that most of the names make sense, but it will still take me the same year to be confident that I was architecting the correct solution on Azure that it took me on AWS.

On the other hand, using "Azure DevOps" for CI/CD is just plain misleading.

Sure, a lot of AWS services could have been named better. But a lot of these are terrible, downright misleading names.

Some examples:

S3 as "Amazon Unlimited FTP Server": S3 has nothing to do with the FTP protocol.

VPC as "Amazon Virtual Collocated Rack": VPCs don't have anything to do with either collocation (they can span AZs), much less physical racks.

Lambda as "AWS App Scripts": really disagree with this one. Their intent is to be microservices or event handling functions, not scripts. "Scripts" isn't really descriptive but usually implies a manually invoked automation that modifies some kind of stateful resource (such as files). Lambda is basically the exact opposite.

There are lots more that are problematic or overly simplified as well.

I don't say this to crap on the effort, just... be careful if you're relying on this for your understanding of what AWS does.

You can tell the age/mentality of the person who wrote some of these names as they match very closely with my mental models. An "FTP Server" is inaccurate, but perfectly descriptive for me. Scripts is the same thing, "a small program which gets run repeatedly for a very specific task." So I think this is more of a guide for a 30-40 something dev who hasn't bothered to keep up with the proliferation of services.

I agree with you and to add. I think of FTP used in describing s3 is kinda like Coke being used to describe all cola based soft drinks. FTP here is a generic descriptor for transferring files, The protocols part is unimportant as is the difference between Coke and Pepsi when you said I’d like a coke.

All of the descriptors used for the Aws stack had a similar ring to them with an intended audience of someone a little older and with a little, or more, legacy knowledge about tech. Oddly I could see this useful for two use cases.

1- explaining something to my mom, now over 60 and no longer in the know with regard to tech. Might remember what ftp is or collocation is 2- explaining to an older very siloed technical person, I’m thinking enterprise/gov networking in a place where they don’t get exposure to newer cloud stack stuff. (Banking, healthcare, FEMA)

I'm in that 30-40 range and been programming 20 of that. I definitely would shy away from anything with FTP in the name because of the protocol's historical problems even if that wasn't a fair assumption for the service. Not having FTP in the name was a good decision by Amazon in my opinion. Now is S3 a good name? I don't know.

Whenever I read FTP, I assume very legacy, very insecure protocols.

Mostly because FTP has been considered insecure and kinda deprecated since 1999, and we've had better things (eg: SFTP) since.

So, for somebody in their fourties it might be a better name, but for somebody younger, it's definitely misleading.

yeah but starting around 2005 if you said "ftp server" it was implied that you would still use encryption via sftp or ftps. Few people will actually spell out sftp server.

Sftp servers are still the backbone in a lot of datatransfers.

Apparently APIs are really hard for most companies.

You are missing the whole point. As a designer that dabbles a bit in code and servers, I literally have NO idea what most of the words you just used mean.

I know that S3 has nothing to do with FTP, but it's a nice analogy that I can relate to. Get off your high horse.

Here are some better names for S3 that aren't misleading:

- Binary Storage

- Blob Storge

- File Storage

- Reliable Storage

FTP is a specific protocol. Hard drives (as a sibling comment mentions) are a specific technology. Our discipline is confusing enough that names should be precise, not analogical.

That still seems unnecessarily nitpicky and like it misses the point of the parallel. I'm reminded of the running joke on Californication:

"This is Santa Monica Cop. It's my innovative, original series about a hip, streetwise cop who brings his own brand of policing to a much wealthier area than he's used to."

'Oh, like ... Beverly Hills Cop, then?'

"Nah, you're not getting it! See, it's in Santa Monica!"

Yes, S3 doesn't natively allow retrieval/upload via the FTP protocol. But at the level of abstraction of "what does this service do for me, and why would I want to dig into the docs and incorporate it into my system?", the FTP analogy communicates the use case.

"This is S3. It's a way to map a bunch of keys -- which look like directories and naturally follow a kind of hierarchical structure -- to opaque blobs of data, allowing CRUD operations."

'Oh, like an FTP server, then?'

"Nah, you're not getting it! See, you talk to it with a different protocol."

"binary" and "blob" storage makes it even more confusing IMO. How about "file storage where you can only ever read, write or completely re-write (but never only partially modify) an existing file and there are no directories but the file names allow slashes so you can fake directories, and paths are called 'keys', where all the files you upload are publicly accessible via HTTP if you enable that and otherwise are accessible via a widely supported but annoying REST-based API and signed URLs, and where only download bandwidth and monthly storage (but not upload bandwidth) is billed, and is billed based on exact usage with no minimum recurring fee". I guess that's too long though.

"File Storage" is a bad idea for the same reasons comparing to FTP is. It might work for a near layman, but it can actually communicate misinformation to someone who knows slightly more about storage. Specifically, S3 is Object Storage, as opposed to File or Block Storage.

This is why "simple descriptions" are hard. Leaving nuance behind makes a lot of room for confusion.

Since S3 was more or less the first service to have the concept of Object storage, I think we would have been able to deal with ‘File’ storage.

Also, I think the difference is more between filesystem and object storage. You are still uploading files to both.

Yes, I'm still waiting for an "unlimited pay-as-you-go SFTP service" that uses something like S3 as a backend, but you connect with SFTP.

> Get off your high horse.

Please keep the discussion respectful.

the original comment offers some constructive criticism because the article does have the potential to misinform novice users about what certain services do and the technologies they use. yours is more of a directed personal attack; keep in mind that this is hn. good on you for knowing a priori s3 is not ftp. but i wouldn't expect someone new to development to necessarily know this. as a result, they could have an even harder time figuring out how to do things if, for example, they search for advice on how to connect to the s3 ftp server.

That said, I would expect someone learning web development to start out the old way like we all did with static HTML and a simple apache server and work their way up to newer things from there. Not doing this runs the risk of completely missing the point of how everything works.

To what end? Would you also teach someone binary, assembly and C before you taught them JAVA?

I see your approach in a lot of fresh graduates. They’ve learned things from the ground up, which means a good chunk of what they’ve learned is useless historic knowledge that doesn’t actually teach you a lot about how the modern eco-system of the web works.

That’s time they could have spent learning docker and authentication security. I mean, it’s often the one guy/girl who’s spent their free time setting up ADFS authentication in AWS for the fun of it, who gets the job.

But why wouldn’t you teach them that instead of teaching then legacy shit they’ll never use that also isn’t really that handy for understanding how things work now?

We’re supposed to stand on the shoulders of giants, and having struggled with apache or better yet the IIS, is just so utterly useless.

There is still a good reason to teach C and assembly even if they end up never using it in the real-world. Just having to manually manage memory teaches one how to be careful because the result is a hard-to-debug segmentation fault if one isn't. It is also teaches how the machine works at a low-level which is extremely helpful in writing performant code in any other language.

Should they learn older technologies like Apache or IIS? No, learning those technologies teach few, if any, transferable skills. Should they build a web server from scratch in C? Yes, that definitely teaches transferable skills even if they use Node or Go or whatever the latest web tech stack is later. Should they also learn the latest web tech stack? Absolutely.

Does C really teach you how to be careful with memory in JAVA?

I think teaching big O is and focusing on efficiency is much better than teaching someone C. Especially in the modern world where garbage collection isn’t bad and memory is abundant.

I mean, we use a lot of Python and a lot of JS, both are fairly inefficient. On the tech side, but it’s very productive on the human end, and Human Resources are a lot more expensive than memory.

If you spent 1 week writing something that was half as efficient as if you’d spent two weeks on it, that extra week of pay, will still be paying for for the additional hardware after you die of old age.

Not the best CS lesson of course, but having hired people who learned C before X, they really don’t seem to have learned the memory lesson anyway.

I have definitely run into memory bugs in managed languages that someone without my experience would have no clue how to debug since the managed language hides that information. Having said that, learning to be "careful" is much more than just avoiding memory bugs. It is about learning to avoid mistakes by really thinking about the code you are writing so that you don't end up screwing yourself or the team later. That can be applied to memory management, class design, API design, complex algorithm development, etc. There are many ways to learn to be careful but manual memory management is one of the best because the consequences are so severe, in terms of debug time, if you aren't careful.

Big O is useful but I think its usefulness is overblown. There are cases where understanding how the kernel allocates memory or how CPU caching works or how networking works, you can develop an algorithm that is inefficient in terms of Big O but is still performant. Sometimes you can even beat the best theoretical algorithm. The reason is that k is extremely important. If I can make k extremely small using my knowledge of the computer and n is within reason, then Big O often doesn't matter. So I don't have to waste time implementing fancy algorithms as a result. But I also have a better feel when Big O does matter. If I can't make k small or n is a huge number or both, then I spend the time on algorithm optimization and then understanding Big O is helpful.

I will give you a real example with GPUs. Let's say you want to do some GPGPU and you have data that is too big to fit in RAM. How do you process this data efficiently? Knowing how the GPU works is very important to solving this problem. You could spend days, weeks, years optimizing the crap out of Big O in your code, but it won't really move the needle in a lot of cases because that isn't the bottleneck. The primary bottleneck in this case is the PCIe bus and a high-level understanding of how it works is needed to keep it full of data as the GPU is processing it. Once that is solved, the next bottleneck will likely be the data format. The GPU is most efficient when each data sample is independent which is related to how the GPU cores work, caching works, etc. So putting the data in GPU-friendly format (not always possible) will make everything go faster even before worrying about Big O.

What C does is it forces you to learn how the computer really works because it is hard to be productive in C without that knowledge. And that knowledge is largely transferable.

I certainly agree with learning and using Python or JS when it makes sense because they are very productive languages. But from an education perspective, people who only learn JS are less likely to be able to solve the really hard problems because they lack that foundation of how the computer works. And there are plenty jobs where that doesn't matter too much but you should probably have at least one person around that really understands how the computer works for the times when it does.

I wouldn't teach binary, assembly and C before teaching Java. That's a bad analogy. It's different with the web. You have to understand what clients and servers are, and it's much more clear with a basic apache setup than it is with, say, the node.js ecosystem which blurs that line between client and server to a confusing degree.

So you would advocate for, say, learning JSX and react before learning how to use vanilla JS, and maybe a little bit of JQuery to manipulate the DOM, or learning HAML before learning HTML? No wonder the current generation of junior devs is so damn confused.

I think you should learn the foundations of JS before you use a JS library, but I see no value in teaching people JQuery before you teach them react.

I politely disagree. JQuery is (largely) syntactic sugar over the built in node selection engine and various CSS properties, animations, etc, that are built in and very much a part of normal everyday browser javascript. React is something else entirely, and creates a system parallel to the DOM rather than simply using the DOM the way it was intended. You have to understand the DOM and how it is typically used before you jump to something like React, and a great way to do that is vanilla js with a little JQuery sprinkled in. There is less friction this way, and it demonstrates the impetus for separating javascript from HTML and not using selectors for everything. Otherwise you miss the point of why React is good, and you run the risk of misusing (or in the very least under-appreciating) it. People with all this background still often make terrible React apps, so imagine how bad the people who don't even understand what or why React is will do.

Why is he that is on his high horse? It would take watching one or two videos on YouTube or Pluralsight to know on a high level what these services are.

If I came to you not knowing anything about the vocabulary used in your field of expertise is that your field’s fault or mine?

It might be descriptive, but it’s still a terrible name for a product that doesn’t have anything to do with FTP.

“Simple Storage Service” is one of those products that they actually named sensibly.

I guess the article just had to make up an alternative.

Dropbox would be a better analogy.

To me, "Dropbox" means "a folder that syncs across your devices"

To me a "script" is a piece of code that runs in response to a thing to do a thing. Either manually or automatically. Arguing that it's "basically the exact opposite" is completely non-illuminating.

I got the impression that overly simplified was sort of the goal. They don't seem like genuine suggestions for alternative names; the column might have been better named "Think of it as sort of like."

S3 has nothing to do with the FTP protocol, but if someone completely new to AWS and similar platforms asked "What's this S3 thing?", you might say "Think of it as sort of like an unlimited FTP server." You can connect to it through programs like CrossFTP, browse a folder hierarchy, upload files, link those files from a website, and in general do all the things people commonly use (S)FTP for, with the same tools they use for (S)FTP stuff. It's similar enough that it makes a good starting analogy.

App Scripts is another description I think is reasonable. Personally, I don't think manual invocation or modification of a stateful resource is key to the term 'script.' I think of a script as a relatively simple program that is invoked, accomplishes one task, and then ends without interaction. "Erase duplicate files in folder X" and "send me an email listing all files added today" are examples of scripts you might run as a cron job or in response to an event (like free space dropping below some level). Lambda is close enough to this that I think "Lambda's where you store scripts and define when they get run" is a reasonable explanation.

I agree 100%, this description are to simplistic - at the beginning the article should have a warning saying, that this view on what AWS has to offer is just for beginners to help them structure AWS in their heads, but then they should dive deep on their own. For example the S3 mentioned above.

It is nothing like an FTP. It is more of a database than a file storage now days. You can literally now use SQL to query data stored in S3 (S3 Select), and S3 is literally used as a database when you store JSON or CSV files. Which if this is all you have the costs are virtually non existent - but no body thinks about S3 this way. I personally had countless clients and developers use it as a FTP and be scared to make thousands of queries a sec - thinking they are dealing with a hard drive.

This type of thinking limits you for no good reason.

Yes, AWS is wast and complicated - and sure you need a starting point - but you need to be careful when you try to simplify AWS to much, because you'll limit yourself and other from taking advantage of what AWS can actually do for you.

Fully agree on the S3 description, actually S3 stands for something: "Simple Storage Service". Plain enough for me.

All those abbrevations should be written out on that webpage.

And many of the names proposed don't seem better in any way.

- Amazon Transactional Email vs Simple Email Service (SES)

- Amazon Queue vs Simple Queue Service (SQS)

- Amazon Github vs CodeCommit

(disclaimer: I worked at AWS from 2008 to 2014 as Technology Evangelist)

I always disliked most of the acronyms. The worst for me was Elastic Beanstalk, a name allegedly picked by Bezos himself. I thought that choice was really poor, considering that most of the world doesn't know what Beanstalk refers to.

This is what happens when you grow up in an English culture and think that the world functions like yours. I had the luck of being from a different culture (Italy), and therefore an easier way to discern certain things.

It doesn't mean that this "type" of fairy-tale is unknown, in fact its origins might trace back to 5,000 years.

In addition, "Jack and the Beanstalk" is an Aarne-Thompson tale-type 328, The Treasures of the Giant, which includes the Italian "Thirteenth" and the French "How the Dragon was Tricked" tales." [0]

[0]: https://en.wikipedia.org/wiki/Jack_and_the_Beanstalk

I don't think this is so much a facet of growing up in an English culture, because their names make no sense to me either; I was born and raised in America, and work in web technologies. This isn't an American/English speaking cultural thing, it's something else.

Jack and the Beanstalk is known to almost everyone in the United Kingdom, a version with moral overtones is usually taught to school children, and it's a popular choice for pantomimes (in fact, it's on right now near me: https://www.atgtickets.com/shows/jack-and-the-beanstalk/live...)

I'm British, and well aware of the tale; the name EBS still makes no sense to me.

Wait, EBS and Elastic Beanstalk are completely different things in AWS.

Well, that really wasn't intentional, but I think it makes the point perfectly.

Author here - I haven't updated this in a while, but am always grateful for the appreciative comments and emails I get when it gets rediscovered.

There's also usually two other comments:

1. Why isn't it funnier / meaner? (an early version had a couple more pointed descriptions).

2. Why aren't these names more accurate?

The answer to both is that I was trying to be help people form a rough mental model of what / how and in what context they would use the services - that they could then take as a jumping off point for doing their own research into.

Nobody is relying on my jokey 2 sentence description of these services to make deep architectural choices about their apps. What I have heard many many times though is: "Oh! I didn't realize that's what AWS Service X did."

> I was trying to be help people form a rough mental model of what / how and in what context they would use the services - that they could then take as a jumping off point for doing their own research into.

This is exactly what I'm trying to do with Hackterms [1] - would love it if you could check it out and maybe contribute AWS definitions. We're up to 1200+ from hundreds of contributors like you.

[1] https://www.hackterms.com

Reminds of this book I found at a thriftshop, "The New Hacker's Dictionary" : https://www.betterworldbooks.com/product/detail/the-new-hack...

(though probably with a lot less snark)

The Jargon File was the old ultimate reference: https://en.m.wikipedia.org/wiki/Jargon_File

A lot of old hacker culture can be learned from it. Might be actually a good thing that it's not updated anymore.

I really enjoyed your post, having never used AWS, and always been a bit intimidated by it. Your list maps perfectly onto my existing mental models in a very reassuring way. Thanks for taking the time to write it up.

In 2011, a polish bitcoin wallet company, "Bitomat", lost 17000 bitcoins ($231k at the time...) because they were running their operation off Amazon EC2 and wanted to upgrade their servers.

EC2 servers are "ephemeral". It's believed the operators glossed over the significance of the word when architecting their system and restarting their servers.


Sidenote: MtGox acquired them and made the users whole. Everyone wondered how they could do that. Maybe they never really did.


The storage for EC2 can either be instance store - storage on the same server as the instance and is ephemeral or EBS backed - networked storage that survives restarts and instance upgrades. This is made super clear when you terminate instances and/or resize them.

It wasn't always that way, but I'm pretty sure the option was there by 2011, as I believe it was released around 2010.

I know this is a joke, but I’m glad this person wasn’t in charge of naming these services. Amazon’s concise and consistent names often translate directly to API calls, making it possible to guess CLI methods.

Solid point, but I'd like to offer that aws-cli has built in autocomplete, should help out with some of the 'guessing'


good point. That’s great if the SDK for the language you are using has a utility for autocomplete and the terminal or IDE you are using has that utility installed, which may not always be the case.

I got a kick out of this site and agree with all of these suggestions. As a video guy of sorts, this one was my favorite:

Elastic Transcoder

- Should have been called Amazon Beginning Cut Pro

- Use this to deal with video weirdness (change formats, compress, etc.).

I don’t agree with EC2. When people think about EC2 they only think about VMs. Amazon considers the EC2 service a service that encompasses, VMs, load balancers, and a lot of other stuff.

“Amazon SQL” makes it sound like Amazon’s bespoke sql database. RDS is hosted Oracle, SQL Server, MySQL, MarisDB, and Postgres. They all behave so much differently it really doesn’t make any sense to combine them into one service.

My favorite is the original page from 2015:


Don't forget to see the DirectConnect description.

This page is hilarious, and reminds me of the early days of both open source and cloud services when everyone thought you needed some complicated sounding new phrase & acronym to be taken seriously. Just because that's the way IBM, Oracle, etc did it so we had to emulate that.

Call this 'the language they speak'.

Let me explain why I believe (in part) they are doing this (and Jeff Barr can chime in).

AWS does this so that people learning from the start in cloud computing 'children' get used to the Amazon term (trademarked) and then it's harder for them to leave.

Because all of the other things they would need to do elsewhere have unfamiliar terms. And now the Amazon term is what resonates in their head 'the language they speak'.

So the brand name is the moat.

Meanwhile if you are Linode or Rackspace and you call it a VPS then it's easy to find an alternate provider of a VPS somewhere else.

Business wise it's on purpose and honestly it (in this case wouldn't work for everyone I might add) it's really smart.

I remember the days when getting a new VPS instance meant placing an order and waiting for hours if not days for the server to be provisioned manually.

So when I hear the term VPS I don’t think cloud, I think ‘inflexibility’ and ‘manual setup’.

It only takes a little Googling to find mappings between AWS/Azure/GCP.

The brand name isn’t even a puddle. The moat is the APIs.

I mean... shortened obscure names for things in the tech world isn't exactly unique. Here's one for basic UNIX commands:

   ls    list_files   Lists files in a directory
   cd    change_dir   Changes the current working directory 
   cat   cat_barf     Takes stuff in and immediately hacks it up onto stdout

Is there truth to that last one? I always assumed cat was short for concatenate and my brief googling appears to confirm that. Would love to read more on this if someone has a reference to cat_barf

Cat is for catenate. Later the word became concatenate.

It appears not:


My understanding is that when you catenate A and B you concatenate A with B.

That is, the catenation of A and B is AB, but the concatenation of A and B is A (catenated with what - we haven't said) and B (ditto).

Of course this is basically trivia and the word that's used is concatenate.

No. I was joking. :-P

    cat  concatenate  Concatenate all files given as arguments as well as stdin.
Similar to: Your twitter feed

Not saying I'm huge fan of AWS service names, but don't mind them either. Didn't feel it as blocking issue and got familiar with them quite fast.

Yeah the names are annoying for all of a few minutes. I suppose if you weren’t working directly with them a lot (like a management role perhaps) it could be really annoying but it’s not hard to make a cheat sheet.

I’d rather google “ebs <issue>” because it’s pretty unique. Google eventually picks up on your preferences for non-unique names so not the worst thing though.

> I’d rather google “ebs <issue>” because it’s pretty unique. Google eventually picks up on your preferences for non-unique names so not the worst thing though.

I appreciate it for this.

As a developer, I've tried to research Django-related items before (The Python Django). But as a guitarist, I've also done a lot of research about Django (Reinhardt). Getting the right Django is sometimes confusing when Google knows you have a deep interest in both unrelated items. Codenames are great like this if they're not reused.

I work with a service every day named Connect.

I can't even find articles which I know exist using Google and nearly exact names.

"Amazon connect best practices troubleshooting" gets you to other AWS services like direct connect.

This is the only service name that truly bugs me.

Try adding quotes, e.g. https://www.google.com/search?q="aws+connect"+best+practises... brings back results only to do with Amazon Connect.

Right, but as a user who has no idea what's available I'm sure that the ux is miserable

Probably this is better - https://github.com/open-guides/og-aws

Amazon Drawer of old Android devices is the best alternative name.

This is a fantastic resource, I wish I'd had it when I first dived (dove?) into AWS. Maybe these shouldn't actually be names, but it would be nice to have nice descriptions like this in the AWS documentation.

Simply brilliant. I've genuinely been struggling to comprehend what most services are.

I think most of the names are fine, now it’s easy to Google on issues related to a particular service. Try Googling on “amazon queue issue” instead of “aws sqs issue”...

AmazonVS should be pretty unique.

EC2 is actually a really good name, as it has the important part of cloud right in the name: elastic. Generally, if your load is elastic, and you can design your solution to take advantage of the elasticity of EC2, you will do well in the cloud. If you just lift-and-shift your virtual machines into EC2, with no elasticity or scaling, you're going to hurt, in either performance, cost, or both.

I guess it's up for debate. While I agree some are silly (Elastic Beanstalk/Kinesis), renaming the product does not necessarily increase discoverability, which is a bigger issue. As for this list, I find a lot of these too dumbed down or missing some huge use-cases.

S3 is simply an object store, which is a fairly common term. So "Unlimited FTP Server" is less obvious to me, and sounds old, outdated, and a security risk (although S3 can be that, too). Lambda is functions-as-a-service, or serverless. "App Scripts" would also tell me nothing, or sound like something for Office. SNS main use-case for us is not to "send mobile notifications, emails and/or SMS messages", but to link two systems together where some loss is acceptable, so "Amazon Messenger" is worse. SQS and SNS fit together well.

I think the sqs and sns naming convention is a bit confusing. Realizing that sns should be the typical choice for an event based architecture is weird to people used to working with a message queue

SNS and SQS works together but serve two different purposes.

SNS is for producers. You use SNS to say something happened. You have no guarantee that any consumer actually successfully processed the message and if the consumer is down, that message is lost to them.

The only way a consumer can actually directly process a message is by subscribing to via http or lambda. But again, if the consumer errors or is down, you’re out of luck.

SQS is a traditional simple queueing mechanism. It has no fanout capability on its own but you do get the traditional queuing functionality. But it doesn’t make sense logically for more than one process (or group of processes doing the same thing) to consume the queue.

If you want the traditional fanout, filtering, multiple queues that do different things on the same event/message, you use SNS and SQS together.

>If you want the traditional fanout, filtering, multiple queues that do different things on the same event/message, you use SNS and SQS together.

For this sort of use case, I've generally opted to go with a Kinesis stream, and I'm having a bit of a hard time understanding why a mix of SNS and SQS would be better here.

For a simple “something happened” with a message and attributes, Kinesis is overkill and not as flexible.

You can send an SNS message with attributes and subscribe to that SNS message with any combination of SQS queues, lambda functions, emails, http endpoints, etc and with any of the subscriptions you can design it so that any of the subscriptions only get messages based on attribute conditions.

Also with SQS, you get the standard granular retries, dead letter queues etc. Yes with Kinesis you can do shards but it really doesn’t make sense to have more than one process reading messages from the same shard to scale out processing. With SQS, you can autoscale instances to read from the queue based on the queue size or just subscribe the SNS to an SQS queue and then subscribe the SQS queue to a lambda and let AWS work it’s magic.

Much appreciated!

This actually has me rethinking the architecture on a project I'm working on right now. It looks like Kinesis would be a little bit cheaper at the volume of data I'm looking at... But the SNS/SQS method will let me sidestep some potential future scale-up concerns I had with the 5 reads per second limit without making a Rube Goldberg machine of Kinesis Analytics feeding into additional Kinesis Streams, which would drive the cost up higher than an SNS/SQS fanout.

5 reads per second - yeah that’s kind of low. At maximum scale up. I am processing 80 messages simultaneously on 8 instances - each instance is running 10 (I/O bound) asynchronous threads. I could push it higher but the database starts screaming.

But with Kinesis, it’s true that you can do only 5 reads per second, but each read can have up to 10,000 records. With SQS, you can only get 10 records per call. I would think you could get much higher throughout with Kinesis, you would just have to handle storing your iterator/sequence number per shard somewhere to know where you left off in case of crash.

Kinesis is much better for higher throughput and you can always scale up instead of out if you need consistent throughout. It depends on your use case.

Yeah, my primary concern is that there were some other teams expressing interest in hooking up to the stream for working with the data in real time.

I know of a few workarounds - if the data is also being sent to S3, having some services working off of S3 events instead of the stream directly, or using Kinesis Analytics to fan out to additional Kinesis streams.

I might also just hook up SNS to the stream via Lambda and have them fan out from SNS. Hmm.

Keep in mind -- SNS is a push source for Lambda. When doing an approach like SNS->Async Lambda->...->Async Lambda it can become easy to saturate the event bus for your account.

Exactly. That’s why I suggested

Event source -> SNS -> SQS -> lambda and set the concurrency limit of the lambda.

(Amazon employee here, but I have no idea what the SQS/SNS groups are doing.)

I always wished that SQS/SNS/Kinesis/et. al. could all be grouped under an "AWS Pub/Sub" brand that encompasses those features. I understand how those systems interact, but damn, it can be confusing so someone who's new.

I agree. When I first start getting my feet wet in AWS, I couldn’t for the life of me figure out why would something as basic as the fanout pattern be missing with SQS. I read the documentation and thought I had to be missing something until I figured out that you had to use SNS+SQS. That still is completely illogical.

That being said, I made it an edict on my team that no process could put anything directly in an SQS queue, they had to use SNS and subscribe it to a queue.

But, keep making things obtuse. The more obtuse things are with AWS the more money I will be able to charge in my next life as an overpriced AWS Architect.

To quote Aziz Ansari in the human giant: "I don't want you out there on the bully plane calling me up, saying 'hey, I gotta pull up my marshmallow pants and hit the boohoo button, 'cause I don't know where the cranberries are.' You know what I mean?"

now if someone can translate Google’s services....

GCP products, for the most part, have a name that describes it very well. https://cloud.google.com/products/

This is badly needed for so many things. Amazon is the worst offender. Even the logos for the services are terrible

We live on AWS. This is frickin' brilliant, and I just sent it around the technology department.

I also love the sponsorship message:

> Hey, this is sponsored by SendCheckIt - and by "sponsored" I mean that's what I've been working on instead of updating this list.

Amazon high throughput? How about Amazon distributed queues?

Amazon EC2 Queue? How about Amazon State Machines? Or Amazon BPM?

FWIW, "Amazon EC2 Queue" actually covers a lot of real world use cases for SWF where a "queue" is needed to manage states and distribute works among workers.

The list is a bit out dated, and as for Amazon State Machines, it's a name more suited for AWS Step Functions, which is the successor of SWF (and defined as an actual state machine). Step Functions do cover lots of SWF use cases (rightfully so as a successor), but also more cases like service orchestration, etc.

Yeah the SWF one is off. I feel like SWF doesn't get enough attention. It's pretty brilliant.

I suspect that the person who made this site worked at Amazon. The way they describe SWF makes me think that the author used it when it was in it's prerelease, "Decider" form. The "Flow" version that went onto become the one that was publicly released has a slightly different set of terminology and doesn't expose the use of queues in the same way the Decider version did. I love publicly released version of SWF and think it's a severely underrated product.

I disagree. WAF is very transparently named

How the hell can Amazon afford to host all of these services?!

From the tonne of money it makes doing so.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact