The recommendation against using S3 for static hosting is that there are much better alternatives outside of AWS. The dev experience you get with Netlify and Zeit will almost certainly never be possible with AWS.
You can set custom TTL for CloudFront to 0 for instant refresh in production. Dev server is `hugo server`, `hugo` is a single Go binary you can Dockerize with --net=host and --volume=$(pwd):/app to use your own dev environment. If you don't maintain too many large static assets, your cost is like 5 cents a month, and I think that's a competitive price, not AWS being generous. $12 / year for a custom domain via ICANN, and Route 53 manages A/AAAA/CNAME/MX records for free.
Again, not to refute your claim. I'm a single author, who publishes as a hobby, with a single dev and a single prod environment, so if things break down (it hasn't for years), it's easy to fix. It's much different than content marketing and enterprise use cases. Even so, I would rather start here and add a static CMS like Forestry before considering a custom solution like Netlify (which is probably built on top of AWS anyways).
Also Netlify doesn't scale in pricing as well. 3 developers is $45 a month, but 4 developers is $1000 a month. There's also no SLA until the $1000 a month tier. They charge almost double for extra bandwidth. It's great for small teams, but they're not objectively better.
Never used Netlify, but that seems odd. Isn't it a hosting provider? Unless you're really tiny team, usually you'd want a centralized process for running the build & deploy (rather than having any developer that wants to release something running the build on their slightly different local environment and deploying it directly), possibly hooked into a CD pipeline. Why do you need to pay per developer?
That's pretty cray-cray. I feel like you could just create a CloudFormation template, have a "Launch Stack" button, and get people to pay you to maintain it for $10 / user / month and have a company easy-peasy. Throw in backups via Batch, I think they use EC2 Spot instances so its crazy cheap.
I get that you should always charge more if possible, but I'm much more concerned with surviving during hard times, and there's just less to ablate away if you stick more or less to first principles.
One benefit Netlify has is that it keeps a reference to multiple deploys, so you can always go back or deploy a development branch separately from production. Plus it does the building for you, which for a Hugo site is not really important but in our case (product website + shop) we used Gatsby which pulled data from several data sources (= secure access keys that ideally don't linger on dev machines).
I actually noticed I had a few incorrect dates, about two weeks after I published the posts. One annoying thing about hugo server is that you can't see draft posts timestamped in the future, which makes it hard to schedule things. My stack definitely could use improvement in a real-life production setting :)
Whether that helps or not depends on how the script is called. Unfortunately. Better to check the result of the command and call exit explicitly if you want to be sure.
Not refuting the UX point. It’s possible (at most an hour of work) to set up webhook-based deploys to s3. Not as turnkey as Netlify, but it is very possible to build your own great developer UX with a bit of elbow grease and pipes :).
If your time frame for hosting is 10+ years, I’d personally recommend to put your eggs in s3, (or the “s3 interface”, to be specific; whether it’s with AWS, DO, etc).
That's not hard. What I'm talking about is giving each PR its own unique domain URL. That's a much bigger pain in the ass with S3+CF. I'm not even sure it can be done quickly because starting a new CF distribution takes like 10 minutes. I guess you'd need to hook into Route 53 somewhere or something?
I could never figure out the Amplify docs. Maybe I was stuck in the marketing part of it, but I never saw something that stated clearly what it is or how it works. It was like "just click a, then b, then c" which is not useful for understanding what it's actually doing.
So, is Amplify just a simplified interface to S3+CF?
AWS Amplify makes it incredibly simple to do deploy a static website to s3 with cloud front in front of it.
Includes CD and Amplify itself (like Cloudformation) is free, you only pay for the resources you make.
AWS Amplify has been out for a couple years now, and it is also able to automate much more than just static websites.
So taking a look at the 2 options you mention, I am not seeing anything that those do that Amplify does not (but the opposite is not true, Amplify does a lot of things for you out of the box)
I agree that AWS doesn't yet provide a clean all-in-one package for static website hosting. I think AWS Amplify Console [1] is the current official "all-in-one" service but I haven't used it.
With the recent GA of AWS CDK [2] I would say that it is getting more and more possible to define your own development workflow that is tailored to your own static website hosting needs and better than just raw S3. AWS CDK is still difficult to use with some rough edges, and I know that setting up e.g. a CodePipeline and CI/CD is difficult.
Here is the CDK example for hosting a S3 origin + CloudFront CDN + custom TLS certificate + CloudFront invalidation on deployments static website [3]. Here is my personal version of it combining Hugo that took me a day to set up [4]. Still complicated, lot of rough edges, but offers a pleasant dev experience [5], I haven't yet built up a CI/CD pipeline.
Disclaimer: I work for AWS but my views are my own.
> I agree that AWS doesn't yet provide a clean all-in-one package for static website hosting. I think AWS Amplify Console [1] is the current official "all-in-one" service but I haven't used it.
That's exactly what Amplify Console is, and it's really good.
After initial configuration it's literally one command using the AWS cli to upload your site to s3.
I don't understand the eagerness to trade-off $/user instead of $/bandwidth (which is far, far cheaper in nearly every case) just to save a day or two of learning how to use a tool.
> The dev experience you get with Netlify and Zeit will almost certainly never be possible with AWS.
For static hosting, AWS Amplify is already a very similar experience to Netlify. Each is slightly better than the other in some ways, but overall they are really very similar.
I find the experience using AWS Amplify Console to be great, not dissimilar to Zeit or Netlify. It's a 'productisation' of the S3 and Cloudfront pattern and does its job very well imho.
https://aws.amazon.com/amplify/console/
Firebase Hosting is superior to any of those alternatives you mentioned (netlify, amplify, zeit etc) in terms of capability. With the added benefit of tight integration with other firebase products (analytics, authentication and cloud storage).
Based on this and other comments of yours in the thread, it feels like you have some vested interest in advertising "Netlify and Zelt" (whatever those are) without declaring your interest in them.
If you truly have no interest in them other than as a user, then it seems as if you're not aware of alternatives such as AWS Amplify. You haven't responded to any of the comments pointing out the issues with your claims.
Whatever book you're a co-author of doesn't sound as though it's worth the price.
Here is a link to the AWS docs referencing the limitation of local secondary indices. Any table with a local secondary index is limited to 10GB per partition key.
The authors may have misinterpreted this limitation.
>Once you create a local index on a
table, the property that allows a table to keep growing
indefinitely goes away.
It limits the value for that partition key but does not limit the overall size of the table...
Yup, having had AWS experience from work, this is the exact stack I always use whenever I need to build a static site; I could use an ordinary host but I'm less sure about the security guarantees there and to what extent they will be able to handle load.
+1. Although the dev experience is convoluted on AWS, you end up learning more and I think it is quite cheap as well. I have a tutorial as well on how to deploy a static website using CloudFront, S3 and Route53 : https://raviramanujam.com/post/blog/meta.html
Do you know what the cost difference is between CloudFront and S3 in terms of serving big files like video files?
For convenience and other reasons, I use S3 to publish some videos for a small group (~30) I teach. Wondering if serving that thru CloudFront would be cheaper.
This is especially important to me now as I'm doing this while everyone here is under voluntary self-isolation so I'm relying on video more.
Not that S3 was ever that expensive for me but if CloudFront is cheaper...
CloudFront will likely help you save costs if there is a benefit to caching the content. For example, if all of those 30 download the same file, from a similar geographic area around the same time, you should see some cost reduction.
Otherwise, it might not be worth the cost of the distribution!
Sorry, I don't know the answer to that specific question. I do want to point out that CloudFront is just a CDN, you would use it in conjunction with something like S3.
Getting https setup with S3 is the easy part. What I'm struggling with is getting www->root redirect to work with https. I've tried quite a few setups and none worked perfectly.
You don't have to. You can, however, if you want to.
I used this stack (S3, CF, R53, & ACM for SSL cert) to deploy a static site built using Hugo, which can deploy directly to an S3 bucket.
If you're crafting the site yourself without a builder, you can just use the AWS CLI to `cp` the files to the S3 bucket. You may need to invalidate the CF distribution if you want everything updated quickly worldwide.
This appears pretty superficial and it gets many things wrong.
> Choose between [DynamoDB] on-demand option (no capacity management) or provisioned option (cheaper).
It's very, very easy for provisioned capacity to be more expensive than on-demand. You need to overprovision because, if you exceed the provisioned throughput, you will get throttled and your application will suffer. Scaling up and down is a slow, opaque process, and you're only allowed to do it something like 5 times per day. If your workload has any idle periods, you're just burning money.
> Avoid using S3 for static web hosting (No HTTPS)
I won't argue that S3 is ideal for static web hosting, but "avoid" is pretty strong and IMO not warranted.
> Do not use AWS Lambda as a general EC2 host
What does this mean? As general compute? This is too simplistic a judgment. Lambda is a versatile service and can be useful in many situations.
> Kinesis: unlimited consumers
No, Kinesis is an extremely limited service. 5 reads per second per shard. In my experience, if you put even two consumers on a stream, you will start to see throttling. The official solution is to fan out to multiple streams. Hacky and super expensive.
> Kinesis 30x cheaper than SQS
This is hilarious. Maybe it's true for some workloads. In my experience, Kinesis is incredibly expensive and SQS is not.
> The official solution is to fan out to multiple streams
Not anymore. Enhanced fanout allows up to 20 consumers, doesn't count against read limits and has lower latency than polling.
> It's very, very easy for provisioned capacity to be more expensive than on-demand.
Reserved capacity will beat on-demand pretty handily, even with massive overprovisioning. Assuming a 50% target and a 16 hour daily duty cycle, reserved provisioned is 20% the cost of running on-demand.
Thanks, I misspoke about fanning out to multiple streams. Dedicated stream consumers are still a hack. They're also expensive and have limitations as Lambda event sources. Kinesis is a service that needs to be used very carefully. It's riddled with landmines for cost and performance.
Regarding Dynamo: I'll echo plexicle's experience that switching to on-demand was an immense cost savings (these were tables that were not being used often, many of them in dev environments).
> Reserved capacity will beat on-demand pretty handily, even with massive overprovisioning. Assuming a 50% target and a 16 hour daily duty cycle, reserved provisioned is 20% the cost of running on-demand.
I don't have the math on hand at the moment.. but switching to on-demand for us resulted in HUGE savings. We weren't overprovisioned either, but we were kind of spiky (in fact we were underprovisioned for a lot of the day).
Also, single consumer for SQS? Completely wrong. It is a common pattern to autoscale a group of workers based on the number of visible messages in a queue. Hundreds or thousands of workers are not out of the question. SQS has no practical limit to its ability to scale.
It seems like they're saying "Don't use this for generic services, only use it as an integration with other AWS services" to which I would say, "What?"
If I recall, https://acloud.guru for example is hosted entirely using Lambda, API GW, S3 and some other AWS services.
> Avoid using S3 for static web hosting (No HTTPS)
Ya, "avoid" is a strong word. I use S3 for tons of static web hosting, because it is exceptionally cheap and I don't have to worry about patching CentOS/Apache/MySQL/PHP and all the other binaries/daemons that I used to. I know how to, I just would rather not when S3 is a couple pence per GB.
That's great if all my company makes is static websites, but if we have a bunch of other CI/CD for other projects, I'm not going to introduce a new tool for people to use to make an already simple pipeline simpler.
Setting up CI/CD via CodePipeline for a static website is trivial. Beyond that, Amplify Console offers an extremely easy experience deploying static websites, with built in blue/green deployments.
THIS! I too was taken aback by the comparison of SQS and Kinesis. In my experience kinesis is significantly more expensive and significantly harder to scale. The only reason you should use it is if you need the throughput and delivery guarantees.
I'm not sure where the article author is going with this, but the official way to host a static web site on S3 with HTTPS is to use ACM and CloudFront:
This opinion piece has too many inaccurate statements.
- S3 for static assests
- SQS V Kinesis
- Lambda ('Small code that doesn't change....what).
- ELB completely overlooked, if you need session persistence then you need an ELB so for more traditional software you've completely omitted that solution because it's "Legacy".
Missing
- No discussion on when to pick Lambda, Containers or EC2
- No Cloudfront addresses S3 and any API GW or ALB fronted Lambda.
- Where is the use of SNS?
- What if I want to Fan Out on a message/event? complete missed the why SQS V SNS V Kinesis and when to use which.
- No RDS....so if i've got relational data are you advising i should put it in DynamoDB?
- No container context in any way.
Really poor quality of content and advisories needs an urgent update.
Right, the SQS part is very misleading: "SQS has 1 consumer" is technically true (although it's actually one group of competing consumers), but in practice, you get fan-out to zero or more consumers easily, by wiring SNS and SQS together (one SNS topic, multiple attached SQS queues).
> No RDS...
Without also mentioning RDS with DynamoDB, you can't make meaningful choices between them.
It also omits the one big benefit of DynamoDB: single-digit number of milliseconds to read a record by primary key, regardless of scale, due to sharding on that key.
> - ELB completely overlooked, if you need session persistence then you need an ELB so for more traditional software you've completely omitted that solution because it's "Legacy".
Additionally the legacy ELB is great for when you want to do you own TLS/SSL termination with Lets Encrypt. (Especially with kubernetes, ingress, and cert-manager.) Also with the legacy feature, you can enable ProxyProtcol which passes on the public IP info after the NAT.
Kinesis has strict ordering PER SHARD. Which dilutes your throughput, unlike SQS. And it has limitations on throughput in terms of total event size, which are different from SQS's 300 TPS for FIFO.
It's apples and oranges. More significant, this comparison really makes it seem like you should pick Kinesis for a simple starting project rather than SQS, which is the exact opposite of the truth. SQS is the simpler choice for 90% of the cases, and Kinesis is an advanced powerful tool for the other 10%
Eh. The ALB/ELB/NLB section is off the mark for failing to consider inter-AZ costs, perplexingly indicates NLB has something to do with TLS, even more perplexingly suggests NLB as the default choice. Dynamo isn't something I'd suggest at all unless one has well studied the costs and benefits. Kinesis is not much like SQS, and new users might be more interested in EventBridge...
>Kinesis is not much like SQS, and new users might be more interested in EventBridge...
As someone who has done a lot of work with AWS, I have at times in the past picked the wrong one of Kinesis vs SQS for different workloads - largely due to my inexperience with Kafka and similar at the time, and misunderstanding the point of some of the "benefits" that I saw for using Kinesis vs SQS. Most of those times, I would have been better suited to a combo SQS/SNS for fanning out (something a helpful HN reader actually suggested to me!).
Anyway, EventBridge steps in perfectly for most of those use cases, my new projects are using it, and I am considering going back to some old services and reworking them to use EventBridge instead. The rules/filtering parts in particular are going to let me decom several lambdas and save money. Definitely worth people investigating.
There is a tremendous amount of incorrect and misleading information on this post.
The first sentence re: Dynamo discusses the inability to do filtering or sorting which is flat out wrong:
“DynamoDB requires data operations (aggregations, filtering, sorting ..) to be done by your application. → All data needs to be sent over the network.”
Sorting - really? Isn't the first thing to learn about Dynamo the primary key and the sort key? Filtering you can do on the database but you still pay for the reads as if it wasn't filtered. To me Dynamos biggest con is that it's propriety technology and aggregations it's just not really any good for.
Can't agree on the NLB part. Everyone should be using ALB by default, and only reach for NLB in very specific cases (you'll know when), in my humble opinion having worked with AWS for 8 years and being a certified architect.
Enable the dropping of invalid headers, enable HTTP/2, let the ALB terminate the TLS, and you'll see benefits even if your full backend isn't HTTP/2 enabled and you'll have eliminated a whole range of other headaches you no longer need to manage. It's one of the most reliable services AWS has.
Book co-author here. (Not article author.) The recommendation is based on the multi-tenant behavior of NLBs (no need to warm up the LB to handle traffic spikes). If you need any ALB features, use ALB. Otherwise, NLBs give you one less thing to worry about (and slightly cheaper and faster too.)
Does anyone else get the feeling that the author has not used any of these services before and is just writing this from reading AWS documentation and pricing?
> DynamoDB is a non-relational database that has two main features : it’s immediately consistent and ...
that's ... only true if you're exclusively using strongly consistent reads, but if you're exclusively using strongly consistent reads, why are you using DynamoDB to begin with?
Not sure where they came up with that SQS pricing? It is free for first million requests per month. I have been using SQS as a message queue in my app and based on AWS Cost Explorer I have never reached that in a month .. always a $0 charge. Looking at Cloudwatch I never even got to 100K requests in a month and the app is moderately active. It does not generate a ton of messages, maybe hundreds a day and the workers receiving the messages use long poll. Anyway, the costs in the doc seem way off.
Touching only the surface of some AWS services and calling them "the good parts" is really misleading. This does not run the gamut of all AWS services available, skips over a number of other popular databases and developer productivity services, and makes broad generalizations as another commenter points out...
Why does it say that SQS has duplicates, while Kinesis does not ? AWS documentation explicitly says that Kinesis has at least once semantics due to consumer/producer retries.
There are so many plain wrong points here I simply dont know where to start. It feels like the person who wrote it has no real commercial experience or very little experience with described services.
A lesser known limitation of S3 is that it does not have Read-After-Write consistency for overwriting PUTS & DELETE operations. Netflix has reported observing consistency taking hours for outliers. https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction...
Because its a distributed system and it takes time to propagate ?? I think its very well known, ppl just understand its normal and in 99.999% of cases does not matter.
I liked the book, but it fell apart when it got to cloudformation. It hand waved a ton of the actual IaaC part, so instead you end up copying from book, which had some differences from the git repo they shared. Good to get the basics of what to do and best practices for architecting, but incredibly frustrating on actually following and building with cloudformation.
We had a couple of frustrating typos in the first version. Very sorry about that. We fixed them now, and the latest book version is in sync with the GitHub repo.
I'd agree with a few of the other comments here. I read it and can't say that I found it particularly helpful. I've deployed a lot of Node.js apps to Heroku and Digital Ocean, and was looking for something to help me "level up" to AWS. I was left thinking that maybe I should just stick to Heroku.
I really like the summary format and the visual design here. It feels slightly weird to specifically call out Redis in a negative way in the Dynamo summary when no other tools listed give a similar callout to a specific product.
Its nice to have some proper opinions and recommendations. Seems the tech world is so full of articles and blogs that show you how to do something, with very little saying don't use this because...
I've seen this done for new projects, and it works really well. If your data access patterns are truly relational (varied lookup paths) then it is probably not the right tool, but many apps can be modeled in a way that DDB handles well.
This article is wildly inaccurate. My corrections:
- DynamoDb allows for a lot of filtering, and Lambda + DDB Streams can do aggregations that are written as state changes and provide incredible read speeds. Use it if its a good fit, and its only a good fit if you're ready to learn its query language and quirks. When it is a good fit, its a world-class database.
- SQS pricing is $0.0000004 per request [1] with 1 Million monthly messages free.
- SQS + SNS is far more flexible and reliable than Kinesis, plus you're paying for and managing shards by the hour. Its great for streaming analytics data processing and distribution (within the Kinesis suite there is stream processing ala Spark and streaming ingest to Elastic, S3, and Redshift). If you're not doing data engineering on a streaming set of events that can't fit in a spreadsheet, its just not the right fit.
- Lambda is not expensive to use with DynamoDB at all, the pipes are so fast that that 50-100ms is typical for a call and response. Lambda does get gnarly with SQL databases that require connection management and pooling, but they have improved that situation on multiple fronts in the past 6-9 months.
- Also, Lambda ought to be considered the first choice for runtimes for greenfield apps, not just for AWS plumbing. This is strictly opinion and definitely depends on your team and existing codebase, but it makes sense from a cost and scaling standpoint for all kinds of projects.
- NLB should be the first choice only for specialized use cases like UDP services or those that require extraordinary throughput. ALB gives you an incredible feature set with things like HTTPS, Auth, access logging, routing requests to different machines or Lambdas based on url path, canary deployments, and more. Use as much or little as you like. The cost difference between ALB and NLB is negligible for most use cases.
If you're new to AWS and cloud stuff: welcome, the water's fine. Sure its hard at first to grok all the breadth and depth of the offerings, but you get to build with the same tools the largest services on the web depend on. Start small and pick a service, learn it from the console, then cli. Hit up the docs a lot and ask questions, there are lots of people that can help. Do NOT get lead astray by this article or opinion pieces on Medium.
Something like 75% of AWS services are geared at serving enterprises with on-premise data centers that want to move some or all of their workloads to the cloud. Stick with the basics like S3, Lambda, SNS, SQS, and whatever servers you need. Pick a managed service vs servers when you can unless you have solid sysadmin resources and a good reason. Avoid hourly billed services whenever you can until you can get cost management figured out. VPC is daunting unless you're good with networking, but you can avoid it entirely by using the primitives mentioned above. IAM takes some real grokking, but there is almost nothing you can't do with it to scope permissions both internally and externally.
AWS YouTube Channel is full of quality material for all levels. Try to start forming your opinions with those and your own experience instead of unsubstantial articles or random HN comments :)
I really like this 'tl;dr, consider this, start with this rather than that until/unless you really need it' style, was left disappointed it didn't cover more services!
I would absolutely recommend S3 for static web hosting. Just add CloudFront on top if you need HTTPS, it takes like 2 clicks.
If anyone interested, I answered on StackOverflow how to deploy a front end app (React, Vue, etc) to S3: https://stackoverflow.com/q/41250087/652669