Hacker News new | past | comments | ask | show | jobs | submit login
How I use the good parts of AWS (twitter.com/dvassallo)
357 points by DVassallo on July 28, 2019 | hide | past | favorite | 165 comments

I think this is really great for getting stuff up and running as quick as possible, maybe if you're just starting out, but I'm really surprised at some of the things said. Maybe if this is a site that you run on the side.

Doing test/staging on a nano and then pushing to production on an m5? What? Like get ready to troubleshoot random issues completely unrelated to your code. And then equating different AMI's to containers is vastly oversimplifying things.

With docker I can just get up and running on my local box and every dev I share this with has the exact same environment and config.

I can package up those images and send them up to ECS/Kubernetes and not deal with the same headaches. The learning curve is a bit steeper, but absolutely worth it.

I walked the same path here (starting out with single VMs for environments, installing services locally on my laptop) and the headaches down the line are not worth it and it isn't even a case of over-thinking the solution. You don't need to ride the cutting edge but modern tooling saves a ton of time.

It was a PITA migrating our old VM stuff over, but absolutely worth it. If you don't want to deal with maintaining systems there are other solutions mentioned ranging from full PaaS to something like GKE.

> With docker I can just get up and running on my local box and every dev I share this with has the exact same environment and config.

Well, that's the dream. The reality is like that sometimes, but also sometimes like "I ran `docker-compose up -d` like you said, and the local-dynamodb container seems fine, but the app container output 'Cannot find /volumeforsomething' and died...?" and then there's a slack thread for a couple hours about which version of Docker and is it native or Docker for Mac, and whether to try upgrading Docker for Mac first or just `docker volume rm -f volumeforsomething` or even `docker volume prune -f`...

Yeah but some of us have paid those dues over time and don't have that level of issue with Docker any longer.

I used to really struggle getting Docker and docker-compose to do what I want, but after a few years working with it, I'm not blocked by the various volume or network or what-have-yous that used to come up.

Alternative proposal: run your containers in docker-compose environments on an ec2 instance. Everyone wins?

If it takes years to figure out how to do things "properly" with Docker, then maybe it isn't very good.

Maybe, but now that I have, it's very very good.

It's also totally reasonable to say, "It shouldn't have taken you years." It probably shouldn't have!

We started that way. I'm honestly surprised this isn't suggested more often. It allowed our devs to benefit from containers while not totally uprooting our existing Infrastructure.

We've mostly moved to EKS now but we had plenty of time to do it right thanks to this approach.

How would you handle ci/cd with docker compose on an ec2 instance?

Depends on... literally everything about your infrastructure. All kinds of options, pretty much anything is possible here, depending on what you want to do.

My favorite part about using ECS was never being able to deploy because our nodes had jacked up ecs agents on them. So the application was still running but you couldn’t touch it. We switched to GKE last year and haven’t had a single problem with infra since. YMMV but for simpler docker/k8s apps GKE has been so easy in my experience.

ECS Fargate is an option for fully managed container runtime/scheduling on AWS today. We're looking into using this for deploying applications. Anyone here had experience with it?

Fargate is nice but expensive. Run through the numbers for your app.

I’m moving away from ECS and going to EKS. The rationale is k8s has a significantly larger user base and I will benefit from that.

I’m using Fargate for almost everything now, and am never going back. It’s extremely simple and easy, does exactly what it says it does, and saves me a ton of time and headaches. (I use Terraform to set up services, and CircleCI (with the ECS orb) to deploy updates.)

What happens with Fargate if you need to SSH into the underlying instance? I haven't used it myself, and I'm not sure that it truly abstracts away the EC2 instance, but the description of Fargate always made me assume that to be the case.

I assume eatonphil is correct, but to be honest I’ve never even tried, and in my view that’s actually part of the point: full commitment to immutable infrastructure, made really easy. If something needs to change, I tweak the Dockerfile or the task definition or a config file and redeploy. No more SSH.

There is no underlying instance, it's a fully managed service.

I’ve used it for both running a website and backend batch processing, and it has worked great so far.

I am in agreement with everything generally except for CloudFormation and setting up Route53 zones manually.

I use Terraform to setup my infrastructure as service on AWS, including Route53 zones.

After the latest 0.12 upgrade to the language, Terraform is quite a bit more user friendly than CloudFormation, and importantly, not locked down to just Amazon -- it supports multiple clouds and on-premises solutions for declarative orchestration of resources.

My company[1] has written 100's of 1000's of lines of Terraform code as part of a commercially maintained library of prod-grade Terraform modules. We've also used Terraform to setup 100+ teams on AWS with prod-grade infra, and I can confirm that Terraform works very well for robustly launching on AWS.

There's also a fast-growing ecosystem around Terraform: lots of open source modules, automated testing frameworks, and a growing number of tooling solutions. In the early years of Terraform, bugs and stability were major issues. With 0.12, the maturity factor is becoming very compelling.

On a separate note, I'm surprised the author endorses plain old EC2 over Docker. I get the point of "choose boring tech," but it seems like launching your app on EC2 requires a whole bunch of rework around automation that's already done for you in ECS, EKS, or dare I say even Elastic Beanstalk + Docker.

[1] https://gruntwork.io

> I get the point of "choose boring tech," but it seems like launching your app on EC2 requires a whole bunch of rework around automation that's already done for you in ECS, EKS, or dare I say even Elastic Beanstalk + Docker.

I have a service that has to run tasks immediately that can take a while to complete, often in a minute, but occasionally taking hours. (And we're working on the obvious solutions of making it both fast and interruptable, but this is tricky.)

With ECS, if you want the automation to place your services, you run into trouble with these when you're deploying a new version because ECS assumes that tasks can be stopped quickly.

Especially, if ECS has asked a task to stop and the task is taking time to shut down, it can't deploy new work to that host until that process reports that it's done.

Does EKS handle this better? I'm inclined to drop ECS entirely and just bake an AMI because the automation feels like it's getting in the way more than anything.

What are the downsides of using terraform? We are currently in the process of redoing a lot of our infrastructure and are considering Terraform. We had some bad experience in the past with AWS (probably 12-18months ago) and Terraform especially when it comes to manual changes to resources for environments where manual changes for testing purposes are common (think changing security group rules for example). It resulted in us having a broken state and being unable to apply changes to our Terraform deployment without tracing the manual changes and undoing them, so I'm a bit cautious about moving forward with terraform. Have you experienced this recently? I'm intrigued by your comment and would love if you could expand on it.

Ideally, don't allow manual changes to happen. It's not that hard to setup for different environments and testing, so IME, it's not been much of an issue.

However, if you really can't change your ways of working, which I understand if you can't, then try out the "terraform refresh" command. I've been importing state recently, to move some of our own infrastructure over to TF, and have found it to be quite useful for things like manual security group changes. Basically, I'm building things up bit by bit, and when one of my states gets out of sync I've been updating the local config and running that command, which brings the state back in line.

In general, once you get your workflows sorted out and running for a while, you're unlikely to have any major issues with Terraform. Just make sure to use remote states and version them whenever you can (for example, turn on versioning on the S3 bucket if you use S3 as the remote).

Terraform will generally just undo the manual changes for you. The issue is trying to mix manual and automation.

Author here. Fair. I never used Terraform beyond playing with it briefly, so I can’t really comment on the differences. I tend to choose CFN mostly because I’m familiar with it.

I'm in charge of cloud infrastructure at my company, and I've found that sticking with CloudFormation on AWS is more powerful and versatile than TerraForm, since TF isn't always a perfect mapping, where CFN is, with the caveat that it's sometimes months behind what their API can do.

To that end, over the years, I've come across a couple of tools which help write CFN templates and manage rollout keeping cross-stack dependencies in mind. If anyone does a lot of CFN, I highly recommend Troposphere (python abstraction for cloudformation) [1] and Stacker (manager stack rollout via dependency DAG) [2]

1: https://github.com/cloudtools/troposphere 2: https://github.com/cloudtools/stacker

Fwiw terraform also lags behind what the underlying apis can do -- but it's a community effort and you can contribute to add the functionality you need, whereas with CFN you can only ask your TAM (iff you have one!) to ask the internal team(s) (not sure if only CFN team or only the service team or both have to implement something) to prioritize your feature request.

As a counter point: my org users both ansible and cloud formation.

If you're use case is to create 10+ objects in parallel (as a response to a customer action) I really like cloud formation...

We built a system around it, and I love the robustness of CF... It's "fire and forget". We can literally create a virtually unlimited number of stacks in parallel and check on them later to make sure they are successful... Granted this is probably an unusual use case, but we user a stack to represent a customer... In essence... And we're providing a real (temporary) server or set if servers for student/education (so it's not like we can redesign this for something other than making EC2 instances). We can support an influx of 100 users per minute using CF, but using Terraform or ansible we had to spin up temporary worker nodes to run their scripts and could only contact as many stacks as we had worked nodes... So CF let's is have one $30/month server super nearly infinite scalability.. but ansible was requiring us to spin up nodes 1:1 with how many we wanted to construct in parallel and Terraform looked no better.

In addition.... And on this I can't speak for specifically Terraform.. but we also have been burned because community modules changing and breaking backward compatibility... We've been version locked on ansible because version 2.6 broke security groups and version 2.7 broke another module... These are the popular built in but community maintained modules... But people redesigning modules don't know all use cases that could be available in CF (like referencing a security group in another account across a VPC link).. so a rehearsal expression update and presto.. a valid SG reference is no longer usage in that ansible version.

Since we use Infrastructure as Code, obviously we still see there are still benefits. I 100% recommend codifying your system! However, I still like to throw in caution that these solutions are not typically as featured as the native tools. The third party tools are also not always the "correct" solution. In addition, "cloud agnostic" seems like a misleading term, since you can't pick up your Terraform or ansible code and go to another provider. So I think it's more accurate to say "they can do multi-cloud orchestration of components" than "they are cloud agnostic".

Yeah. Terraform is the mature standard for this. To me it feels like OP isn't that familiar with large scale infra work.

Author here. I helped manage thousands servers at Amazon across 20 regions. Obviously Terraform wasn’t an option :)

My company runs AWS cloud all 20 of the non government regions as well, google cloud, and our own data centers around the world. Yeah certain regions (hello eu-north-1) are problemmatic, but we have work arounds for that. Terraform has been fine.

> Obviously Terraform wasn’t an option :)


Because I worked at AWS for the last 8 years :)

Tbh, I don’t think it’s prohibited, but CloudFormation was the default choice.

You probably would have run into security issues with using terraform as each version would have to be vetted and any extensions you use

The Terraform AWS providers are maintained by AWS engineers, but the rest of the code will still need vetting.

Sometimes Terraform even supports new feature earlier than CF. It seems a separate team inside AWS to work on CF.

One of the nicest things about my current infrastructure is terraform + ECS + Route53 + ACM. There's basically no work required to set up a new server behind TLS loadbalancer and get the hostname registered, etc.

Check out CDK

Completely agree on just using a server instead of the various lambda-style systems. All modern languages have great web/app frameworks that make it incredibly easy to build, whether it's a single endpoint or a giant app. The ability to just include whatever code you need and deploy it atomically is massively underrated. Also agreed on scale, servers are fast and cheap and the savings from Lambda rarely pays off in the extended effort. When you do scale, Lambda becomes more expensive anyway.

I do recommend using Docker though. Containers are more portable and easier to deploy and replace on a running server, along with the ability to run multiple instances, mount volumes, setup ports and local networks, and eventually migrate to something like ECS/K8S if you really need it.

Extended effort to push up a lambda function and not have to worry about automating deployment and configuration and patching and monitoring and upgrading and fail over-ring and, yes, scaling? Maybe its just me but I'd rather not see the backend of a server ever again for anything other than development.

That's why I recommend containers, because automating deployment and config would be the same regardless of destination, right? Monitoring also seems to be the same if you're using built-in cloud stuff.

As for scale, I think that's massively overstated. Servers are really fast and most apps aren't anywhere near capacity. Even a $10 digitalocean server is plenty of power, and there's no cold starts. Even YC's advice is to focus on features and dev speed, and worry about scaling when it truly becomes an issue.

But a lambda is just a container that you don’t have to manage.

I don’t get this sort of anti-serverless sentiment. If you have even one good SRE, then it’s an absolute breeze. Writing a lambda function is writing business logic, and almost nothing else. I can’t see how you could possibly do any better in terms of development velocity. I don’t get this ‘testing functions is hard’ trope either. Writing unit test that run on your local is easy.

Your code becomes AWS specific, it is more expensive if you need to scale, it has higher latency, it is harder to test locally etc etc.

IMO lambda is awesome to handle infrastructure automation.

> Your code becomes AWS specific

Not really, aside from the other AWS services you consume (KMS, parameter store...). A cloud function takes an event, executes your business logic, and returns a response. The structure of the event can change slightly, but they’re remarkably portable, and I’ve moved them before. If you’re doing it right, most of your API gateway config will be an OpenAPI spec, and equally portable.

> it is more expensive if you need to scale

This is context specific.

> it has higher latency

Again context specific, and likely not something actually worth caring about.

> it is harder to test locally

This is one I simply cannot understand. You can run your functions locally, they’re just regular code. I’ve never had a problem testing my functions locally. If anything I’d say it’s easier.

There’s upsides and downsides to any architecture design. Serverless models have their downsides, but these anti-serverless discussions tend to miss what the downsides actually are, and kinda strawman a bunch of things that aren’t really.

I’d say the most common downside with serverless is that the persistence layer is immature. If you want to use a document database, it’s great, if you want to use a relational one, you might have to make a few design compromises. But that said, this is something that’s improving pretty quickly.

Focus on features and dev speed by managing a container mesh, the underlying server, system libraries, patching for security, handling a potential spike, solving each problem with your architecture as if it were novel, etc.?

There are times to go serverless and times to avoid it, but with what you're saying you want to optimize for, serverless is the answer.

I guess you can make either one as complicated as you want, but surely just putting a container on a server is rather simple? There's no mesh for a single server, and is a potential spike a realistic concern?

I get your point but I think with products like Knative/Cloud Run everything will converge on a lambda-for-containers model eventually which combines the best of both worlds.

There's still a mesh. Containers need to know which containers to communicate with and across which ports.

If putting a container on a service at scale were simple then services like Lambda would have never been popularized and orchestration frameworks like kubernetes wouldn't exist

I don't think popularity means it's the best option. That's the point of the blog post.

I'm also the first to recommend Kubernetes as soon as you need it as it's a solid platform, but most apps stay small and don't need all that upfront complexity. However I stand by Knative being the best of both, have you had a chance to look at that?

I agree servers and containers to get started...

However, once you are running enough containers, your server bill becomes something you can't ignore...

Making enough servers with enough memory capacity to keep all or containers running with fail over support was $400/month.. and that was just 60 containers (an easy number to hit in micro-services architectures).

And your right, we never got near server capacity by CPU usage .. completely agree there, we ran out of memory to keep the containers in memory ready for use.

> not have to worry about automating deployment How do you validate your code before deploying to production? If you test in environments besides production how do you manage configuration settings for the different environments (i.e. a db connection string)? How do you avoid patching? Almost any code I've written takes dependencies on 3rd party libraries and those will often have security vulnerabilities (usually some time after I wrote the code).

I mean of infrastructure, servers etc. I.e. the days of Puppet and Ansible are behind me. And patching a Lambda is as simple as pushing up a new version. No downtime, restarting services etc. And no patching of OS or Docker containers. Or building them. Configuration is as simple as environment vars or SSM for secrets.

I agree with this 100%. The only logical reason yes is job security and that's about it.

How coupled is your code to your host? You should always design to lift-and-shift between these as easily as possible. That way you keep the benefits and aren't locked in.

Cloud functions products have been incredibly useful trade off for me. Whenever I want to process a lot of small files (read: tens or hundreds of thousands), I'll wrap some JavaScript into a function and it'll magically horizontally scale a function for a huge amount of tasks at the same time.

Has it done in seconds.

What about for bursty workloads? Lambda seems to really shine there.

Yes, they do make it incredibly easy for you to do node, you know what else they do?! They make it completely insecure by default.

If you don't know what I mean by that, then you should probably go with a serverless architecture instead of whatever your company has going right now.

Can you please explain what you do mean by that? Are you talking about node/js apps being insecure by default? I guess that's a fault of the specific app framework, rather than an inherent issue with running on a server.

Our company runs services written in C# running on .NET Core in containers. It's fast, secure, and makes development simple.

It's if you're using on the server side, all packages/dependencies included and running on your nodejs-backend have full access to your network and filesystem, even if it's just a css styles library, if updated it there's no permissions stopping it from grabbing files or monitoring the network.

Instead of wrapping security layers around it ourselves with docker, selinux configs etc, it's safer to let gcp or aws filter that out for you because they're likely to have way better security.

Serverless ( there's still servers/containers ) just means that you don't touch the devops and scaling. You can still have your DB and APIs separately in order to be cost effective.

In your case your servers are not using node on the backend to run the servers thus you don't have this vulnerability.

Perhaps I'm alone here, but I'd say 50% of my time with AWS goes to (attempting to) properly configure the VPC as well as the IAM roles. For the majority of small-ish projects, the hardware isn't nearly as important or difficult as properly configuring access rules between services and outside parties.

The lack of this hassle is a big part of what has attracted me to Azure.

I had a list at one point, not much VPC and IAM Roles, but routing tables, CIDR blocks, NAT Gateways, I can't remember what all else.

I just remember thinking that, while such configuration options might be powerful in the right hands, if you expect your application developers to manage all that, your cloud services are designed at the wrong level of abstraction.

Completely agree. I'm an information security consultant and my organization has been getting into cloud security more and more over the past few years. Fundamentally there's not a big difference in cloud vs on-premise security, the risks are the same and the technology is more or less the same so the steps you take are more or less the same... with that one huge caveat.

Far too often I see clients who move their development and operations teams to the cloud but not their network or IAM/GRC teams. And they expect that because AWS (et al) is the cloud, it's just the DevOps people who need access and they can handle it all. So now you have developers and system administrators doing networking, firewalls, IAM, etc.

There's a big difference between enterprise cloud environments and developer-friendly cloud environments.

This is where GCP shines. The IAM and user system along with the organization/folder/project hierarchy is magnitudes better than anything in Azure or AWS. Makes managing everything from hobby projects to big corps simple.

The only annoying thing is that GCP uses json certification files for service accounts rather than just API key/secret pairs which are more portable, but you can usually avoid dealing with this by setting the account for the underlying VM or managed service.

> I had a list at one point, not much VPC and IAM Roles, but routing tables, CIDR blocks, NAT Gateways Just curious - how is the developer experience different on Azure? There are different types of developers. Some like to have full control and end up building their own cloud infrastructure while others like to have no control and go with something like Heroku (or maybe Azure, but I'm not too familiar with that hence my question).

To set up basic App Services, Azure Functions, Web Jobs and the few other bits that I have been involved with, I have needed to do zero networking or permissions configuration.

Perhaps this is severely less secure by default, but it sure does make for a lower barrier to entry, and for reduced distraction from app development.

I also liked Heroku quite well. Felt at one point like I needed to move to Azure for some technical reason but right now, I can't recall the reason.

Originally in AWS I wanted to send emails from within a Lambda Function, and doing so required me to rework my networking configuration that I felt like I had barely gotten running AND pay ~$30/mo for some additional required networking service. IP Gateway or NAT Gateway or some other such. That was what finally ran me off from AWS.

It is awesome if you use direct connect with on prem, then use on prem as baseline and AWS for spikes.

Don’t you have to do it just once though? I have a CloudFormation template that I wrote in 2014 to set up all VPCs, Subnets, Gateways, Security Groups, etc, and it’s still all valid today.

I think I'd agree if I had a better grasp of networking in general. VPC still requires a solid understanding of networking, which is my biggest challenge.

What part of VPC networking is challenging to you? I'm genuinely interested, because I find the basics of internet (IP) routing and addressing to be very straightforward. Not trivial, certainly, but it's not black magic, and I don't think VPCs require anything esoteric.

It all essentially boils down to some basic operations with binary numbers; the decimal repsresentations of IP addresses are useful shortcuts, but to really get a sense of what's happening, you need to look at the binary forms.

> What part of VPC networking is challenging to you?

It can be a source of friction for someone that is just trying to accomplish a specific goal, in this case it might just be to use AWS to go live.

If building a database-backed prototype is your goal, designing a schema is only tangentially related to that goal. Schema design ability can slow you down if you have limited familiarity with SQL and/or the entity relationship model used by RDBMS.

NoSQL allows you to reach that specific goal faster than with an RDBMS by lowering the friction of creating database records and it does it by flipping the schema model from schema-on-write to schema-on-read which is a huge time saver. This made NoSQL great for building prototypes, especially for devs with little or no familiarity with SQL, because it eliminates the upfront effort of designing a schema (that is certain to change anyway) before you can begin to store records in the database.

In other words, the use of a far more familiar language (JavaScript vs. SQL) and the simpler programming model (document-model vs. entity relationship model) is a big part of why MongoDB was able to earn mind share among devs evaluating NoSQLs databases.

If you're just setting up a prototype environment then yes, sure, set up a VPC and go with the defaults and don't worry overmuch about what you're doing. However, when you start setting up a production environment, the story changes. If you're a developer creating network-facing applications, then I am convinced that at least a basic understanding of computer networking is not optional.

I think the attitude that not everyone needs to know these things leads to people mistakenly thinking they can always have them be someone else's problem which furthermore results (needlessly!) in brittle systems.

I'm not asking for every developer to become a networking expert, but learning at least the very basics about your dependencies (you depend on networking, after all) is what I think a professional should do.

To address your NoSQL example, certainly it can be convenient to prototype with a dynamic schema, but you as a developer should still understand that even if it is dynamic, your data still has a schema and a concrete structure, which will affect your application in various ways. For rapid prototyping it's fine to ignore all this, but again once you start thinking about production, you no longer have the luxury.

In both, NoSQL or cloud providers that provide predefined networking it is easy to stay, but it will vote you in the butt once your project becomes bigger.

Using RDBMS has a stepper learning curve, but organizing your data right will pay back in bundles.

VPC is similar.

It's not that hard to configure it, it is actually very trivial compared to setting on prem data center, you can learn it or how someone who knows this.

Same with RDBMS, learn it (it is not hard) or hire a DBA.

I worked at one of popular car shopping sites, and while they used postgresql they stored car information as a thrift encoded blob. Every car trim was a separate row. A data had to be ETL to a NoSQL database to be extracted, that means to look up all car makes took 7 hours (the ETL happened through AMQP, don't ask) of course it was them stored in dedicated collection, but it was ridiculous. Out of curiosity I wrote code to convert the data from that form to relational database (postgresql) this actually helped to find that the were duplicates. The queries on postgresql database also had lower latency it didn't look like it would have a problem handling the traffic. This data it's also read only for users so it is trivial to scale out.

They also had a database to map up and latitude/longitude to a zip code. The data on MongoDB was taking 30GB and they had to use instances with enough RAM to hold all the data otherwise MongoDB would choke. The same exact data when stored in postgresql using right types (PostGIS and ip4r) took a bit over 600MB.

They also used Solr for storing inventory data.

I'm sure you are shaking your head that you would not do such things, bit of you have NoSQL blinkers on you often will run into problems that aren't problems at all.

I like how you managed to turn a question about VPC networking on AWS into an advertisement for MongoDB!

LOL :-)!

That was never my intention and I have absolutely no connection to them in any way whatsoever.

MongoDB just turned out to be an example I felt a lot of HNers could readily relate to considering some of their guffaws as they grew made front-page like this one: https://stackoverflow.com/questions/16833100/why-does-the-mo...

Access control in vpc is strictly more powerful than in ec2 classic, but it also forces you to deal with a lot of stuff you didn't have to in ec2 classic (availability zones, subnets, routing, gateways).

If vpc is "ec2 v2", I would like to see a v3 that allows you to start off with a naive/simple setup and gracefully transition to the complicated setup of vpc when/if you need to.

Depends on how fast your project grows and evolves. Hub and spoke simplifies a lot of things but not everything.

IAM is a killer. You’re absolutely right. Are GCP permissions easier? I haven’t used them as much.

I’ve stopped worrying about minimising IAM permissions and tend to just use the built in AWS roles for most things now.

Yes, I do the same. For service roles I just use the PowerUser managed role. I don’t see the need to put access control on Amazon’s ability to call it’s own services. I only restrict my EC2 instance profile, since that’s a bit more vulnerable, and I tend to know very precisely what it should have access to.

What if you have a lambda with a full admin role that is not sanitizing its inputs? Or maybe it's using an outdated file parsing library (csv/yaml) with a vulnerability. Now your entire AWS account could potentially be compromised.

Yes, I would use a restricted role for Lambda too. Anything that gets creds in user space gets restricted permissions: EC2, Lambda, ECS, etc.

Some of the AWS built-in roles are an absolute car crash, no idea how they got through review (EMR is a good example). I use the built-in roles by default, but only after thoroughly reviewing the policies, I create my own based on that if I find anything I don’t like.

It’s not restricting amazon’s access that I’m worried about, more privilege escalation (e.g non-constrained iam:PassRole in combination with anything is a good one)

https://github.com/tilfin/aws-extend-switch-roles/blob/maste... may be interesting to you.

The fact that this has to exist is also interesting...

I'll hate on GCP all day long but the one thing it has going for it is the permissions.

Wow, I was gonna say “haven’t tried IAM on other platforms than GCP but I can’t imagine it being more difficult than this”, but damn ...

night and day imho

“Step 1: Forget that all these things exist: Microservices, Lambda, API Gateway, Containers, Kubernetes, Docker.

Anything whose main value proposition is about “ability to scale” will likely trade off your “ability to be agile & survive”. That’s rarely a good trade off.”

I don’t see these as being about scalability. Rather, they’re about fast time to market and ability to change. Moving up the stack and adding managed services such as API Gateway will definetly give your product a better chance of survival.

The disadvantages I’m highlighting are about the restrictions of the Lambda abstraction.

What if you want to send telemetry to a third party? Or use a cache? Or deploy something bigger than 250MB? Or handle WebSockets (without having to read/write state in DDB on every message)? Or buffer something on the filesystem? Or run something for more than 15 mins? etc etc.

How do all those things not impact agility? (In the web app/service space at least.)

I've been working with Lambda for about two years now, I'd like to answer all of your concerns. I don't work for AWS but I do love Lambda and I think it has its place amongst everything else. Previously I built node.js+Postgres apps hosted on DigitalOcean for ~4 years.

>What if you want to send telemetry to a third party?

Can't you do this from your own Lambda code? Sure, your code could crash before it can reach your telemetry service - but isn't this a concern on a server based app as well?

>Or use a cache?

Elasticache or Mongo or whatever NoSQL 3rd party service you want to use works straight from Lambda. If you're talking about caching Lambda responses, you can again add custom code, which you would also have to do in a server environment.

>Or deploy something bigger than 250MB?

Yeah, you're SOL here. 250MB is huge for any non-GUI software and it would take a long time to get set up on Lambda's containers. If you're in this spot, I wholeheartedly recommend ditching Lambda for EC2. However, don't count Lambda out entirely - you can still have it take over repetitive, simple tasks so the server hosting your monster 250MB backend doesn't get overwhelmed!

>Or handle WebSockets (without having to read/write state in DDB on every message)?

I'm sure when PHP introduced sessions there were a bunch of devs complaining about having to maintain a MySQL table for session keys. Ultimately, if you're making a web app, 95% of its routes are probably glorified Excel formulas, so you'd need to pull and push state through a database anyways.

Where else do you store WebSocket state? In memory? What happens when you got millions of connections at once (think slither.io scale)? At the end of the day you have to put it into something that can scale. I'm assuming if you're using Lambda you care about scaling up - otherwise you could literally run your backend on an IoT toaster with a MySQL database hosted on some IoT coffee maker that had "admin" as its root password and nobody would tell the difference. Again, if this is you - Lambda wasn't meant for your use case. Go buy a coffee maker.

>Or buffer something on the filesystem?

I'm not really sure how to answer this one. There is obviously no permanent file system on Lambda unless you count S3 (although I'm sure you know that).

I did some searching and found this: https://stackoverflow.com/a/31660175 If you're talking about uploading huge files, direct upload to S3 seems like your best bet.

>Or run something for more than 15 mins?

Lambda wasn't meant for this. Set up a server and schedule a cron job. I'm guessing if it takes >15min it's probably some kind of backup, statistical analysis, ML model training, database dump parsing, yadda yadda yadda...Lambda is for handling events. Everything I mentioned seems like an internal business operation the users have no part in, so that also seems like a good candidate for just having one server instance floating around and throwing all your odd long jobs onto it.

I haven't used Lambda Step Functions, but I vaguely recall hearing something about being able to run long tasks with those? Not sure. I wouldn't bother, though, I'd just head straight for a server (the example Amazon gives is starting a job to retrieve a specific item in a warehouse, sending an order to an inbox, and waiting for a warehouse worker to mark the item as retrieved...I don't know why Amazon chose that specific example, but it sounds like they have a very specific target audience!)

>How do all those things not impact agility?

I'm of the opinion that tools do not impact agility, decisions do. If you decide to use Lambda in a situation where a server would prevail, you're wasting time on the wrong thing. If you decide to a use a server in a situation where Lambda would prevail, you're...not really doing anything wrong, I think. It'll likely cost you more than a Lambda function but those numbers only start to matter once you actually have to care about scale.

Like I said, it's ultimately about what you're trying to do. Don't put a square hole through a rectangular screw, or something like that.

Thanks for writing and explaining all of that! I’m sure one way or another there’s a workaround for everything. But my point was exactly that. The fact that you need workarounds hinders agility. I don’t disagree with the benefits.

BTW, about this “What happens when you got millions of connections at once (think slither.io scale)? At the end of the day you have to put it into something that can scale.”

API Gateway has a hard limit of 500 WebSocket connections per second. It can’t be increased! https://docs.aws.amazon.com/apigateway/latest/developerguide... —- That’s about the capacity of 1 C5.4XL instance :)

The whole scalability argument of API Gateway and Lambda is highly overrated IMO. There are all sorts of soft and hard limits, and you still have to monitor utilization of concurrency rate and invocation frequency and manually request limit increases when approaching them. Doesn’t sound much different than using EC2.

> API Gateway has a hard limit of 500 WebSocket connections per second

That particular limit is new connections per second, not total connections per second. Starting from zero, you could have 1.8M connections after an hour (per account, per region).

> Doesn’t sound much different than using EC2.

I don't understand when people see "server less" and start thinking it is without realizing that all what lambda is still using EC2 instances running on your behalf, it just hides that away from you. You should only use lambdas when your use case would reduce the cost (lambdas charge per invocation) which means it is for services with low number of requests. Otherwise it will end up being more expensive.

I can’t really imagine going back to scripting servers remotely to boot up without docker. It sucked. And if you’ve got docker you want something like ECS, cause why manage your own servers? It’s a bit rough to learn but man life sucked before PaaS. You had to build your own. Now I write a simple install script for docker on my machine, edit a menu and as many servers as I want run that app.

Otherwise I dig.

Author here. Docker is the only one in that list I’m on the fence on. However, I feel that the EC2 AMI can be the equivalent of the container image, and Docker would only be adding another layer and another OS to deal with. Sure, the AMI is not as portable as a docker image, but all in all I prefer working directly on the EC2 VM just for the sake of reducing layers.

Not even in the same universe. Containers using some managed k8s (EKS, or GKE) cluster with helm is ridiculously easier to manage a dynamic sets of applications running than bare amis. I have a k8s cluster that runs across >1k ec2 instances where I have thousand of containers running various jobs. I can apply/delete any massive deployment across that literally in seconds. with a single command.

The layers make it a lot lot easier here.

My perspective is shaped by my experience developing and maintaining mostly monolithic apps on fleets of servers. I never had the problem of having to manage dozens/hundreds of independently deployable jobs/services. And I hope I never will :) I can see how K8 can be super useful there, but it’s a situation I try to avoid :)

You know, that makes sense. I was missing that context.

If you ever do, please give container orchestration a look. It makes it honestly easy.

Weird -- the services you're utilizing are developed exactly the way you're going to avoid!

When developing a monolithic application your choices make sense. When you're taking an approach designed to allow for feature development by multiple teams in an Enterprise setting they start to be a bottleneck.

I’ve seen AWS from inside and I helped build a small part of it. Each individual product is closer to a monolith than a constellation of indispensably deployable jobs/services. The 3 products I worked on were definitely monoliths because we intentionally wanted them to be so. Obviously I don’t mean that they didn’t horizontally scale. Only that the entire app could run from 1 process, the build system produced 1 build artifact, every deployment deploys to all server, and there’s only 1 version of the app running in prod.

K8S and containers in general encourage people to split applications into micro services even if there is no good reason for it.

What about development-wise? One of the big advantages I’ve found with Docker (at least on my teams) is that we’ve mostly eliminated discussions about things breaking on local dev machines due to environment differences. When things break it’s almost always due to code and configuration within our application itself rather than some OS package being a wrong version or something similar. How do you achieve that just using EC2 AMIs without maintaining remote EC2 instances for each developer and keeping them up to date?

That was exactly the reason I was very close to adopting Docker recently. But then I realized it was a very rare problem, in my experience. Maybe it’s because I always worked in small teams of <= 10 people, or maybe the stuff I work on tends to not cause local dev problems frequently. But I definitely see how Docker can help a lot there.

Even on teams of small people (< 10 people) you can quickly run into issues if you don't develop in the same environment you deploy. For example at one company I worked for, developers would use pypi packages but the operations team required us to deploy using the version of the python package available on the Debian our hosts ran. This would create not infrequent discrepancies that certainly could have been dealt with a number of ways, but would have been pretty simple if we could just bundle a single environment and use it locally and in production. Vagrant and VirtualBox VMs are another alternative (that we did use too!) but Docker allows you to get even closer more easily if teams can agree on it.

Being in operations and as a dev, I can tell you that is something operations people are doing wrong. They make their own life harder. What about dependencies that are not in the repo? Will they be building their own packages? What if one of packages has a bug discovered in production, will they create their own package with a bugfix risking breaking system components that might depend on the old version. What when the OS version is EOL to migrate now your app needs to be tested with the new OS, because it is tied to it. I have seen an upgrade from Python 2.6 to 2.7 being unnecessarily painful because of this, even though Python 2.6 code can run on Python 2.7 without changes.

You absolutely want your application to be disconnected from the system you running it on. Languages like Python or Java were designed to run the same not only on different OS but also on different architecture, why would you be ruining that?

With Python I highly recommend if you use redhat or CentOS to use IUS repo which let's you choose exact Python version you want to use. Then create a virtualenv with the exact dependencies you need. If your ops are standing on the way of this they are misinformed, because what they are suggesting ads a lot of unnecessary work for themselves as well with no tangible gain.

The AMI build process is also incredibly slow compared to docker. When worked into the build process, creating AMIs probably accounts for an extra ~30 minutes for every build, as you have to spin up a separate ec2 instance to create the AMI, snapshot it, copy full AMIs around between regions...

I don’t build AMIs on every build. In fact, nowadays, I just use the latest Amazon Linux AMI and set up the necessary stuff on instance boot using a UserData script. Example: https://github.com/encrypted-dev/proof-of-concept/blob/af60b...

That certainly works and is faster. But doesn't it turn UserData scripts into (part of) a provisioning management system? For instance in that example, you could have a machine die in the middle of the night, the autoscaling group replace it, and end up with a different version of node.js than the one you had before. Or the package server goes down and you can't bring up new instances at all.

I suppose you could bake everything into the AMI except your code, and hardcode the git sha into UserData to ensure your builds are reproducible. It just seems like it might get complex when coordinating lock-step code/dependency changes.

Don't be discouraged by the docker love in tech right now. I'm not saying it's a fad (it's basically critical for local dev these days), but nothing will replace the joy of simplicity and empowerment of working with the Linux virtual machine directly instead of through endless PaaS or compose or k8s abstractions.

With Fargate you don’t need Kubernetes or any crazy abstractions. I went from not knowing anything about Docker to setting up an autoscaling (didn’t need autoscaling, I needed HA) Node/Express microservice with CloudFormation within two days.

For me, modern Linux VMs keep changing with all the systemd stuff. It’s like every time I try to do something basic like configuring networking I have to lookup whatever they’re doing this week. Fargate and k8s seem better documented nowadays and just runs my app.

That’s the second part of the story for me. From 2000 until last year, all of my development and deployments have been on Windows.

I just started developing with .Net Core, Node, and Python last year. My “Linux deployments” were all Lambda. The Node/Express app used the lambda proxy integration.

I needed to move past the lambda limitations - 6MB request body and cold starts - and I didn’t want to introduce Linux servers into a Windows shop or host Node on a Windows server and I didn’t know Linux well enough myself. I was able to deploy a Linux Docker Container on Fargate and have all of the console output to go to Cloudwatch.

If we assume there's a way to quickly build an image for a vm (like docker build), are there workflows to quickly spin up and throw away those images?

There is packer I highly recommend to use it together with salt module to do all the configuration.

There is also NixOS together with NixOps it has some steep learning curve, but I think it is worth it. You edit configuration.nix which is like Dockerfile, and then you can either use standard AMI and provide the configuration, that will result in setup similar to salt, ansible, chef etc. You can also use NixOps to provision such machine, you can then place /etc/NIXOS_LUSTRATE file which removes state after next reboot, stop it and generate AMI that essentially emulates what packer is doing. You can also use nix build to generate an image file and upload that to AWS.

NixOps is also a great tool to test, you can deploy your configuration to local virtual box, or public cloud provider.

From a thread a while ago I know of these that work on Dockerfiles but build fully bootable disk images:





I think Packer (https://www.packer.io/) might what you want there.


1. That's docker. 2. I should have specified "in local development".

I started playing with Nix and recently also tried NixOS and NixOps and I must say that it actually did correctly what other tools (salt, chef, ansible, docker, vagrant, packer and others) failed to do.

Nix approach is to build everything from ground up without depending on anything outside of it. It caches results so you don't have to rebuild the whole system when building your application. This approach makes fully reproducible builds[1], because the entire environment is known.

Nix by itself can be used for deployments, you can install it on any machine that has nix installed and don't need to worry about any other dependencies, since nix will take care of all of them. It can generate a docker image of you need it, and it will only contain your application with is dependencies. You can use nix-shell to define CDE with all developer tools installed with exact same versions that way developers only need to have nix installed and nix will take care of all other dependencies you need.

NixOS takes what Nix does and takes it one step further and uses a configuration that similarly describes an entire operating system. The single configuration.nix describes what your system supposed to have installed, and configured. You can either deploy that and have nix configure machine on boot, configure machine create /etc/NIXOS_LUSTRATE file which removes all state on next reboot and create AMI out of it (equivalent to what packer does). Or have nix generate an image file and upload that to AWS.

NixOps supposed to be for deployments, but to me it replaces vagrant and docker you can create configuration.nix and deploy it with local vbox, ec2 and other cloud providers. The great thing is that your configuration file will just work fine no matter which provider you use.

There are some rough edges though, for example I needed to update NixOps to use boto3 so it works with assume role and MFA, I hope it will be merged soon.

I believe the issue is that what they are doing is very ambitious and they have limited number of developers to handle all of the work, but from all devops tooling I used they seem to have the right approach. They provide the immutability reproducibility at the right level and doing it the right way (declarative (through use of pure, functional, lazily evaluated language starting with a fully known state) vs an iterative language with a partially known state)

[1] your need to pin your build down to specific version of nixpkgs

Very interesting point.

But what about maintenance?

Working in a place where we have hundreds of small apps, the issue of maintaining umpteen servers is a great pain. Docker reduces the footprint of what has to be maintained substantially.

Stop me if I'm wrong, but there is technically nothing blocking AWS to have the same management in EC2. Scaling docker containers is the same allocation problem than scaling VMs, so it could be done with EC2 too.

Maybe AWS spins up an actual EC2 instance for each container, it's an implementation detail.

EC2 is a legacy tool with different abstractions, but if you replace "VM" by "hypervisor" and "container" by "microkernel-app", you could do the same thing with one less layer and the same tech we always used

Yes I can see that. I was never in a situation of maintaining more than a couple of apps, so my perspective might be shaped through that.

Curious, what does "maintenance" mean for you?

Patching servers can be part of it. Also fixing minor issues when no major development is being paid for.

FWIW, there is tooling in place such that you can just recycle the hosts in your fleet and they'll be brought up with the latest image. so you can become just another scheduled and automated event.

Don't quite follow the latter.

I don't agree life sucked before PaaS. I've been responsible for responding to far too many production issues where the PaaS tooling got in the way of rapid response.

PaaS is a problem for me because so many developers don't understand (or own) the tradeoffs of choosing PaaS, so i have to constantly have the "no-PaaS" argument to people who have never dealt with the responsibility of uptime response and can't fathom why I'd want to "do more work when the PaaS does it for you"

I think if you start a new project, PaaS is definitely the way to go. You don't have that many infra people available, you don't have time to deal with building your own deployment details, etc. Once you have time, staff, and real users - move from PaaS to something you can prove is both a better fit and more reliable than (for example) Heroku.

But in the meantime? PaaS services solved a lot of issues for you for less money than it would take you to interview a sysops person. Use them as much as you can, whether it's complete app hosting, abstracted container deployment, or something else.

There's a third option: as you grow in your career as a developer, learn the ops stuff so you can make good choices from square one.

If your site is simple sure PaaS will work, but anything advanced and even simple things are becoming complicated.

Anything whose main value proposition is about “ability to scale” will likely trade off your “ability to be agile & survive”. That’s rarely a good trade off.

Interesting that he sees micromanaging infra on AWS as more agile than using managed services.

Also, serverless systems don't cost you money when you(r customers) don't use them, so that's a huge plus on survival.

Finally, code you write is always a liability, if you got a good architecture (hexagonal, etc.) you can just swap out one service that saved you time in the past with another service that will save you time in the future.

>serverless systems don't cost you money when you(r customers) don't use them

All the horror stories I see surrounding huge surprise Lambda bills always bring me back to this point. If I have to pay $5/mo for a server I only use for two hours once every few months, that's something that should go on Lambda. If it's something that I'm using constantly, all day every day, a server will be cheaper.

If I only use my car once every few weeks, Uber makes a lot of sense. If I use it every day back and forth to work and the grocery store, Uber's gonna be a lot more expensive.

Lambda is for small tasks that don't execute often. And using it right can save startups gobsmackingly large amounts of money.

Yes, it comes all down to risk assessment.

Maybe using serverless technology is too hard for your corp, because you don't have the skills, so it could lead to problems in the future (surprise Lambda bills).

But it could also be that your competitors get a huge advantage by investing in the serverless paradigm and run you away in the future.

To me that Twitter thread sounded too much like a guy who invested in some tech over the last 11 years and now tries to convince potential customers of him to use the tech he knows about.

He could be right, he could be wrong. I don't know. I started back-end development with serverless, so I'm biased in the other direction, haha.

Yeah, kind of made me cringe a bit (considering his experience).

Yeah for launching startup level traffic I totally agree, lambda feels like the right arch at the right cost. Also it will engrain setups that, like you said, you can swap out. You’ll need an alb / api gateway, you’ll need to get logs flowing through cloudwatch or some other non-host logging mechanisms. Transferring onto these kinds of architectures from not-these-architectures has been a royal pain and a time drain in my experience.

Here's the thread link more readably: https://tttthreads.com/thread/1154516910265884672.html

Or use Google Cloud, which has 90% good parts. Documentation can be a bit pain, but the services themselves are rocksolid. There are no 3/4 queuing services, just one. GKE rocks! Cloud Console is a breath of fresher, compared to AWS. Cloud Shell makes it easy to bypass firewalls for logging into instances and no messing with public keys. It's all managed for you. Use firebase if you are looking specifically for Web and Mobile Apps. Scaling to millions of users or Petabytes of data is no big deal and you don't have to rearchitect everytime your customer base grows by 10x.

I keep reading that Google Cloud is excellent. It is definitely enticing.

That being said, for me, the fact that it's Google is a big detractor. Non existent support, the worst documentation last I checked, and I read a "my account was closed unexpectedly" story every now and then.

Then there's the privacy angle, but the alternative companies are hardly bastions of user privacy themselves.

I worry that I will overcome my fears and start using it one day, only to years later try to undo that decision and wean off it like with Android, Search, Chrome, and (not yet) Gmail/GSuite.

Google's intro/free stuff is by far the best of the lot so def give it a try. e.g. Free VM 24/7 forever.

>Then there's the privacy angle

Don't think that's a big issue at dev level. They're interest in billions of users not tracking thousands of esoteric devs doing work stuff that has limited ad resale value.

>the worst documentation last I checked

Seemed fine to me. Probably better than azure

The google services are just way easier to use but once you’ve got expert level proficiency at AWS it’s tough to let go.

I have to use both. I find google easier to do simple things.. but I find it lacks some of the flexibility AWS has for less simple things.

Well, that and Google's managed k8s solution was down for multiple days awhile back when I was doing a comparison. Another reason I use EKS atm.... despite I think GKE is a bit better.

I don't know when this happened. Give GKE a try, it's really amazing. Blows EKS out of water. As per the flexibility is concerned, once you learn how to use Google Services, you can get the flexibility with simplicity. AWS services are too complex and even things like billing require a PhD degree in finance to optimize for anything non-trivial.

The lock-in with not just all your infrastructure with one provider but also all the skills and proficiencies with one provider's stack is a high risk that I ranted about recently: https://blog.flurdy.com/2019/06/all-eggs-in-one-cloud-basket...

Having your teams doing little POCs with other providers will ease your project/company cloud bus-factor significantly.

Can you provide examples, in what way they are easier

Step one: don't use Twitter for this sort of thing

Thank You! I thought I was the only one that hated the format of these things. Why not writing a proper article and linking it on Twitter? It would be way better

I tend to disagree with the EC2 sentiment. Any cost savings using VMs over a higher abstraction will likely be wiped out by the time wasted becoming a sysadmin unless you just like doing sysadmin stuff (patching, access management, config)

This is a very solid point and something to consider inside of the entire infrastructure planning stage. Opportunity cost could be huge, or net zero depending on your needs.

I find the characterization of RDS Aurora as not proven, or unstable, as odd.

I thought it looked like one of the more mature-seeming choices on the market.

Any thoughts?

From the point of view of someone who's used Aurora Postgres since it came out, though you sometimes are exposed to bugs, AWS support has always been great and we never faced anything super serious. It's been almost a year since we had any problems though. This on a somewhat large aurora instance at around 5TB, so it's already representative for a lot of people.

Right now if you use Aurora Postgresql 9.6 you don't have an easy way to migrate to 10, and 11 is not even available. They supposedly are working on a solution, but won't disclose when it will be available.

It's way less mature than vanilla MySQL or postgresql.

The CDK[0] will hopefully help with the cloudformation story.

[0] https://github.com/aws/aws-cdk

And if you want something even easier (albeit with more fixed/upfront costs) just use Heroku.

I agree. I have no direct experience with Heroku so I can’t comment on the experience/restrictions, but if a PaaS works well for what you’re doing, I’d go for it.

I was also thinking the same, especially considering that you start with this as your main reason:

"Anything whose main value proposition is about “ability to scale” will likely trade off your “ability to be agile & survive”."

I'm a core dev of stacker, it's worth taking a look at if you maintain Cloudformation templates. Makes life a lot better.

At Remind (remind.com) we have nearly 600 separate Cloudformation stacks which build and maintain our stage and prod environments. This would be insane without stacker.

As for this tweet, I'm fairly confident he is talking about a side projects or early startups and I agree with most of what he said.

Personal, I don't use AWS for side projects, I use offerings from Linode, Digital Ocean, and vultr. They are cheaper and scale up vertically with a click of a button, which is really what you need for the first couple of years, unless you hit the growth/scaling lottery, which isn't typically the case.

For example, over the last two weekends I was able to use Digital Ocean Spaces (alternative to AWS S3) to build my wife a secure digital downloads store.

All uploads and downloads use presigned POST and GET urls created via Boto3! I was suprised how perfect Digital Ocean implemented S3's API in their Spaces offering.


Lambda is absolutely essential for glueing together Amazon services and many Amazon articles recommend it (for instance, getting your ASG to drain ECS instances before scaling down). It's also helpful for building alerting/monitoring workflows since Cloudwatch/SNS is pretty simplistic on its own

I think the sentiment is more "don't build your entire app on lambda"

I think "don't have any customer facing Lamdas" is a good rule of thumb. It's great for glue code, if-this-then-that code and cron jobs.

> But Autoscaling is still useful. Think of it as a tool to help you spin up or replace instances according to a template. If you have a bad host, you can just terminate it and AS will replace it with an identical one (hopefully healthy) in a couple of minutes.

I really started to enjoy AWS when I got into the habit of deploying all EC2 instances using Autoscaling groups. Even single node instances and configurations that don't need any scaling, just everything. Autoscaling is free and it forces you into the immutable/disposable infrastructure paradigm, which just makes administration of the nodes just so much easier. And for all stateful stuff there is RDS, S3, dynamo, SQS, etc.

> Forget that all these things exist: ... Containers, Kubernetes, Docker.

Not sure how this guy can give this advice.

containers make rapid development so much easier. There's no distinction between your dev environment and your prod environment except for a couple properties/env variables and some scaling factors.

Also "no" to cloudformation and "yes" to terraform.

And if I were to write a simple application, I'd probably try with lambda next time around.

I don't think is good advice in general. OP is obviously used to doing things in a certain way and good for him for working the way he is most productive. But looking at things objectively, everything has a learning a curve and once you get past that, Lambda is way more efficient even when working alone due to the agility it allows and time saved on maintenance.

All depends on what your goal is. Are you looking for large scale out? Or are you iterating fast and don't want to mess with managing databases, CDN's, KV stores etc...? Note: A version of premature optimization is premature scaling. So beware of those offerings that require you to use a certain technology to enable AWS scale-out it can slow you down.

Why not just use Linode or Digital Ocean at this point?

I wouldn’t recommend using single EC2 instances in production ever, but the rest seems solid.

> 1/25

Is there a more user-hostile webapp than Twitter? I think not.

Sort of related to this, but can someone explain to me why Zappa[1] isn't a bigger deal? It completely delivers, as far as I can tell, on the promise of deploying a WSGI app into a Lambda in a way that requires minimal configuration.

And yet, the project seems to have stagnated somewhat. Maybe it just did everything it set out to do? I'm not sure, but when people say "forget about Lambda because you'll have to roll your own shit" I wonder if they've heard of Zappa or not.

[1] https://github.com/Miserlou/Zappa

I've tried to use Zappa for a non-trivial personal project to run a Django app and I found:

- Super nice, low config API

- easy to deploy

- great out-of-the-box support for async tasks (no Celery! woo!)


- It was a nightmare getting certain Python libraries (eg. Pillow, psycopg) to work in the AWS Lambda environment

- It really sucked having to deploy to AWS in order to debug issues which cannot be reproduced locally (eg. library)

- It seems hard to get away from using AWS tools to observe your code in prod (eg. CloudWatch for logs)

- I still needed a database to maintain state, DynamoDB didn't work with Django's ORM and was surprisingly expensive, and if I'm going to shell out $10/mo for a Postgres RDS instance then I may as well run the whole thing on EC2 anyway

I think Zappa is a really nice tool for some niche use cases, and I'd definitely turn to it if I needed to stand up some small, stateless serverless service, but I would hate to support and debug it as a web app in prod.

project: https://memories.ninja

zappa version: https://github.com/MattSegal/family-photos/tree/lambda-hosti...

ec2 version: https://github.com/MattSegal/family-photos


Zappa got me out of a massive hole several times. I'm not a webdev, and we arn't a web company, so we don't have a web deployment pipeline.

Having Zappa makes our life very simple and cheap. The best part is that local and "prod" are equally easy to spin up.

The simple integration to DNS, SSL, api gateway, is a wonder.

The only downside is the permissions management is a massive pain, and assumes that you have admin permissions.

I tried to use Zappa and my experience was that it didn't support tires of callbacks I needed it to. After I patched it up to do what I want I realized that handling calls directly was actually simpler than using Zappa and Zappa made my application huge.

I understand that primary use of Zappa is a web application, but I have reservation about that as well. The "serverless" applications aren't really server less, they just have VMs spun on demand and it's only good idea when you have a service that most of the time is idle.

Docker is bullshit. AWS images are alright IME. Spinning multiple large instances will give you more bang for the buck than dealing with tiny docker containers.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact