The Segment Stack is a good showcase of best-practices for AWS ... only one thing I didn't understand, why a NAT instance is needed in each subnet? https://github.com/segmentio/stack/blob/master/vpc/main.tf#L... It seems a little wasteful. Couldn't you just allow traffic between the subnets?
Played with an external tool call [terraforming] that creates a terraform setup from your existing boxes, but that, too, was way more than I wanted.
Terraform is a very neat tool, but learning the AWS ecosystem _and_ this third-party tool that has its own very strong set of best practices was too much for me. If you're experienced, and especially if you're already using CloudFormation heavily, you can probably get a lot of mileage out of terraform.
Doing a quick search to make sure I'm not crazy, it looks like terraform 0.7 will start to support importing, so that's something: https://github.com/hashicorp/terraform/issues/581
[tfstate file]: https://www.terraform.io/docs/state/remote
- Terraform vs. Ansible:
/u/mitchellh (I think that's his HN name) started terraform and explained his "Terraform vs. Alternatives" in a Google Groups Post.
In addition to that post, Terraform also has nice support of multiple providers, so you can mix AWS, Azure, and others in one set of templates.
In general, Terraform is doing a lot of cutting-edge thinking with orchestration tools and represents IMO a best-of-breed approach.
That being said, there are a lot of bugs, especially around eventual consistency issues in AWS. They're getting better, and most of them are recoverable, though.
> In general, Terraform is doing a lot of cutting-edge thinking with orchestration tools and represents IMO a best-of-breed approach.
Terraform isn't really doing anything that snazzy compared to Cloudformation  (an AWS tool) unless you're also orchestrating in concert with non-AWS services.
I'd disagree with that. Just take a look at the Terraform Changelog for some of the latest & greatest.
For example, the concept of "Data sources" is pretty cool. Basically, you can reference pre-existing data, potentially do rich queries against it, and get a read-only value back. For example, you can use a Data source to find the latest AMI for a given search string.
CloudFormation has a concept of Custom Resources which could achieve similar functionality, but not without a lot of hassle.
Terraform has also been building up a rich language of interpolation functions  that can be used for string replacement, hash generation, and even arithmetic.
There's a lot more, too. I think it goes well beyond "cloud-agnostic."
Doing devops, I prefer boring and reliable over cool.
This excellent article  does a nice job of talking about why it's important to keep your tfstate files small and isolated. Since we started doing that, working with Terraform has been much nicer (and safer!).
From the AWS documentation :
"Each NAT gateway is created in a specific Availability Zone and implemented with redundancy in that zone. You have a limit on the number of NAT gateways you can create in an Availability Zone. For more information, see Amazon VPC Limits.
If you have resources in multiple Availability Zones and they share one NAT gateway, in the event that the NAT gateway’s Availability Zone is down, resources in the other Availability Zones lose Internet access. To create an Availability Zone-independent architecture, create a NAT gateway in each Availability Zone and configure your routing to ensure that resources use the NAT gateway in the same Availability Zone."
I don't think it's strictly needed, but it's best practice because instances in each AZ remain independent from failures in other AZs. Were the AZ with the single NAT to go down, then instances in the other AZs wouldn't be able to communicate outside the VPC (ie. to the rest of the internet)
There's also a side benefit of much lower latency using a NAT in the same AZ vs going across AZ (unscientific benchmark is 0.1ms in same AZ vs 0.3ms across AZ)
As you say, it's not strictly needed. It really depends on your use case. If your use-case suffers for the NAT being down, then you need HA on it. If it can wait, then no. With the new managed NAT in AWS, you may as well go with that if you need HA - it's cheapest is roughly twice the price of a micro anyway, and it's one less bit of clutter in your instance list.
Oh I see. Though assuming app servers are wired up behind an ELB, the service will only be partially degraded (no app server outbound connectivity, like you said).
The one-NAT per AZ is a more robust design but at $30/mo each (for NAT gateway) seems expensive ;) Even at $10/mo (small do-it-yourself NAT instance) it's not free.
So, as you change your config, terraform will come up with a plan to move from your present AWS state to the desired AWS state.
Terraform allows you to easily express the dependencies in your infrastructure, and also knows about the various dependencies that naturally come with AWS services, which allows it to make better decisions in its planning compared to tools like ansible, chef, etc.
I've been working on an open source stack that is somewhat similar and serves the same purpose. Allowing startups to bootstrap their stack on AWS.
Great work on open sourcing this. I will see what I can do to contribute.
I am developing everything "in the open" so feel free to contribute / ask questions.
I spent the last 3 months collecting usage information from startups using the-startup-stack and about to make another effort to commit all that knowledge back into the project.
"Not Invented Here" is a big problem in this industry in general. We tend to be too comfortable cobbling a solution together from stone knives and bearskins, rather than using someone else's solution (and paying for it). If you are running a business, though, you shouldn't be building things that aren't what you sell, unless you really cannot otherwise buy them.
I'd add the above. Otherwise AWS would have never been built.
Back in the dot-com days, I worked on a project to build some functionality in-house that we could easily have bought off the shelf. The engineers argued that we'd save the company a million dollars. But frankly, we just wanted to do it because it was badass. And it turned out our solution would actually have cost us more per-system than the commercial solution we sneered at (hardware costs, not just development cost). Six man-months of engineering when everyone knew we were racing the clock before the money ran out? Absolute stupidity.
If I were CEO/CTO and caught wind of such a project, I'd tell people that if they lifted a finger on it, they'd be fired. But that's a very different perspective than I had back then. Risking the very existence of what could have been a very big company in order to someday save a million dollars? Feh. (Of course, no one stopped us, because the CTO was just head nerd, and the money execs were busy fundraising rather than supervising)
I understand your point and it's valid, but opportunities like this are juicy steaks for Oracle and IBM sales guys. I think the best possible outcome is to involve engineers, solicit a reasonable internal cost bid, then invite external contracts, then pick what makes sense.
"We can just buy it" is what funneled money that could better be spent elsewhere into the coffers of legacy infrastructure companies for decades.
I cannot begin to explain how much extra code we had to write to deal with the lack of transactions in MySQL! Sybase would have saved us a ton of time and risk.
Thanks to economies of scale, it's worthwhile for Amazon to develop new features as full-scale products, which leads to truly amazing things like Lambda. No one in their right mind would develop Lambda in-house, but for AWS, it makes a ton of sense.
Using the "aws specific" features that are always touted as the reason it's expensive, have a secondary "invisible" cost to them that many don't or won't recognise: you're tying your business directly to a single provider, and one that has a history of predatory pricing to gain market control.
You don't need to buy physical servers to not use aws. There's lots of middle ground that can be achieved for equal or lower costs, with less lock-in and more control for your business.
Also I wouldn't even consider DigitalOcean a competitor to AWS. Not even close.
If you had a perfectly clean install of your distro of choice, could you write a shell script that could build your server from scratch?
If you can answer Yes to that (and you should), then you can build an if/then-heavy shell script that works with each of the APIs to create a perfectly clean install of your favorite distro.
One script to create the clean slate machine.
One script to build what you need.
We have 9 "pods" around the globe, each with API servers (Java/Tomcat/Apache), static servers (Varnish), a MySQL slave, and an HAProxy Maître D'. With close to 60 servers, our monthly bills are less than $1,000, and we haven't had downtime in years. Spinning up a new server is just: sh build.sh atl api 1.
Feel free to ping me if you want any more details: firstname.lastname@example.org.
do you want the pain, or do you want the money, that's the basic proposition here. when you start looking at 25, 50, 150k/month of savings by doing stuff yourself, the choice becomes much clearer. in many cases i've seen, you could theoretically hire an entire team to take care of the stuff that AWS does for you, and still come out ahead.
The thing that's more difficult to do on physical hardware, obviously, is scaling down every day if your peak load is some insane multiple of your base load for a 24h cycle. That's where AWS makes a lot of sense.
I don't think there's an obvious 'best' answer -- it depends on what you need.
- We wrote our own terraform testing framework to validate that every change to our modules doesn't break functionality
- We actively update our modules based on feedback from new client engagements
- We provide commercial support for each module
- We combine our modules with consulting and training as needed
And of course, there are many similarities
- We give 100% of the source code to our clients
- Everything runs in the client's AWS account
- Everything is self-documented, modularized, and can be combined/composed as the needs of different teams require
I didn't mean for this to be a shameless plug; more just that I found it interesting to compare the open source vs. commercial approach to solving this same problem. Props to the Segment team for sharing this.
Building an open source solution with a "pro" level all setup included in it. I had a very hard time quantifying how much companies will pay for this and whether they even will.
How I see it, you are either on Heroku (or other similar) and you don't care about anything except `git push` or you have a full blown stack.
I know the middle ground between the two is where the stack is but I just couldn't figure out how many companies are actually experiencing those difficulties and how much are they willing to pay for the help.
Since it's not a startup for me, it's just an open source project I decided not to worry about it, but would still be nice to get some input.
This seems custom to me, maybe Sketch or something.