
Change Sets for AWS CloudFormation - wallflower
https://aws.amazon.com/blogs/aws/new-change-sets-for-aws-cloudformation/
======
meddlepal
CloudFormation is awful. I used it extensively for about a year and will never
go back.

~~~
buremba
+1

1\. We can't test the templates in local environment, the only way to test a
template is to create a real cluster which usually takes ~15 minutes.

2\. The error messages are not helpful enough. It usually fails with
"SETUP_FAILED" error with no proper error cause.

3\. The rollback may not work if you change the configuration. You need to
delete the resources manually or even need to get help from Amazon since the
resources stuck at some failed status.

4\. The JSON template is hard to write and read. We have a fat single file
template that has 2k lines and scared to change it since we might broke
something.

5\. Not all AWS actions are supported. For example we need to add custom
resources (ELB and RDS instances) to a Opsworks stack but it's not possible
with Cloudformation.

~~~
leg100
Can't counter these points other than that what else is there to automate AWS
infrastructure? Terraform suffers all these problems, too.

~~~
buremba
I don't have experience with Terraform but it should be possible to find a way
to fix these issues except the first one.

------
fleitz
Call me when they get rid of UPDATE_ROLLBACK_FAILED until then it's
practically useless.

Whoever thought it was ever acceptable to leave your production stack in state
that requires deleting it is an idiot.

Even if they fixed it I doubt I'd ever trust a service that ever acted like
that with my production infrastructure.

~~~
andrewguenther
It isn't gone, but you can at least recover from it now.

[https://blogs.aws.amazon.com/application-
management/post/Tx1...](https://blogs.aws.amazon.com/application-
management/post/Tx11YT4MHFDZMK6/Continue-Rolling-Back-an-Update-for-AWS-
CloudFormation-stacks-in-the-UPDATE-ROLL)

~~~
justinsb
Is that just a retry though? It sounds like the onus is still on you to fix
the problem manually so that CloudFormation can complete its rollback, rather
than e.g. being able to request a different desired CloudFormation state that
might be reachable.

------
leg100
This is effectively the 'dry-run' feature that has been sought for a long long
time [1], and led me to switch to Terraform (which does do this amongst other
key differences) . Why did it take so damn long to implement?

I do appreciate it has come about eventually. Terraform has its foibles (and
stupid half-declarative language) but it would take further improvements to
consider going back to CF.

[1]
[https://forums.aws.amazon.com/thread.jspa?messageID=563929&#...](https://forums.aws.amazon.com/thread.jspa?messageID=563929&#563929)

~~~
zwily
Actually, this isn't quite dry-run. What this does is calculate the diff
between two templates, and just provide the operations from that diff. It does
not inspect the current state of resources to see if there were any out-of-
band changes that would be unwound.

So it gets you closer to a dry-run, but is not as complete as terraform's.

------
kenbreeman
This is a step in the right direction but the lack of consistent error
handling and having some resources created that can't be referenced are still
huge pain points (e.g. the default security group and route table).

I've had the most success using CloudFormation to create an entirely new
stack, test it, cut over, then delete the old one... definitely not optimal.

~~~
mryan
References to existing resources, such as the default SG, can be handled with
stack parameters.

You can pass in its ID as a parameter to the stack, and refer to this
parameter in your launch configs or ingress rules.

An example from one of my client's stacks:

    
    
        "Parameters": {
          "DefaultSG": {
            "Type": "AWS::EC2::SecurityGroup::Id",
            "Default": "sg-abc123"
          },
    

Personally I prefer to create a new SG to replace the default one as it means
all of my infrastructure is part of a CF stack, but the parameter method can
be used to partially manage (some) non-CF resources.

