You’re not wrong. You can achieve the same results using modules well. It’s comm...

xyzzy123 · on Aug 4, 2023

Agree, it's a complicated conversation because the tools support many different ways of working - and most of them can be made to work with different tradeoffs.

A team that e.g owns its own microservice IaC can absolutely maintain a setup long term with app & db in the same state, it just requires care and love (it's easier in some ways, it just doesn't scale along certain dimensions).

Maybe you have other controls / factors that can make it work for you.

But my experience is that as you split things across more teams or have more complicated IaC "supply chains" (eg teams supplying modules to each other or have lots of ppl working on different bits of the same thing) you need to look at ways to make things more foolproof, easier to support and give yourself as much "wiggle room" as possible for upgrades. At this point having state split out is very helpful (almost essential).

Because the terminology seems to be tripping up the conversation I'd be inclined to phrase it like this: a single "terraform apply" should be touching a precious, stateful stack or a disposable stateless stack and these should be clearly delineated. Ideally the stateful stacks should be as small as possible, as much stuff as possible should be in the stateless stacks.

linuxdude314 · on Aug 5, 2023

The way I see it is we only want to use TF for setting up "base infrastructure". This is things like our VPC networks, cloud routers, firewall rules, and finally our Kubernetes clusters.

We still allow devs to use TF, but it's only for using cloud services they depend on, like say a SQL DB or something.

I think you and I are 100% on the same page, but I think the word state (in the way you are using it) is causing a bit of confusion at least for me. A terraform stack will always have state, that's the point. In addition to the current "tfstate" there is a set of parameters that must be used with w/e modules in order to arrive at the tfstate. That's the state that causes problems, not the tfstate so much at least in my experience, as these often are _not_ version controlled.

This is why it's critical to only allow terraform to be applied using automation. I mean don't make it a company policy, make it IAM policy so it's literally not possible.