One of the things I really believe is that you can have the best of both worlds here. Pulumi uses imperative programming languages, but is still "declarative". The imperative programs are executed to build up the desired state, which can then be reliably diff'd and previewed, and can be used to enforce manual or automatic checks for correctness. So you get the expressiveness of imperative programs (loops, conditionals, components, packages, versioning, IDE tooling, testing, error checking, etc.), but still the safeguards and reliability of declarative infrastructure-as-code (preview, gated deployments, policy enforcement, etc.).
I also tend to view the perceived benefits of JSON/YAML/HCL "simplicity" as somewhat comparing apples to oranges on a complexity specturm. If you are only managing a dozen resources, it may be that JSON/YAML/HCL are fundamentally simpler. But when you've copy/pasted tens of thousands of lines of YAML around all over your codebase to manage hundreds or thousands of resources, the value of abstraction, reuse, well defined interfaces, and tooling to manage that complexity feels to me essential to the scale of the problem. And that degree of complexity is no longer just something large organizations are dealing with. Modern cloud technologies (serverless, containers, Kubernetes, etc.) are leading to significant increases in the number of cloud resources being managed, and the pace at which those resource are deployed and updated.
Assembly is a "simpler" way to think about programming, but didn't scale as complexity of application software increases. I believe the same is true about JSON/YAML/HCL and cloud infrastructure.
You're lumping in HCL (and languages like Dhall by extension) with static serialization formats and criticizing them for a characteristic only found in the latter.
HCL is programmable and has a fair model for code reusability through modules, state outputs, for-expressions and other kinds of expressions.
Add in a proper language with types like Dhall and you have a configuration language where you can apply all the transformations you could want with a much higher safety and robustness floor than a turing-complete language that allows you to make all sorts of messes.
It's specially dangerous to have a turing-complete language for configuration once you factor in that the reflex of an inexperienced developer who is more likely to make these messes is to use a tool they're already familiar with even when the tool is actively harmful to their goals, as Pulumi facilitates.
We've worked with a lot of end users to migrate from Terraform, and we honestly do see a lot of copy-and-paste. I agree that it's not as rampant as with YAML/JSON, however, in practice we find a lot of folks struggle to share and reuse their Terraform configs for a variety of reasons.
Even though HCL2 introduced some basic "programming" constructs, it's a far cry from the expressiveness of a language like Python. We frequently see not only better reuse but significant reduction in lines of code when migrating. Being able to create a function or class to capture a frequent pattern, easily loop over some data structure (e.g., for every AZ in this region, create a subnet), or even conditionals for specialization (e.g., maybe your production environment is slightly different than development, us-east-1 is different, etc). And linters, test tools, IDEs, etc just work.
For comparison, this Amazon VPC example may be worth checking out:
- Terraform: https://github.com/terraform-aws-modules/terraform-aws-vpc/b...
- Pulumi (Python): https://github.com/joeduffy/pulumi-architectures/blob/master...
- CloudFormation: https://github.com/aws-quickstart/quickstart-aws-vpc/blob/ma...
It's common to see a 10x reduction in LOCs going from CloudFormation to Terraform and a 10x reduction further going from Terraform to Pulumi.
A key importance in how Pulumi works is that everything centers around the declarative goal state. You are shown previews of this (graphically in the CLI, you can serialize that as a plan, you always have full diffs of what the tool is doing and has done. This helps to avoid some of the "danger" of having a turing-complete language. Plus, I prefer having a familiar language with familiar control constructs, rather than learning a proprietary language that the industry generally isn't supporting or aware of (schools teach Python -- they don't teach HCL).
In any case, we appreciate the feedback and discussion -- all great and valid points to be thinking about -- HTH.
I don't see this as such a terrible problem. The configurations may have more LOC's but there are not as many surprises. The dependency of declarable configuration makes it rock solid and favorable among operations teams who need to make these kinds of changes all the time.
> A key importance in how Pulumi works is that everything centers around the declarative goal state. You are shown previews of this (graphically in the CLI, you can serialize that as a plan, you always have full diffs of what the tool is doing and has done. This helps to avoid some of the "danger" of having a turing-complete language. Plus, I prefer having a familiar language with familiar control constructs, rather than learning a proprietary language that the industry generally isn't supporting or aware of (schools teach Python -- they don't teach HCL).
I understand the reason to want this. Having worked closely with developers, lack of familiarity with HCL makes it much less accessible. However, from an operations perspective, I am GLAD that HCL is a very limited language. No imports of libraries all over the place (in your infrastructure configurations, no less!).
The issue is that your static configs often have lots of boilerplate sections that have to be kept in sync. Further, you can use an imperative language like Python, JS, etc and still write in a completely declarative fashion (or you can use a functional language which tend to be declarative out of the box). Conversely, you can model an AST in YAML (which is what CloudFormation is trending toward) and get the worst of all worlds. Bottom line: don't conflate "reusability" with "imperative" or "static" with "declarative".
Yes, I agree with this. However, its predictable. As an operations person, I value predictability and am willing to pay the price of keeping static configs in sync.
> Further, you can use an imperative language like Python, JS, etc and still write in a completely declarative fashion (or you can use a functional language which tend to be declarative out of the box). Conversely, you can model an AST in YAML (which is what CloudFormation is trending toward) and get the worst of all worlds. Bottom line: don't conflate "reusability" with "imperative" or "static" with "declarative".
Hold on, I'm not conflating anything. Saying that "you can write terrible things in any language" isn't anything new. We choose to use languages that provide certain guarantees that we need for the domain that we're working in. For infrastructure, declarative languages are a lot more suitable for the properties they provide (i.e. no surprises, limited functionality etc.). Its "possible" to use static types in Python, how many do that?
I think there's wisdom in this at small scales, but as the volume and complexity of your boilerplate grows, I think you lose any advantages. I also think this threshold is quite low (as an ops person and a dev person) since it's not much harder to look at/read the YAML generated by a script vs that which is hand-rolled and committed to git.
> Hold on, I'm not conflating anything.
Are you sure? Because you just said "I am willing to pay the price of keeping static configs sync" and then "For infrastructure, declarative languages are a lot more suitable for the properties they provide" and then you started to talk about "static types" in Python, which is different than "static" in the YAML sense (YAML isn't statically typed, but it is static in that it isn't evaluated or executed).
I'm not trying to be a jerk, it just sounds like a lot of concepts are being confused. I also wasn't making the argument "you can write terrible things in any language" (not sure if you were attributing that argument to me or if that was a point you were trying to make).
Consider this Python: https://github.com/weberc2/nimbus/blob/master/examples/src/n...
It's fully declarative, but it does evaluate, so it's not static in the YAML sense. It outputs a JSON CloudFormation template (but it could easily output in YAML) which you could inspect visually before passing onto CloudFormation.
It's also statically typed although that's not evident from this file since all types are inferred in this file (however there are annotations in the imported libraries), and while the static typing is a very useful property, it's not what I've been talking about in this thread.
In my opinion, this is no less readable than the equivalent YAML; however, it's capable of doing much more (albeit if your infrastructure is just one S3 bucket, then this is overkill--to really understand the power of dynamic configuration, you would want a more complex example).
Getting them, nevermind relying on them to write Python/JS in the correct way is straight up out of the question.
At least I know in Terraform/HCL they can’t map a config change over the 1000 new instances they spun up because they happened to write their for loop wrong.
Then use Starlark (https://go.starlark.net) or Dhall or similar.
> At least I know in Terraform/HCL they can’t map a config change over the 1000 new instances they spun up because they happened to write their for loop wrong.
To be clear, the proposal is to use a programming language to generate your HCL-equivalent configs, not to imperatively modify infrastructure. Consequently, you can inspect the generated "HCL" (or whatever the output is) and make sure it looks like the code they would write manually. Further, you can even write automated tests.
Then again, code has to be run in order to analyze its output -- that or code has to be data you can analyze (like a Lisp), but that can be very difficult to reason about.
So my preference would be to have libraries for constructing configuration data. Then you can execute a program to generate the configuration, and that you can use without further ado. The output may not be easy for a human to understand, though it should be possible to write code to analyze it.
It might be better to compare how to use the module/stack:
- Terraform: https://github.com/terraform-aws-modules/terraform-aws-vpc
- Pulumi: https://github.com/joeduffy/pulumi-architectures/tree/master...
So as a user, can I configure this Pulumi VPC stack before it's instantiated? Or do I have to use the defaults first and then use the CLI to change things? Do these CLI changes then get placed into code, or just into state? Does that mean I'm now in a situation where the code doesn't match the state?
Personally I find the Terraform configuration much easier to reason about, I see exactly where resources are declared just by scanning the file. (But I've also used Terraform a lot).
Edit: Ah, maybe I have to configure it via this config.py file ? I appreciate what Pulumi is trying to accomplish, but that is certainly not a config format I'd like to be using. Maybe you could use HCL or YAML for it? ;)
Edit 2: Another last thought, I think a lot of the mindset in Terraform comes from Go, where the proverb "A little copying is better than a little dependency" is pretty well adopted. Before I started writing Go as my main language I didn't appreciate that mindset, but after 5 years with Go I've found it more and more appropriate .
 https://go-proverbs.github.io/ -- https://www.youtube.com/watch?v=PAAkCSZUG1c&t=9m28s
1) The project does support config. So if you want to change (e.g.) the number of AZs, you can say
$ pulumi config set numberOfAvailabilityZones 3
$ pulumi up
3) We offer some libraries of our own, like this one: https://github.com/pulumi/pulumi-awsx/tree/master/nodejs/aws.... That includes an abstraction that's a lot like the Terraform module you've shown, and cuts down even further on LOC to spin up a properly configured VPC.
I am a big Go fan too, so I very much know what you're saying. (In fact, we implemented Pulumi in Go.) Even with Go, though, you've got funcs, structs, loops, and solid basics. Simply having those goes a long way -- as well as great supporting tools -- and you definitely do not need to go overboard with abstraction to get a ton of benefit right out of the gate.
Again, I'm biased and YMMV :-)
Cool, is it possible to do that without having to use the CLI? Are you doing any sort of state locking here? I've seen ops teams get saved from potentially horrible situations by Terraform's dynamodb state locking.
"You can make this into a library using standard language techniques like classes, functions, and packages."
That's pretty nice and it seems like it'll get you the same functionality as a Terraform module. Do you have any plans of releasing something like the Terraform Registry to help with discoverability?
Also, do you have any docs on writing providers? I've had to do that a few times for Terraform and getting up and running with that was pretty easy as a Go developer. I wouldn't really want to do that for every supported language though (no offense C#).
I'm seeing that some of this is using codegen to read the equivalent Terraform provider and generate the Pulumi provider from that schema. Is that the preferred workflow here for providers that already exist in the Terraform ecosystem?
Yeah it's just a file if you prefer to edit it. By default, Pulumi uses our hosted service so you don't need to think about state or locking. That said, if you don't want to use that, you can manage state on your own. At this time, you also need to come up with a locking strategy. Most of our end users pick the hosted service -- it's just super easy to get going with.
> Do you have any plans of releasing something like the Terraform Registry to help with discoverability?
I expect us to do that eventually, absolutely. For us it'll be more of an "index" of other package managers since you already have NPM and PyPI, etc. But definitely get that it's helpful to find all of this in one place -- as well as knowing which ones we bless and support.
> Also, do you have any docs on writing providers?
We have boilerplate repos that help you get started:
1) Native providers: https://github.com/pulumi/pulumi-provider-boilerplate
2) Terraform-based providers: https://github.com/pulumi/pulumi-tf-provider-boilerplate
> Is that the preferred workflow here for providers that already exist in the Terraform ecosystem?
Yes. We already have a few dozen published (check the https://github.com/pulumi org when in question). In general, we will support any Terraform-backed provider, so if you have one that's missing that you'd like help with, just let us know. We have a Slack where the team hangs out if you want to chat with us or the community.
And still keep it terminating.
You can do it but it means doing more cognitive engineering than "just throw python at it".
Another point: you can have a declarative turing complete language. I would really like to see people bring prolog like languages to things like pulumi and terraform.
That would also allow to get convergent concurrent application which means we could get proper collaboration. That would be a strong move ahead for devops.
I would risk to say that it’s not the Terraform that makes the people to copy / paste. It’s the people. Call it lack of knowledge, not enough time, laziness, tight schedules...
Once your customers are on their own, new people join - no knowledge of Pulumi, resources get added / moved / evolve, there will be copy / paste in their Pulumi code too.
Not defending Terraform here. Just adding a point to the discussion.
The reality is that all programming languages have significant copy paste codebases using them, but there are features which help reduce the amount of it. Terraform is missing some of those features, and many of the features it does have were introduced in tf 12, which is less than a year old.
It’s interesting that some people bring up sbt as an example of how to use a „programming language” for configuration. The reason why sbt became dominant was the weight of Lightbend (Typesafe). There was no way to get away from it. Frankly, sbt can be awful mashup of copy / paste too. sbt is so much magic, I would not be surprised to discover that majority the folks who use sbt, have no actual clue why stuff works the way it works.
I haven’t tried Pulumi yet, I will try when I get the chance. I am eagerly waiting for an opportunity to use it. Hopefully it will surprise me in a positive way. Surely, it can deliver on what it promises. I have very fond memories of Chef and cookbooks in Ruby, it can be done.
Edit: personally, Chef solo (with right tooling to eliminate the server), was the best experience so far. If Pulumi can improve on that (no agent), I’m looking forward to take it for a test drive.
Well, the problem is that a majority of people don't want to / don't have the time to learn HCL, because it's not the most effective use of their time / not worth the "investment" to do so.
Learning HCL is not very rewarding, unless you are an ops person.
Learning a general purpose language language like Python, TypeScript or whatever language your company uses is rewarding both for ops and dev people (or devops people if you like that term) and typically can be used for a much wider set of use-cases.
When introducing a new language the pros and cons of doing so should always be carefully considered, however unfortunately for devops tools new languages like HCL,Jsonnet,Starlark,zillions of YAML pseudo-programming DSLs etc. are often introduced very lightly, mentioning a handful of use cases where the new language shines, but ignoring the cons and intrinsic costs (learning curve, new tools, editor integrations, package manager etc. to be built).
Terraform works great for teams where you have a strict separation between ops and dev people. The ops people will spend their time learning HCL, the dev people will learn Python, TypeScript or whatever that is.
However if you are trying to truly embrace a "DevOps" model Terraform shows its flaws. Developers will either still heavily rely on ops people to "help them" even for trivial infra changes or they will write sub-par copy pasta HCL code that tends to be verbose.
TF 0.12 may have a bunch of new constructs which make it easier to reduce duplication, but the boilerplate that is required to create an actual reuse module with variables and import it (and overall awkwardness of the module system/syntax compared to any other language) vs the simplicity of creating a reuse function/file in Python/TS is like night and day.
Furthermore the subpar editor support for TF makes it actually hard to follow references between modules and safely refactor code, so there is a much lower threshold at which an abstraction appears "magic"/incomprehensible in HCL, compared to typed TS/Python where you can easily follow references.
Source: ~2 years worth of Terraform (incl. 0.12) and ~1 years worth of Pulumi use within multiple companies and teams.
It's ironic that you sell as a plus that Python allows you to easily loop over data structures and make resources codnitional, because pretty much all your Terraform resources there are conditional (with a few looping over lists for DRY purposes), while few of the Python resources are.
Many of the lines saved for declaring identical resource types are just because either the Terraform resource is declared with unnecessary values or because the Python one has a default value, which can be provided as well in Terraform.
But yeah, the bulk of the difference is that the scripts are doing different things by declaring different sets of resources.
> Plus, I prefer having a familiar language with familiar control constructs, rather than learning a proprietary language that the industry generally isn't supporting or aware of (schools teach Python -- they don't teach HCL).
Which comes back to my point about inexperienced (or the "10x" ones that cut corners until the table is round and then leave) developers preferring familiarity over using a specialized tool that takes into account common pain points, further fragmenting the space through "worse is better". I am certain I will die employed on cleaning up ORM messes left by developers that didn't want to learn SQL despite having a whole field of mathematics backing it; so if you're successful, odds are I will also end up fixing some day the "declarative output" a Pulumi script produced in a developers computer that is not reproducible anywhere else because it makes a request to his home server and mutates an array of resources somewhere depending on that response, the current time, the system locale and the latest tweet by Donald Trump.
Yeah, it seems a bit silly to say that a benefit is saved lines of code, yet the Terraform example is setup to do quite a lot more than the Pulumi example. The resources are just there and turned off with the "count" configuration. The Pulumi example isn't doing any of the RDS, Redshift, Elasticache, Database ACL, VPN gateway, etc things. This example is a pretty substantial module and I'd guess the LOC would be pretty similar between the two if the functionality were closer.
"Turing complete" is a red herring. You can write a program in Dhall that will continue to run long after we're all dead. But this doesn't happen in practice and/or when it does we notice something is wrong fairly quickly and correct the problem. And because these infra-as-code-and-not-configuration solutions generate configuration, if you do have a loop that doesn't terminate or similar, it's not a problem because your program never deploys any changes.
As for making messes, our experienced developers make more of a mess with static configuration because it's fundamentally impossible to manage large static configurations with their inherent repeatable segments that must be kept in sync. The static configuration players try to solve for this by introducing hacky mechanisms for reuse (macros and nested-stacks in CloudFormation, text templates via Helm for Kubernetes, etc), but these fall over very quickly as hacks do.
It's not the avoidance of the halting problem the reason these languages are better for the task. It's the benefit of having limitations that come with being turing incomplete that prevent us from doing a lot of stupid stuff without realizing it and doing "hacky workarounds" without properly understanding the problem we face.
> As for making messes, our experienced developers make more of a mess with static configuration because it's fundamentally impossible to manage large static configurations with their inherent repeatable segments that must be kept in sync.
Or don't do static configuration and just use something like Terraform where you can just reference a resource and pass it around.
You'll have to articulate your said benefits to be sure, but I would wager that the principle reason to be turing incomplete is to address the halting problem and that the benefits you're thinking about come from other properties of the language (functional purity, immutability, limitations on I/O, type safety where applicable, etc).
Notably, there are lots of hacky workarounds employed in HCL and YAML because people don't understand the problem properly. The problem requires that we can generate arbitrary static configuration from a fixed set of inputs. If your organization is so inept that they keep adding in infinite loops and/or I/O, then by all means, try something like Dhall or Starlark (unfamiliar vs not-type-safe, pick your poison); however, if this is a consistent problem in your organization you probably need to replace your humans because these programs aren't hard to write correctly.
> Or don't do static configuration and just use something like Terraform where you can just reference a resource and pass it around.
Because this only addresses reuse at the resource level. You can do the same thing in CloudFormation; it's not adequate. For example, not everything is a resource. You ultimately need the ability to generate arbitrary static configuration. Terraform probably has lots of other disparate features that collectively address a good portion of the solution space, but programming languages have a unified concept ("functions") that satisfy the whole solution space and programmers are already familiar with them. Terraform's job should be taking static configs and applying them to infrastructure--let a real programming language generate those configs, or at least offer dynamic configuration language that is designed with a proper understanding of the problem (to use your words).
None of my Terraform projects are 10k lines long. I find it's reusable and at almost the right level of abstraction (Typed templates). I tend to go for a minimum expressivity necessary for DRY. So far I have not found Terraform lacking for a single project, but I have found it lacking for expressing higher order infrastructure (infra code intended for multiple projects).
I've never managed a project with thousands of hetrogeneous resources though. I question whether that's really a thing that a single team would do.
I wonder if you considered Dhall (https://dhall-lang.org/) that's declarative but at the same time has functions and other convenience factors.
Granted, no one is going to really do that. And there are good reasons to want provably-terminating programs (e.g., in DTrace, eBPF, ..., because probe actions have to not just terminate, but also run very fast). But for infrastructure deployment? I think Turing complete is fine for that.
One idea I've entertained is to use jq as a configuration language and have its output be a JSON text describing a fully-constructed configuration. Yes, jq is Turing complete, but it's so damned convenient!
Guix and Nix both are innovative way to build production, reproducible, secure deployments and platforms without side effects and get rollback and transactions free.
With Pulumi - i was able to just sprinkle in a couple of console.log statements and my "extension" was done.
Using whatever language your project already uses to then dump to these formats is a way more sane pattern IMHO.