Hacker News new | past | comments | ask | show | jobs | submit login

Sorry to be offtopic, but I've been using Pulumi at work for the past 6 months and I'm really not impressed. It's basically just Terraform but worse, with a million ways to declare your infrastructure instead of just one. Infrastructure people tend not to write the best code and from my observation the extra freedom of an imperative language just makes stuff even more complex and harder to maintain. It's also much harder to automate than Terraform, I am not aware of any equivalent to Atlantis.

Also, Pulumi previews (equivalent to plans) are complete bullshit. If you don't write your code carefully, resources can be created and removed and you won't know it's going to happen until you start applying... it's an engineer's worst nightmare when a tool lies.




Engineering Manager for the Pulumi Cloud here.

We have an equivalent of Atlantis called Pulumi Deployments[1]. The benefit of the Deployments platform is that it is entirely API driven. In addition to defining CI/CD and deployment in configuration/code, we offer APIs that let you do this programmatically. Great for platform automation where you are setting up hundreds or thousands of stacks.

In addition to `git push` workflows, we also support other deployment triggers such as a REST API. This is pretty unique, and let's you do things like build RESTful infrastructure APIs [2] ontop of the deployments platform.

- [1] https://www.pulumi.com/blog/pulumi-deployments-platform-auto... - [2] https://github.com/pulumi/deploy-demos/tree/main/deployment-...


I've also made the switch from managing a few thousand Terraform modules to handling most app-code things in Pulumi and have run into some of these limitations.

With Terraform + Terragrunt + Atlantis, we created https://github.com/transcend-io/terragrunt-atlantis-config and had an extremely robust and easy to use flow for updating all infra code.

We've since moved to an approach where more of our infra/security things are managed in Terraform (like Guardduty, SSO, Github repo settings, etc.) maintained by more devops folks, and our app code is mostly in Pulumi (lambdas, Fargate, CloudFront/CloudFlare CDNs, etc.). To accomplish this without something like Atlantis, we moved the app code infra deployments from being deployed continuously pre-merge via Atlantis to being deployed via `pulumi up` calls in our normal CI flows, so like right next to where we build the docker images and restart ECS services, as an example.

Overall I actually really love this flow. It is so, so much easier to create multi-regional infra in Pulumi with a quick for loop, and it's also much easier to do things like run esbuild over our code in typescript, and then bundle the output of that call and send it up to a Lambda function all from pulumi/typescript without needing separate build steps or to do things like using terragrunt pre-hooks or Docker build steps inside terraform provisioners, which I always found slow and clunky.

I would agree that Pulumi's plans are a disappointment though, exactly as you said.

Overall I've been happy with the change, and we've seen some improvements I think in the velocity that developers can launch services that meet our requirements.


You can automate both Terraform and Pulumi in a similar way to Atlantis using Spacelift[0] (though generally with a lot of additional features and customizability thrown in vs Atlantis; but you can also accomplish basically the exact Atlantis flow).

It's a CI/CD system specialized in Infrastructure-as-Code.

[0]: https://spacelift.io

Disclaimer: Software Engineer at Spacelift, though I'm recommending it not just because I work there, but because I think it's legitimately a very good product.


Thanks for reminding me about this, we came across it a while ago. I should book a call. Do you guys support on-premise hosting?


Yes!

We support using the SaaS version with self-hosted workers and privately hosted VCS systems, as well as (released just last week[0]!) a fully self-hosted version of Spacelift (specifically, bring your own AWS).

[0]: https://spacelift.io/blog/introducing-spacelift-self-hosted


Atlantis is basically just a terraform oriented command runner. I believe you can create a custom Atlantis workflow that'll just run `pulumi preview` or `pulumi up`. That's how Atlantis supports terragrunt as well: https://www.runatlantis.io/docs/custom-workflows.html#terrag...

Our experience with Terraform is exactly the same as what you're describing with Pulumi. We have hundreds of stacks and they are all implemented differently because they are written by a hundred different people with varying knowledge of Terraform. Terraform is a goofy language and it's very hard to do even simple things like conditionals. There is also a massive amount of copy/paste because people have no idea how to set up their provider or remote states and terraform has no real way to make those DRY.


I think the fact that there are “infrastructure people” vending infra for devs is usually evidence of a mistake. If you have platform engineers, sure, that makes a ton of sense. But adding friction to the developer workflow of waiting for someone else to do something that’s an API call away is strange


It's very valuable to have someone on your team thinking about infra all the time.

We know that we are constantly pulled in many directions in our industry, and often we take shortcuts to get the work out the door. Infra is not a place you typically want to take shortcuts. Burdening devs with the infra | ops responsibilities is a sure way to security incidents and inflated costs.

It does provide a good market for consultants|contractors to come in and clean up afterwards.

If we look at this job separation in a different analogy, why do we typically separate FE & BE development? b/c people can only be expected to be proficient in so much of the stack? And you typically want someone around who is proficient for each part of the stack?


> If we look at this job separation in a different analogy, why do we typically separate FE & BE development? b/c people can only be expected to be proficient in so much of the stack? And you typically want someone around who is proficient for each part of the stack?

It's just been a convenient way to divide up the work? When FE & BE don't work together and aren't aware of each other it's just another mess. And then people invented Backend For Frontend to deal with it or GraphQL. We're just adding more layers and abstractions and complexities on top.

It's valuable to have someone to be dedicated to infrastructure. It's even more valuable for everyone to be aware of the whole ecosystem. No 1 lives in a silo.


Truth, I've built a lot of automation for a CI / ops system. The devs have not taken the time to learn it, because business needs them to deliver value to the user. This certainly adds pain at times, but overall the business is better off with specialization and experts.

In an ideal world, it would be great for everyone to be aware, but people have limited time and the systems naturally grow large enough that you would have to spend all of your time just keeping up with the changes.


> This certainly adds pain at times, but overall the business is better off with specialization and experts.

This is where we have architects, microservices and layers of complexity and abstraction.

> and the systems naturally grow large enough that you would have to spend all of your time just keeping up with the changes.

I've seen more cases where it doesn't actually need to be that large - just like you don't need infinite scaling or 1000 microservices and microfrontends. Instead of accepting that it has to grow - think about how it can work together more efficiently. In a lot of situations there's been more downtime due to redundancy, DR and other setups than having a simple instance.

Everyone architects their own part and a build tool / process for each of it... when does it end?

I see these "systems" that mention more as a mess and most people spend 80% of their time to workaround it or just procrastinate at how fragile it is.


Amazon seems to do it fine, as have many other places I’ve worked.

The truth is that infrastructure is not that complex when you have someone giving you building blocks and setting up guardrails. Having someone give you the pipelines and setting the boundaries is great. Choosing the pieces to get the product over the line is not that hard in a world where cloud is the norm.


You need a big team to maintain the platform and provide a controlled access to do stuff (PaaS is the buzzword) and then a bunch of above-average devs that are motivated, trained and incentivized to do their own bits of Infra properly.

Possible at FAANG[1] but gets a lot harder as companies get smaller and/or the devs teams are not setup to do that.

[1] I assume the SREs are not the only ones doing the Infra code


I see what you're saying but I also don't. Developers can't be expected to maintain the infrastructure too, there's an immense amount of work involved to keep it reliable and secure.


Sure, there is. Let your cloud provider do that work. They’ll do it better and it lets you keep a platform team focused on where they add value.


Agreement, I do not understand this backwards movement in the DevOps world. My hypothesis is that they are catering to a different group, i.e. enabling developers to do Ops, who don't want to learn TF and want to use their preferred language. DevOps first practitioners are in short supply, so it makes sense there is a market for this.


Why is it backwards? Is making it more accessible and inclusive backwards? So these "DevOps practitioners" who are so different according to you never used Python or any programming language?

We should enable everyone to at least aware of Ops and be able to contribute. Why does it need to be gated behind a special language i.e. HCL?

A lot of times things go rogue exactly because developers don't understand and claim to not have a need to understand because it's not their job. Ultimately the code runs on the infrastructure provisioned just like how we live on Earth altogether.

Just like moving to recycling and clean energy the only way is to go at it together and not create more divide.


It's backwards because we used to do it that way and then upgraded to declarative IaC which gave us better reliability and confidence. It's backwards because tools like Pulumi use techniques from before we learned better. It adds more complexity and makes understanding harder. I still don't see how this is a win. The only exception is for developers who don't want to learn the best tools & techniques for IaC, pandering to their preferences over providing better systems. Sure there is a market, but that doesn't make it a good product, especially at scale.

The process of writing, reading, and understanding how the infrastructure comes to be is important. If you have never been on call for a production outage, you won't know how hard it can be to make the correct fix in a stressful situation. Being able to do that is more important than how easy it is for anyone to write the initial version. Take your time writing good IaC during development, make high-stress situations easier.

It's not gated behind a single language, there are multiple tools that provide declarative IaC. You would be surprised at how many folks in the Ops space have not written code beyond simple scripts to glue various tools together. It's something I require for our devops hires, but it is not necessary for all orgs. Most of the time you are writing TF, CF, or Yaml anyway.

It is interesting that Pulumi now supports a Yaml interface, but at that point why use that over TF directly? In the end, Pulumi is just a wrapper around TF. Personally, I use CUE -> tf.json for IaC. It's a much better wrapper with provable correctness.


Actually, I think it's not going backwards, but moving in circles, with some changes, hopefully improvements, each time through the loop, so possibly a bit like a corkscrew in 3D. (circles in two dimensions, straight line in third dimension).

Why are we going in circles?

Because there are conflicting requirements and our implicit assumptions prevent us from resolving the conflict.

The conflicting requirements are being able to describe infrastructure in its own terms ("declaratively"), the other is being able to abstract over infrastructure.

If the first requirement is prioritised, we get JSON files, yaml files or specialised IaC DSLs that tend to be limited, partly on purpose, but mostly accidentally, to the point of being crippled. We soon discover the limitations of this approach, which is essentially that for anything a bit more complex we need the abstraction capabilities of a general purpose programming language.

Martin Fowler makes a very similar case for integration here: https://martinfowler.com/articles/cant-buy-integration.html

So the pendulum swings towards general purpose languages with "infrastructure libraries". As far as I can tell, we've had two broad generations of these: the first was libraries to build the infrastructure, the second that we are on now (CDK, Pulumi) output "declarative" descriptions of the infrastructure.

Sounds good, except that there is now a layer of indirection, because essentially all our general purpose languages are actually not general purpose, they are domain specific languages for the domain of algorithms. So what you can express is an algorithm for creating the infrastructure(-description), not the infrastructure(-description) itself.

That doesn't sound like a big deal, but it actually is, not least because the indirection encourages all sorts of shenanigans and messes, including overdosing on indirection.

I think the only way to break out of this back-and-forth (or circle) is to create general purpose languages that can directly express and abstract over things other than algorithms.

Oh, and there's actually a third somewhat conflicting requirement: this description needs to be accessible programmatically (see Kubernetes and friends). If it's true, as I believe, that the other requirements point to a specific linguistic solution, this means that there also must be really great metaprogramming APIs.

Objective-S (http://objective.st) has those capabilities, and recently I've been starting to do some experiments with describing and acting on infrastructure, and the early results have been very promising. Having the concept of URI built into the language (Polymorphic Identifiers) is a good start for directly talking about infrastructure. Storage Combinators also help a lot and the software-architectural concept of a "system" composed of components connected via connectors also seems to map in a fairly straightforward manner.


Well said. The constraints of Terraform (HCL2) are a blessing.


I've been replacing my TF with CUE -> tf.json

I find it to be the best of both worlds


Maybe it shines if your infrastructure is really dynamic?


The project I'm on involves dynamic infrastructure. I don't see any benefit Pulumi provides over Terraform in this regard. In fact, Terraform's module system is more convenient if you want to do per-tenant infra versioning for example.


You've used Pulumi extensively and don't see a single benefit?

Here's setting up Guardduty in AWS in multiple regions using a terraform module: https://github.com/gruntwork-io/terraform-aws-security/blob/...

In pulumi, it would be:

``` import * as pulumi from "@pulumi/pulumi"; import * as aws from "@pulumi/aws";

[ "us-east-1", "us-east-2", "us-west-1", "us-west-2", "ap-south-1", "ap-northeast-2", "ap-southeast-1", "ap-southeast-2", "ap-northeast-1", "ca-central-1", "eu-central-1", "eu-west-1", "eu-west-2", "eu-west-3", "sa-east-1", ].forEach((region) => { const provider = new aws.Provider(`aws-provider-${region}`, { region: region, });

    const guardDuty = new aws.guardduty.Detector(`detector-${region}`, {
        enable: true,
    }, { provider: provider });
}); ```


The link is 404 but maybe you have a point there. I don't think this alone is enough to make me want to use Pulumi over Terraform though :)


If I use Python for my Pulumi, how could I reuse the work of a peer who uses JS for Pulumi?

Do we need to have multiple "mirrors" of our internal infra modules? Or do we need multiple language runtimes in our deployment runner in CI?

With TF, there is one language and one binary


You build a multi language component: https://www.pulumi.com/blog/pulumiup-pulumi-packages-multi-l...

You can then publish a language SDK for your supported languages, so npm for node packages, pip for python.

The invoking peer needs to have the language they're using runtime locally

The python boilerplater is here: https://github.com/pulumi/pulumi-component-provider-py-boile...

Here's an example component built in Go: https://github.com/jaxxstorm/pulumi-productionapp

Then in examples you can see it being using: https://github.com/jaxxstorm/pulumi-productionapp/tree/main/...


Thanks for that blog link! I have been asking this question for a while and you are the first to provide an actual useful answer.

Can I pass arguments to these components in other languages? Still unclear

It looks to limit, or create specific points, of where the reuse can happen. Like it is still impossible to use a helper function from one runtime in another. So we still need to maintain utilities in multiple languages

Either way, it looks like a lot more complexity for anyone who goes down this path


You would have separate JS and Python stacks that could share outputs, similar to a `terraform_remote_state` data source in terraform-land. There's also a yaml "language" support if you just want to use a config language: https://www.pulumi.com/docs/intro/languages/yaml/, but I have never messed with this, as we really just use typescript.

As somebody who maintains public terraform providers though, I do want to point out that "With TF, there is one language and one binary" is not totally accurate. Each provider you install must be installed separately, and runs separately as its own process, and the core terraform library calls out to the providers via gRPC. You can verify this by running `ps` during your next terraform run and seeing all the different providers on their own processes.

Pulumi actually uses the same gRPC setup, and can even communicate with Terraform providers. So the only real different part binary wise is the "core" terraform binary being replaced by a different "core" pulumi binary based on the language you are using, but the rest of the providers will all be the same, and you'll run multiple languages either way.

If your company only supports one or a few languages in your normal production code, you can also just use those languages for Pulumi, and still get one language and one binary. There's nothing forcing you to use multiple different languages.


It's still an extra dimension of complexity to support N language runtimes. Won't each of them still need the same TF providers?

With TF, sure there is a call to fetch those providers & modules, but still, I don't have to install N language runtimes, which needs its own IaC to build, thus adding another process to the mix

I would assume most eng orgs try to limit their Pulumi to one language to skirt this issue


> If I use Python for my Pulumi, how could I reuse the work of a peer who uses JS for Pulumi?

Standardize on the language(s). It's important. Pick 1 for Pulumi and standardize on it. Even better pick 1 or 2 for the whole company and standardize on it. In a large company...

> Or do we need multiple language runtimes in our deployment runner in CI?

Already does happen.

But back to the origin question: you can create Pulumi packages that can generate usage in different languages as required.


just wondering - have you tried CDKTF ? what do you think of that directions




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: