Hacker News new | past | comments | ask | show | jobs | submit login
Creating my personal cloud with HashiCorp (cgamesplay.com)
205 points by CGamesPlay 3 months ago | hide | past | favorite | 86 comments



As a DevOps guy, I'm not a huge fan of Terraform.

Often I hear from enterprises that Terraform is cloud agnostic, but that's often very wrong. Terraform modules are still specific to the cloud platform and a rewrite is required to port an app running on AWS to GCP.

If you use AWS, you're probably better off to use AWS Cloudformation and for GCP Google Cloud Deployment manager.

A business reason is often that the engineers are more familiar with Terraform already, but learning Cloudformation is really not that hard...And if you can't work out how to use a basic tool, you should not be running infrastructure anyway, because IT is a constant cycle of change.

I'm not saying Terraform is not good, but I just think a native solution of the platform is preferable over a 3rd party tool.


(Creator of Terraform, Co-Founder of HashiCorp)

I'm quite late to respond here, but just wanted to clarify one thing: Terraform is WORKFLOW agnostic, not TECHNOLOGY agnostic. This is a key part of our product philosophy that we make the 1st element of our Tao: https://www.hashicorp.com/tao-of-hashicorp

I've talked about this more with more references in this tweet: https://twitter.com/mitchellh/status/1078682765963350016

I don't think we've ever claimed cloud portability through "write once run anywhere;" that isn't our marketing or sales pitch and if we ever did make that claim please let me know and I'll poke some teams to correct it. Our pitch is always to just learn one workflow/tool and use it everywhere, but you explicitly WILL rewrite cloud-specific modules/code/etc.

With Terraform, the big win for folks is learning how to write and use Terraform, then knowing a fully supported official tool (to some extent) for hundreds of API-driveable systems. Instead of educating an engineer on CloudFormation, Azure ARM, etc. they learn ONE tool and ONE syntax and then adapt that to their cloud-specific knowledge.

More details are in the tweet I mentioned above, but I hope that helps. Fully respect you not being a fan of Terraform, I don't mind that, I just wanted to make sure for yourself and others reading that it is clear that we also don't believe Terraform is cloud agnostic in the sense you described.


I actually much prefer this. When things try to be technology agnostic, or too generic you end up with the lowest common denominator for features.


I highly appreciate what Terraform does for me and the whole industry.

I also think sometimes why i don't like it very much and how i would make it different.

How the state is handled, including potential secrets in it, is just frustrating. Having root secrets for your whole setup exposed/unsecure is bad. The state is relativly fragile and cumbersome to clean up or fix. I also can't grasp that tf even needs a state and the cloud providers can't return the current state just fast enough. Only a lightweight cache would then be needed.

And probably due to implementation details, plans show sometimes changes when there would be no changes necessary.

For me its a good tradeoff to use terraform for setting up a k8s environment and then handling everything with ArgoCD.

Google Connector is a very great thought: you create a k8s resource and the cloud provider executes it for you on their cloud. No terraform needed anymore at all.


I also have to use Terraform and I try to avoid looking at it as much as possible. Hashicorp HCL is incredibly bad (not to use stronger words). I am trying to use an alternative like Pulumi whenever possible, but thankfully there is some hope with cdktf, since Terraform is today effectively technical debt. I can't see how anyone thought that HCL, a terrible language to use in any aspect, was a better choice than a general purpose language like python or js. Every one in our infra org utterly despises HCL. The workflow or Terraform is also very poor, it promotes bad practices and does not offer decent guarantees. I know I sound harsh, but I hope it is received as constructive criticism


I'm curious to see what you think of the "hash stack" for self-hosted projects: consul, nomad, vault. Honestly, seems pretty ideal to me ;)

Why not provide a cloud host tier for startups akin to CLoudflare Pages?


I'm not the biggest fan of Terraform either but it would be naive to deny that Terraform doesn't still offer a lot of advantages over CloudFormation even if you're not writing cloud agnostic code (even though I do actually agree with your point that Terraform doesn't make your infra cloud agnostic).

Terraform offers far more constructs than CloudFormation and there is still a lot to be said for using the same language from describing your AWS infra as GCP, Github, any on prem infra, etc (even if the resources differ between providers requiring bespoke code for each). It's a bit like those who advocate node.js because it means the same developers can right frontend and backend code and in the same language.

If you like the tooling around CloudFormation in AWS then you're better off with serverless (`sls`) or even Amazon's own CDK (https://aws.amazon.com/cdk/) over YAML-based CloudFormation stacks in my opinion.

That's not to say I don't think Terraform doesn't have its warts: properties are non-guessable, IDE integration is pretty mediocre, and it's overly verbose in calling modules (so bad that sometimes the calling code has just as many lines as the module itself!) but I do still think it is the least worst tool available at the moment. And one can always use a 3rd party tools like Terragrunt if you want to fix some of the shortcomings of Terraform while still taking advantage of it's benefits (though personally I'm on the fence about whether the industry really needs yet another transpiler that compiles to code that needs to be transpiled....it's starting to feel like it's just abstractions all the way down....)


I agree with this. Terraform is definitely the least-bad tool, especially in that it integrates with so many more services than CloudFormation and has far fewer bizarre limitations than CloudFormation (e.g., you cannot pass objects in CF, and arrays can only be simulated as comma separated strings). Terraform isn’t great, but CF is awful, and even the AWS folks will point you at the CDK instead.


Agree CF is crap.

Each clouds SDK in the language the team is most familiar with is by far the best option.

State can be stored in git. Any version of my infrastructure is a git checkout away.

I use Go, and the documentation for the AWS SDK includes copy-paste examples

Try and checkout Terraform from 6 months ago and run it? Frequently I cannot even get someone’s tutorial example written a week prior to work without edits.

I checked out a year old commit in my infra repo; rebuilt an entire ECS stack deprecated 18 months ago.

Just be programmers. The cloud ops scene is just reselling the same delusions as Unix grey beards and Windows server admins. It’s about making hardware do the right thing, not hand wavy semantics.

Most of the people I work with just regurgitate memes. Very few actually test them for truth.


I disagree with using the SDK directly because the SDK doesn’t have any reconciling capability, and that’s not something easily built correctly.

Creating resources with the SDK is easy—keeping them in the desired state is very hard.

Instead, we use a real language to generate a description of what we want (in YAML or HCL) and a reconciliation engine takes over from there.


Terraform Language server together with VSCode is working quite well.


That's what I use and it is better than nothing but nowhere near as finely tuned as other language servers:

1. It's slow to return results

2. sometimes the only way you can generate a list of available properties is to type the first character, which is really annoying if you don't even know what the first character might be

3. it offers suggestions for properties that aren't even valid for that resource

4. there is no description against any of the properties so you're still left guessing (or having the docs open in another window) anyway


I'd point out that one of the biggest advantages of Terraform isn't managing cloud infrastructure (though I certainly like it for this), but for providing a common language for integrating vendors and other 3rd parties into my own cloud infrastructure.

I've worked in a couple organizations that had a lot of success managing things like cloud access security brokers, web application firewall appliances, ticketing systems, identity providers (with and without multi-IdP federation) and more.

Doing this without Terraform is incredibly manual and requires even more manual process to keep these systems in sync with your cloud. Having a common automation framework that can manage this is indispensable--and useful even if you're not using Terraform to manage your core infrastructure.


From my enterprise perspective Terraform is cloud agnostic, as you "only" have to once configure the CI/CD pipeline, secret management, and state management. Afterwards you are free to use any cloud you want and have all the benefits that come with Terraform (including infracost). Especially if you create templates, you can use Terraform in multiple projects quickly.

However, my biggest issue with Terraform is that the promise of dependency graphs is in practice broken. Providers will break once the underlying resources are not yet there. The hack to solve this is having several Terraform directories which you run after each other.

Still, I think multiple clouds is the way forward. You can negotiate prices down and use services are best suited for the projects. Especially with other players such as CloudFlare and Backblaze, which already have Terraform providers available.


If you are saying that it’s better to run native than Terraform then to me it means that Terraform is not good enough - which is also my experience from poking it. It covers all use cases on all clouds, buy only ~80% of each use case. The rest you have to figure out by yourself. Which often means diving into CloudFormation and such - so basically losing the advantage of Terraform.


Basically right. And every time the platform changes, terraform has to implement it too! And more often than not, it's supported in the providers implementation than terraform.

I think terraform was better before, but now CloudFormation for me is good enough, so I don't see the point of Terraform.

Cloudformation also has Drift detection now and I really dislike Terraforms state files.


I think it varies on the quality of the providers, at least in AWS I've yet to come across something missing


This is only true recently, and largely only because the CF team got sufficiently embarrassed by TF have support for things they didn’t literally months ahead of time.


But why not just Cloudformation then? What advantage does Terraform provide over Cloudformation in your opinion?


I find it easier to read Terraform over CloudFormation. I also find it easier to write.

Anecdotally: non-devops developers find it easier to edit Terraform than CloudFormation. They understand the resource and data blocks much easier than YAML.

Terraform bridges the gap between purely YAML CloudFormation and purely JavaScript API calls. (JavaScript used as an example there, it could be any language).

Terraform makes tradeoffs to satisfy its problem domain and be easier to use than CloudFormation. Plus Terraform often gets features faster than CloudFormation because cf seems to lag behind the AWS API.

I'd like to write some of my simpler Terraform stuff in AWS CDK, Pulumi, or something similar, to see if I can make use of more language features like inheritance or better decision logic. (currently I try to keep as much logic outside of Terraform as possible. It's supposed to be declarative and not have "real" coding language features shoved into it)


For one, it's closer to a proper programming language as opposed to straight up data interchange format. Sure if you write it in YAML than you can take advantage of variables but YAML's syntax for variables is pretty gross.

Comparing CloudFormation to Terraform is a little like comparing HTML and CSS to Javascript (though Terraform isn't nearly as nice to code in as Javascript -- and I'm not exactly a big fan of Javascript). You can cover most use cases with plain HTML and CSS but the moment you need to get a little more intelligent with your code you get stuck.


> For one, it's closer to a proper programming language as opposed to straight up data interchange format. Sure if you write it in YAML than you can take advantage of variables but YAML's syntax for variables is pretty gross.

So what is stopping you from using "a proper programming language" to generate the json/yaml cloudformation template?

This is what you see in GCP docs from day one. On AWS, they brainwashed everyone in this corner of "static template with parameters", so that you can "reuse" a template to build your custom stack. It's great for "look what I can do, mom" (but I have no idea what it's doing) but nobody sane would ever trust a 1-km long yaml/json and deploy it. So if you anyway have to inspect it, why not make it easy to inspect? Split into modules, add docs, etc = code to run.

I have no idea how we switched from random scripts to "reusable" random scripts (ansible &co) to random static configuration and then the cherry on top: "reusable" random static configuration. Insane. Abstractions on top of abstractions.

CDK is on the right track, but even there it's a mess, again for the sake of hiding complexity: constructs and deployment. Where did One thing well and Keep it simple stupid go? :)


> So what is stopping you from using "a proper programming language" to generate the json/yaml cloudformation template?

Nothing and there are already products out there that offer that. However I think the issue will always fall back to the problem that you're compiling from a functional or procedural language down to a dumb data interchange format. That can cause a variety of issues such as losing your carefully crafted order of execution.

> This is what you see in GCP docs from day one. On AWS, they brainwashed everyone in this corner of "static template with parameters", so that you can "reuse" a template to build your custom stack. It's great for "look what I can do, mom" (but I have no idea what it's doing) but nobody sane would ever trust a 1-km long yaml/json and deploy it. So if you anyway have to inspect it, why not make it easy to inspect? Split into modules, add docs, etc = code to run.

I don't think anyone was "brainwashed" by CloudFormation and the solution you describe is exactly the approach Terraform takes.

> I have no idea how we switched from random scripts to "reusable" random scripts (ansible &co) to random static configuration and then the cherry on top: "reusable" random static configuration. Insane. Abstractions on top of abstractions.

This I agree with. It's not just AWS though, you see YAML-based config all over the place from CI/CD pipelines to Kubernetes pods. And they all suffer from the same problems. It's easily my least favourite thing about the DevOps modernisation of what would have been random sysadmin shell scripts 10 years ago. Frankly I'm not convinced these YAML files are any more readable nor less brittle than the duct tape we wrote in #!/bin/sh before.

> CDK is on the right track, but even there it's a mess, again for the sake of hiding complexity: constructs and deployment. Where did One thing well and Keep it simple stupid go? :)

As I'd written elsewhere, I think CDK is aimed at a subtly different audience. CF, TF, etc are more focused around the sysadmin side of DevOps, whereas CDK are more focused around the application developers end of the DevOps tool chain. That's not to say sysadmin guys can't use CDK, but rather than CDK doesn't just aim to deploy infra, it embeds and deploys the serverless applications (like lambda code) as well. It's more akin to the "full stack" style of developers and not every team nor individual who's job it is to manage cloud infra is going to be an application developer. Particularly in larger organisations. So there definitely is a place for cloud infra stacks to be described in less sophisticated languages (even if it doesn't appeal to you and I personally).


> For one, it's closer to a proper programming language as opposed to straight up data interchange format. Sure if you write it in YAML than you can take advantage of variables but YAML's syntax for variables is pretty gross.

I think that's what AWS CDK[0] and Terraform's CDKTF[1] are trying to solve.

Given the context of your example, I'd liken Terraform to CSS and CloudFormation to HTML; CDK/TF to Javascript. It's not a great analogy, but Terraform as is right now is just close enough to a programming language to deceive you into treating it like one. But it really isn't and those issues become glaringly clear the more you use it.

[0] https://aws.amazon.com/cdk/

[1] https://learn.hashicorp.com/tutorials/terraform/cdktf


Newer versions of Terraform are much better. I think they went v1.0 at the right time. But I do agree that there are still plenty of warts in TF compared to a "proper" programming language. However TF 1.0 is still easily far more composable than CSS currently is. If anything, YAML (and thus CloudFromation) is more equivalent to CSS than TF is given YAML's support in the spec for variables, templates, etc.

I'm yet to try Hashicorps CDKTF but from what I've used of CDK it felt like the audience was a little different to those that would use TF. CDK feels more for orgs that would have the same team who write the application code (eg lambdas) also write the infra, a bit like Serverless (sls) is. Whereas Terraform tends to be more suited for orgs that like different teams managing infra from those managing application development. Generally speaking of course.

Ultimately all the above tools work fine for production systems so its often just a question of preference.


Out of interest, do you find yourself writing actual software with CDK stacks integrated, or is it more accurate to say the CDK is just a stand-alone bit of code purely for deploying infrastructure?

I'm definitely in the latter camp, which is something I find frustrating. I get that for a developer the syntax familiarity might make CDK easier, but for me as a non-developer the pain of groping around the terrible documentation and learning how classes are supposed to be used far outweighs the annoyance of fixing YAML indentation.

Ultimately I worry people are jumping on the "true IaC" bandwagon without acknowledging that if their infrastructure is supposed to be somewhat static and immutable, a declarative language might actually be better.


I strongly suspect that these CDKs are not very well designed. In particular, what I want is something that lets me generate YAML/etc in a type-safe fashion. That YAML is then the input for an engine which reconciles the desired state with the actual state (a la Terraform or cloudFormation). The idea is that the “real programming language” layer just allows us to DRY our YAML. For a use case like this, we don’t need inheritance or methods, but just structs, maps, arrays, and functions; however, these CDKs typically index pretty hard on inheritance and generally make things more complicated than necessary.


It's important to note that Terraform doesn't compile down to CloudFormation stacks like CDK does. Terraform providers instead call AWS APIs directly. This should allow Terraform much more composability than CDK for the very points you've listed unfortunately Terraform does still compile down to a fucking JSON state -- which is easily my biggest complaint about Terraform.

My issue with anything that ultimately compiles to a JSON or YAML state/config is that you lose the dependency tree (or your dependency tree becomes rigidly defined by the way the transpiler converts your code into JSON). It causes so many problems on any large project that ultimately the only solution is to break the project up into smaller distinct projects within the same git repository. Which is basically the same solution to working with JSON/YAML CF stacks directly.

If someone can create a language (or SDK for $PROGRAMMING_LANG) that then worked with AWS APIs directly, (like Terraform), didn't just transpile back to JSON, and isn't as verbose as using boto3 directly -- well I think that product might stand a real chance of displacing Terraform.

I've got a lot of strong thoughts on how this could be done right so I did consider having an attempt at it myself using the shell scripting language I'd designed as a rough foundation. But having a full time job already, two young kids, and a growing in popularity open source project (namely my aforementioned $SHELL), I realised any Terraform competitor I did create would be doomed to either never being maintained, or the project would literally burn me out.


Neither system deals well with large projects, but my CF criticisms weren’t predicated on the project size. So at the end of the day, CF has all of the problems TF has and then some.

But I do think that it would be really interesting to build an IaC project, and you’re right that it probably would be doomed. :)


That is certainly my impression with CDK. An even bigger heartache is the fact that I am writing a Python "program", but then have to use a JS binary to execute the deployment. Having the CDK synth/deploy functionality exposed via actual execution of the Python script (or even a built-in capability of the regular AWS CLI) would make much more sense to me.


Your suspicions are incorrect - I'd suggest trying the AWS CDK as it solves exactly the problem you want it to solve in the sense of providing strong type safety.


My only experience is with the AWS CDK.


For one of my projects I use CDK (JS) with a JavaScript project and the CDK part does call limited parts of the project to lookup how many DynamDB tables to create and which kinds of which indexes to generate.

Also the API gateway configuration is generated based on registered endpoints within a larger application, but that's not being done directly in the CDK, but as a separate explicit step in the Makefile as a dependency of the CDK targets.

Disclaimer: I work for AWS, but not on anything related to this, other than using it.


Here are a few of my reasons, but I have many more: https://news.ycombinator.com/item?id=29048297


On gcp terraform works very well. But we try to keep everything in k8s.

So terraform is critical for our bootstrap procedure, for documenting our configuration.


One of the things I like about Terraform and Pulumi and the non-vendor-specific ones is the cross-cloud features. My very basic Terraform use from the article has a machine set up on Hetzner cloud and an S3 bucket on AWS, for example.


Terraform also had the advantage of being able to pull together multiple services outside of a single provider.

Want to use Cloudflare with AWS and another 3rd party provider that offers services in an AWS region? Simple with Terraform.


Also a DevOps guy, I like Terraform from the “user” experience. It’s much more comfortable eco system to be in than AWS CF. Managing your resources is also better experience. That is until you hit an issue with resource or situation not being correctly supported by TF. CF has the vast advantage of being native and fully supporting AWS resources. Unfortunately it gets complicated (not complex) so quickly and has a strong feel of rushed MVP.


> I'm not saying Terraform is not good, but I just think a native solution of the platform is preferable over a 3rd party tool.

I would agree if you have your green-field approach and can commit to a single platform.

From my current experience I use AWS, Azure some other SaaS hosted products like ElasticSearch, Instana, Opsgenie, Kubernetes, databases, Grafana, Prometheus. And with all these products you have a bunch of people in the companry which specialize in their domain and can't know all the tools in all details but have to talk to each other.

So what makes terraform so special in my case is that you can streamline the interaction between multiple teams by focusing on defining well-known interfaces between those teams. The interfaces in the case of terraform would be:

variables (inputs)

outputs

Or you can have specialized teams, which will offer terraform modules for other teams to use.

So for me terraform does not have to be agnostic as this is not the point of it. The benefit of terraform is to streamline interactions (inputs,outputs) for your needs. Teams could automate their things with a python script for example, and use terraform just as a "hull" to offer a way to pass inputs,args to you python script and report back some outputs.

What a Dockerfile did, was to establish a well-known interface on how to define what a container is, by giving a standard-way on how to declare a "CMD,ENTRYPOINT,PORT,etc."

And terraform in that sense gives you a standard on how to define your inputs,outputs when you build,configure infrastructure,Saas, etc.


I used to be in the same camp, but instead would use the aws CLI for automation. I wasn't happy with how far behind many features in Terraform was at the time.

However, since then I gave Terraform another shot and dang am I glad that I did. It is fast, easy for me to get started with. I would much rather waste effort and time on Terraform which is cloud agnostic than spend a bunch of time learning something specific to AWS


>> but I just think a native solution of the platform is preferable over a 3rd party tool

I had a job interview where someone asked me what I would prefer Terraform or Cloudformation for AWS.

I said Cloudformation because it's managed by AWS who writes the actual software as well. And they kind of smugly said Terraform is better because it's cloud agnostic.

I was thinking...have you ever USED Terraform.


I wouldn't consider myself super in love with terraform, but Cloudformation has been nearly 100% unpleasant experiences for me, though I will admit to not being an expert. Mostly it seems like it's harder to know your changes don't have any mistakes and will do exactly what you expect. Is there a CF equivalent to TF plan? We've also found TF seems to apply changes faster in many cases.


I agree CF kinda blows compared to real programming lang, and is probably legitimately better as an intermediate generated template than hand crafted code.

Even AWS has come up with a CDK to avoid building CF and use real prog langs.

I've never used it though.

I just personally think that if youre going to be a part of an ecosystem, things go much more smoothly when you stay in that ecosystem as much as possible.

Also Amazon is a massive company compared to hashicorp so I feel more comfortable about my infrastructures longevity with AWS tools.

Not saying hashicorp is going anywhere anytime soon and if it did the open-source community would probably take over, but it eliminates that tiny tiny risk.

I would probably use the AWS-CDK now instead of CF.


Im not super familiar with terraform but think change sets may compare to terraform plan?

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...


I’ve used CloudFormation at a previous employer. I hate, loathe, and despise the thing.

YAML is not a programming language, and any attempt to turn it into a programming language is fraught with pain, woe, and suffering — at least on the part of the poor suckers who are tricked into using it.

Give me a real programming language, with interfaces to the appropriate constructs necessary to do the job, via an SDK. In AWS, that means using CDK, not CloudFormation.

Feel free to argue over whatever aspects of TerraForm that you don’t like, but please, for the love of ${DEITY}, please do not recommend CloudFormation as your suggested alternative.


> YAML is not a programming language

To bw fair, while CF can be written in YAML for less visual noise, its fundamentally JSON.

Also, its not a programming language, its a target-state description language.

> Give me a real programming language, with interfaces to the appropriate constructs necessary to do the job, via an SDK. In AWS, that means using CDK, not CloudFormation.

Technically,“CloudFormation via CDK”. You can't do “CDK, not CloudFormation” because CDK is just a tool for generating CloudFormation.


You’re entitled to your opinion, but I want to point out that learning N different custom infra as code systems for N clouds is not really sustainable. If you’re mostly using just one cloud, it makes more sense. Terraform is cloud agnostic in that once you build a process around it… CI/CD of Infra as Code, runbooks around how to work with failures, break glass etc, you can basically use the same thing for any cloud, since all of them have terraform “providers”.

Also all the major clouds (even the minor ones) have pretty strong first class support for terraform.


What do people here think about Pulumi?


I like the concept of using different languages instead of terraform because I consider the latter to provide pretty terrible developer experience.

I don't like the idea of having a Turing complete language to do that. I would prefer things to be declarative. Honestly I would just be happy with a better terraform without yaml and better tooling for terraform.

I tried to use pulumi and they didn't support something I needed, or maybe I was too dumb to understand how to do it, so I insta quit.

They're also pretty pushy in selling whatever they're selling.


I really like Pulumi, and have done some really interesting stateful infrastructure work using it in the past. Because you get a full programming language, you can definitely code yourself into a knot that is difficult to debug, but the saving grace for Pulumi I think is that the output of the program is a simple declarative object describing the target state. So you can almost think of Pulumi as a generator for Terraform files.


As a software engineer, Terraform/HCL being a real declarative programming language is a big advantage of the big ball of JSON/YAML from cloudformation.


I seem to recall that years ago, Terraform was touted as an abstraction layer for cloud providers meaning that you could write some HCL and move between them seamlessly. A solution for vendor lock-in. Maybe that was nonsense but that's certainly not the case today.


This is preciously my sentiment but its drowned out by my peers with cargo cult buzzword fo "terraform is cloud agnostic" :(


I was in a similar situation a few months ago. I had previous positive experience with k8s and some negative experience with nomad, so I was torn between k3s/k0s and just docker stack+swarm. I didn't want to install anything as I started with a minuscole node (5$ digital ocean node).

In the end I went with docker stack + machine + swarm + docker flow proxy and it's been pretty smooth sailing.

I don't need anything else installed on the machine, I can develop stacks locally with docker compose, deploy them to swarm and I can even add replication on multiple machines for services that need it. I manage secrets with stackoverflow's blackbox and the secrets live in the repository encrypted. I can easily run commands on the local or remote stack via docker machine. I cheated with cron jobs and I just synchronise /etc/cron on all hosts with a directory in my repo. The commands are mostly running commands inside containers.

I have one stack for redis, postgres, docker registry and multiple stacks for other applications. The flow to deploying stuff is basically: generate id from git hash, build docker image, push to self hosted registry, stack deploy with the git hash as version.

A few caveats I found:

- Dependencies with docker health checks will take the duration of the health checks to become available to services needing them. Either you skip health checks or setup things so that infra is running before your services.

- In order to use docker-machine on multiple machines / with different contributors you need to export the key used to setup the machine and export it (there is a npm package that works very well, but it would be nice to have it natively).

- You have to cleanup the docker registry manually every once and then (I had to write a cron job for that)

- I'm unsure about the future of docker, albeit things work pretty well as they are.


I had many of the same needs, so I wrote Harbormaster:

https://gitlab.com/stavros/harbormaster

All it does is manage Compose applications, with a sane directory structure. It's been working great, both for my personal use and for a few companies running production workloads on it.

I love that it's super simple and the workflow it has is fantastic, I just push to a repo and everything else happens automatically.


Nice! I may even have looked at that when I was getting started on this project. The two things that jump out at me as "not as easy" are private docker repos and cron jobs. I know that you can run cron images for Docker, but I didn't want the extra layer of indirection.


Hmm, what do you mean? You don't have to use either of those with Harbormaster.


I wanted to use both of them :)


Oh, well, you can :P

Can you go into a bit more detail about your setup?


Nomad has native support for periodic jobs that does not require running a cron docker image, and that was a big draw. As far as the private registry goes, I don't see why that would be particularly hard with a set of compose scripts, but I really liked the fact that Nomad has everything about them documented.

Honestly, if I had stuck with Compose files then Harbormaster looks like it's a reasonable way to manage them. But I felt like everything I wanted to do was just a little bit more difficult when I was using them, especially after comparing to Nomad.


I see, thanks! Maybe cron functionality is something I can add, actually... It seems useful.


I've been using nomad for smaller setups in AWS and it's been great.

The biggest issue I've encountered on it is when you move out of AWS (and you can't apply EC2-based IAM) but still have S3-hosted artifacts. Specifically it cannot receive Vault secrets for the artifact credentials because the nomad templates get applied at a much later stage.


Have you looked into levant? Seems like it would allow you to do this. Now, with levant the developer machine would be the thing retrieving the vault secrets, but it may be a useful stopgap.

https://github.com/hashicorp/levant


Hmm, would the stored job data still include the AWS credentials? That is, if I change the artifact S3 credentials and I run "nomad job plan" it will show the diff of the AWS keys. That means somewhere in the nomad raft logs the keys are exposed.


Yes, exactly. Definitely not ideal, but potentially a workaround depending on the security requirements.


Yeah, this pains me too. Here's a relevant issue to keep an eye on:

https://github.com/hashicorp/nomad/issues/3854

I've used an nginx-based S3 proxy in the past to get around this. Not ideal but it works.


I'd be curious to know more about the average cost of this setup over a few months, and how scaling up/down affects the cost.


Great article. I’ve also just started using nomad to manage personal services on my home loan.

What I particularly like is that I have a mixture of Docker containers, VMs, and LXD containers all centrally managed.

Overall I found nomad to be fairly intuitive, and the single binary/single job paradigm of the Hashicorp stack is very appealing.


> they still feel usable after 6 months of use. By this time, most of the context that I learned to develop the project has faded away and I have to rely on the documentation that I left myself.

This was a much-needed reminder that the target audience for my documentation is my future-self.


I was looking into doing something similar with Nomad recently since lately I have been using systemd to launch containers and managing that config with some janky shell scripts. How are you configuring everything to run on a single server, including consul? Isn't Nomad designed to to run on multiple servers or are you running nomad in multiple containers on your VPS? When I looked into this previously I got to https://stackoverflow.com/questions/56112422/nomad-configura... which has some suggestions for running Nomad on a single node but generally recommends against it.


For development purposes you very much can run Nomad and Consul on a single host. They recommend you don't, as you lose any HA, of course, but for those of us not seeking 5 nines of availability, that's quite acceptable.

In my lab I'm actually running a two-node cluster, but that's 'even worse' and engenders the occasional mildly surprising failure states.

Anyway, I can highly recommend setting up Nomad (and some friends) on a single host. It's going to be much more robust & interesting than 'some janky shell scripts'. : )


What's the configuration here? 1 server-client and 1 client? Or are you actually running 2 server-clients?


It's not my ideal situation, but pandemic relocation etc means I'm not next to my normal lab (two ESX servers + NAS).

So, that said, yes, server (whitebox) plus desktop (xeon, 32gb) - both running Debian, with Consul, Nomad, Traefik, promtail etc running on both, but no shared NFS between the two, so I've got constraints on most of the important stuff to run on the server (prometheus, cortex, loki, nodered). In practice, desktop is running 24/7, just it occasionally gets a reboot.

Having consul/nomad running as server-client on two machines is undeniably weird, and requires some careful consideration around bootstrap_expect= settings.

Almost all of this is around having a useful facsimile of my work environment, rather than (say) running an SSG public site.


Hey, I'm the author of the stack overflow answer. You can absolutely do it and it's pretty cool! Just keep in mind that you lose the advantages of a 3 node setup: data replication across nodes, failure tolerance, new leader election. That's why it's discouraged for production environments where things are meant to be up 24/24h


the qemu driver for nomad seems pretty bare bones compared to kubevirt

https://www.nomadproject.io/docs/drivers/qemu

https://kubevirt.io/user-guide/virtual_machines/disks_and_vo...

Is terraform generally used to deploy workloads to nomad instead of writing tasks directly?


> Is terraform generally used to deploy workloads to nomad instead of writing tasks directly?

It can be, although it has some weird shortcomings. For example, if the job is already present in Nomad but not running ("dead"), I don't think you can use the terraform provider to start it again.

HashiCorp themselves suggest using terraform to provision the base nomad system (ACL, quotas, base system jobs), but perhaps not your actual applications:

> This can be used to initialize your cluster with system jobs, common services, and more. In day to day Nomad use it is common for developers to submit jobs to Nomad directly, such as for general app deployment. In addition to these apps, a Nomad cluster often runs core system services that are ideally setup during infrastructure creation. This resource is ideal for the latter type of job, but can be used to manage any job within Nomad.

https://registry.terraform.io/providers/hashicorp/nomad/late...

In my work helping companies with nomad, I've seen jobs run a few different ways:

* Write and submit HCL jobs directly to nomad

* Terraform templating and the nomad provider (as above)

* https://github.com/hashicorp/levant

* A DIY templating thing (e.g. python and jinja2 templates)

* A webapp that submits jobs as JSON directly to the nomad API, perhaps modifying it to match certain policies (kind of like k8s validating / mutating admission webhooks)

There's also a new thing similar to Helm: https://github.com/hashicorp/nomad-pack


I believe that it does have a provider that talks to Nomad, however I just directly write the Nomad jobs (or use levant, which is a lightweight template for Nomad jobs). The only use of Terraform is to set up the barebones VM and other infrastructure (S3 buckets, etc.)


With the Nomad qemu driver, any unsupported options (like bridge networking) can be passed as args.


What does it mean to host a personal cloud vs hosting a server? I thought the point of using cloud services was so you didn’t have to host it yourself.


Aren't a lot of people just using cloud instead of server now? It's similar to people saying I run on EC2, which is a server, and 95% of the time it's basically a VM.


I am using "personal cloud" to refer to the set of "cloud services" that I am hosting myself on my own server. The advantage of having the server is that my devices have a central syncing hub, and the advantage of hosting myself is no need to trust third parties. There are obviously disadvantages, like having to host it myself.


Thanks for sharing. I was looking for such examples


I am curious, why is cloud-init not enough?


cloud-init is for instance initialization, commonly called "provisioning". Nomad is a application (container/non-container) orchestration platform. The two couldn't be more different.


Good article!


Great stuff!!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: