I always search for mentions of Hashicorp Nomad in the comments section of front-page Kubernetes articles like this. There are often few or no mentions, so I’d like to add a plug for the Hashistack.
For some reason Nomad seems to get noticeably less publicity than some of the other Hashicorp offerings like Consul, Vault, and Terraform. In my opinion Nomad is right up there with them. The documentation is excellent. I haven’t had to fix any upstream issues in about a year of development on two separate Nomad clusters. Upgrading versions live is straightforward, and I rarely find myself in a situation where I can’t accomplish something I envisioned because Nomad is missing a feature. It schedules batch jobs, cron jobs, long running services, and system services that run on every node. It has a variety of job drivers outside of Docker.
Nomad, Consul, Vault, and the Consul-aware Fabio load balancer run together to form most of what one might need for a cluster scheduler based deployment, somewhat reminiscent of the “do one thing well” Unix philosophy of composability.
Certainly it isn’t perfect, but I’d recommend it to anyone who is considering using a cluster scheduler but is apprehensive about the operational complexity of the more widely discussed options such as Kubernetes.
Being a bit of a HashiCorp fan I tried Nomad for Transloadit but at the time it did not support persistent volumes. K8s had that already. The more I started looking into k8s as an alternative, the more compelling features I discovered that Nomad did not have yet.
With the velocity of k8s it's hard to imagine how Nomad could catch/keep up. K8s has operators, Helm, etc. That just means you can add battle-tested components off the shelve with a single command. So, less wheel-inventing and boilerplate writing to do for us.
With the backing of so much larger community/entities it also feels like I’m less likely to be the first one to discover a new bug. RedHat or Google or one of their customers will have hit and fixed it already, and my production platform keeps humming along nicely. K8s has just had more flytime and exposure to crazy environments and workloads, so more kinks are going to be ironed out.
I always did like the “do one thing right” unixy approach of Hashicorp’s toolset, and that you can pick the pieces you like. But (sadly for them) that means I can now pick Vault or Consul and run it on top of Kubernetes (re-using k8s' internal etcd is not recommended) if I wanted. I'm actually not overly sorry for them, seeing as how they're locking up more & more features behind enterprise products. I haven't checked in a while but wouldn't be surprised if they also had a Nomad Enterprise already. Nothing wrong with HashiCorp wanting to make money, but if there also is k8s without those restrictions..
I have a few production Mesos clusters under my belt and one production Nomad and I really like Nomad and Mesos is not bad.
Kubernetes seems to be a lot of magic and NIH and tries to do everything itself, whereas Mesos and Nomad are nicely composable and easy to reason about.
Nomad's biggest benefit for me is a very nice integration with Vault (and Consul), I can have Nomad ask for a container instance specific secret which Vault then goes and generates and later immediately revokes once that container dies. Maybe this is possible with Kubernetes but I have not seen anything that tight yet.
IAM instance profiles are nice but they are instance wide, but having each container a unique, short lived and properly scoped set of secrets injected at the last possible time and immediately revoked afterwards makes me feel all warm and fuzzy inside.
Not heard that criticism before, what are you referring to in particular? The NIH part seems incongruous to me, since Google were a major contributor in inventing warehouse scale computing and cluster schedulers (c.f. the Borg and Omega papers, etc.).
Catch 22: the lack of traction/adoption is the main point that stops me from exploring it more.
I would have to put so much effort in convincing customers and management to not go the (now almost default?) Kubernetes-route, that it's risky trying something else. A small hiccup in Nomad, would be enough for the pitchforks to come out.
I would argue the biggest strength is maintainability. Managing and keeping up a distributed cluster with k8s is WORK. If you are not at the scale where you can dedicate full-time staff to managing only k8s, you shouldn't even be touching k8s. You need full-time staff to keep it alive.
Nomad is operationally simple, you can run it out of your normal devops roles, you don't need dedicated staff. Mostly because you can pretty easily wrap your head around what it does and how it works.
Zero maintenance work implies you are not doing security patches or upgrades, so as soon as you have a problem, not only will you be left holding the now broken pieces, nobody will have any reason to help or support you, unless you pay them $$$$$$'s(and even then.... maybe not).
I hope whatever you are running under k8s isn't crucial or important, and I really hope I'm not a customer of whatever you "operate".
Maintenance is real, that applies to everything if you want it to work reliably for any length of time. There are various ways to handle maintenance, do a little consistently and constantly (what most of us professionals do) or do large bulk-replacements every X time (like when stuff crashes and burns - and nobody can remember how to fix it, so they just replace it with whatever is new and shiny).
AH! sorry. I didn't realize Google started offering hosted k8s.. That def. keeps maintenance down, since Google does it for you. It's been a while since I've dug into k8s in depth.
Cool. This definitely makes it easier to use k8s, but that's very different from running k8s. My comment(s) are geared about running k8s yourself. My systems are all on physical hardware we own, hence I don't really pay a lot of attention to the latest and spiffiest in hosted platforms.
AH! sorry. I didn't realize Google started offering hosted k8s.. That def. keeps maintenance down, since Google does it for you. It's been a while since I've dug into k8s in depth.
Having a correct mental modal of the Consul architecture and realizing that the raft cluster (consistency) and the consul cluster (gossip) are two separate layers, does wonders.
Additionally, in the early days there were some tools missing (like online modifying the raft peer members) that are all there now.
For some reason Nomad seems to get noticeably less publicity than some of the other Hashicorp offerings like Consul, Vault, and Terraform. In my opinion Nomad is right up there with them. The documentation is excellent. I haven’t had to fix any upstream issues in about a year of development on two separate Nomad clusters. Upgrading versions live is straightforward, and I rarely find myself in a situation where I can’t accomplish something I envisioned because Nomad is missing a feature. It schedules batch jobs, cron jobs, long running services, and system services that run on every node. It has a variety of job drivers outside of Docker.
Nomad, Consul, Vault, and the Consul-aware Fabio load balancer run together to form most of what one might need for a cluster scheduler based deployment, somewhat reminiscent of the “do one thing well” Unix philosophy of composability.
Certainly it isn’t perfect, but I’d recommend it to anyone who is considering using a cluster scheduler but is apprehensive about the operational complexity of the more widely discussed options such as Kubernetes.