Hacker News new | comments | show | ask | jobs | submit login
Stripe open-sources Skycfg, a configuration builder for Kubernetes (github.com)
97 points by jmillikin 8 days ago | hide | past | web | favorite | 35 comments





I'm just waiting for someone to make a kubernetes config generator based on googles GCL (not opensource, but this is an opensource clone of it, although without k8s integration/examples: https://github.com/rix0rrr/gcl).

I feel it has almost limitless flexibility, without requiring either boilerplate or complexity where you don't need it. Nothing else comes close in that regard.


I wrote kubecfg as an unashamed imitation of borgcfg, because for all that we like to complain - borgcfg + GCL + git works extraordinarily well. Kubecfg uses jsonnet, which is similarly openly inspired by GCL (but with some important changes).

https://jsonnet.org/ https://github.com/ksonnet/kubecfg Example real-world usage: https://github.com/anguslees/k8s-home


Disclaimer: I maintain borg’s configuration tool, which was gcl originally.

Gcl is terse, dead simple in most common areas. It appears the best general purpose configuration language I’ve seen so far.


The main caveat of the above is that GCL is so powerful, it's easy to write crazily complex puzzles in it. Just because you can write a mandlebrot fractal generator in your config language, does not mean you should.

And you can bet the person to write the configs of that service you inherited did enjoy seeing quite how far they could stretch the flexibility of GCL...


Well we have good ways to solve these problem, Complexity management in programming language is a well understood problem, I see plenty of low hanging fruits.

Also trrseness has nothing to do with complexity. Gcl did not have much language syntax sugar intended for terseness. It’s just naturally simple in choices of right abstractions.


Hashicorp comes pretty close with their new edition of HCL at least in terms of language. It’s still more flexible in its output (unlike gcl where the output is protos).

If you are familiar with google’s piccolo this is closer to that.


I am seeing K8s ecosystem repeating Borg's own history of configuration churns. It's just there seems so many configuration tools for K8s now, which seems a much worse situation than Borg (started with only one configuration tool).


For those who are as confused as me why Kubecfg exists in the Ksonnet org on Github, when they also have Ksonnet itself, I found this explanation: https://blog.heptio.com/the-next-chapter-for-ksonnet-1dcbbad....

Short summary: They're different but somewhat complementary tools. Kubecfg was made by Bitnami, Ksonnet by Heptio. Kubecfg uses Jsonnet and is less opinioned than Ksonnet, but (as far as I can see) can also use Ksonnet.


The fork is my fault. I agree it's confusing.

I wrote (and still maintain) kubecfg. Heptio joined the project and started adding lots of great stuff. Eventually it was clear they wanted to take it in a direction that was different to the original borgcfg-like vision. I suggested we split that new functionality into a different tool so we could keep exploring both directions without trying to mash both into the same cli flags. Hence ksonnet/kubecfg and ksonnet/ksonnet.

They both use jsonnet internally, generate the same k8s resources in the end, and have that common code heritage, so have many similarities.

ksonnet/ksonnet has a bunch of extra ("rails-like") tooling to hold your hand while you generate jsonnet, and assumes it uses the k.libsonnet library (https://github.com/ksonnet/ksonnet-lib).

ksonnet/kubecfg is much dumber and really just conceptually `jsonnet | kubectl apply`. In particular kubecfg avoids having any opinion about which jsonnet libraries you use. We use it extensively with https://github.com/bitnami-labs/kube-libsonnet, but you can also use it with https://github.com/ksonnet/ksonnet-lib or anything else that is valid jsonnet.


Thanks for the explanation. Is Kubecfg like Helm in that it will apply a "destructive" diff (i.e. delete resources belonging to a deploy that are not part of the new deploy)?

I took a look at Ksonnet a while back, and it seemed to have way too many bells and whistles. In particular, I did not particularly like the sheer amount of files, in something like three separate folders, that you have to maintain for each project. I love the idea of declaring a schema that you're generating data for, but Ksonnet seems to have adopted a fairly complex structure to accomplish this.

We use Helm right now, though purely as a templating engine and destructive deployer of charts that we store in the same repo as the app itself. The package management isn't useful for our own apps, and really only gets in the way. We're looking at alternatives that don't use text-based templating and come with slightly higher-level concepts of releases.

Right now, we wrap Helm with our own little CLI so that we can, for example, automatically build all the template inputs from a set of (defaults, environment-specific stuff, local overrides). Our tool also presents a diff if you want, and records things like git revision info and deployer name in the annotations. Thus, when you ask our tool to deploy something, it can look at what the currently deployed revision is, then do a "git log" to find what's changed, display that history with nice colours on the terminal, etc. before deploying. All things that, in my opinion, a Kubernetes deploy tool should have.


(EDIT: the nick rang a bell: thanks for ktail!)

kubecfg has the --gc-tag flag, which you explicitly pass so that it can know which of the existing resources in the k8s API server used to be created by this set of kubecfg maintained config (this "system", this "application", it's up to you to decide the grain of your modelling) and thus be able to delete the resources are are no longer output by the evaluation of the current configs.

This catches the cases where you delete resources but also when you rename resources, or when you "move" them between namespaces.

It's implemented without any in-cluster component (no "tiller") by simply setting the GC tag as a label on the resource.

Kubecfg also implements diff between local config a d deployed state.

As for the amount of files: it's up to you. Kubecfg is not opinionated on how you lay out your config.

Unlike helm, it doesn't require you to know in advance which values you want to parameterize (and thus out in a values.yaml) since it's trivial to override any value with jsonnet.

The k8s API and its data model is the only thing you have to learn.

There are some helpers available to help you structure larger configs, e.g. https://github.com/bitnami-labs/kube-libsonnet/blob/master/k... . While this lib also provides some 'macros' to help you build common entities like services and deployments, IMHO the main benefit is the mapping of "foo_: {key: val, ...}" to "foo: [val, ...]". The former is much more friendlier when you have to override values with jsonnet rather than having to depend on array ordering.


Awesome explanation, thanks! (And I'm really glad you like ktail!)

This sounds like a much better foundation to build on than Helm. To me, Kubecfg sounds a lot more palatable than both Helm and Ksonnet.

As an aside, Kubecfg still isn't something you (or at least I) would expect developers to work directly with. We have devs who currently just do "tool app deploy foo" (using our homegrown CLI) to deploy an app; they don't need to understand much about Kubernetes, though they understand the basics about pods, kubectl and so on. With Helm, they'd need to know to run "helm install --upgrade --values-from k8s/production.yaml ./chart" or some other extremely long command line. In short, none of the existing tools are high-level enough.

There are Heroku-like PaaS abstractions on top of Kubernetes that give you a simplified entrypoint into deployment, but I feel like what's needed isn't a whole platform, just an opinionated top layer. Kubernetes deals with discrete objects, what you want is a higher-level tool that deals with atomic groups of objects, i.e. apps.

Long story short, are there any rumblings in the community about going in this direction? The lack of such a tool, at least as far as I've found, has lead me to consider maybe creating one, based on the experience we've had with our in-house CLI tooling, and perhaps using Kubecfg as a foundation. Thoughts?


Making an "easier" app-level experience very quickly becomes (necessarily) opinionated, because you need to anticipate which k8s parameters need to be exposed and which can be derived/assumed. Narrowing the configuration space in this way is entirely the point of "easier".

The way I've been approaching this is that you need a local "power user" who produces a simple abstraction that captures local patterns and policies, and the rest of the company then reuses that abstraction (or abstractions). Helm sort of lets you build this, but in practice it requires re-packaging helm charts with local customisations - which rapidly becomes a lot of overhead. The alternative is to expose every possible k8s option through the original helm parameters, which in turn means the helm chart becomes bewilderingly complex, and we're back to our original problem statement.

Instead, I've been advocating an "overlay" approach with jsonnet and the design of kube.libsonnet. The idea is that each consumer can import some upstream k8s manifests (described in jsonnet), apply some further jsonnet-based translations, and then publish the result as newer/simpler "templates". Someone else can then consume that, add another layer, republish, rinse, repeat. Importantly, each "republished" layer is still as easy to consume as the original. Eventually you end up with a jsonnet-based template that becomes highly opinionated and specialised to your actual problem domain, and hopefully is terse enough for local devs to use without having to learn all about k8s.

Example strawman:

  local mycompany = import "mycompany.libsonnet";
  mycompany.PhpApp {
    repo: "webgroup/guestbook",
    url: "https://mycompany.com/guestbook",
    requires_mysql: true,
  }
This might (hypothetically) turn into a:

- k8s Deployment that derived the docker image from the repo name (using knowledge of local build/publish conventions) and the command from the fact it was a php app

- k8s Service to point to the Deployment

- k8s Ingress from the provided URL (and local policy), pointing to the Service

- Bring in a mysql instance via any one of several approaches (eg: new standalone instance, or configure a new user/table on a centrally-managed DB server)

None of the above would be hard to do right now using kubecfg (or other approaches), but requires at least one person who understands both local policies and kubernetes - and for them to express that knowledge in "mycompany.libsonnet".

Importantly, whatever "mycompany.PhpApp" did would be quite different to "mycompany.PeriodicSparkJob" or "someothercompany.PhpApp" - so this isn't really something the _community_ can provide, without it rapidly becoming generic again and missing the whole point of the exercise. Coming back to your question, I think this is why you won't (and will never in the general case) find already-made tools that just happen to match your particular local needs.


Those are some great points. I agree with the premise that Jsonnet and schema-based config generation opens up the possibility of actually composable, "layerable" building blocks, something Helm doesn't do at all. I also see your point about the top layer being org-specific.

That said, I was actually thinking more about the CLI itself, and wrapping the underlying config generation in something that, for example, knows how to tag the config (so that, if it uses Kubecfg internally, --gc-tag is automatically provided, for example. And using git as a base for release versioning. As I mentioned earlier, one thing our internal tool does on deploy is to present you with what commits will be deployed, which is derived from running "git log HEAD..<currently deployed commit>". It's a nice UX for the person doing the deploy. It just uses Kubernetes annotations for that, but it ends up being pretty powerful. Something we were also thinking about was using a CRD to record each deploy, so that you can get a history, with what commit, who deployed, and so on.

Another thing we do is provide a real-time progress view of the Kubernetes resources that your deploy creates/updates/deletes. This lets the operator know when the new version is live, and also alerts them if the deploy failed. Again, it's about UX.

I think I'd want to extract what we have into a general-purpose tool, and use something like Kubecfg or Ksonnet to do the actual applying of configs. But I don't hear a lot about what people are using, and looking for, in terms of deployment tools. For me, creating an in-house tool like this was an obvious thing because we just can't run kubectl or Helm from the shell to do things, it would be way too many steps even for simple apps. Is everyone writing tools like this? Or are they actually writing out full "helm install" commands?


A point on flags: yes it would be great if the user wouldn't have to remember to provide --gc-tag explicitly. Bringing that even further, I'd like to be able to specify the cluster in the config (likely in the last "actualization" layer). Conceptually it's like the namespace: you can currently craft configs that are parametric on a given namespace and then fix that value to a given deployment specific choice. IMHO clusters should be the same except currently they are "outside the config" since the choice of the cluster affects the API endpoint the tool has to talk to.

In my ideal scenario my colleagues would just need to know which file to "apply". The file itself (through its name or directory location or comments or more documentation) will guide the user to the meaning of what environment that actually is (dev, staging, production, some well knownv deployment X)


Looks interesting, but why is it a library and not a command line tool?

For Kubernetes you don't really want to write a program (one that calls this library) for every single Kubernetes deploy you want to do.

Seeing as Protobuf schemes are parseable, why not just have a command line tool that parses the schema, then combines it with the Starlark declarations to generate Kubernetes manifests?

It could be an interesting competitor to Ksonnet if done right.


Internally we use Skycfg as a library linked into other tooling, the same way you might link in a YAML parser. The ideal adoption model would see "skycfg support" as a checkmark feature of user-facing tools like kubectl or Helm.

The specific case of a sky-to-yaml utility binary is one we want to write (and have started in the `_examples/repl/` directory). Unfortunately go-protobuf doesn't support dynamic protobuf schemas yet (https://github.com/golang/protobuf/issues/199). Until it does, every message type must be linked directly into the binary.


Thanks for clearing that up. The story title and the Github repo together makes it unclear what this is intended for.

I'd love to see a Helm-like tool built on Skycfg.

I'm really surprised that the Go Protobuf implementation doesn't support dynamic schemas, which seems like such a core feature, but looks like work is underway.


Why is this downvoted?

It doesn't have functions, but I've also released a type-checked configuration language recently[0]. The main difference is it's designed to be easy to edit, even by non-technical users.

[0] https://github.com/gilbert/zaml


Note that Skycfg isn't a language itself; we're using Google's Starlark (a deterministic Python dialect) for syntax and evaluation. This means that we get IDE support and tooling for ~free, such as Python syntax highlighting and Starlark formatting with Buildifier.

A deterministic Python dialect is indeed very cool. I've dabbled in writing a deterministic language myself.

Syntax highlighting and formatting is nice, but I think error reporting is more important for the end user.

Zaml auto-validates configuration structure, removing the need to write most boilerplate on the programmer's side.

Can you help me understand Skycfg's type safety and validation? When I see `return [pb.StringValue(value = 123)]`, does this imply the schema is written inline, alongside the values?


The schema is defined as a Protocol Buffer (https://developers.google.com/protocol-buffers/) package. This lets us construct values protobuf-based APIs without having to manually copy over the schema.

For example, if you had this schema:

  package my.example.schema;
  message Person {
    string name = 1;
    int32 id = 2;
    string email = 3;
  }
Then you could build a value using Skycfg like this:

  pb = proto.package("my.example.schema")
  msg = pb.Person(
    id = 1234,
    name = "Jane Doe",
    email = "jdoe@example.com",
  )
  print(msg)
For Kubernetes, we're using the schema at https://github.com/kubernetes/api and for Envoy we use https://github.com/envoyproxy/data-plane-api/tree/master/env...

> The main difference is it's designed to be easy to edit, even by non-technical users.

Yaml is beautiful and easy, this looks like a nightmare.


Many uses of YAML are beautiful and easy but the spec/parsers are anything but.

5 notations for things, unclear whitespace delimiters, graph/labels, etc.

https://news.ycombinator.com/item?id=17358103


This looks great. We have a similar stack (Bazel, Terraform, Kubernetes) and have struggled with configuration.

I'd love to learn more detail about how you've use skycfg with Terraform. Do Terraform module definitions drive the proto config (something we've considered)? Or does skycfg generate JSON or HCL? Thanks!


Our use of Skycfg with Terraform is still very basic, using Terraform's JSON syntax[0] and the "external" data source[1] for particular configurations that are difficult to express in HCL. I personally would like to have most of our Terraform config in Skycfg format for testing/analysis purposes, but that's blocked on having some sort of schema for the providers we use.

[0] https://www.terraform.io/docs/configuration/syntax.html#json...

[1] https://www.terraform.io/docs/providers/external/data_source...


Dhall and its Kubernetes support are another potential option [1]. I am not using them yet, but looking to switch from helm templates to something that isn't just text mash-up but has modules and type-safety. Skycfg looks like a nice option to have: more main-stream for those used to Python or Bazel even if it is missing some of Dhall's interesting features (such as reduction).

[1]: https://github.com/dhall-lang/dhall-kubernetes


How is this compared to pulumi k8s module?

Skycfg[0] is hermetic and deterministic. Evaluating a Skycfg configuration file can't execute local processes or access system properties such as the current time, so it can be treated as inert data similar to a JSON or YAML file.

This important for Stripe because we have lots of internal security boundaries. A rich configuration language that can't break out of its sandbox is useful when a highly-trusted system needs to expose a subset of its power to less trusted agents.

[0] More specifically, the Starlark language that Skycfg extends.


If you can't edit a yaml (or HCL) file, you shouldn't be using Kubernetes / Terraform. All these "you too can be an artist, just draw inside the lines of this coloring book" configuration systems (ie helm) are just distracting people from doing things the right way.

Plain yaml files have far too much repetition for common use cases.

If I have 100 microservices, and I want to run all of them in a production and staging cluster, but I want all the staging cluster jobs to have an extra environment variable/tag and have lower CPU limits, there's no way to do that without 200 yaml files, which quickly get out of sync and inconsistent.


That's pretty extreme. What if you have a deployment that should have a 100GB PVC in production, but only 10GB in staging? Do you duplicate the entire config, or pass around patches, just to make a single line change? What if you have different HPAs, different replica counts, different resource requests/limits, etc.?

People use tools like Helm and Ksonnet because they allow you to write general-purpose manifests that can be parameterized. It's an extremely powerful tool. Our charts are full of things like this for that reason:

    resources:
      {{ if .Values.kubernetes.resources }}
      requests:
        {{ if .Values.kubernetes.resources.requests.cpu }}cpu: {{ .Values.kubernetes.resources.requests.cpu }}{{ end }}
        {{ if .Values.kubernetes.resources.requests.memory }}memory: {{ .Values.kubernetes.resources.requests.memory }}{{ end }}
      limits:
        {{ if .Values.kubernetes.resources.limits.cpu }}cpu: {{ .Values.kubernetes.resources.limits.cpu }}{{ end }}
        {{ if .Values.kubernetes.resources.limits.memory }}memory: {{ .Values.kubernetes.resources.limits.memory }}{{ end }}
      {{ end }}
Or stuff like dynamically built ingresses:

    - host: {{ .Values.host }}
      http:
        paths:
        {{ range .Values.apps }}
        - path: {{ .path }}
          backend:
            serviceName: {{ .name }}
            servicePort: 80
        {{ end }}
Admittedly, templating sucks, and a tool like Skycfg or Jsonnet would be significantly better here, but the fundamental problem to solve is the same.

I'm a big fan of configurator generators. I think it's wise to let apps be simple and rely o static, pre-baked configs (JSON, YAML, whatever), but a deployment system needs to be able to generate dynamic ones.


Have you used terraform kubernetes provider? We (Cruise) basically gave up on it. Customer support reply about issues going ~1y or longer unfixed was “not a priority”.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: