Hacker News new | past | comments | ask | show | jobs | submit login
Modern CI is too complex and misdirected (gregoryszorc.com)
207 points by zdw 10 days ago | hide | past | favorite | 178 comments

I think that modern CI is actually too simple. They all boil down to "get me a Linux box and run a shell script". You can do anything with that, and there are a million different ways to do everything you could possibly want. But, it's easy to implement, and every feature request can be answered with "oh, well just apt-get install foobarbaz3 and run quuxblob to do that."

A "too complex" system, would deeply integrate with every part of your application, the build system, the test runner, the dependencies would all be aware of CI and integrate. That system is what people actually want ("run these browser tests against my Go backend and Postgres database, and if they pass, send the exact binaries that passed the tests to production"), but have to cobble together with shell-scripts, third-party addons, blood, sweat, and tears.

I think we're still in the dark ages, which is where the pain comes from.

Pretty much.

Docker in my experience is the same way, people see docker as the new hotness then treat it like a Linux box with a shell script (though at least with the benefit you can shoot it in the head).

One of the other teams had an issue with reproducibility on something they where doing so I suggested that they use a multistage build in docker and export the result out as an artefact they could deploy, they looked at me like I’d grown a second head yet have been using docker twice as long as me, though I’ve been using Linux for longer than all of them combined.

It’s a strange way to solve problems all around when you think about what it’s actually doing.

Also feels like people adopt tools and cobble shit together from google/SO, what happened to RTFM.

If I’m going to use a technology I haven’t before the first thing I do is go read the actual documentation - I won’t understand it all on the first pass but it gives me an “index”/outline I can use when I do run into problems, if I’m looking at adopting a technology I google “the problem with foobar” not success stories, I want to know the warts not the gloss.

It’s the same with books, I’d say two 3/4 of the devs I work with don’t buy programming books, like at all.

It’s all cobbled together knowledge from blog posts, that’s fine but a cohesive book with a good editor is nearly always going to give your a better understanding than piecemeal bits from around the net, that’s not to say specific blog posts aren’t useful but the return on investment on a book is higher (for me, for the youngsters, they might do better learning from tiktok I don’t know..).

I personally love it. RTFM is pretty much the basis of my career. I always at minimum skim the documentation (the entire doc) so I have an index of where to look. It's a great jumping off point for if you do need to google anything.

Books are the same. When learning a new language for example, I get a book that covers the language itself (best practices, common design patterns, etc), not how to write a for loop. It seems to be an incredibly effective way to learn. Most importantly, it cuts down on reading the same information regurgitated over and over across multiple blogs.

lol... yeah.... I've become "the expert" on so much shit just because of RTFM :D

It's amazing how much stuff is spelt out in manuals that nobody bothers to read.

The only issue is that so few people RTFM that some manuals are pure garbage to try and glean anything useful. In those cases, usually the best route is often to just read the implementation (though that is tedious).

I have someone in my network, who is very active in PHP scene. Tutorials, tips&tricks, code reviews, you name it. He pretty much abandoned his very popular website and went all in on YouTube. Why? Apparently watching video is so much easier than reading 3000 word article.

>He pretty much abandoned his very popular website and went all in on YouTube. Why? Apparently watching video is so much easier than reading 3000 word article.

Learning from videos is lazy? Content you've learned is only valid if you read it? I'm not sure what exactly you're getting at, but I'm a visual learner and I much prefer (well made) videos over text. There's a visual and audio aspect to it enabling so much more information to be conveyed at once. For example, Dr. Sedgwick's video series on algorithms and data structures has animations as he steps through how algorithms work. He's overlaying his audio while displaying code and animating the "data" as it's processed by the algorithm. I have a physical copy of his book Algorithms but I go back to his videos when I need a refresher. https://www.youtube.com/watch?v=Wme8SDUaBx8

I do have a prejudice against people who tell me they learn something through a video. There are good video creators and good educational videos, but there’s just too much trash.

Besides that, I default to text content because of the monetary incentives on youtube, it’s absurd that a 10 min read, can be a more than half hour video that never gets to the point.

>I do have a prejudice against people who tell me they learn something through a video. There are good video creators and good educational videos, but there’s just too much trash.

Bad content has absolutely nothing to do with the format. What a ridiculous statement. As if there aren't thousands of even worse articles, blog posts and incomplete/outdated documentation for every bad video. Does W3 schools ring a bell?

>it’s absurd that a 10 min read, can be a more than half hour video that never gets to the point

It's absurd you're blaming bad content on the format, which is also ironic because you missed where I explicitly wrote in text that I prefer well made videos. Do you expect every single piece of text you come across to be a concise and up to date source of truth?

It depends on what you're learning. Try to learn to chop veggies and a video is superior to text. Try to learn all the inputs to a function, return values and thrown errors and a video is far inferior to text. The crossover point probably varies from person to person, and my guess is snobbery about videos is based around the personal differences in how a person learns and where what they already know or what they learn is on that spectrum.

There's also a wide (wider than text?) difference in video quality. How many videos out there are badly narrated sources available in text format?

There are pros and cons of each media, I'm not arguing against it. What I mean is that there's a general tendency of migration from longer versions( books) to more compact,bite size content(videos). It's one of the reasons why there's countless videos on YouTube on how to create a dictionary in Python,even though the documentation has it covered wide and deep.

I agree that something like a Python dictionary is probably unnecessary to make a standalone video for, but simple concepts like that also wouldn't be a 3000 word article unless you're looking for literally every nuance and weird quirk the language has to offer.

like an explanation of what 'this' is in JS:)


Haha, I'm genuinely intrigued to see the contents of that book. It could certainly be copy/paste from the Mozilla docs, but who knows! I was pleasantly surprised by this video about the JS event loop, although in fairness it is a lecture at a conference rather than a made for youtube video. Regardless, the length of the video made me question how necessary it was but I ended up watching the whole thing and enjoyed it. https://www.youtube.com/watch?v=8aGhZQkoFbQ

>Also feels like people adopt tools and cobble shit together from google/SO, what happened to RTFM.

Sometimes it's easier to google because TFM is written by people who are intricately familiar with the tool and forget what it's like to be unfamiliar with it.

Look at git for example; the docs are atrocious. Here's the first line under DESCRIPTION for "man git-push":

>Updates remote refs using local refs, while sending objects necessary to complete the given refs.

Not a single bit of explanation as to what the fuck a "ref" is, much less the difference between remote and local refs. If you didn't have someone to explain it to you, this man page would be 100% useless.

I think that reading the man page for git-push is not reasonable if you don't understand git first.

That being the case, the first thing you need to read is the main git man page. At the bottom of it (sadly) you find references to gitrevisions and gitglossary man pages. Those should provide enough information and examples to understand what a ref is, yet probably even these could be better.

I'm in full agreement that this is terribly undiscoverable, but if you really want to RTFM, you mustn't stop at just the first page.

> treat it like a Linux box with a shell script (though at least with the benefit you can shoot it in the head)

To be fair, that by itself is a game-changer, even if doesn't take full advantage of Docker.

Any CI books you can recommend? I have been completely treating it as a linux box with a shell script and have cobbled together all of my shit from google/SO.

> Docker in my experience is the same way, people see docker as the new hotness then treat it like a Linux box with a shell script (though at least with the benefit you can shoot it in the head).

"Adapting old programs to fit new machines usually means adapting new machines to behave like old ones." —Alan Perlis, Epigrams in Programming (1982)

Every programming book I've ever bought has been way outdated to the point of feeling kind of useless.

I mean yeah there's a lot of algorithms and other fundamental concepts you can learn from books, but specific tooling? I'm not sure.

>Also feels like people adopt tools and cobble shit together from google/SO, what happened to RTFM.

I find that documentation has gotten worse and harder to find. It used to be trivial to find and long to read. Now, between SO and SEO of pages designed to bring people in, it's 95% examples not enough "here's all the data about how x works". It's very easy to get a thing going, but very hard to understand all the things that can be done.

I think there is a difference between taking technology, framing it into own mindset, getting parallels, and diving into presented story full of unicorns. Unfortunately seconds scenario is what often business wants, technology is mostly same, story is specific to the company.

I am very similar. I read through the documentation the first time to get a ‘lay of the land’, so I can deep-dive into the various sections as I require.

This is something that I love about using Bazel. It allows you to do this. Bazel is aware of application-level concepts: libraries, binaries, and everything that glues this together. It has a simple way to describe a "test" (something that's run who's exit code determines pass/fail) and how to link/build infinitely complex programs. Do you build a game engine and need to take huge multi-gb assets folders and compile them into an efficient format for your game to ship with? You can use a genrule to represent this and now you, your CI, and everyone on your team will always have up-to-date copies of this without needing to worry about "Bob, did you run the repack script again?"

It also provides a very simple contract to your CI runners. Everything has a "target" which is a name that identifies it.

A great talk about some things that are possible: https://youtu.be/muvU1DYrY0w?t=459

At a previous company I got our entire build/test CI (without code coverage) from ~15 minutes to ~30 to ~60 seconds for ~40 _binary and ~50 _tests (~100 to ~500 unit tests).

I agree with you in principle, but I have learned to accept that this only works for 80% of the functionality. Maybe this works for a simple Diablo or NodeJS project, but in any large production system there is a gray area of “messy shit” you need, and a CI system being able to cater to these problems is a good thing.

Dockerizing things is a step in the right direction, at least from the perspective of reproducibility, but what if you are targeting many different OS’es / architectures? At QuasarDB we target Windows, Linux, FreeBSD, OSX and all that on ARM architecture as well. Then we need to be able to set up and tear down whole clusters of instances, reproduce certain scenarios, and whatnot.

You can make this stuff easier by writing a lot of supporting code to manage this, including shell scripts, but to make it an integrated part of CI? I think not.

I'm curious what's a Diablo Project ? I've never heard of such technology unless you're speaking of the game with the same name.

Did you possibly mean Django ?

Argh it was indeed Django, I was on mobile and it must have been autocorrected.

While it comes up, I think it's more of a rare problem. So much stuff is "x86 linux" or in rare cases "ARM linux" that it doesn't often make sense to have a cross platform CI system.

Obviously a db is a counter example. So is node or a compiler.

But at least from my experience, a huge number of apps are simply REST/CRUD targeting a homogeneous architecture.

Unless we're talking proprietary software deployed to only one environment, or something really trivial, it's still totally worth testing other environments / architectures.

You'll find dependency compilation issues, path case issues, reserved name usage, assumptions about filesystem layout, etc. which break the code outside of Linux x86.

You're vehemently agreeing with the author, as far as I can see. The example you described is exactly what you could do/automate with the "10 years skip ahead" part at the end. You can already do it today locally with Bazel if you're lucky to have all your dependencies usable there.

I dunno... it seems important to me that CI is as dumb as it can be, because that way it can do anything. If you want that test mechanism you just described, that shouldn't be a property of the CI service, that should be part of the test framework--or testing functionality of the app framework--you use. The tragedy of this super thick service is that CI services suddenly start using your thick integration into your entire development stack as lock-in: right now they are almost fully swappable commodity products... the way it should be.

Hard agree. I've been using gitlab CI/CD for a long time now. I almost want to say it's been around longer or as long as docker?

It has a weird duality of running as docker images, but also really doesn't understand how to use container images IN the process. Why volumes don't just 1:1 map to artifacts and caching, always be caching image layers to make things super fast etc.

"I almost want to say it's been around longer or as long as docker?"

I had to look it up but GitLab CI has been around longer than Docker! Docker was released as open-source in March 2013. GitLab CI was first released in 2012.

I don't know if I'd say they're too simple. I think they're too simple in some ways and too complex in others. For me, I think a ton of unnecessary complexity comes from isolating per build step rather than per pipeline, especially when you're trying to build containers.

Compare a GitLab CI build with Gradle. In Gradle, you declare inputs and outputs for each task (step) and they chain together seamlessly. You can write a task that has a very specific role, and you don't find yourself fighting the build system to deal with the inputs / outputs you need. For containers, an image is the output of `docker build` and the input for `docker tag`, etc.. Replicating this should be the absolute minimum for a CI system to be considered usable IMO.

If you want a more concrete example, look at building a Docker container on your local machine vs a CI system. If you do it on your local machine using the Docker daemon, you'll do something like this:

- docker build (creates image as output)

- docker tag (uses image as input)

- docker push (uses image/tag as input)

What do you get when you try to put that into modern CI?

- build-tag-push

Everything gets dumped into a single step because the build systems are (IMO) designed wrong, at least for anyone that wants to build containers. They should be isolated, or at least give you the option to be isolated, per pipeline, not per build step.

For building containers it's much easier, at least for me, to work with the concept of having a dedicated Docker daemon for an entire pipeline. Drone is flexible enough to mock something like that out. I did it a while back [1] and really, really liked it compared to anything else I've seen.

The biggest appeal was that it allows much better local iteration. I had the option of:

- Use `docker build` like normal for quick iteration when updating a Dockerfile. This takes advantage of all local caching and is very simple to get started with.

- Use `drone exec --env .drone-local.env ...` to run the whole Drone pipeline, but bound (proxied actually) to the local Docker daemon. This also takes advantage of local Docker caches and is very quick while being a good approximation of the build server.

- Use `drone exec` to run the whole Drone pipeline locally, but using docker-in-docker. This is slower and has no caching, but is virtually identical to the build that will run on the CI runner.

That's not an officially supported method of building containers, so don't use it, but I like it more than trying to jam build-tag-push into a single step. Plus I don't have to push a bunch of broken Dockerfile changes to the CI runner as I'm developing / debugging.

I guess the biggest thing that shocks me with modern CI is people's willingness to push/pull images to/from registries during the build process. You can literally wait 5 minutes for a build that would take 15 seconds locally. It's crazy.

1. https://discourse.drone.io/t/use-buildx-for-native-docker-bu...

It's weird that people keep building DSLs or YAML based languages for build systems. It's not a new thing, either - I remember using whoops-we-made-it-turing complete ANT XML many years ago.

Build systems inevitably evolve into something turing complete. It makes much more sense to implement build functionality as a library or set of libraries and piggyback off a well designed scripting language.

> Build systems inevitably evolve into something turing complete.

CI systems are also generally distributed. You want to build and test on all target environments before landing a change or cutting a release!

What Turing complete language cleanly models some bits of code running on one environment and then transitions to other code running on an entirely different environment?

Folks tend to go declarative to force environment-portable configuration. Arguably that's impossible and/or inflexible, but the pain that drives them there is real.

If there is a framework or library in a popular scripting language that does this well, I haven't seen it yet. A lot of the hate for Jenkinsfile (allegedly a groovy-based framework!) is fallout from not abstracting the heterogeneous environment problem.

>What Turing complete language cleanly models some bits of code running on one environment and then transitions to other code running on an entirely different environment?

Any language that runs in both environments with an environment abstraction that spans both?

>Folks tend to go declarative to force environment-portable configuration.

Declarative is always better if you can get away with it. However, it inevitably hamstrings what you can do. In most declarative build systems some dirty turing complete hack will inevitably need to be shoehorned in to get the system to do what it's supposed to. A lot of build systems have tried to pretend that this won't happen but it always does eventually once a project grows complex enough.

> Any language that runs in both environments with an environment abstraction that spans both?

Do you have examples? This is harder to do than it would seem.

You would need an on demand environment setup (a virtualenv and a lockfile?) or a homogeneous environment and some sort of RPC mechanism (transmit a jar and execute). I expect either to be possible, though I expect the required verbosity and rigor to impede significant adoption.

Basically, I think folks are unrealistic about the ability to be pithy, readable, and robust at the same time.

>Do you have examples? This is harder to do than it would seem.

Examples of cross platform code? There are millions.

>You would need an on demand environment setup (a virtualenv and a lockfile?) or a homogeneous environment and some sort of RPC mechanism (transmit a jar and execute). I expect either to be possible, though I expect the required verbosity and rigor to impede significant adoption.

Why need it be verbose? A high level rewuorements, a lock file and one or two code files ought to be sufficient for most purposes.

We're not talking about a program that could run on any given platform (cross platform). We're talking about one program that is distributed across several machines in one workflow. That's a form of distributed (usually) heterogeneous computation. And typically with a somewhat dynamic execution plan. Mix in dynamic discovery of (or configuration of) executing machines and you have a lot of complexity to manage.

This is why I wanted to see some specific examples. I haven't seen much success in this space that is general purpose. The closest I have seen is "each step in the workflow is a black box implemented by a container", which is often pretty good, though it isn't a procedural script written in a well known language. And it does make assumptions about environment (i.e. usually Linux).

I call this the fallacy of apparent simplicity. People think what they need to do is simple. They start cobbling together what they think will be a simple solution to a simple problem. They keep realizing they need more functionality, so they keep adding to their solution, until just "configuring" something requires an AI.

Scripting languages aren't used directly because people want a declarative format with runtime expansion and pattern matching. We still don't have a great language for that. We just end up embedding snippits in some data format.

Who are the "people" who really want that, are responsible for a CI build, and are not able to use a full programming language ?

I used jenkins pipeline for a while, with groovy scripts. I wish it had been a type checked language to avoid failing a build after 5minutes because of a typo, but, it was working.

Then, somehow, the powers that be decided we had to rewrite everything in a declarative pipeline. I still fail to see the improvement ; but doing "build X, build Y, then if Z build W" is now hard to do.

People used to hate on Gradle a lot, but it was way better than dealing with YAML IMO. Add in the ability to write build scripts in Kotlin and it was looking pretty good before I started doing less Java.

I think a CI system using JSON configured via TypeScript would be neat to see. Basically the same thing as Gradle via Kotlin, but for a modern container (ie: Docker) based CI system.

I can still go back to Gradle builds I wrote 7-8 years ago, check them out, run them, understand them, etc.. That's a good build system IMO. The only thing it could have done better was pull down an appropriate JDK, but I think that was more down to licensing / legal issues than technical and I bet they could do it today since the Intellij IDEs do that now.

Uhh, I don't know. All the groovy knobs on Jenkins (especially the cloudbees enterprise one) and nexus enabled a ridiculous amount of customisation which while it made me a load of consultancy money, I think taught me the lesson that most of the time it's better to adapt your apps to your CI and infra, than to try and adapt your CI and infra to your apps.

I much prefer GitLab + k8s to the nightmare of groovy I provided over the last decade anyway..

It's funny. If you stick around this business long enough you see the same cycles repeated over and over again. When I started in software engineering, builds were done with maven and specified using an XML config. If you had to do anything non-declarative you had to write a plugin (or write a shell script which called separate tasks and had some sort of procedural logic based on the outputs). Then it was gradle (or SBT for me when I started coding mostly in scala) with you could kind of use in a declarative way for simple stuff but also allowed you to just write code for anything custom you needed to do. And one level up you went from Jenkins jobs configured through the UI to Jenkinsfiles. Now I feel like I've come full circle with various GitOps based tools. The build pipeline is now declarative again and for any sort of dynamic behavior you need to either hack something or write a plugin of some sort for the build system which you can invoke in a declarative configuration.

It's so true. I used Ant > Maven > Gradle. The thing that I think is different about modern CI is there's no good, standard way of adding functionality. So it's almost never write a plugin and always hack something together. And none of it's portable between build systems which are (almost) all SaaS, so it's like getting the absolute worst of everything.

I'll be absolutely shocked if current CI builds still work in 10 years.

This is kind of why I like keeping them as dumb as possible. Let each of your repos contain a ./build ./test ./run and the ci does stuff based on those assumptions...

You're switching from rpms->k8s? Actually nothing has to change per repo for this.

Also it creates a nice standard that is easily enforced by your deployment pipelines: no ./run? Then it's undeployable. kthxbye etc..

This becomes important when you have >50 services.

Haha. I'd be surprised if they work NEXT year.

That's sweet. It's thursday morning here. They probably already don't work any more.

Is maven old now....oh uh...gotta get with the cool kids

I was waiting for jai to see how the build scripts are basically written in... Jai Itself.

It seems that zig [1] already does it. Hoping to try that someday...

[1] https://ziglearn.org/chapter-3/

You can activate typechecking in groovy with @CompileStatic. It's an all or nothing thing though (for the entire file).

Joe Beda (k8s/Heptio) made this same point in one of his TGI Kubernetes videos: https://youtu.be/M_rxPPLG8pU?t=2936

I agree 100%. Every time I see "nindent" in yaml code, a part of my soul turns to dust.

> Every time I see "nindent" in yaml code, a part of my soul turns to dust.

Yup. For this reason it's a real shame to me that Helm won and became the lingua franca of composable/configurable k8s manifests.

The one benefit of writing in static YAML instead of dynamic <insert-DSL / language>, is that regardless of primary programming language, everyone can contribute; more complex systems like KSonnet start exploding in first-use complexity.

I wouldn't say helm has won, honestly. The kubectl tool integerated Kustomize into it and it's sadly way too underutilized. I think it's just that the first wave of k8s tutorials that everyone has learned from were all written when helm was popular. But now with some years of real use people are less keen on helm. There are tons of other good options for config management and templating--I expect to see it keep changing and improving.

Are there good kustomize bases for things like Redis, Postgres, Mysql, etc? My impression is that most "cloud-native" projects ship raw manifests and Helm charts, or just helm charts. By "won" I just mean in terms of community mind-share, not that they built the best thing. But I could be out of date there.

I do like kustomize, but the programming model is pretty alien, and they only recently added Components to let you template a commonly-used block (say, you have a complex Container spec that you want to stamp onto a number of different Deployments).

Plus last I looked, kustomize was awkward when you actually do need dynamic templating, e.g. "put a dynamic annotation onto this Service to specify the DNS name for this review app". Ended up having to use another program to do templating which felt awkward. Maybe mutators have come along enough since I last looked into this though.

Some of us use Make + evnsubst[0] (and more recently make + kustomize[1]) in defiance.

I haven't found time to take a look at Helm 3 yet though, it might be worth switching to.

[0]: https://www.vadosware.io/post/using-makefiles-and-envsubst-a...

[1]: https://www.vadosware.io/post/setting-up-mailtrain-on-k8s/#s...

Can just default to something like

  apiVersion: v1
  appName: "blah"
or something.

I wish more people who for some reason are otherwise forces to use a textual templating system to output would remember that every json object is a valid yaml value, so instead of fiddling with indent you just ".toJson" or "| json" or whatever is your syntax and it pull get something less brittle.

(Or use a structural templating system like jsonnet or ytt)

Things like rake always made more sense to me - have your build process defined in a real programming language that you were actually using for your real product.

Then again, grunt/gulp was a horrible, horrible shitshow, so it's not a silver bullet either...

The way I would categorize build systems (and by extension, a lot of CI systems) is semi-declarative. That is to say, we can describe the steps needed to build as a declarative list of source files, the binaries they end up in, along with some special overrides (maybe this one file needs special compiler flags) and custom actions (including the need to generate files). To some degree, it's recursive: we need to build the tool to build the generated files we need to compile for the project. In essence, the build system boils down to computing some superset of Clang's compilation database format. However, the steps needed to produce this declarative list are effectively a Turing-complete combination of the machine's environment, user's requested configuration, package maintainers' whims, current position of Jupiter and Saturn in the sky, etc.

Now what makes this incredibly complex is that the configuration step itself is semi-declarative. I may be able to reduce the configuration to "I need these dependencies", but the list of dependencies may be platform-dependent (again with recursion!). Given that configuration is intertwined with the build system, it makes some amount of sense to combine the two concepts into one system, but they are two distinct steps and separating those steps is probably saner.

To me, it makes the most sense to have the core of the build system be an existing scripting language in a pure environment that computes the build database: the only accessible input is the result of the configuration step, no ability to run other programs or read files during this process, but the full control flow of the scripting language is available (Mozilla's take uses Python, which isn't a bad choice here). Instead, the arbitrary shell execution is shoved into the actual build actions and the configuration process (but don't actually use shell scripts here, just equivalent in power to shell scripts). Also, make the computed build database accessible both for other tools (like compilation-database.json is) and for build actions to use in their implementations.

I think what you are getting at is a "staged execution model", and I agree.

GNU make actually has this, but it's done poorly. It has build STEPS in the shell language, but the build GRAPH can be done in the Make language, or even Guile scheme. [1]


I hope to add the "missing declarative part" to shell with https://www.oilshell.org.

So the build GRAPH should be described as you say. It's declarative, but you need metaprogramming. You can think of it like generating a Ninja file, but using reflection/metaprogramming rather than textual code generation.

And then the build STEPS are literally shell. Shell is a lot better than Python for this use case! e.g. for invoking cmopilers and other tools.

I hinted at this a bit in a previous thread: https://news.ycombinator.com/item?id=25343716

And this current comment https://lobste.rs/s/k0qhfw/modern_ci_is_too_complex_misdirec...

Comments welcome!

[1] aside: Tensorflow has the same staged execution model. The (serial) Python language is used for metaprogramming the graph, while the the highly parallel graph language is called "Tensorflow".

> Build systems inevitably evolve into something turing complete. It makes much more sense to implement build functionality as a library or set of libraries and piggyback off a well designed scripting language.

This is so true. That's why I hate and love Jenkins at the same time.

What is a well defined "scripting" language? Lua, Python, Ruby?

I do agree it'd be nice with a more general purpose language and a lib like you say, but should this lib be implemented in rust/c so that people can easily integrate it into their own language?

Many unknowns but great idea.

Literally any real language would be better. Even if I have to learn a bit of it to write pipelines, at least I'll end up with some transferable knowledge as a result.

In comparison, if I learned Github Actions syntax, the only thing I know is... Github actions syntax. Useless and isolated knowledge, which doesn't even transfer to other YAML-based systems because each has its own quirks.

On the theme of the posted article “Literally any real language would be better” is how I feel every time I write cmake/auto tools/etc

I'd say it's not about the capabilities of the language, but the scope of the environment. You need a language to orchestrate your builds and tests (which usually means command execution, variable interpolation, conditional statements and looping constructs), and you need a language to interact with your build system (fetching code, storing and fetching build artifacts, metadata administration).

Lua would be a good candidate for the latter, but its standard library is minimal on purpose, and that means a lot of the functionality would have to be provided by the build system. Interaction with the shell from Python is needlessly cumbersome (especially capturing stdout/stderr), so of those options my preference would be Ruby. Heck, even standard shell with a system-specific binary to call back to the build system would work.

People hate on it, but do you know what language would be perfect these days?

Easy shelling - check.

Easily embeddable - check

Easily sandboxable - check.

Reasonably rich standard library - check.

High level abstractions - check.

If you're still guessing what language it is, it's Tcl. Good old Tcl.

It's just that is syntax is moderately weird and the documentation available for it is so ancient and creaky that you can sometimes see mummies through the cracks.

Tcl would pretty much solve all these pipeline issues, but it's not a cool language.

I really wish someone with a ton of money and backing would create "TypeTcl" on top of Tcl (à la Typescript and Javascript) and market it to hell and back, create brand new documentation for it, etc.

> I really wish someone with a ton of money and backing would create "TypeTcl"

They did. Larry McVoy of BitMover created Little, a typed extension of Tcl [0]. Didn't do the marketing bit, though.

[0] https://wiki.tcl-lang.org/page/Little

I'd rather see an Lua successor reach widespread adoption. Wren pretty much solved all my gripes with Lua but there is no actively maintained Java implementation.

How is the Windows support? One of my big needs for any general-purpose build system is that I can get a single build that works on both Windows and POSIX. Without using WSL.

That said, you're right, at least at first blush, tcl is an attractive, if easy to forget, option.

Windows support? Great as far as I can tell. ActiveTcl, the Windows distribution, has been a thing for several decades now. I remember using it back in 2008.

The IDE/editor that ships with Python for Windows is built on Tkinter, which is a wrapper for Tcl/Tk. So it seems to work fine.

Tcl. We already have that language and it's been around for decades, but it's not a cool language. Its community is also ancient, and it feels like it.

Kotlin. It has good support for defining DSLs and can actually type check your pipeline.

They probably mean an interpreted language or at least something distributed as text. But honestly, you could come up with some JIT scheme for any language, most likely.

The problem with arbitrary operations is that they are not composable. I can’t depend on two different libraries if they do conflicting things during the build process. And I can’t have tooling to identify these conflicts without solving the halting problem.

Pulumi and CDK come to mind, they look very interesting compared to yaml/dsl approaches

Fully agreed -- Pulumi was in this space correctly and right out of the gate while CDK is a relative newcomer (both in general and inside the walled garden of AWS). AWS has contributed to the bloodshed with CloudFormation for a long time.

Also, don't forget that CDK for Terraform now exists[0] as well.

[0]: https://www.hashicorp.com/blog/cdk-for-terraform-enabling-py...

That's kind of what Bazel does. Skylark is a Python dialect.

Why is an entirely new dialect necessary? Why couldn't it just have been a python library?

It started as that at Google and was a nightmare in the long run. People would sneak in dependencies on non-hermetic or non-reproducible behavior all the time. The classic "this just needs to work, screw it I'm pulling in this library to make it happen" problem. It just kept getting more and more complex to detect and stop those kinds of issues. Hence a new language with no ability to skirt around its hermetic and non-turing nature.

Mozilla uses Python, but executes the code in a very limited environment so you can't import anything not on an allowlist, can't write to files, etc. But it's just the regular Python interpreter executing stuff. It produces a set of data structures describing everything that are then used by the unrestricted calling Python code.

It seems to work pretty well, though it feels a little constraining when I'm trying to figure something out and I can't do the standard `import pdb; pdb.set_trace()` thing. There's probably a way around that, but I've never bothered to figure it out.

Starlark is an extremely limited dialect of Python. It is intentionally not Turing complete, to prevent unbounded computation.

I think the rationale for making it syntactically similar to Python was: we want to add a macro processing language to our build system, and we want to support common features like integers, strings, arithmetic expressions, if-statements, for-loops, lists, tuples and dictionaries, so why not base our DSL off of Python that has all that stuff, so that people familiar with Python will have an easy time reading and writing it?

Then they implemented a very limited but very fast interpreter that supports just the non-Turing-complete DSL.

To limit what can be done, to make it easier to reason about: https://docs.bazel.build/versions/master/skylark/language.ht...

This is sorta Gradle's approach.

Gradle's approach is commendable, but it's too complicated and they built Gradle on top of Groovy. Groovy is not a good language (and it's also not a good implementation of that not good language).

It's good enough for me. It's unlikely to be "replaced" by the new hotness because it was created way before the JVM language craze where there were dozens of competing JVM languages. Sure that means it has warts but at least I don't have to learn a dozen sets of different warts.

I so oppressed with YAML chosen as the configuration language for mainstream CI systems. How do people manage to live with this ? I always make mistakes - again and again. And I can never keep anything in my head. It's just not natural.

Why couldn't they choose a programming language ? Restrict what can be done by all means, but something that has a proper parser, compiler errors and IDE/editor hinting support would be great.

One can even choose an embedded language like Lua for restricted execution environments. Anything but YAML!

YAML is not a language, it's a data format. Why does nobody in the entire tech industry know the difference? I didn't even go to school and I figured it out.

Most software today that uses YAML for a configuration file is taking a data format (YAML) applying a very shitty parser to create a data structure, and then feeding that data structure to a function, which then determines what other functions to call. There's no grammar, no semantics, no lexer, no operators, and no types, save those inherent to the data format it was encoded in (YAML). Sometimes they'll look like they include expressions, but really they're just function arguments.

The gp isn't talking about YAML in itself, they're talking about YAML as it's used by mainstream CI systems. Github Actions, for example, encodes conditionals, functions, and a whole lot of other language elements in its YAML format. To say it's "just a data format" is like saying XSLT is "just a markup language" because it's written in XML.

All CI yaml configs look basically the same, so I believe this is missing the intended point.

But, TIL. Thanks

> YAML is not a language, it's a data format.

Yet Another Markup <<Language>> (which later supposedly became "YAML Ain't Markup Language", because every villain needs a better backstory).

VS code with the prettier extension is an IDE with hinting, parser and immediately show when you have (not compiler but) errors. If there is an extension for your CI system try installing it too.

This is what chef did and I enjoyed it, but it seems that's not the way most systems went.

I guess YAML has the best solution to nested scopes, just indentation

"Bazel has remote execution and remote caching as built-in features... If I define a build... and then define a server-side Git push hook so the remote server triggers Bazel to build, run tests, and post the results somewhere, is that a CI system? I think it is! A crude one. But I think that qualifies as a CI system."



The advisability of rolling your own CI aside, treating CI as "just another user" has real benefits, and this was a pleasant surprise for me when using Bazel. When your run the same build command (`say bazel test //...`) across development and CI, then:

- you get to debug your build pipeline locally like code

- the CI DSL/YAML files mostly contain publishing and other CI-specific information (this feels right)

- the ability of a new user to pull the repo, build, and have everything just work, is constantly being validated by the CI. With a bespoke CI environment defined in a Docker image or YAML file this is harder.

- tangentially: the remote execution API [2] is beautiful in its simplicity it's doing a simple core job.

[1] OTOH: unless you have a vendor-everything monorepo like Google, integrating with external libraries/package managers is unnatural; hermetic toolchains are tricky; naively-written rules end up system-provided utilities that differ by host, breaking reproducibility, etc etc.

[2] https://github.com/bazelbuild-remote-apis/blob/master/build/...

How does Bazel deal with different platforms? For example, run tests on Windows, BSD, Android, Raspberry Pi, RISCv5, or even custom hardware?

Pretty well! You can set up a build cluster that provides workers for any of these different platforms. Each of these platforms is identified by a different set of label values. Then you can run Bazel on your personal system to 'access' any of those platforms to run your build actions or tests.

In other words: A 'bazel test' on a Linux box can trigger the execution of tests on a BSD box.

(Full transparency: I am the author of Buildbarn, one of the major build cluster implementations for Bazel.)

Bazel differentiates between the "host" environment (your dev box) the "execution" environment (where the compiler runs) and the "target" environment (e.g. RISCv5)

Edit: there's a confusing number of ways of specifying these things in your build, e.g. old crosstool files, platforms/constraints, toolchains. A stylized 20k foot view is:

Each build target specifies two different kinds of inputs: sources (code, libraries) and "tools" (compilers). A reproducible build requires fully-specifying not just the sources but all the tools you use to build them.

Obviously cross-compiling for RISCv5 requires different compiler flags than x86_64. So instead of depending on "gcc" you'd depend on an abstract "toolchain" target which defines ways to invoke different version(s) of gcc based on your host, execution, and target platforms.

In practice, you wouldn't write toolchains yourself, you'd depend on existing implementations provided by library code, e.g. many many third party language rules here: https://github.com/jin/awesome-bazel#rules

And you _probably_ wouldn't depend on a specific toolchain in every single rule, you'd define a global one for your project.

"platforms" and "constraints" together let you define more fine-grained ways different environments differ (os, cpu, etc) to avoid enumerating the combinatoric explosion of build flavors across different dimensions.

HTH, caveat, I have not done cross-compilation in anger. Someone hopefully will correct me if my understanding is flawed.

The reason this isn't a concern is because Bazel tries very hard to not let any system libraries or configurations interfere with the build, at all, ever. So it should rarely matter what platform you're running a build on, the goal should be the same output every time from every platform.

Linux is recommended, or a system that can run Docker and thus Linux. From there it depends on the test or build step. I haven't done much distributed Bazel building or test runs yet myself. I imagine you can speak to other OSes using qemu or network if speed isn't a concern. You can often build for other operating systems without natively using other operating systems using a cross-compiling toolchain.

That said Bazel is portable - it generally needs Java and Bash and is generally portable to platforms that have both, though I haven't checked recently. There are exceptions though, and it will run natively in Windows, just not as easily. https://docs.bazel.build/versions/master/windows.html It also works on Mac, but it's missing Linux disk sandboxing features and makes up for it using weird paths and so on.

> That said Bazel is portable - it generally needs Java and Bash and is generally portable to platforms that have both, though I haven't checked recently. There are exceptions though, and it will run natively in Windows, just not as easily. https://docs.bazel.build/versions/master/windows.html It also works on Mac, but it's missing Linux disk sandboxing features and makes up for it using weird paths and so on.

The good old: in theory it's portable, but in practice the target of that port better look 100% like Linux :-)

https://docs.bazel.build/versions/master/platforms.html is probably what you want.

So you can, conceivably, bazel running on your local, x86 machine, run the build on an ARM (rpi) build farm, crosscompiling for RISCv5.

I presume that this specific toolchain isn't well supported today.

I had to integrate Azure Pipelines and wanted to shoot myself in the face. The idea that you are simply configuring a pipeline yaml is just one big lie; it's code, in the world's shittiest programming language using YML syntax - code that you have no local runtime for, so you have to submit to the cloud like a set of punch cards to see that 10 minutes later it didn't work and to try again. Pipelines are code, pure and simple. The sooner we stop pretending it isn't, the better off we'll be.

Yeah, agreed. Feels like programming a PHP/ASP website directly on the server through FTP, like we did in the 90's

We got tired of using external tools that were not well-aligned with our build/deployment use cases - non-public network environments. GitHub Actions, et. al. cannot touch the target environments that we deploy our software to. Our customers are also extremely wary of anything cloud-based, so we had to find an approach that would work for everyone.

As a result, we have incorporated build & deployment logic into our software as a first-class feature. Our applications know how to go out to source control, grab a specified commit hash, rebuild themselves in a temporary path, and then copy these artifacts back to the working directory. After all of this is completed, our application restarts itself. Effectively, once our application is installed to some customer environment, it is like a self-replicating organism that never needs to be reinstalled from external binary artifacts. This has very important security consequences - we build on the same machine the code will execute on, so there are far fewer middle men who can inject malicious code. Our clients can record all network traffic flowing to the server our software runs on and definitively know 100% of the information which constitutes the latest build of their application.

Our entire solution operates as a single binary executable, so we can get away with some really crazy bullshit that most developers cannot these days. Putting your entire app into a single self-contained binary distribution that runs as a single process on a single machine has extremely understated upsides these days.

Sounds like Chrome, minus the build-on-the-customer's-machine part. Or like Homebrew, sort of. Also sounds like a malware dropper. That said, it makes sense. I would decouple the build-on-the-customer's machine part from the rest, having a CI system that has to run the same way on every customer's machine sounds like a bit of a nightmare for reproducibility if a specific machine has issues. I'd imagine you'd need to ship your own dependencies and set standards on what version of Linux, CPU arch and so on you'd support. And even then I'd feel safer running inside overlays like Docker allows for, or how Bazel sandboxes on Linux.

Also reminds me a bit of Istio or Open Policy Agent in that both are really apps that distribute certificates or policy data and thus auto-update themselves?

We use .NET Core + Self-Contained Deployments on Windows Server 2016+ only. This vastly narrows the scope of weird bullshit we have to worry about between environments.

The CI system running the same way on everyone's computer is analogous to MSBuild working the same way on everyone's computer. This is typically the case due to our platform constraints.

> GitHub Actions, et. al. cannot touch the target environments that we deploy our software to.

It can with on-prem self-hosted Runners > https://docs.github.com/en/actions/hosting-your-own-runners/...

I just had this same complaint about using Actions and was pointed to this document.

Yeah and that's a total shitshow too. You want to run that in k8s?

    * First off it won't work with a musl image
    * You need to request a new token every 2 hours
    * It doesn't spawn pods per build like gitlab, it's literally a dumb runner, the jobs will execute *IN* the runner container, so no isolation, and you need all the tools under the sun installed in the container (our runner image clocked in at 2gb for a java/node stack)
    * Get prepared for a lot of .net errors on your linux boxes (yes, not exactly a showstopper but.. urgh).
I hated my time with GitHub actions and will not miss it.

Yeah. I really hope they fix the dumb runner part: https://github.com/actions/runner/pull/660 I resorted to this one instead https://github.com/summerwind/actions-runner-controller but it requires kubernetes...

Forgive me, but it sounds dangerous and insecure to give your software the kind of access that would be required to do what you described. Even with safety measures and auditing in place, I'm not sure if I would feel comfortable doing this.

How is this any less secure than handing the customer a zip file containing arbitrary binary files and asking them to execute them with admin privileges?

I worked for a place which simply ran antivirus/malware scans on the vendor-supplied binaries. Way easier to review an antivirus scan giving an approval or rejection, compared to allowing code to download source code from a vendor server (which you hope is not compromised), which does not pass human review before being compiled and run. The latter is far more likely to result in infection, unless the source code in the former is verified somehow (signed commits from a whitelist of signatures, at the very least).

I wouldn't do that either, but it's even less secure than that because the software would have credentials to the source control system. It also means your source control system has to be public.

Client environments have tokenized access to source control, which is private. Builds are pegged to specific commit hashes and triggered via an entirely separate authenticated portal, so there is no chance that someone pushes malicious code and it automatically gets built.

There is a certain mutual degree of trust with the environments we are operating in. We do not worry about the customer gaining access to our source code. Much like the customer doesnt worry too much about the mountains of their PII we are churning through on a regular basis.

Not insecure for the customer, insecure for you, the software vendor.

"Build Systems à la Carte" is not so much ringing a bell as shattering it with the force of its dong.


To expand, the OP ends with an "ideal world" that sounds to me an awful lot like someone's put the full expressive power of Build Systems à la Carte into a programmable platform, accessible by API.

Nitpick: I think you meant to write "gong".

Call me crazy, but I don't think they did...

It genuinely hadn't crossed my mind that a CI system and a build system were different things - maybe because I usually work in dynamic rather than compiled languages?

I've used Jenkins, Circle CI, GitLab and GitHub Actions and I've always considered them to be a "remote code execution in response to triggers relating to my coding workflow" systems, which I think covers both build and CI.

Did your dynamic languages not have some sort of build tool? Heck, Javascript has Gulp, Grunt, Webpack, a million others. Ruby has rake, Python pipenv/poetry, I guess. Python is actually kind of outlier, I guess you're expected to write Python scripts to manage your packaging and such.

Same boat. I am surprised that I had to to scroll so far down to see this comment.

I've been wishing for one of my small projects (3 developers) for some kind of "proof of tests" tool that would allow a developer to run tests locally, and add some sort of token to the commit message assuring that they pass. I could honestly do without a ton of the remote-execution-as-a-service in my current GitLab CI setup, and be happy to run my deployment automation scripts on my own machine if I could have some assurance of code quality from the team without CI linters and tests running in the cloud (and failing for reasons unrelated to the tests themselves half of the time).

You are right. Small teams absolutely do not need to execute code remotely, especially if the cost is having an always on job server.

My team writes test output to our knowledge base:

    bugout trap --title "$REPO_NAME tests: $(date -u +%Y%m%d-%H%M)" --tags $REPO_NAME,test,zomglings,$(git rev-parse HEAD) -- ./test.sh
This runs test.sh and reports stdout and stderr to our team knowledge base with tags that we can use to find information later on.

For example, to find all failed tests for a given repo, we would perform a search query that looked like this: "#<repo> #test !#exit:0".

The knowledge base (and the link to the knowledge base entry) serve as proof of tests.

We also use this to keep track of production database migrations.

> You are right. Small teams absolutely do not need to execute code remotely, especially if the cost is having an always on job server.

Super debatable. You should have some builds and tests run in a clean environment. Maybe you could do it in a Docker container. But otherwise, you want a remote server.

Devs mess up their environment too much for a regular dev machine to be a reliable build machine.

Git pre-commit hooks can run your tests, but that's easy to skip.

I don't know about a "proof of test" token. Checking such a token would presumably require some computation involving the repo contents; but we already have such a thing, it's called 'running the test suite'. A token could contain information about branches taken, seeds for any random number generators, etc. but we usually want test suites to be deterministic (hence not requiring any token). We could use such a token in property-based tests, as a seed for the random number generator; but it would be easier to just use one fixed seed (or, we could use the parent's commit ID).

There is a lot of valid criticism here, but the suggestion that "modern CI" is to blame is very much throwing the baby out with the bathwater. The GitHub Actions / GitLab CI feature list is immense, and you can configure all sorts of wild things with it, but you don't have to. At our company our `gitlab-ci.yml` is a few lines of YAML that ultimately calls a single script for each stage. We put all our own building/caching/incremental logic in that script, and just let the CI system call it for us. As a nice bonus that means we're not vendor locked, as our "CI system" is really just a "build system" that happens to run mostly on remote computers in response to new commits.

It's not exactly the same as the local build system, because development requirements and constraints are often distinct from staging/prod build requirements, and each CI paltform has subtle differences with regards to caching, Docker registries, etc. But it uses a lot of the same underlying scripts. (In our case, we rely a lot on Makefiles, Docker BuildKit, and custom tar contexts for each image).

Regarding GitHub actions in particular, I've always found it annoyingly complex. I don't like having to develop new mental models around proprietary abstractions to learn how to do something I can do on my own machine with a few lines of bash. I always dread setting up a new GHA workflow because it means I need to go grok that documentation again.

Leaning heavily on GHA / GL CI can be advantageous for a small project that is using standardized, cookie-cutter approaches, e.g. your typical JS project that doesn't do anything weird and just uses basic npm build + test + publish. In that case, using GHA can save you time because there is likely a preconfigured workflow that works exactly for your use case. But as soon as you're doing something slightly different from the standard model, relying on the cookie-cutter workflows becomes inhibitive and you're better off shoving everything into a few scripts. Use the workflows where they integrate tightly with something on the platform (e.g. uploading an artifact to GitHub, or vendor-specific branch caching logic), but otherwise, prefer your own scripts that can run just as well on your laptop as in CI. To be honest, I've even started avoiding the vendor caching logic in favor of using a single layer docker image as an ad-hoc FS cache.

This makes no sense to me. Modern build systems have reproducible results based on strict inputs.

Modern CI/CD handles tasks that are not strictly reproducible. The continuous aspect also implies its integrated to source control.

I guess I don't understand the post if its not just semantic word games based on sufficient use of the word sufficient.

Maybe the point is to talk about how great Taskcluster is but the only thing mentioned is security and that is handled with user permissions in Gitlab and I assume Github. Secrets are associated with project and branch permissions, etc. No other advantage is mentioned in detail.

Can someone spell out the point of this article?

Your CI pipeline builds and tests your project, which is the same thing your build system does, except they are each using different specifications of how to do that. The author argues this is a waste.

I think by introducing continuous deployment you are changing the topic from what the author wrote (which strictly referred to CI).

This is indeed the key distinction - the author did strictly refer to CI, and for that their argument works, but that's essentially strawmanning as IMHO most people who are using CI don't care about "pure CI", they want (and get) CI/CD from "CI tools" so CI/CD is the actual thing that should be compared with build systems.

I would think the build system would only build the binaries or other compiled source. The CI pipeline would build and test.

Which build tools don’t also run tests? That seems very common with Java, Scala, Bazel users, npm, etc.

Many CI systems try to strictly enforce hermetic build semantics and disallow non-idempotent steps from being possible. For example, by associating build steps with an exact source code commit and categorically disallow a repeat of a successful step for that commit.

Re: correct language/abstraction

At the highest level you want a purely functional DSL with no side effects. Preferably one that catches dependency cycles so it halts provably.

On the lowest level, however, all your primitives are unix commands that are all about side effects. Yet, you want them to be reproducible, or at least idempotent so you can wrap them in the high level DSL.

So you really need to separate those two worlds, and create some sort of "runtime" for the low level 'actions' to curb the side effects.

* Even in the case of bazel, you have separate .bzl and BUILD files. * In the case of nix, you have nix files and you have the final derivation (a giant S expression) * In the case of CI systems and github actions, you have the "actions" and the "gui".

Re: CI vs build system, I guess the difference is that build systems focus on artifacts, while CI systems also focus on side effects. That said, there are bazel packages to push docker images, so it's certainly a very blurry line.

> Re: CI vs build system, I guess the difference is that build systems focus on artifacts, while CI systems also focus on side effects. That said, there are bazel packages to push docker images, so it's certainly a very blurry line.

I think the CI and build system have basically the same goals, but they're approaching the problem from different directions, or perhaps it's more accurate to say that "CI" is more imperative while build systems are more declarative. I really want a world with a better Nix or Bazel. I think purely functional builds are always going to be more difficult than throwing everything in a big side-effect-y container, but I don't think they have to be Bazel/Nix-hard.

Out of curiosity, what's hard about bazel?

From my experience the main issue is interoperability with third party build systems. i.e. using a cmake library that was not manually bazeled by someone.

It’s been a few years since I tried to use it, but at the time the documentation was sparse and misleading. I couldn’t tell the right way to write my own rules and there were several concepts which were muddled. Some of the docs suggested I could extend it by implementing my own rules or macros and other docs suggested I had to delve into the extension interface which was a mess of convoluted Java (the kind of OOP that OOP advocates swear is Not True OOP). It certainly seemed like Bazel supports certain kinds of projects roughly aligned to languages, and if you needed to do anything interesting like codegen the answer was not at all obvious. The worst was that the docs all said that Bazel supported Python 3, but there were half a dozen different flags pertaining to Python 3, and after trying every permutation and asking for help, I discovered that the docs were simply wrong and Python 3 support had never actually been implemented due to various issues. It seems you need deep familiarity with Bazel in order to use it in even a basic capacity.

All fair points actually, a lot of babel rules were originally implemented internally in the starlark interpreter itself (written in java). I think nowadays a lot of the language support is implemented in skylark, and "toolchains" are first class concepts.

I definitely had entire days occupied by bazel when I used it, but when I figured something out, it generally "just worked" for the rest of the team.

Yeah, I think my experience might have been different if I had access to someone who was familiar with Bazel or Blaze or perhaps even if I were trying to build Java instead of Python. Hopefully things are better now and next time I take a stab at it I’ll have more success.

What is the name for this phenomenon:

Observation: X is too complex/too time-consuming/too error-prone.

Reaction: Create X' to automate/simplify X.

Later: X' is too complex.

The alternative to this seems even worse. To deal with the overcomplexity we get things like Boot and Webpack, which aren't even really build tools, but tools for building build tools.

Software Development

I don't know the name of the fallacy/phenomenon, but it always reminds me of this xkcd: https://xkcd.com/927/


Ding ding ding. We have a winner.

All a CI pipeline is is an artifact generator.

All a CD pipeline is is an artifact shuffler that can kick off dependent CI jobs.

The rest is just as the author mentions. Remote Code Execution as a service.

But what we have is CI tools with integrations to a million things they shouldn't probably have integrations to.

Most of the build should be handled by your build scripts. Most of the deploy should be handled by deploy scripts. What's left for a CI that 'stays in its lane' is fetching, scheduling, and reporting, and auth. Most of them could stand to be doing a lot more scheduling and reporting, but all evidence points to them being too busy being distracted by integrating more addons. There are plenty of addons that one could write that relate to reporting (eg, linking commit messages to systems of record), without trying to get into orchestration that should ultimately be the domain of the scripts.

Otherwise, how do you expect people to debug them?

I've been thinking lately I could make a lot of good things happen with a Trac-like tool that also handled CI and stats as first class citizens.

I've spent a ton of time thinking/working on this problem myself. I came to roughly the same conclusions about a year ago except for a few things:

- Build system != CI. Both are in the set of task management. Centralized task management is in the set of decentralized tasks. Scheduling centralized tasks is easy, decentralized is very hard. Rather than equating one to the other, consider how a specific goal fits into a larger set relative to how tasks are run, where they run, what information they need, their byproducts, and what introspection your system needs into those tasks. It's quite different between builds and CI, especially when you need it decentralized.

- On the market: What's the TAM of git? That's what we're talking about when it comes to revamping the way we build/test/release software.

- There's a perverse incentive in CI today, which is that making your life easier costs CI platforms revenue. If you want to solve the business problem of CI, solve this one.

- There are a number of NP Hard problems in the way of a perfect solution. Cache invalidation and max cut of a graph come to mind.

- I don't know how you do any of this without a DAG. Yet, I don't know how you represent a DAG in a distributed system such that running tasks through it remain consistent and produce deterministic results.

- Failure is the common case. Naive implementations of task runners assume too much success, and recovery from failure is crucial to making something that doesn't suck.

Anything in CS can be generalized to its purest, most theoretical forms. The question is how usable is it and how much work does it take to get anything done.


- https://danstroot.com/2018/10/03/hammer-factories/

- https://web.archive.org/web/20120427101911/http://jacksonfis...

Bazel, for example, is tailored to the needs of reproducible builds and meets its audience where it is at, on the command line. People want fast iteration time and only occasionally need "everything" ran.

Github Actions is tailored for completeness and meets is audience where its at, the PR workflow (generally, a web UI). The web UI is also needed for visualizing the complexity of completeness.

I never find myself reproducing my build in my CI but do find I have a similar shape of needs in my CI but in a different way.

Some things more tailored to CI that wouldn't fit within the design of something like Bazel include

- Tracking differences in coverage, performance, binary bloat, and other "quality" metrics between a target branch and HEAD

- Post relevant CI feedback directly on the PR

I was really disappointed when GitHub Actions didn't more closely resemble Drone CI. Drone's configuration largely comes down to a Docker container and a script to run in said Docker container.

Not familiar with Drone but just pointing out Github Actions can be used for lots of stuff besides CICD. That's just the most popular and obvious usage.

I actually prefer being able to grab some existing actions plugins rather than having to write every single build step into a shell script like with eg. Aws codePipeline for every app. You don't have to use them, though. You could have every step be just shell commands with Github Actions.

Drone is the ultimate example of simplicity in CI. Automatic VCS-provider OAuth, automatic authorization for repos/builds, native secrets, plugins are containers, server configuration is just environment variables, SQL for stateful backend, S3 for logs, and a yaml file for the pipelines. There's really nothing else to it. And when you need a feature like dynamic variables, they support things you're already familiar with, like Bourne shell parameter expansion. Whoever created it deeply understands KISS.

I think the only reason it doesn't take over the entire world is the licensing. (It's not expensive at all considering what you're getting, but most companies would rather pay 10x to roll their own than admit that maybe it's worth paying for software sometimes)

Airflow is a pretty generic DAG execution platform. In fact, some people have built a CI (and CD) system with Airflow.




I cannot stop thinking this guy is describing what we do at https://garden.io.

He seems to go on describing the Stack Graph and the build/test/deploy/task primitives, the unified configuration between environments, build and test results caching, the platform agnosticism (even though we are heavy focused on kubernetes) and the fact that CI can be just a feature, not a product on itself.

One thing I definitely don't agree with is: "The total addressable market for this idea seems too small for me to see any major player with the technical know-how to implement and offer such a service in the next few years."

We just published the results from an independent survery we commissioned last week and one of the things that came out is: it doesn't matter the size of the company, the amount of hours teams spend mantaining this overly complex build systems, CI systems, Preview/Dev environments etc. is enormous and often is object of the biggest complaints across teams of Tech organizations.

So yeah, I agree with the complexity bit but I think the author is overly positive about the current state of the art, at least in the cloud native world.

> In my ideal world, there is a single DAG dictating all build, testing, and release tasks.

I've stewed on this as well, and I'll add my two cents:

I'd argue this problem -- building and managing a DAG of tasks -- isn't just critical to build systems and CI. It's everywhere: cloud architecture, Dev/MLOps pipelines... I'd argue most programming is just building and managing a DAG of tasks (e.g. functions, and purposefully limiting to "acyclic" for this argument). This is why we always regress to Turing complete languages; they're great at building DAGs.

So yes, I agree that some standard DAG language (probably Turing complete) would be great. But I'd extend it's reach into source code itself. Pass your DAG to Python, and it schedules and runs your tasks in the interpreter (or perhaps many interpreters). Pass your DAG to Kubernetes and it schedules and runs your tasks in containers. etc.

inb4 Lisp

One big difference in my opinion is that a CI system can (and should) allow for guarantees about provenance. Ideally code signing can only be done for your app from the CI server. This allows users/services to guarantee that the code they are running has gone through whatever workflows are necessary to be built on the build server (most importantly that all code has been reviewed).

As a principle, I consider any codebase which can be compromised by corrupting a single person to be inherently vulnerable. Sometimes this is ok, but this is definitely not ok for critical systems. Obviously it is better to have a stronger safety factor and require more groups/individuals to be corrupted, but there are diminishing returns considering it is assumed said individuals are already relatively trustworthy. Additionally, it is really surprising to me how many code platforms make providing even basic guarantees like this one impossible.

It is something we're definitely concerned about: https://www.datadoghq.com/blog/engineering/secure-publicatio...

There is a new CI solution: https://kraken.ci. Workflows are defined in Starlark/Python. They can be arranged in DAG. Beside providing base CI features it also has features supporting big scale testing that are missing in other systems.

Why even have a difference between production and CI in the first place? I see a future where it's Kubernetes all the way down. My CI system is just a different production-like environment. The same job to deploy code to production is used to deploy code to CI. The same deployment that gets a pod running in prod will get a pod running in test under CI. Your prod system handles events from end-users, your CI system handles events from your dev environment and source repo. Everything is consistent and the same. Why should I rely on some bespoke CI system to re-invent everything that my production system already has to do?

The biggest problem with any CI system is that you need an execution environment. Changing this environment should be the same as changing the code. Docker (or rather podman) has given us the tools to do this.

Now if CI systems would allow me to build that container image myself, I could pretty much guarantee that local build/tests and CI build/tests can run inside the same environment. I hacked something like this for gitlab but it's ugly and slow.

So in conclusion, I think that CI systems should expect container creation, some execution inside that container, and finally some reporting or artifact generation from the results.

Docker / containers are necessary but not sufficient. For example, in a machine learning CI / CD system, there could be a fundamental difference between executing the same step, with the same code, on CPU hardware vs GPU hardware.

I built a CI service many moons ago (was https://greenhouseci.com back then) on the premise that _most_ companies don't actually require a hugely complicated setup and don't actually need all the flexibility / complexity that most CI systems have.

After talking to tens and tens of companies with apparent complex CI requirements, I still stand by that assertion. When drilling down into _why_ they say they need all that configurability, it's sometimes as easy as "but FAANG is doing X so my 5 person team needs it as well".

Is this still working? The certificate is expired.

I parted ways with my child many years ago. The product itself is still around and has gone through 2 rebrands and has evolved a lot. It's now home under https://codemagic.io

The vision of this article is a similar vision to earthly.dev, a company I have founded to pursue the issues presented here.

We have built the build system part and are working on completing the vision with the CI part.

We use buildkit underneath as a constraint solver and our workflows are heavily DAG based.

Hundreds of CI pipelines run Earthly today.

I don't fully agree with all the assumptions in the article, including with the fact that the TAM is limited here. CI has been growing at 19% CAGR and also I think there are possibilities for expanding into other areas once you are a platform.

I always thought of CI to be a build server that pushes code directly to production automatically. I never liked how CI does away with the concept of a 'release' with a specific version. To me CI is synonymous with SaaS products, specifically products whose job is to serve individual client requests via a front end. I've never understood how CI is supposed to work with different types of software that isn't so transactional, or when the transactions are much longer running than a single HTTP request.

That's why I love Buddy (buddy.works), you build your CI pipelines with a UI, all the config and logic is all easily configurable without having to know the magic config key/value combination. Need to add a secrets file or private key; just add it to the "filesystem" and it'l be available during the run, no awkwardly base64ing contents into an environment string. Unfortunately I have to use github actions/CircleCi for my iOS deployments still, but I read MacOS container support is coming soon.

I was thinking the other day that Jetpack compose or some other reactive framework might be the perfect build system. It has declaritivity by default, sane wrapping of IO, and excellent composability (just define a function!), plus of course a real language.

I've always had a pipe dream of building a CI system on top of something like Airflow. A modern CD pipeline is just a long-running workflow, so it would be great to treat it just like any other data pipeline.

He doesn't mention Concourse, but let me say the complexity there is beast.

I've heard Concourse team members describe it by analogy to make, FWIW.

One thing Concourse does well that I haven't seen replicated is strictly separating what is stateful and what isn't. That makes it possible to understand the history of a given resource without needing a hundred different specialised instruments.

Yes, if make threw away all the context and everything you had done prior to a target completing, and only gave the outputs of the target to the next target.

Am I too old fashioned in thinking it’s good to define an acronym the first time it’s used? I think many well educated readers wouldn’t know CI

Too Many Acronyms (TMA)

  Continuous Integration (CI)
  Continuous Delivery or Deployment (CD)
  Domain-Specific Language (DSL)
  Structured Query Language (SQL)
  YAML Ain't Markup Language (YAML)
  Windows Subsystem for Linux (WSL)
  Portable Operating System Interface (POSIX)
  Berkeley Software Distribution (BSD)
  Reduced Instruction Set Computer (RISC)
  Central processing unit (CPU)
  Operating System (OS)
  Quick EMUlator (QEMU)
  Total Addressable Market (TAM)
  Directed Acyclic Graph (DAG)
  Computer Science (CS)
  User Interface (UI)
  PR? workflow (Pull Request, Purchase Request, Public Relations, Puerto Rico)
  GitHub Actions (GHA)
  GitLab? or Graphics Library (GL)
  Facebook Apple Amazon Netflix Google (FAANG)
The only acronym I care about is B:

  alias b='code build'

If you're in software development and haven't heard about Continuous Integration, you're definitely old-fashioned. Although fashion moves fast and breaks things here.

The complaint is invalid. CI pipelines are build systems.

there are a lot of solutions for CI.

I just create a small script and run using make


in another words: https://xkcd.com/927/

Isn’t Drone also an example of the author’s ideal solution?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact