Hacker News new | past | comments | ask | show | jobs | submit login
The End of CI (matt-rickard.com)
146 points by rckrd 86 days ago | hide | past | favorite | 231 comments

> How often have you or a coworker pushed an empty commit to "re-trigger" the CI?

Zero, because unless my dog configured the CI, I’d just click a button to re-run it on the same PR/commit/branch and not “make an empty commit”.

End to end testing of large systems with hundreds or thousands of man-years of code in them will be slow because there is a large surface area to test and testing in individual modules doesn’t give the same kind of end to end coverage that testing everything together does. So while pre-commit would be good, I doubt it will replace server CI for all but the smallest repositories.

Agreed, but for instance, I am working on a project hosted on AWS where the trigger build action is not allowed for my user. Empty commit is my only option.

So a dog configured your CI then.


Im sorry but I dont understand the point of such a configuration. Its certainly not security related

It can definitely be security related. The exact minimum set of permissions needed for users to successfully use the AWS Console is byzantine and not well documented. AWS managed (example) policies tend to just grant "* on *" type permissions.

To carve out minimal permissions, you have to start with nothing and repeatedly attempt to do the action in AWS console, and check CloudTrail to see what got denied. Increase role permissions, lather, rinse, repeat until it works and pray they don't update the console and break you again.

It's possible that either this process is too tedious to be worth doing, or produces a policy more complicated than they wish to use, or requires a policy that is more permissive than they wish to use.

IAM isn’t fun, but there’s lots of options.

https://pypi.org/project/access-undenied-aws/ will allow you to start with least privilege and fix specific issues.

https://github.com/iann0036/iamlive allows an admin to perform the action via CLI and capture the policy.

Access advisor can inspect how you actually use the role and give suggestions on what to remove.

A more helpful suggestion is to experiment with these tools and then find gaps in IAM actions and submit those as feature requests via your TAM.

Sure, those tools make it moderately more convenient. An intricate policy like this might still run afoul of organizational reasons that make it a challenge to get a special nonstandard thing approved.

I've also experienced the AWS console being less than stellar at fault-tolerance when acting within very restrictive, targeted IAM roles. The only solution would be an overly broad permissions grant which is not always viable. Or well, if you spend enough money you can try to beg your TAM to get it fixed, but in the meantime between "now and never", your solution would still be pushing empty commits.

It's probably cost related. Don't want any extraneous jobs running on AWS, someone might automate that as an attack.

Maybe to click a button they’d have to signup for AWS and log into their account. It easier to just push an empty commit

the workow will be maintained by the CI tool of choice, which can (should) have its own credentials to deploy stuff. There is nothing but semantic differences between a commit trigger or a manual trigger for a given commit.

CI is RCE. You want to make sure you're keeping control over that.

Sounds like you need to talk to your SRE team to get that fixed.

It's not a security concern, it'll end up as WONTFIX

- I need to be able to re-run failed builds

- Wooof!

I've seen issues before, where Jenkins agents "go bad" (Gradle cache corruption, Docker filesystem corruption, etc) – pressing the "re-run" button in the UI usually runs the job again on the same Jenkins agent, and good chance it fails again in the exact same way. What I've discovered, is if I restart the job twice, in quick succession, the first re-run will go to the same agent (and will likely fail with the same problem), the second to a different one (hopefully one without that issue). Easiest way to restart it twice, is `git commit --allow-empty -m rerun && git push`, wait until the job starts, then repeat. Maybe there is some way to do the same thing in the UI, but pushing a couple of empty commits is quick and I know it works.

That's an inadequacy in Jenkins though. If you want to do the same in TeamCity you just click the re-run option in the UI and then pick a different agent from the dialog.

Sometimes to click buttons you need to 2FA log in. It's easier to push empty commits than to do that.

I always have to be careful responding to things like this because I get sucked into the trap of thinking “we” means myself, and it just needs to mean “most of us” for whatever they’re talking about to be worrisome or even true.

I do worry that a lot of people are missing the first principles where CI is concerned, but a few of us still remember. Whether that’s enough is always a question.

> First, CI rarely knows what has changed, so everything needs to be rebuilt and retested

CI is the impartial judge in your system. That is the ultimate goal and much of the value. Don’t believe me when I say that your check in broke something. Believe the robot. It does not think, it does not interpret errors (which is why intermittent failures poison the well). It just knows that whatever you did was not good.

We make it stupid simple because the more ways you try to interpret things, the higher the likelihood the system will either miss an error or report imaginary ones. And in this case I am very sorry Mr Lasso, but it is in fact the hope that kills. Optimism makes us try things, things others would give up on. That’s what makes a piece of software good. Better. But too much and you start ignoring very real problems and make optimism into a liability. I’ve seen it over and over again, take a build with intermittent failures and people will check in broken code and assume it’s the intermittent problem and not their change.

Ideally, when you start breaking builds you’ve found your limits. But the build has to break when I break the code, and it needs to do it repeatably, because if I can’t reproduce the error I risk telling myself stories, and foisting off debugging to other people, which is a big no no.

What if it was possible to have a local build system that would run tests etc and provide a cryptographically secure hash proving that the tests passed, the build was good, etc then a “centralized” CI system just verified that before permitting merge/deploy. There’s a lot of hand waving there but it doesn’t seem wildly implausible. Local use of containers or something like nixos could provide sufficient protection against the “works on my machine but not in production” problem.

I always found it interesting how git was adopted by industry where a large part of its complexity and feature set is built around suopporting fully distributed developer, developers who could lack internet access for months at a time but need to work collaboratively in a team. Then, industry took that distributed approach and strapped project and team management systems with web based front ends like Github, GitLab, etc. around git and created CI/CD pipelines that built ontop of or in parallel to those environments--all of which were centralized. While it's of course still true that your underlying repo instances still give you a lot of flexibility away from those centralized environments, the centralized environments themselves are critical in regular daily workflows, negating a lot of the flexibility git as a VCS adds.

In large part, industry could largely use a significantly simpler model than git for VCS since many features are largely moot on a daily basis. Many could probably get away with SVN in large part, for example (although branching isn't nearly as good).

I think you raise a great point in that we need to look at how processes have evolved ontop of version control and look at adapting those to a similar model. In some cases it's just not practical because the way infrastructure testing an deployment works but it's the direction I think we should be going. In fact, the first step would be to make issue systems and project management interfaces provided by efforts like GitLab and GitHub available in a local capable, distributed fashion. Clearly communications often require some degree of centralization or at least peer message propagation but there's no reason that information can't be separated from the infrastructure that displays and interacts with it.

> Then, industry took that distributed approach and strapped project and team management systems with web based front ends like Github, GitLab, etc. around git and created CI/CD pipelines that built ontop of or in parallel to those environments--all of which were centralized.

A different view on this is that my CI system is a kind of colleague, responsible for a lot of what testers, ops, and build eng people toiled at before. It is another aspect of the decentralized system where a robot can also check out and work with the code even while humans continue to write more, which is considerably more painful under e.g. the SVN model. My human colleagues and I still share plenty of code directly, via git pulls and other tools.

It's a recurring and perhaps inevitable problem. We've collectively been building software for a long time but any given individual hasn't. No-one has seen all the good ideas that have gone before and sometimes been forgotten. Everyone is influenced to some degree by whatever ideas are currently popular within their group.

We developed distributed VCSes and the ability to do smart things with branches and combining contributions from people all over the world at different times. Then we somehow still ended up with centralised processes where everyone has to merge to trunk frequently and if GitHub goes down then the world stops turning. It turns out that just using git as git wasn't such a bad idea after all.

We were told that dynamic languages like JS and Python are so much more productive and that this was essential for fast-moving startups to be competitive. Today the industry is moving sharply from JS to TypeScript and even Python now has similar tools. It turns out that static typing was better for building robust software at scale than relying entirely on unit tests after all.

Just wait until lots of developers who have only ever worked with JS or Python learn what a compiled executable is. If you tell them that xcopy deployment even to a huge farm of web servers is no big deal when all you're copying is one executable file and maybe a .env with the secrets, it'll blow their minds...

> if GitHub goes down the world…

Well, technically it doesn’t have to. It needs to be „a“ CI server that the team agrees upon, not necessarily a GitHub one.

I think the most important factor is to keep in mind that this is something teams decide to do.

Sure. But anecdotally I don't think I have ever worked somewhere with a centralised VCS/CI/CD for more than a few weeks and not had at least one major outage that basically caused all development work to grind to a halt for everyone for at least several hours, all because a key service went down. I've seen it happen to GitHub, GitLab, some of the big dedicated CI providers...

You can certainly argue that there are other advantages to using those systems. However it's an inescapable fact that if those teams had been using git-as-git and had a good local environment for each developer then most of the developers would have been able to carry on with most of their work during all of those outages. Sometimes single points of failure fail.

Wow, that seems to be the other side of the lucky stream I am on :-D

Seriously, a well set up gutlab is pretty much undownable. And even if you do, what really is the impact? If the pipelines don't run you cannot integrate, sure if the core service breaks down you will probably resolve to exchanging patches the old school way. But the good part is, in order to deploy those decentrally sourced changes you still will go through centralized CI and gain its quality assurances by doing so. Where's the drawback?

Unfortunately a hosted GitLab is downable when GitLab itself goes down. Same for all the other hosted source control and CI systems.

Given that many companies now tie their entire deployment process into their source control and CI/CD systems that means you really are reduced to exchanging patch files. For any non-trivial change in a large system that quickly becomes impractical and so development slows to a crawl or everyone literally gives up and goes to the pub.

Industry uses both the centralized and decentralized features. Just because I want to configure team permissions doesn't mean I want to be blocked if the network is down.

I think a lot of the adoption of git in the industry at large is in _spite_ of its distributed nature - it just so happens that the decentralized design made it care much more about branching and merging, and it got adoption because even for a small local team those improvements over SVN were worth the pain of dealing with git's UX warts.

> industry took that distributed approach and strapped project and team management systems with web based front ends like Github

Indeed. Software forges centralized what was a Decentralized VCS.

For one reason: profit.

And of course the many other reasons: convenience, consistency, built-in integration with other development tools...

...none of which requires or justifies the centralization that has been created.

Apparently, many people disagree with you; otherwise, we would not be in this situation.

I agree that centralization is probably not a hard requirement for these properties, but currently no one has come up with decentralized equivalents that provide users what they actually want - and remember, you don't get to tell them what they want!

Why is this not best of both worlds? You have centralized portals like GitHub that have bunch of additional proprietary features, but if you don't need them or don't want to have centralized solution you have freedom to interface with it from, uh, anything.

github does not federate, it is not decentralized and you need an account to interact with the many proprietary features.

You just described https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis...

Running tests locally is pretty much a no-no for many codebases because they are so large and tests take a long time if they don't run on a beefy test server or in parallel.

Common examples of this are multi-million line C++ codebases (e.g proprietary game engines) and monorepos in any language.

Running tests on my computer uses up valuable CPU cycles that I can use to work on something else while the CI servers are running my tests.

However I can see this working for small to medium sized codebases that really don't need CI for testing.

This is an oversight that lands in circle of “the things I know that are difficult to sell because everyone else is just ignoring it.”

Capacity planning applies to tests. As your test count goes up your budget per test goes down. Every CI tool should plot build and test duration over time and only a couple do.

There’s a reason most test frameworks can mark slow tests. You need to not only use that but ratchet down over time. Especially when you get new hardware.

I haven’t run benchmarks lately but my old rule of thumb developed over many projects and with several people better at testing than I, was a factor of eight for each layer in the testing pyramid.

That certainly puts a lot of runtime pressure on the top of the pyramid, but that’s by design. You don’t want people racing to the top, because that’s how you get a cone.

I'm actually working on something to solve the slow test problem. We go down the massively parallel route. We run everything pre git commit by syncing your project directory to our servers and parallelizing the runs. Here is a demo https://brisktest.com/demos running against the react test suite.

Wow, that's really impressive! Do you plan on adding support for Golang?

It wasn't top of our list but it is definitely something we could support if there was interest (fairly quickly I'd imagine). My personal experience with go tests is that they tend to be very fast but perhaps I just haven't seen a large enough project. Is it something you'd be interested in? If you want to chat more about it, my email is in my profile.

Or run a partial suite...

Partial suite is certainly an option. You can build tools that only test code that has changed and whose changes can have cascading effects on other code (i.e dependent code). Do you know if such tools exist, or would I (as in the software engineer working on the codebase) have to build them?

This is actually the default result you get if you leave the defaults alone in some CI systems! My current company runs CI this way, in fact. A checkout is re-used between builds, and Gradle computes the minimal set of tasks that need to be run based on changed files just as it would locally. TeamCity will set things up this way by default. If you want a clean build on every run, you have to opt-in by checking a box.

How well does it work? Well .... it sorta mostly works. Sometimes a build will fail for inexplicable reasons because Gradle/Kotlin incremental builds don't seem fully reliable. You re-run with a clean checkout and the issue goes away.

How much time does it save? Also hard to say. Most of the time goes into integration tests that by their nature are invalidated by more or less any change in the codebase. That's not exactly incorrect. Some changes in some modules do avoid hitting the integration tests, though.

TC can also do test sharding in the newest versions when you use JUnit. We don't use this yet though.

Partial is the most valuable when you have red tests that are resisting simple fixes. I would definitely encourage running the entire tests every hour or two at the least.

We have a project where the team split the tests into chunks and eventually I figured out the reason why is because they had coupling between tests, and running them all together ran into problems. I worry about other people opening that door, because it’s damned hard to close again.

Containers alone don’t fix the “works on my machine” issue. CI should be isolated to provide proper gate keeping such that any change required for code to function on CI machine will also be automatically loaded into production and other developers machines.

In theory you could get something reasonably close to this working locally, but the it’s a serial process so that’s pointless.

>will also be automatically loaded into production and other developers machines.

I’m assuming you are only submitting this as theoretical, because in the real world (where I’ve directly experienced this workflow) it’s a nightmare:

* Someone merges main which deletes working files in your feature branch. Automatic pushing/rebasing of main onto your feature branch creates needless work the moment you try to push upstream when git tries to ascertain state of local main.

* countless times I’ve hit bugs where if I try to merge “patch-a” from a common “feat-1” branch (meaning patch was cut from feat, not from main), but then main is updated by the auto-updater, I then have a messy working directory in which main’s new files are treated as unknown orphans and I have to spend time deleting these by hand.

I’m all for having my feat branch be up to date at merge time. But making it a rolling target is something git (from my perspective) hits the boundary of what git can reasonably do, and creates more pain than any type of positive DX

This is actual experience from multiple projects. Ideally starting from a blank machine/VM you should be able to run a single script that get’s the latest, builds it locally, runs it successfully, and passes all tests.

Git may or may not be part of that process.

I'm going to be annoyingly pedantic here, apologies in advance. To me, what you're describing is an end state where the fetch/build + test is a state in which the developer is done with some task. But in my working reality, testing/building is a non-linear process that is iteratively run after some certain self-determined point-in-time.

>Ideally starting from a blank machine/VM you should be able to run a single script that get’s the latest, builds it locally, runs it successfully, and passes all tests.

This isn't automatic, and is a practice that is fairly standard today (And one I agree with). You have placed a requirement/line in the sand that says, "Only when I have ideal state should this pipeline run in linear time and output the final result, which are the return codes from tests." Your example has the user initiating the update of the upstream main, not some other process that runs git fetch on the Developer's behalf.

Your original comment, as I understand it, is contradictory to this point:

>or code to function on CI machine will also be automatically loaded into production and other developers machines.

All of us (I think) agree that auto-deployment to production is a desirable goal. But we all (I think) know that broken commits are routinely delivered to production, where "production" represents the sum of all production environments in the world. So while we can have a reasonable assumption that "Production is, or should be deployable all the time," that doesn't mean the state that is represented by Production is safe to run locally in my environment, unless I *specifically* request it. Since git doesn't have file-locking, some other team/PM/developer can decide it's time for <MASSIVE REFACTOR> that blows away my work/branch mistakenly (Or maybe even intentionally, especially if I work in an org that is terrible with communication), creating unnecessary merge conflicts/mental load. This happens in short-lived and long-lived feature branches.

In no setup, do I think it's ever safe to take away the developer's agency and let some other process keep my local machine in "sync." There are so many variables to account for that some daemon/service can't be aware of, to allow for automatic updating (and again automatic updating != user running `git fetch`).

> But in my working reality, testing/building is a non-linear process that is iteratively run after some certain self-determined point-in-time.

That’s not actually involved here. The actual process of coding can take place on a separate standalone project or even a whiteboard. But somehow all that code and everything associated with it needs to be packaged up for the team or it’s never making it to production. Further your process needs to minimally interrupt other team members or team efficiency tanks.

How that’s done is up to the team but it needs to happen somehow and automation avoids headaches. I briefly worked on a project where we kept passing around updates to a VM, slow and bandwidth intensive but it did actually work.

> "Production is, or should be deployable all the time,"

That’s a separate question. I am saying the code should be bundled with any needed environment configuration required to run that code. CI is a direct test of the process.

> In no setup, do I think it's ever safe to take away the developer's agency and let some other process keep my local machine in "sync." There are so many variables to account for that some daemon/service can't be aware of, to allow for automatic updating (and again automatic updating != user running `git fetch`).

Capacity is not a requirement.

For developers it’s about being able to hit a big red button and get your local environment working rather than something you automatically do day to day. Onboarding, or coming back after 3 months on another project, etc shouldn’t involve someone trying to piece together all the little environment changes that people need to apply since the last time someone updated the onboarding document.

In practice you might be reading a diff of some script and just apply that manually. But at least the sharing process is automated.

It’s the same problem with tests in general. Developers think it’s all about getting a green build. It’s not. It’s about fixing red builds as fast as possible. I have a whole other rant about making zero build failures a metric.

A red local build probably means a red CI build, but a red CI build doesn’t necessarily mean a red local build. Now you’re fucked because there’s a build failure you can’t reproduce.

Reducing variance helps. Having the person who set up the CI system also copy edit the onboarding docs helps a lot with this.

It’s a matter of scale. As the team grows and particularly as you hit the steep part of the S curve of development, you’re going to have lots of builds and that 1:50 error is going to go from every two weeks to once a day.

Humans interpret every day as “all the time”. Some do this for every week, especially if it coincides with their most important commits. It’s not the ratio of failures that bothers people. It’s the frequency, and the clusters.

"What if it was possible to have a local build system that would run tests etc and provide a cryptographically secure hash proving that the tests passed"

On who's system? Your? Mine? Someone else's? The purpose of CI is consistent continuous integration in a same like manner, not relying on developer A, B or C's systems which may vary greatly.

If your tests ran in a sandbox, for instance the Bazel sandbox or a Docker container, and this has the same configuration on CI and on the local machine, then it does not matter if the tests ran in CI or on a developer's machine.

This falls apart for things like integration tests, which may be too large/complex/interconnected to work on a local machine, but most of the time this would be more than sufficient.

This and the original article feel very much like solutions in search of problems. Beyond the issues with the original article, what's the point in a cryptographically secure hash from an untrusted environment? If someone were really determined to maliciously push broken code for some reason, they could just tamper with whatever generates the hash.

Local testing doesn't tend to work outside the small scale. Even if you use containers for everything you'll run into edges where a local development machine can't accurately test a production-like environment. Deploying to production without testing in that environment is just tragedy waiting to happen.

Distributed signing of artifacts is only effective if you've got fully reproducible builds. If you don't, because almost no one does because it is a huge effort, then all you have is attestation. If a broken/malicious artifact gets deployed the damage gets done and you only know who to blame afterwards.

Local tests are useful for not knowingly committing broken code insofar as your tests can determine. Outside of that a full test suite run by a beefy cluster with ready access to assets and network resources is better suited to test for deployment.

> provide a cryptographically secure hash proving that the tests passed

It’s unlikely that verifying such a hash is possible without rerunning the tests, and not at the same time enabling someone to trivially compute that hash without having run the tests in the first place.

There’s a lovely feature of TeamCity that I had to enforce on one exuberant but doubly sloppy programmer.

In trunk based development you can do a conditional commit, where TC runs your code and only pushes it to trunk if the build is green. This allows you to push something before lunch or a meeting for someone who is blocked without coming back to angry faces because you ding-dong-dashed.

Breaking trunk is fundamentally a problem of response time. If you’re not in the office you can’t fix a red build in a timely fashion. He did this regularly and got put under house arrest.

I started using it on myself so people didn’t have to wait two hours for me to get out of a meeting and fix their api bug.

What's the point? To reuse hardware for build workers? You can run a local Gitlab ci runner and get what you're saying, no?

That's just decentralized CI, aka CI with extra steps.

CI itself is just continually testing before continually merging. Doesn't matter where.

A previous employer used a passport analogy. Your code gets a visa (cryptographically signed) to run in a specific environment for a fixed time. You can renew your visa as long as all the assumptions remain true, but once they’re falsified you must create and submit a new build that makes the assumptions true again.

It's not enough to just validate the tests: you would also want a tamper-proof record of exactly what source code was actually built. Probably also the full environment. 99.99% of the time this will be irrelevant trivia, but that one time you need to know, there's no substitute.

In the Java world gradle can do this with the help of a shared build cache server. There is also Gradle Enterprise but I never looked deeper into it.

I like how you said using containers, similar to how CI servers tend to be configured.

Exactly this. CI gets rid of "works on my machine" syndrome.

If it doesnt build on CI, you broke it, fix it before merging.

Though, I'm still sour from my only experience in "tech" where "works on my machine" meant devs got to demand I fix the CI somehow.

If self delusion is our biggest sin, learned helplessness is probably number two.

A small but important part of my motivation to build better DevEx is that it puts a squeeze on the foot draggers. If you can’t blame the tools for being shit (and believe me, I know a lot of them are), then the only other explanations are that you are either too stupid to use them or just not a team player. An asshole.

Developers will pick being called an asshole over being seen as stupid any day, but the former eventually gets management involved. It’s the closest I’ll probably ever get to solving a social problem with tech. Make it easy for people to participate, make them feel silly for not, then ramp up the pressure if that doesn’t work, and when it’s clear they only care about themselves you have actionable evidence to involve HR.

I fail to see the point being made in the article. CI done right is the best thing that ever happened to me in terms of collaborative productivity.

Of course there are many pitfalls. Like with any tool it becomes trivial to use it wrong. Integrating less than once per day and deferring all testing and linting to the CI process is an anti-pattern and the issue is not the CI process but how the team chooses to use it.

For example I worked in a team of 10 and we were doing multiple production deploys per day. This was made possible by a great CI workflow. Everyone was running tests precommit and focusing on keeping the pipeline green. Yes I've seen the opposite as well, but that usually is a symptom of issues with the team not the general concept of CI.

Not saying that there isn't room for improvement but all the real-time collaboration features I used on low-code platforms feel like a step back.

> deferring all testing and linting to the CI process is an anti-pattern

I'm confused as to why this is an anti-pattern? My understanding is that the CI pipeline should run unit tests and linting for every commit. But at the same time, developers should run their tests before pushing code.

Emphasis on all :).

Of course the CI should always run them, but that should normally be as a confirmation/safeguard.

I've seen too many cases where the devs wouldn't even run the code locally. They would push it and expect the CI to do all the work. That's how you get shitty CI that is always broken.

That’s why you use branches though. You can break the CI on your own branch as much as you want, it’s nobody’s business. But a broken CI on a dev branch MUST prevent merging to a release branch.

If you allow devs to push directly on release branch, thus breaking the CI, you’re absolutely doing it wrong.

Well, I believe "absolutely doing it wrong" is a bit strong-worded.

Of course you can do it like you said but that means longer feedback loops in general. If the team wants to integrate more often and reduce feedback loops then that model evolves.

I'll give you an example. In the team I mentioned in my top comment we were initially using a branching model with master, releases/, hotfixes/, dev, features/, which gradually evolved into master, dev, features/, which finally ended up as master, features/*. With the important mention that for small changes/fixes that needed to get deployed quickly nobody would bother with a branch they would just push to master.

This allowed us multiple production deploys per day per developer with no risk. That's why I said I don't get the point of the article, you can absolutely get those short feedback loops and continuous integration if you want it, just need to setup the process that way.

> Of course you can do it like you said but that means longer feedback loops in general. If the team wants to integrate more often and reduce feedback loops then that model evolves.

Longer feedback loops =/= long feedback loops. You can definitely wait 5 to 10 minutes if it means doing it right.

> With the important mention that for small changes/fixes that needed to get deployed quickly nobody would bother with a branch they would just push to master.

From my experience, the 1-2 lines fixes are the ones that benefit the most from automated CI because you’re doing it in a rush. In my team just last week a junior dev asked us to review their PR quickly because it was just 2 lines, and it didn’t even compile. We told them to be more careful in the future, but in the end it didn’t impact anything. It couldn’t possibly have impacted anything thanks to CI, it just makes it impossible to fuck up too recklessly.

The TBD [0] crowd disagrees with you.

I agree with you though.

Not that I don't see the value proposed by TBD, but I think you can have >90% of said value and none of the downsides using a well thought out branching strategy.

[0] Trunk-Based Development: https://trunkbaseddevelopment.com/

TBD doesn't mean you have a red CI main branch. Of course it is always green on main. It means you have short lived feature branch and rely on runtime checks for feature gating. A broken main will halt TBD. You are mischaracterize what TBD is.

From the linked Website:

> Depending on the team size, and the rate of commits, short-lived feature branches are used for code-review and build checking (CI). [...] Very small teams may commit direct to the trunk.

Indeed. That seems to be a new-ish addition. I'm glad they now admit alternate approaches.

I'm in the "continuously improve the team's process to best suit its needs" crowd.

Sometimes TBD is the answer, sometimes you need something else.

What I did notice is that, with time, mature teams end up simplifying processes in order to reduce friction and increase output.

> I've seen too many cases where the devs wouldn't even run the code locally.

Lazy devs are going to be lazy no matter what processes their team uses.

From the perspective of the dev, if my local CI isn’t worth anything towards a merge and upstream CI is gospel why run locally? If I’m reasonably certain that the two jobs are duplicative in output it could be seen as wasted time, especially if I have a PM hounding for features. I don’t call that lazy, I call that a trade off. (coming from someone who is constantly running tests locally before pushing upstream)

Isn't that slower and less efficient? Usually the CI has to run a full build from scratch before it can run the tests, but locally for me it's going to be an incremental build that takes a second or two. I can also run the subset of tests that I know are affected by my changes to get fast and reasonably reliable feedback vs waiting for CI to run the whole test suite.

Linters and tests should pass before code is rebased onto master. YOLOing code onto master is an antipattern because if your code breaks the build then you are holding up everyone else from being able to make changes until yours gets reverted. If linters and tests are already passing there is a very good chance the rebase won't break anything.

You can go a step further. Instead of merging anything, you can tell the ci to merge it. Then ci can make a "merge-test" branch, run all the tests on it, and if they pass, then ff-merge it to master for real. No need for "good chance" or rebasing just to keep up with master.

It does take some extra work though, because GH and others don't really support this out of the box.

Gitlab does support this with “Merged Result pipelines”[0]. We use them extensively alongside their merge train functionality to sequential is everything and it’s fantastic.

[0]: https://docs.gitlab.com/ee/ci/pipelines/merged_results_pipel...

Isn’t this the way everyone works? Write code, run some tests/linting locally, then create a PR to master and the CI server runs all the tests and reports pass/fail. Changes without a pass can’t be merged to master.

The extra step is that once the change is accepted the process of rebasing it or merging it into master is done by a bot that checks if it would break master before preceding to do it.

My issue with this approach is that it becomes tricky to scale since you can only have one job running at a time. Allowing master to potentially break scales better because you can run a job for each commit on master which hasn't been evaluated. Technically you could make that approach work by instead of rebasing onto master rebasing onto the last commit that is being tested, but this adds extra complexity which I don't think standard tooling can easily handle.

> you can only have one job running at a time

https://zuul-ci.org/ and some other systems solve it by optimistic merges. If there's already a merge job running, the next one assumes that will succeed. And tests on merged master + first change + itself. If anything breaks, the optimistic merges are dropped from the queue and everything starts from the second chance only. Openstack uses it and it works pretty well if the merges typically don't fail.

This can result in a broken master if there were new commits added to master since the pull request was submitted (but before it is merged).

The solution is to require that all PR must be rebased/synced to master before they can be merged. GitHub has an option for enforcing this. The downside is that this often results in lots of re-running of tests.

I think the point is about far shorter feedback loops than possible with today's pervasive tech.

> something even more continuous than continuous integration. It might look like real-time integration of your changes, collaborative editing, or smarter build systems that produce near-live feedback

e.g. https://medium.com/darklang/how-dark-deploys-code-in-50ms-77...

Well, how short the feedback loop needs to be is something that can be controlled by the process, and the team sets the process.

I can have the normal flow, which takes minutes and runs all the tests and other stuff I might have on my pipeline (security, performance, etc). I use this for typical day to day work on features. I don't care if the deploy takes 50ms or 5 minutes.

I can have a fast-track for critical production patches. I skip all the main CI steps and just get my code quickly in production. If done right this takes seconds.

I didn't know about dark but I've seen this type of promise too many times, so I'm pretty sure there are many tradeoffs hidden under the nice shiny exterior. I can't know until I try it, but that kind of complexity doesn't all just disappear, i always has a cost even if it's out of sight.

Here's the thing: we have hundreds of thousands of teams and organizations who currently each need to select and define those processes.

Perhaps some of those teams and individuals are experts at determining the correct, minimal continuous integration suite to run on a per-commit basis to minimize time and energy expenditure without compromising correctness.

But I can guarantee that not all (in fact, not many) are, and that they pay maintenance and mental overheads to adhere to those practices.

It feels to me like there is a potentially massive opportunity to design an integrated language-and-infrastructure environment that only re-runs necessary checks based on each code diff.

- "Altered a single-line docstring in a Python file? OK, the data-flow from that infers a rebuild of one of our documentation pages is required, let's do that... done in 2ms"

- "Refactored the variable names and formatting within a C++ class without affecting any of the logic or ABI? OK, binary compatibility and test compatibility verified as unchanged, that's a no-op... done in 0ms"

- "Renamed the API spec so that the 'Referer' header is renamed 'Referrer'? OK, that's going to invalidate approximately a million downstream server and client implementations, would you like to proceed?"

(examples are arbitrary and should imply no specific limitations or characteristics of languages or protocols)

Doing this effectively would require fairly tight coupling between the syntax of the language, ability to analyze dataflows relating to variables and function calls, cross-package dependency management, and perhaps other factors.

Those properties can be achieved during design of a programming language, or they can iteratively be retrofitted into existing languages (with varying levels of difficulty).

Bazel[1] attempts to achieve much of this, although to my understanding it offloads a lot of the work of determining (re)build requirements onto the developer - perhaps a necessary migration phase until more languages and environments provide formats that have self-evident out-of-date status and dependency graphs.

We'll get there and the (sometimes uneasy) jokes will be about how often the software industry used to re-run huge continuous integration pipelines.

[1] - https://en.wikipedia.org/wiki/Bazel_(software)

Honestly diversity is a good thing. It's good that different people are trying to solve problems at the heart of software in different ways as it creates new perspectives and possibilities. Now, after reading these replies I'm really curious about stuff like darklang and Bazel.

What happens is that people don't understand that most of these higher level solutions are very leaky abstractions, they aren't the silver bullet marketed in medium articles. Yes, they can save you a lot of time and headaches in specific scenarios but when you encounter one of the leaks it can take you weeks to get to the bottom of it.

If the teams doesn't understand what problems the tool is solving and if they have that problem then they might be just cargo-culting. An example of this is kubernetes. I know teams that used kubernetes just because everybody else is using it, they didn't actually need it for their monolithic java spring app. They think they avoided accidental complexity by using kubernetes but in fact they just added accidental complexity to their project. And then they add a new tool that makes it easier to manage the complexity of kubernetes, and so on.

Anyway, I'm probably just a rambling fool and I should appreciate that all this generating and shifting around of accidental complexity will actually mean future job safety for guys like us.

Yes exactly, I was also thinking about how darklang does it. The feedback loop can be much shorter, and I'm sure it's not just limited to how darklang does it either.

It seems like a core argument is the pre-commit tests that runs as commit hooks on the developers computer.

I have worked in a place where they did that, and I think the cons heavily outweighed the pros. I can not push incomplete work to a remote, I can not separate feature development and chores (eg. linting) because I _need_ to fix stuff in order to commit an push, etc.

> Continuous Integration is a model thats tightly tied to how we run version control.

I would say that a pre-commit testing world is much tighter. CI, as many know it, as a passive observer. When stuff fails, it will let you know. You can still merge.

One thing that would be nice, however, would be the ability to run the entire pipeline locally. For GitHub actions, it indeed seems like there are viable options to do that.

> One thing that would be nice, however, would be the ability to run the entire pipeline locally. For GitHub actions, it indeed seems like there are viable options to do that.

I prefer not having to replicate locally how GitHub runs GitHub actions, but rather just make my GitHub actions run something that I know I can run locally. So all the complicated stuff is contained in the script that I can run locally, and GitHub actions is just a dumb “runner” of this script.

For my local script I prefer using Nix, since it’s the best way I know to guarantee a consistent environment between different machines.

That where I use Makefiles. For GitLab CI, for example, you can have the likes of `make $CI_JOB_NAME` so your CI config can be very short and dry, just a bunch of jobs named after Makefile targets.

this is the way.

> One thing that would be nice, however, would be the ability to run the entire pipeline locally.

This cost me many hours of waiting for the Gitlab CI runner when debugging non-trivial pipelines, when the issue was something that did not have to do with the script steps inside of the jobs but rather how the Gitlab runner handled things.

I've found gitlab-ci-local [1] which actually does run the Gitlab pipeline locally, although I had to write some boilerplaye scripts to set up all the necessary 'CI_FOO_SOMETHING' environment variables before running the tool. (Which sometimes came back to bite me because the issue was actually in the content of some of those environment variables). It's still a very good tool.

[1] https://github.com/firecow/gitlab-ci-local

Thanks for the link

Side note: makes me sad when even open-source tools for Gitlab are hosted... on Github...

git has cli flags that will happily let you bypass pre commit/push checks so it's not even a reliable way to know that tests actually got run. Central CI gives you a traceable test history of every change and keeps us all honest.

This might break the teams assumption that these things pass. After all, centralized CI has, in this narrative, been exchanged for the assumption that everything pushed to git passes all tests.

> pre-commit tests that runs as commit hook

Also hate those.

What I do instead is have the tests run as part of the build. If the tests fail, the build fails.

Definitely a good motivator for having fast tests, which is essential. The ones for my projects tend to run in about a second or two.

Also gives you way greater confidence in your tests.

Try it!



We use nektos act to run pipelines locally. Works OK, but everytime something fails locally you are left wondering if its act or your pipeline, and some features (like reusable workflows) are not implemented yet.

Maybe the right thing is to just write the workflows in something else, and have the github workflow file be a single call to your script. But

-It would be nice to be able to use github actions others has made (or libraries/abstractions in general)

- I don't see how to easily get parallell execution.

- I love github environments, how do I pass down all my environments and their variables?

Does it work 100% offline if the docker images have been cached?

"One thing that would be nice, however, would be the ability to run the entire pipeline locally."

I'm working on something to do just this. Although I've redefined what "local" means in this context. I'm still using remote servers but everything is happening pre-commit from the terminal. If you are interested check out this demo and let me know if you have any feedback https://brisktest.com/demos

I do feel like if we’re going the precommit route we need to go much deeper into things like Bazel. Loads of companies are using CI to get a huge test suite to run with a lot of parralelsim for a reason!

Though for me it’s also like… so many people do unspeakable horrors to their machine setups that I like there being a consistent runner in CI.

But “CI is to run tests” in the world of CD is a bit of a simplification anyways.

> Loads of companies are using CI to get a huge test suite to run with a lot of parralelsim for a reason!

Tho technically you can probably have a local runner which either distributes the test suite across a bunch of available matchines (à la pytest-xdist), or one which goes and creates jobs on the CI platform without needing to go through the pretest ceremony (e.g. creating commits, branches, CRs, ...)

Right, bazel lets you do that (it has no notion of version control state), and ships it off remotel if you so choose to have the system work like that.

I stand by this idea that if someone made Bazel, but without all the obtuseness, with strong integration to a hosted CI service, they would quickly overtake a lot of services.

It sounds like a problem with the policy, not the tool. Pre-commit hooks is a useful tool but there should be a way to skip them if you really need it.

Per-commit hooks save my time by not allowing to push code which guaranteed to break later in CI.

I `export HUSKY=0` in all my shells because pre-commit hooks are stupid.

Thanks for that. I agree that pre-commit hooks enforce one way of working that just wastes a bunch of my time.

I think pre-commits are useful, but the prediction is basically "trust the client".

What if the client is compromised? Are we throwing away reviews? Should the reviewer re-run all the tests?

> But all technology evolves, and git is already showing its weaknesses. Fork-and-pull isn't the final form of development workflows.

I kind of get what this is saying, but technology evolution doesn't have to mean completely replacing said technology with something else.

I think that's one weird thing about the software field, whereby we keep moving to these shiny new things that we think are better than the tools of yesteryear, yet in the end there is only a marginal gain in productivity.

Fork and pull is an incredibly productive and powerful workflow. CI is incredibly, incredibly useful. If these things were not the case, then neither of these would be even discussed by this article. There is a reason for their success - and it's not because GitHub is the most ubiquitous code hosting service out there. Git is _actually_ pretty great. CI is _actually_ very useful and has secured codebases for decades at this point.

So if one were to proclaim the "End of CI" I really need to see a viable alternative that addresses the same problems as CI and significantly improves upon it. An incremental improvement is not enough to shift and rewrite everything - there needs to be a significant jump in ability, productivity, security, or something else in order for me (and I imagine many others) to consider it.

> I think that's one weird thing about the software field, whereby we keep moving to these shiny new things that we think are better than the tools of yesteryear, yet in the end there is only a marginal gain in productivity.

"Thought leaders" need to he constantly talking up the next new thing so that they can stay ahead of the herd on socia media.

Developers get bored, or worry that their career is stagnating, if they're not using the new shiny. Particularly if they pay attention to the thought leaders, or they're stuck building unglamorous crud apps.

And the software industry generally has a poor collective memory of tools and practices and experience from even the recent past. Contrast with more mature engineering disciplines, or architecture, medicine, etc. I'm not sure why this is.

I do think there's a bit of a visibility bias here, though: plenty of us know that there's money and stability to be had with deeply competent knowledge of the "old" stuff. We just don't make weekly posts on $MEDIA about it.

> I think that's one weird thing about the software field, whereby we keep moving to these shiny new things that we think are better than the tools of yesteryear, yet in the end there is only a marginal gain in productivity.

Agreed. But I'd take it a step further and say:

...yet in the end no one is any happier. Not engineers. Not management. Not leadership. And most importantly not the users of the product.

We keep building and delivering more. But how often is that better? Either as an end to end experience or simply fewer bugs? Most of us - who are at some point users of products we didn't build - have resigned ourselves to that fact (read: it's been normalized) that an uncomfortable amount of friction is a given. That's sad.

There’s always problems and always friction. The new shiny doesn’t always help. But I do feel happier when I am able to move to the next level of problem. To spend my time wrestling with more meaningful challenges and less trivialities. Better tooling can help with that, although to make it a reality your org needs to see “increase quality” as the first step in “delivering more”

To restate my point with brevity: More efficient doesn't mean more effective.

FFS, look at GitHub. Shows your commits. They could be a high percentage of shite yet we drool over a saturated commit graph.

> Either as an end to end experience or simply fewer bugs?

I'd say we build bigger things with the same size/bug ratio. Therefore, "Small is beautiful".

Maybe. But user's have no sense of size. Nor do that care.

I am forever fielding questions / issue from my (retirement age+) parents. They are my benchmark for usability. From being able to open jars of food, to a website bug, to ambiguous UXs.

Until these questions are reduced - and they been steady for many year now - then I'll presume we, the makers, are failing.

> Nor do that care.

That's a misunderstanding about size. It is not about saving some kilobytes - indeed we have more than enough RAM/ROM/flash/mass storage in general, although it is a bit less true currently due to shortages - but it is a hint that the whole thing is better in other areas as well.

Except for the cases where a speed/size trade-off has been made of course. But even when it is a features/size trade-off, it is not always a bad deal, because software often has lots of features you don't care about, but could nevertheless introduce bugs in the features you do use.

What the author should have made more clear is that we’re starting to see technology that eliminates the “local developer environment” completely, and we’re also starting to see collaborative real-time code editors. Taken together, we can imagine a world where a dev points their browser at codebase hosted somewhere and just starts editing. Each edit is a change event that the system reacts to by running compilers, then unit tests, etc. and provides near real-time feedback that the code works/doesn’t work, right in the editor. In this world the developer does not push their changes, the system does that for them, while they’re working. The CI server disappears from view completely.

That's an interesting take that I didn't think of. I think the burden on having a cloud-based development environment on ops teams will be non-zero though - some set of individuals (or a dedicated team) will have to maintain these very beefy hosts of your code that run the tests for you.

That said, I'm not against the idea in principle if access to the internet is guaranteed and constant. In some places that I'm in, internet access is shoddy or just too slow for this kind of thing, and my preference there would be to work easily on a local machine without access to the internet.

> I kind of get what this is saying, but technology evolution doesn't have to mean completely replacing said technology with something else.

Technology and in general historical trends are rarely broken, they change but there is no gaps in sense, I don't remember who said it, someone in defense R&D, if recall correctly.

"Fork-and-pull isn't the final form of development workflows."

Actually, like the wheel, I think it is the final form. When artificially sentient machines are writing code for us and someone or some thing pollutes or corrupts a critical model; a human will need so step in and fork where things were working to fix it.

What's that old saying? Don't remove a fence unless you really know why it's there.

CI isn't just a remote compute scheduler. Its also the place that contains tooling, tooling which repeated across a dev team creates n * jobs replicas of that tooling - which run on a local machine would be fairly wasteful. Its also centralized because CI is one of the biggest vectors an organization runs; it contains credentials for outside services, credentials for deployment (which, if you're SOX-bound cannot exist on a dev machine), and if you're cryptographically signing binaries and doing reproducible builds it's got those keys too. n * developer laptops makes that vector a planet sized blot. A lot of the policies I imagine dealing with that would be limiting developer access to their machines, limiting tooling to only approved or internal tooling.

It'd be a nightmare.

Some of this is good, and some of it I feel makes some fundamental misunderstandings. The good: "tight feedback loops are really important". Absolutely. Getting feedback about a change you (as a developer) are making is key, and the longer that takes the more time is spent waiting which is annoying and concentration breaking. This is, in my experience, usually tied to the implementation of tests (read: not bloated and useless) and not really a core tenant of CI.

However, things like:

    [...] git is already showing its weaknesses. Fork-and-pull isn't the final form of development workflows.
Show that the author is perhaps conflating GitHub with Git.

I disagree with the POV of the article. That said, I do work at Spacelift[0], a specialized CI/CD for Infra as Code (so you can either take this with a grain of salt, or more trust, due to the increased exposure).

> First, CI rarely knows what has changed, so everything needs to be rebuilt and retested.

Yes, this is an issue with some CI/CD systems, an issue you can solve however. See our Push Policies[1] based on Open Policy Agent. Test-case caching is also sometimes available (available in i.e. Go).

> Running tests pre-commit is one answer.

Running tests locally for a quick feedback loop - sure, that's fairly mainstream though (something you can use our local preview[2] for as well in the case of IaC). Running tests locally before directly deploying to prod - that would be a compliance and security nightmare.

The author presents what looks to me like a very "move fast, break things" attitude, that doesn't really work in a whole lot of cases.

If your CI/CD is slowing you down, make it faster. Easier said than done, I know, but a lot of people don't even think about optimizing their CI/CD, which you often can do, by being smarter about what you run and/or parallelizing extensively and/or caching[3].





This take doesn’t make any sense. What is the author suggest as a replacement, hooking your IDE up to prod via FTP and editing PHP files directly on the live server? Sure I’ve done this, as a solo developer on smaller projects, but this doesn’t scale even to a team of two or three, and it goes without saying I’m not relying on tests in this scenario and I have basically no ability to roll back.

“CI is frustrating!!” I hear ya, but this article does nothing to clarify a viable alternative.

I think what the author is meaning is the "CI" tests run on your local machine before you can commit to the repo?

I don't understand why the distinction between pre and post commit really matters. You're building your feature on a branch, you can and should commit as early as you can. Personally I want to get my work off of just my hard drive asap. If it breaks it's not biggie. If you're worried about an ugly commit log you can squash.

Gitlab even has a great feature where it will run your CI on the post merged result of a MR. It really is the best of all worlds.

I don't either. The author is also missing that CI servers often have much better hardware/networking than the developers laptop. Even if they don't, it's not crunching away running a CPU and IO intensive test suite on their device for minutes.

A good example of this are tests which hit the DB hard, if you are in a different location to the test database servers, the latency between the app server and the db can absolutely murder performance.

I can sort of see the problem if you are often seeing CI fails, which require tedius 'fix ci, fix CI 2, fix CI final, fix CI final FINAL' commits taking ages to test and see if it works. But really that's a different problem.

A poor title; I'd say "The current CI is inadequate, we need to begin with a new CI".

Why is the current CI slow and resource-heavy?

The language(s) are inadequate: every function can affect another function or data; every function / method may have arbitrary side effects. Even if not so, the language does not export the dependency graph.

The build system is inadequate: because in the source code everything potentially affects everything else, a lot more needs to be rebuild to be sure than would be strictly necessary for the logic of the change. Even if not so, the build system does not export the dependency graph and the scope of the actual changes.

The tests end up inadequate: even if they are written competently, they cannot trust any info about the scope of the changes, so they need to test way more than logically required to be sure, or, worse, they actually need to re-test everything. Also, due to the way other systems (databases, etc) are represented in the test setup, they are hard to run in parallel.

Microservices were invented partly as a way to fight this: they make very clear bounds between independent parts of the system, and strive to keep the scope of every part small. Their small scale makes the problems with the CI tolerable.

What we need is better tools, better static analysis and compartmentalization of the code, more narrow waists [1] in our systems that allow to isolate parts and test them separately.

[1]: https://www.swyx.io/narrow-waists

My ideal currently looks like:

- trunk-based workflow. Small commits. No feature branches as a rule to be occasionally broken

- unit tests move left - run pre-commit (not necessarily run in CI). Emphasis placed on them as a refactoring tool.

- a critical, small suite of e2e and integration tests that block the pipeline before publication (fast feedback)

- latest build publication being constantly redeployed to production, even if no changes have taken place to exercise the deployment routines

- a larger suite of e2e and integration tests being constantly run against production, to give feedback when something isn't quite right, but it's not a disaster (slow, non-blocking feedback).

In summary, emphasise getting code into production, minimise blocking tests to critical ones, test in production & notify when features are broken.


- Engineers spend too much time in test environments that give the illusion of the real thing. They lose touch with production as the Dev cycle increases in circumference.

- Enabling tighter feedback cycles by accepting that some features are important and some are not helps put the cost of change into perspective for the entire product team.

- Engineers get used to working in and around production on a daily basis. Production operations and observation of systems (& users) are emphasised - protective bubbles for writing code are minimised.

You're not trying to maximise code output, you're trying to maximise the velocity of safe change, and you do that by understanding the environment (production) through your intelligence (observability of systems and user behaviour), so that you can employ your force (changes, code, features) rapidly and effectively, whilst maintaining the ability to quickly deal with unexpected problems (defects) along the way.

Disclaimer: might not be possible for your specific theatre of war for any number of reasons.

Why bother running a test pipeline at all? Why not have two production environments running in red/green, and always just run your e2e tests then flip.

That's a viable variation on the theme - but there may be some critical sanity tests you want to run before you let code near a production environment - depends on your appetite for risk of course.

I like CI tools that run locally on a developer machine and run end to end tests with all Microservices locally. The secret is integrating early and often.

That's why I wrote mazzle and platform-up.

Mazzle is a run server that is configured as a Graphviz file (.dot) and defines an end to end environment. Here's an example graph file of a pipeline:


It's largely a prototype of an idea. It's infrastructure and pipelines as code but incorporates every tool you use from terraform, chef, ansible, puppet, Kubernetes, packer and shell scripts. My example code spins up a Consul and Kubernetes cluster with hashicorp vault and a Debian repository server, configured SSH keys on every worker, bastion and Java application. And Prometheus exporter and grafana. I haven't got around to adding ELK yet. But it didn't take long to do all these things due to Mazzle meaning it's very easy to test complicated changes together.

https://devops-pipeline.com - Mazzle

Platform up is a pattern for local development that tests your Microservices locally all together. You use vagrant-lxc and ansible together to deploy everything locally. So you can test your secret management locally and deployment process. If your ELK stack is ansible driven you can even run your ELK stack locally as I did on a client project. https://GitHub.com/samsquire/platform-up

First of all, it should be a fundamental principle of practical software engineering to be able to execute every test on the developer's machine.

Software development that relies on "special" environments is prone to break down sooner rather than later. If you cannot execute your tests on a fresh machine after half an hour of setup or so, something is fundamentally broken in your toolset.

In turn, this requirement means that in a well-designed development environment your integration frequency is only limited by the product of integration points (e.g., your branch and "develop", or your branch and n other ongoing branches) and the time it takes to run the tests.

"Most" and "everybody" on HN often means "we at the few megacorps around".

Let me assume instead that most software projects happen in the long tail of small companies, low LOC numbers, low developer head counts (even one or zero - consultants working only when needed.) In that world I saw deployments run when tests pass on one PC, deployed from that PC. In the case of scripting languages even with a git pull on the server and a restart. That works surprisingly well until the team gets larger. Then customers often resist the extra costs and burden of CI. Then some of them cave in after something bad happens.

I think these tiny or zero teams benefit even more from CI. As people come and go the CI process remains and people can use it without any local setup. Use GitHub actions to avoid main thing yet another thing (assuming that’s where the source code lives)

> CI is a post-commit workflow. How often have you or a coworker pushed an empty commit to "re-trigger" the CI?

Build can be triggered manually in every CI system I know. Why would I push empty commits?

Many integrations between CIs and git frontends (as in showing green checkmarks under your pull request) only work properly if you push. Rebuilding manually in the CI will run the build, but it will no longer be recognized as related to a PR

The only integration I've used like that is github workflows and only because the project owner won't turn on the feature to allow manual re-run . Fortunately transient failures are rare on that project, but I still find it rather odd.

Because there are plenty of people working with systems you don’t know then.

I've had caching problems when triggering manually resolve on a new commit.

Maybe this will happen, but I really doubt it will come anytime soon, especially since the original tweet mentions that local, laptop-based testing is enough, which is laughingly ridiculous in terms of resources, time and coverage for anyone who's worked on large, real-life, many-year, bloated projects.

: which is pretty much every project that survived the POC phase

I feel as though this article comes from a place of very subjective personal experience, perhaps one which is painful. I've seen people avoid git because merge conflicts are hard to get their head around, afraid of the terminal because GUIs are what they're used to. CI/CD... CX, when done right, allows for automating a lot of work and allowing you to become much more efficient while documenting build, integration, and delivery workflows in code. If Mr Rickard doesn't wish to see Jenkins in 2025 (three years away, so not exactly far in the future), how does he expect to be building & deploying 100s of services?

Not seeing jenkins does not mean CI/CD does not exist. User experience of CI/CD might be something similar to what you get in vercel or netlify where build, deploy and hosting is taken care of without any complicated yaml based pipeline definition, configuration management and learning curve.

I’m hopeful that https://dagger.io will help CI “shift left”! Especially the ability/easiness to run it locally, and the built in caching/DAG

"I, for one, hope that we aren't using Jenkins in 2025"

Or at least, if we are, it's a far more robust and capable system that's actually really designed for dealing with all the different things that people need to do while building and testing software.

But surely most IDEs these days make it simple to run all unit tests locally? My usual M.O. is to manually run the tests locally that I believe will demonstrate my changes are "working" and rely on the full suite of CI unit tests to check I haven't broken anything else. What other options are there?

I've moved to develop most of the time on a (time-share) beefy machine on AWS, so my "locally" is always a pretty strong machine.

While I develop I start a `watch` process that keeps running the tests. The watch runs a container with the docker image of the CI, mounting only my development working directory.

With this setup there's no added value to the CI

Well there are plenty of possible projects for which CI has no value, I don't think that's a controversial statement. But anything of any complexity with a decent size team (where a full CI build is likely to take 20 min+), then I can't imagine being convinced it wasn't needed.

> I've moved to develop most of the time on a (time-share) beefy machine on AWS, so my "locally" is always a pretty strong machine.

We looked at doing this, but it seemed to be pretty expensive. Like, it was cheaper to upgrade a developers machine twice a year expensive.

Is that what you see or was it just setup wrong on our side?

Fundamentally, you can't trust what worked on a developer's machine and you need a centralized system (the CI) to validate your build. So if the author is right, this will likely comes with the development taking place closer to the CI than the CI taking place closer to the development.

I suspect a lot of the author's frustration comes from working inside a particular bubble for a long time and perhaps genuinely not realising that plenty of software developers are working in other ways already and always have been.

Not everything has to be a massively scalable cloud-native web app with a dependency tree 27 levels deep and DevOps turned up to 11. Not everyone has jumped on the merge-to-trunk-and-deploy-every-five-seconds process train.

Plenty of developers still work with modular systems in teams that each focus on a particular part of those systems. They follow simple feature-based development processes. They have local development environments that provide near-instant feedback on their own workstations.

A lot of the problems mentioned in the article simply don't exist in that kind of environment. It would never even occur to those developers that they might need to push work in progress to trunk just to be able to test it properly or that their team might have significant merge conflicts so often during normal development that they'd need to change their entire process to reduce the frequency.

It's like Scrum. Some people like it and some people don't. But the people who don't like the rigid time boxes and ceremonial meetings probably aren't going to join organisations that have those things. Sometimes people in those organisations then start to think that their way is the only way because they've been doing it for so long and never seem to hire anyone with different ideas.

If there are 5 people framing a house, you can't have all 5 of them doing 5 different things on the same section of roof at the same time. You have to put down the beams, joists, eaves, rafters, battens, purlins, collar ties, wind braces, sheeting, flashing, etc, one on top of the other. So to save time, you have people prep their work before they begin, schedule deliveries to arrive at exactly the right time, work in pairs, etc.

The point of CI is the same: to batch your work into stages and apply it just when it is needed. Merging your work incrementally ensures putting in your pieces doesn't stop everybody else from putting in theirs or making them re-do stuff. So CI isn't going away.

What we should do is make the process more seamless. For example, IDEs should not be editing local files, they should be committing directly to a VCS branch, which should be immediately rebuilding an app, which should immediately redeploy in a cloud environment dedicated to an engineer. We can make it as fast as local development (it's not like your local computer is faster than a giant server in the cloud).

If you're thinking "but what about offline work?", we literally have 250Mbps satellite internet worldwide now. If you really can't get a stable internet connection, you can keep a local server to do development in. But the local server software must be 100% identical to what's in the cloud. Towards this end, we must build fully-integrated, fully-enclosed, fully-self-hosted CI environments. DVCS will seamlessly link the local and remote environments.

Pre commit tests assume everyone properly installs and runs them. Ci forces a standard onto everyone.

It's two different worlds.

It wouldn't need to be pre commit but pre push. Integrate vc with test runner and don't allow a push to certain remotes without an associated green test suite.

What is typically needed is an easily reproducible reference CI environment. That is often approximated with a centralized reference CI environment, that is not necessarily easy to reproduce, in which case it is only instantiated very few times.

Left or right of VCS should not be a big question given the VCS is supposed to be decentralized, but for that to happen the whole dev process (or at least a very substantial portion of it) should be day to day decentralized at well, or at least with a (well tested) ability to decentralize when needed, without friction.

If we go back to the more general "configuration management" concept (than merely source code version control), it is obvious the ideal situation is when the whole pipelines can be freely reproduced, in which case you can usefully do it before the integration on a reference branch/repo (and redo it after just to be sure because who knows if the whole configuration has been correctly captured: you better detect discrepensies early too - also because in some workflow you won't test exactly the same branch before merging to a reference as what is the result of the merge)

I'm not sure that CI is going to go away, especially so long as developers are building on machines that are not capable (sometimes, hardware constraints, sometimes differences in OSes) of running the exact software that is run in production. A CI server (or paas) makes doing builds safer for production, and saves the developer a lot of time.

I've become a huge proponent of where you can (sometimes you have to cross-compile, or the compute job is so big), using the same os to develop that you deploy in. This grossly simplifies tooling and testing and has the side effect of focusing developer mindshare (and development hours) away from simulating a foreign OS on their local machine. That time that would have been spent getting X working in Windows (and maintaining it) or fixing some deploy to container script because the new MacOS broke something can be spent, if there's nothing else better, fixing bugs in the actual product.

> I've become a huge proponent of where you can [...] using the same os to develop that you deploy in.

In my case that would be some sort of Linux targeting servers. Individual Linuxes are pretty different in important ways, so it probably would have to be the exact same flavor. I've seen my colleagues spend a lot of time fighting even the Linuxes that target laptops to get basic things like USB-C, sleep, battery life, docking stations and projectors to work well, which I've read on HN were all solved about five years ago, but I still see colleagues fighting Linux quirks, so I guess a laptop running RHEL will be a terrible hassle.

I would also have had to switch my OS every time I've changed employers (sometimes when changing teams), and I'd lose the accumulated muscle memory of over a decade on macOS, plus all of the platform specific tooling, plus the full Microsoft Office suite to interface with the non-tech part of the company.

I just try to deliver everything I build as a docker image, that way I get pretty decent isolation from the particularities of the host OS and it's pretty easy to test locally under conditions very close to prod. I've switched to a M1 Pro notebook earlier this year and apart from having to specify the arch when running some docker images, I've hardly noticed a change, though I don't develop Linux or Windows desktop software (hence, no X or the like needed).

I.. don't get the gist of the article.

It seems to show that the author likes to use the git commit hooks to run linters and unit tests. Which is ... debatable, but the real question is: why does this should affect CI?

I understand that discovering a problem during the tests after the code landed on main branch is ... disconforting, at best. I also understand the frustation of not being able to commit your changes in a precise way because the git hook required you to also touch other stuff.

What I don't know is why can't we run the CI on a merge-request flow, on the beefy servers, without blocking the local commits, sharing feature branches any time we want, and be checked only "when ready". FWIW I worked on a project set up this way, with a jenkins job starting each time a MR was made, where the branch is applied "locally" in the main branch and the tests run, and a (needed for actual merge) approvation obtained only if the tests passed.

I don’t like the hand-wavey “git is already showing it’s weakness”, then provides no examples.

This is a twitter tier take with no insight to back it up except “things will continue to get better and soon things will be different!” But doesn’t paint a picture of what different might be.

I mean, yeah things are going to change, duh

I cannot agree more. VCs are betting on companied like railway, render, netlify and vercel. If their vision becomes true there won't be CI/CD. An interesting thing happens if CI/CD disappears from the process. The developer loop looks very different. Today we focus a lot on day 1 operations, building, provisioning, deployments. The focus will shift to the other side of the developer loop, I.e day 2 operations, incidents, chaos, etc. The number of incidents will increases which implies you need different kind of automation for day 2 which a asynchronous in nature, async to the developer loop. The automation we saw for CICD in past few years gets propagated to day 2 operations which in my opinion is still very nascent. Just some thoughts in the space.

Right, we should just edit and test directly in production.

Jokes aside, one of the reasons why we still need a CI is it proves we can do a full clean build. Optimizations that rely on what actually changed are good for local dev.

One way to truly get rid of CI is to get rid of the need to do a full clean build. If we structure our build to be constructed from a series of immutables input files (easy with version control) with a series of pure (as in no side effects) build steps and hash the output file by its input + builds steps, then the concept of a clean build is meaningless.

Then you could even allow regular dev machines to push the build assets to a centralized build cache. However I would still want an independent "known good" oracle to rebuild and check the hashes. One could call this oracle a CI...

The complaint doesn't even make sense, the CI platform has all the change informations (the diff tells you that), running the tests locally won't give you any more useful input, and so makes no difference to the change-detection logic.

> CI is often triggered by creating a pull request to merge a branch into main.

> Today, a developer would be crazy to suggest anything other than git and trunk-based development for a project.

My understanding is that trunk-based development is about shipping a feature via multiple merges of incremental changes that are often single commits; the main opposing model is feature-branch-based development, and it's with the latter model that the terms "pull request" and even "branch" are associated. Is my description of the terminology accurate and is the author of this article using the terminology confusingly?

Like others here, I read this article and largely disagree with nearly all of the points made, and for the same reasons.

However, most of this is a result of the author assuming that everyone and every organization echos their experiences and point-of-view (which isn't uncommon in online writings).

If the article were titled something along the lines of "My views on the long-term viability of CI" and the article written accordingly, then readers would be more willing to ponder alternatives to, or even a redefinition of CI in the viewpoint of the author knowing that it's a personal opinion piece.

Did I miss the hacker news post where “git has already shown its weakness.” ? Someone help me out here.

OP has no idea what is possible in CI when you integrate intelligence about your compiler/linker into the build executor and run it massively parallel. CI isn’t going anywhere bc parallel cloud execution will always build faster than your laptop. Less time wasted and less heat on your lap. BigCos are already doing this (some for over ten years) and running laps around garden variety CI. They are investing more in CI today than ever because the assumptions the OP made are incorrect.

I have never been a huge fan of CI. I've always considered it a potential "Concrete Galosh"[0], and, in my case, the fox is usually not worth the chase; but I don't work in the type of environment that usually necessitates CI.

> As I wrote in Your Integration Tests are Too Long, developers become reliant on CI and won't run the tests until merge-time.

That's a big issue. I think that testing is very important, and integration testing should be done from the git-go. Anything that discourages early integration testing, is a problem.

Recently, I was invited to submit a PR to a fairly sizable project (I was the original author, but have not had much to do with it for the last three years or so).

I declined, because, in order to make the PR, I would have had to set up a Docker container, Composer, Jenkins, xdebug, PHPUnit, etc., on my computer, in order to run the full integration tests (I won't submit a PR without running the tests, as that's just rude).

For someone that is a regular backend engineer, like most of the team working on the system, that's no big deal. For me, it's a fairly big deal (I write frontend native Swift stuff, and don't have infrastructure on my machine for that kind of work).

That means that they will have to do without a fairly useful extension that I could have added.

[0] https://littlegreenviper.com/miscellany/concrete-galoshes/

>First, CI rarely knows what has changed, so everything needs to be rebuilt and retested.

That's a problem with your building and testing tools. Even without CI you would still need to build everything and test everything if you have no way to do it just for what changed.

>How often have you or a coworker pushed an empty commit to "re-trigger" the CI

Most CI solutions have a button to trigger a new version without a new commit.

>Running tests pre-commit is one answer

Even with CI developers are likely running at least a subset of the test suite while they are developing.

>Yet, there are roadblocks today that need to be fixed, like tests that don't fit on your laptop

Either develop on a server while using your laptop as just an editor or have a test runner on the server.

>Things are shifting left

The problem with this is that at large companies the changes you've already tested will always be rebased onto a newer version of the codebane which it hasn't been tested with. Who runs tests for this newly rebased version? CI. Also for things like come review you can not push it left onto the devs machine. You will want linters and tests to be run for come review.

> Today, a developer would be crazy to suggest anything other than git and trunk-based development for a project

Funny, I thought it would be crazy to suggest trunk-based development when everybody is so enamoured with feature branches (trunk-based means that the whole team commits directly into the main branch - so pretty much the opposite of using branches with merge requests).

Another counterpoint is that CI makes at least a lot of sense for cross-platform development. E.g. when you're working on Linux, you can't trivially check your code locally on macOS and Windows, or even just on different compiler toolchains unless you install all those things on your local dev machine.

That's ridiculous, committing to the same branch doesn't scale beyond 1 dev.

Trunk-based means merge requests get merged into master instead of feature branches.

It's not ridiculous, we used "actual" trunk-based development on a single branch in teams between a dozen and a hundred team members (coders, artists, level designers and QA), since the mid-90's in CVS and SVN. It works very well but requires a different kind of discipline (if anybody breaks the build, this must be fixed immediately before anything else).

Git doesn't lend itself very well to this kind of development model though, because it's doesn't have a central repository like SVN.

Also see here: https://trunkbaseddevelopment.com/

The basic form of trunk-based doesn't use feature branches, the "scaled" version only uses short-lived feature branches. But I can assure you that the "scaled" version isn't needed for teams of up to around 100 contributors (with a centralized version control system like SVN that is).

Next logical step is for everybody to just ssh into common server and work there. But then you wouldn't dare to save your files before they are reasonably bug free.

Feature branches let you save intermediate work outside of your computer.

If you have trunk based development and your machine smokes, all unpushed work is lost.

People are looking for some philosophical advantages of one over the other. But in practice, it's just a trade-off of how much you save and where.

Just SSHing into a shared filesystem is really too simplistic. There's a lot of nuance between this extreme on one side, and feature branches on the other. Working exclusively with Git just doesn't make it obvious that there have been other working solutions before that don't rely so much on branching ;)

I think it really comes down to the differences between a distributed versioning system like Git, and a centralized versioning system like SVN. A lot of the "branching ceremony" that evolved around Git is about managing the different timelines that evolve because every dev has its own local repository with its own history which then needs to be synced with a remote repository, and this entire problem area simply doesn't exist if there's only a single shared repository.

In SVN there are no "merge commits" polluting the history when working on a single branch. If you update (in git lingo: pull), there are no "merges" or "rebases" happening, instead you get conflicts that you need to resolve locally. Next time you "commit" (in git lingo: push) those resolved conflicts are uploaded to the central repository as if they were regular changes (e.g. no "merge commits").

Somehow this central-repository-model of SVN still is much more logical to me. The distributed model of Git doesn't make a lot of sense unless you're a Linux dev with your own 'fork' sending patches to upstream from time to time via email (e.g. despite git's decentralized nature, Github or Gitlab are also just trying to emulate a simple central-repository-model).


> If you have trunk based development and your machine smokes

That's why you commit to trunk just as often as you would push to a feature branch. You just have to organize your work in a way that frequent commits don't break the build (e.g. by putting your work behind a runtime feature flag), the upside is that you never get into a state where you need to deal with complex merges, because your work never differs from the shared project state by more than a few hours.

I worked with SVN and it was just horrible. Merges were horrible. Even resolving conflicts on updates was arduous, error prone and hard. If two people accidentally worked on the same file, both were screwed. Git was the first thing that made merges feasible. (Not really the first I think, my initial bet was on mercurial, but git was the one that stuck).

Runtime feature flag sounds absolutely repulsive. And to avoid complex merges you can just rebase your feature branch onto current trunk often and either avoid working on two large tasks affecting same area simulatnously or merge or rebase one on the current state of the other when the other is in stable enough state.

You probably worked with SVN before 'merge tracking' was implemented, after that merges were really not that different than in Git, especially with a UI frontend like TortoiseSVN. Both Git and SVN absolutely break down when it comes to merging binary data though, and in game projects binary data is the vast majory of all project data (in our projects about 95% binary vs 5% text data), and this is actually the main reason to avoid branching at all cost (or rather: branching is fine, merging of binary data that has diverged between branches is the problem, and Git doesn't solve any problems in that area over traditional version control systems).

But still, branches are simply not as important in a centralized version control system where everybody works on the same shared repository state.

Runtime feature flags also make a lot of sense outside the version control worflow. If you have a "live product" you often want to enable or disable features after a new version went live, sometimes only for a group of users.

> You probably worked with SVN before 'merge tracking' was implemented

Yeah, most likely. Since SVN is not dead yet, that must be have been the case because I don't see how any modern dev could suffer with SVN I suffered with.

I see how having binary data checked out into git and merging in any manner other than "use A" or "use B" might be a nightmare and how never branching might be the best (or even the only) way to even be able to use versioning at all.

I remember that when at some point I started using sqlite in my projects I wanted to have another database format I could check into the repo that would use multiline text representation for the data and things that must be binary like indexes would be just generated on deploy.

I can't imagine needing to have in my source tree some opaque binary formats that might be altered partially. I think I'd just kept those outside and use meatware process to manage change in them. Or have some migrations system where updates to binary files might be described textually and checked into the source tree.

Merges still brake with false conflicts in SVN with "merge tracking" when directories are deleted or renamed.

Merge requests into a feature branch sounds like an over-engineered process, but maybe that works for large teams?

A new feature might require new/changed backend APIs, iOS and Android and WFE changes that might all be handled by different devs. Having them work from a shared feature branch can be a practical strategy, if there's a desire to keep things out of master until they're fully ready.

I find it funny how CI is not continuous and not integration.

It’s inherently a batch process, so not continuous.

It’s an automated build and check system. It also runs on merge commits but this “integration” part is really marginal to the concept as represented by e.g. GHA.

I guess it got its name from the contrast to older practices. Software projects used to divide parts up considerably as part of system design and give them to individuals. Separate compilation, like in Ada, was supposed to allow compiling to interfaces without having the implementation available. After the detailed design and implementation was finished, the parts could theoretically be integrated during a period called "integration hell".

I remember the MCSD docs were promoting a "Daily build and smoke test". This was a huge difference in that they promoted at least making sure that thev whole system could be built in some state every day.

CI really appeared in the late 90s when someone had the idea that integration tests could be run for small gains in functionality. Then the software could be automatically tested on every incremental change. I credit C3/XP for popularizing the practice but I'm not a historian. Possibly someone was already at it before.

developers become reliant on CI and won't run the tests until merge-time

I think there's a whole lot of truth in this particular statement. I've worked with lots of devs who complete a story, commit it, and then find the tests aren't passing any more and have to do a little more work to fix what they broke. And then tery complain that either tests slow them down or that the estimate for the story was too low. Having a test suite that can be run locally, and teaching the team to use it regularly, even if it's wholly optional rather than running on a hook, improves team velocity significantly.

Why only one commit for a story? Sounds like they should be committing more often.

I often fire off a commit, get notified of a test failure, and fix it in the next commit. Why wait for tests to run locally when some other computer can do it for you?

(Yes, it would be nice if all tests ran in zero time, but back in the real world...)

Sorry to disappoint but I think Jenkins will be here far beyond 2025.

However on the topic of real-time integration that is somewhat already here for no-code systems that are already divorced from traditional version control.

He hasn't been in the industry long enough to see that most of the pillars of software engineering are built on top of massive tech debt.

When I say tech debt I mean design flaws created by designer who simply could not anticipate the future. I'm not talking about bugs, mistakes or intentional shortcuts.

Whether jenkins is "tech debt" is a different topic. But whether Jenkins is here to stay past 2025 has nothing to do with how well it's designed.

The next evolution will be a shared build system.

If the builds done by the developer can be cached and shared with the CI, CI's role is lessened to just a gate-keeper. Most of the time, the cache is already warmed up by the developer's build and CI is a noop. Imagine a CI that take 5 seconds to finish.

For that to become reality, the build environment needs to be truly hermetic and reproducible, so that the cache can be trusted. Remote builders also help establish trust.

Have a look at https://dagger.io by the creators of Docker. It's a BuildKit + CUE based tool. You can get this shared cache by running a shared BuildKit setup. It also automatically figures out the build dependency DAG.

I can imagine that editing code in a fashion similar to Google docs collaboration is the way forward. Especially if tests run continuously, giving instant feedback.

This doesn't mean that daily tests will be a thing of the past, but potential conflicts can be found sooner.

I don't know how commits would work. I feel like they would become a hindrance to productivity, but having atomic changes spanning across multiple files seems like a must for proper rollback scenarios.

When you're doing a big refactor, this would be awful. Sometimes you need to rip out, add, or replace an entire layer or subsystem, and break everything until you're done.

I love the freedom having my own private branch gives me to go crazy and demolish whatever I want to see how bad it is to fix. It's like taking out supports in a building to watch how it falls down, then try again.

By the time I make a PR I've rebased and cleaned up my commits and made it look like a nice controlled demolition and careful surgical replacement. No one has to see I was ignoring hundreds of complie errors for a couple days.

This is something that would also be a problem with everybody working on the main branch, which I understand people are doing. I don't know how they deal with large refactoring though.

Continuous integration originally was shifted left of VCS. There was a shared integration workstation where people would pair program and sign off code as they go. CI shifted right and can shift left again too:

> It can be very tempting to substitute automated build software for an integration computer.


The article posits an academic theory of CI that biases towards very narrow models of software development: Namely kubernetes, and namely orgs that have the budget to experiment in these types of models where developer trust is a nice to have, rather than regulatory compliance.

So as a thought experiment it’s neat, but knowing that most orgs will never need Kube and will never develop software like a FAANG, it just strikes me as clickbait given the title.

There have been times where I would not know how to locally test code or did not want to install a bunch of dependencies on my system. I don't think precommit workflows will entirely replace CI, but it's rise will surely ensure that commits and commit history will be higher quality. Testing locally used to require installing additional dependencies and testing suites but this could be simplified with containers.

I have the feeling there is a „we need to do x pre-commit“ season right now, opinions like „The end of CI“ seem to just pop up.

I‘m not convinced by pre-commit arguments. C in CI isn’t only for „continuous“, it also happens to introduce „centralised“ and „certain“ (as in, it’s going to run whether you like it or not).

Coupled with good git style there really is absolutely nothing that needs to change for teams of 1 and bigger.

Just my 2€


> There is such a build system, but I can't remember the name right now. It tracks system calls to see every file opened by the compiler to produce exact dependency graphs (assuming compiler is deterministic).

> The downside is that it's Linux only.

My ideal workflow is:

As I declare that I start development on a task, the branch is made automatically from current trunk and testing environment is set up with this branch deployed. As I commit and push that environment gets updated.

Last part of the development before handing it off to a tester is merging current trunk into my branch and resolving any issues. When the tester picks up this task to test then another merge from trunk to task branch is performed automatically and if there are conflicts task is automatically pushed back to development to resolve issues so the tester can immediately move onto testing another task. If there were no issues with pre-test merge the tester tests if the task was implemented satisfactorily. Then, if trunk didn't change in the meantime task branch is merged back into the trunk by the tester. If the trunk was changed then the testing part repeats. The issue is that only one development task can be tested at a time without the need to recheck all currently tested when one of them is merged into a trunk.

In case of small tasks, tester might gain familiarity with the task to recheck it quickly after trunk was changed and merged into the task. In case of large dev task it should be split between few testers.

Order of merging tasks into the trunk should be determined by testing team to minimize recheck needed after the merges.

If anything goes wrong at any stage of testing the task is immediately kicked back to developer.

In this setup developers do no manual branching and merge only current trunk to their branches and can pretty much work independently (unless two pick very overlapping tasks), they can work on multiple tasks at a time and react quickly to tasks that come back from testing with minor issues. And testers coordinate testing and merging tasks back to trunk. Testers are also in close contact with stakeholders so they can demo new features for them on testing environments and know the priorities so they can work on merging higher priority tasks first. They can even participate in designing the new features or improvements as they know the current state of built software best and interact with it the most. Once the design is starting to crystalize deveolpers might be asked to provide input on feasibility and complexity of the feature implementation.

When I say merge it might be better to do a rebase. I'd have to actually work in that system to see what's better.

It seems to me like the article is advocating for a solution to a problem that isn’t there?

> But all technology evolves, and git is already showing its weaknesses.

Such as?

Hmm I'm unconvinced. It's already possible to have developers run tests locally and not run tests in CI, which is what this sounds like.

Nobody does it because developers can't be trusted to actually run tests.

Perhaps if there was some way to cryptographically prove that you've run tests? That doesn't sound possible though.

you could amend a commit with a flag if the git hook ran. if the flag is set one wouldn't have to execute the whole CI pipeline

There are still multiple issues with that.

1. I still wouldn't put it past developers to lie occasionally.

2. You don't control the test environment. What if the tests pass on the developers machine but not on the CI machines?

3. Difficult to test on multiple platforms. What if the developer uses Mac but you need to support Windows too?

4. Testing can take a long time. Who wants to wait an hour to submit a minor PR? Plus it ties up resources on your machine. CI can automatically scale.

5. There's no way to avoid races between testing and integration (i.e. merge queues).

CI isn't going anywhere. I think the biggest scope for improvement to CI is

1. Only testing things that have changed. Most people don't do that because it requires a build system that properly isolates everything. Basically only Bazel and it's derivatives do this. You can't do it with CMake or NPM or Cargo or ...

2. Make it easier to run CI on your own machine. I don't think there's really any technical barrier to this, it's just people don't usually bother.

This reminds me of a brief discussion/feature-request on the nektos/act GitHub repository (a project that allows developers to run GitHub Actions workflows locally):


Yes I think this is the crux:

> Trust in the developer is required ... but it is likely the case that developers are incentivized to illustrate that they are reliable, careful and trustworthy -- and that should strongly encourage accurate test result signing

100% guaranteed you're going to see "oh I just made a stupid typo that broke one test but I can see what is wrong so I'll just fix it and use the previous test results".

Also since it's entirely based on trust anyway I'm not sure what additional benefit signing gets you. I don't think there's really a way to prove you ran the tests.

Either way, all my other points still stand. CI isn't going anywhere.

> Also since it's entirely based on trust anyway I'm not sure what additional benefit signing gets you. I don't think there's really a way to prove you ran the tests.

It's a reputation and trust-building exercise, essentially - and that's one of the reasons that continuous integration is particularly useful. "Entities X, Y, Z all say that commit <ID> looks good".

> Either way, all my other points still stand. CI isn't going anywhere.

Agreed :)

If the hook can set it, the developer can set it.

In $current_place, we have about 250 hours (!) of tests that are run in CI per each commit. By splitting this across many executors, the CI can finish in about 40-60 minutes.

The idea of "running the tests locally" may fit some projects but in a complex system it's definitely a no-go.

Wow, that sounds crazy. I'm curious what kind of system are you building and what types of tests are you running to get to those numbers.

I've heard of similar systems, and even relatively modest codebases (100K LOC) can easily accumulate many-CPU-hour build/test phases that rely on running across multiple build servers all running different OSes. None of which means that running tests locally isn't worthwhile, esp. if you rely on any form of TDD (personally I rarely actually write tests first, but certainly as I'm going to verify each new change. And I'm not waiting for our CI pipeline in that case, esp. because I often step-debug through the test execution to verify it's really doing what I intended)

It's a sort of an operating system for some specialized hardware. The code that runs AWS could be an example.

CI is a collaboration practice. It means to share your work with your team very often, to not develop little parts in silos for weeks or months and then put it all together at the very end.

I think the author means "the end of build servers". You can do CI fine without build servers.

Yeah, it's quite weird to see everyone discussing "CI" as some particular tool or workflow (and not the same one...) and not as a general ideal or practice.

When I read the title I automatically assumed this would be a "...because CI won and is everywhere now" sort of closer. Junior devs' eyes go wide when you tell them builds used to happen overnight, "feature branches" could last months or even years, and merging was someone's full-time job.

This blog post is riddled with issues. The key ones that come to mind are:

- it conflates a build system (like bazel) with CI execution platforms (like buildkite)

- it ignores the origins of CI - build everything from scratch so none of your workflow optimizations affect anything

Next.js has a marketing blurb about hot-reloading "at Facebook scale" on the website. I didn't know what this could mean, but apparently someone has been thinking about it!

The build is neither working nor not working. In fact, it isn't even built!

CI isn’t going anywhere, and is the best thing to happen to teams in terms of collaborative efficiency, even if you dislike the fact that you need to write tests. Write less, more important tests and write smaller, more composable scripts.

I wouldn't trust my computer to be the pre-commit judge if my code is good or not. It looks nothing like the target environment and has plenty of modifications. I'll trust CI to do that check for me.

The value for me wasin the last two paragraphs. In short: What's next? For the same reason we're not still using COBOL (sans legacy applications), something else has to be next in the progression.

I can't take someone that's still using Jenkins very seriously.

Why not?

The world has moved on to builds that are coded in pipelines. No UI configuration should be necessary any more. I get that Java is different and needs waaaay to much coddling at this point, but the rest of the world has moved to pipelines.

`git commit —-amend` will change the sha, even if you don’t change a single thing. I’d be upset if someone was pushing totally empty commits, wantonly polluting the history, just to trigger CI.

Here lies the problem "developers become reliant on CI and won't run the tests until merge time" as always it is not technology/approach problem but developers.

The first point is not true at all. GitLab CI has a feature when it can run jobs conditionally based on what changed. We did the same with Jenkins, it's not hard.

That's cute. Too bad it won't work for use cases where the end product needs 100+ GB RAM to be run in its entirety and MUST be integration tested e2e.

The author could be right, but I'm skeptical. The model described is always-online, yet SCM like git is very deliberately distributed so you can work offline.

Is anyone doing collaborative cloud-based editing with the ability to run all tests at any time? That sounds pretty great, although you still need deployment gating.

Try renaming the title of this to "The End of Testing", does it still make sense? I didn't think so.

Thought experiment: taking real-time pre-commit workflows to the logical extreme, humans just become slaves to the machines, sitting at keyboards making near-instantaneous real-time changes to running systems in response to changing conditions and needs.

Resistance is, as they say, futile.

> First, CI rarely knows what has changed

If you need this, you can have this already. As far as I'm aware, anyone running a large monorepo (Google etc.) is doing some type of diff-aware test pruning, simply because you can't run all of the tests on every change.

And for us small guys, there are tools you can use to do change-aware tests, "Test impact analysis" is one keyword for this (there may be others, coverage.py calls it "who tests what": https://nedbatchelder.com/blog/201810/who_tests_what_is_here...).

> I, for one, hope that we aren't using Jenkins in 2025.

I certainly hope not. I had moved on from Jenkins in 2015. But I don't think there's much reason to think the author's "run the tests locally" speculation is particularly likely to supplant a centralized CI/CD system. As a simple argument against, for supply-chain integrity reasons, the tests and builds need to be run in a common, public environment. You're not going to push anything to production that has only been tested/built on your own machine. This is one of the core tenets of the modern software philosophy that lets you move extremely fast (deploying many times per day) and (perhaps I'm showing a lack of imagination) it's hard to see an alternative here.

Here's a counter-prediction:

1) In the short term, the pendulum will continue to swing back towards thin-clients (Github Workspaces etc.) which means we'll see more emphasis on cloud-based test runners per-dev-environment, and making these faster.

2) The push towards improved cycle-time (both for developers running tests on their WIP code, and for pre-merge tests on branches) will continue which will mean that the "change-aware test runner" tech will propagate down from high-complexity codebases so that every test runner is expected to offer conditional compilation/testing.

3) And finally, a bigger/more speculative one - the "build/test graph" (the potentially-per-file set of tasks required to create your binary artifacts and test them; the thing Bazel computes, or that pytest computes) and the "CI/CD job graph" (the thing that GitLab defines, which might include "run the build/test graph" and "deploy this artifact to $environment") will meet in the middle and become the same thing, so that you can invoke exactly the same graph processing logic locally (if you choose) as your CI/CD server would invoke on any given commit/tag. Earthly are working in this direction for example.

> or smarter build systems that produce near-live feedback

You can currently set up your IDE to re-run the tests in real-time. If you add "who tests what" to that, you can have your IDE re-running just the tests related to your diff in real-time, which might actually be fast enough.

It's not clear to me that the build system is the thing that needs to run this; it looks more like the IDE to me. But perhaps the author might agree with my general suggestion that we should make real-time IDE tests logically identical to the per-commit CI/CD tests if possible.

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact