Zero, because unless my dog configured the CI, I’d just click a button to re-run it on the same PR/commit/branch and not “make an empty commit”.
End to end testing of large systems with hundreds or thousands of man-years of code in them will be slow because there is a large surface area to test and testing in individual modules doesn’t give the same kind of end to end coverage that testing everything together does. So while pre-commit would be good, I doubt it will replace server CI for all but the smallest repositories.
To carve out minimal permissions, you have to start with nothing and repeatedly attempt to do the action in AWS console, and check CloudTrail to see what got denied. Increase role permissions, lather, rinse, repeat until it works and pray they don't update the console and break you again.
It's possible that either this process is too tedious to be worth doing, or produces a policy more complicated than they wish to use, or requires a policy that is more permissive than they wish to use.
https://pypi.org/project/access-undenied-aws/ will allow you to start with least privilege and fix specific issues.
https://github.com/iann0036/iamlive allows an admin to perform the action via CLI and capture the policy.
Access advisor can inspect how you actually use the role and give suggestions on what to remove.
A more helpful suggestion is to experiment with these tools and then find gaps in IAM actions and submit those as feature requests via your TAM.
I've also experienced the AWS console being less than stellar at fault-tolerance when acting within very restrictive, targeted IAM roles. The only solution would be an overly broad permissions grant which is not always viable. Or well, if you spend enough money you can try to beg your TAM to get it fixed, but in the meantime between "now and never", your solution would still be pushing empty commits.
I do worry that a lot of people are missing the first principles where CI is concerned, but a few of us still remember. Whether that’s enough is always a question.
> First, CI rarely knows what has changed, so everything needs to be rebuilt and retested
CI is the impartial judge in your system. That is the ultimate goal and much of the value. Don’t believe me when I say that your check in broke something. Believe the robot. It does not think, it does not interpret errors (which is why intermittent failures poison the well). It just knows that whatever you did was not good.
We make it stupid simple because the more ways you try to interpret things, the higher the likelihood the system will either miss an error or report imaginary ones. And in this case I am very sorry Mr Lasso, but it is in fact the hope that kills. Optimism makes us try things, things others would give up on. That’s what makes a piece of software good. Better. But too much and you start ignoring very real problems and make optimism into a liability. I’ve seen it over and over again, take a build with intermittent failures and people will check in broken code and assume it’s the intermittent problem and not their change.
Ideally, when you start breaking builds you’ve found your limits. But the build has to break when I break the code, and it needs to do it repeatably, because if I can’t reproduce the error I risk telling myself stories, and foisting off debugging to other people, which is a big no no.
In large part, industry could largely use a significantly simpler model than git for VCS since many features are largely moot on a daily basis. Many could probably get away with SVN in large part, for example (although branching isn't nearly as good).
I think you raise a great point in that we need to look at how processes have evolved ontop of version control and look at adapting those to a similar model. In some cases it's just not practical because the way infrastructure testing an deployment works but it's the direction I think we should be going. In fact, the first step would be to make issue systems and project management interfaces provided by efforts like GitLab and GitHub available in a local capable, distributed fashion. Clearly communications often require some degree of centralization or at least peer message propagation but there's no reason that information can't be separated from the infrastructure that displays and interacts with it.
A different view on this is that my CI system is a kind of colleague, responsible for a lot of what testers, ops, and build eng people toiled at before. It is another aspect of the decentralized system where a robot can also check out and work with the code even while humans continue to write more, which is considerably more painful under e.g. the SVN model. My human colleagues and I still share plenty of code directly, via git pulls and other tools.
We developed distributed VCSes and the ability to do smart things with branches and combining contributions from people all over the world at different times. Then we somehow still ended up with centralised processes where everyone has to merge to trunk frequently and if GitHub goes down then the world stops turning. It turns out that just using git as git wasn't such a bad idea after all.
We were told that dynamic languages like JS and Python are so much more productive and that this was essential for fast-moving startups to be competitive. Today the industry is moving sharply from JS to TypeScript and even Python now has similar tools. It turns out that static typing was better for building robust software at scale than relying entirely on unit tests after all.
Just wait until lots of developers who have only ever worked with JS or Python learn what a compiled executable is. If you tell them that xcopy deployment even to a huge farm of web servers is no big deal when all you're copying is one executable file and maybe a .env with the secrets, it'll blow their minds...
Well, technically it doesn’t have to. It needs to be „a“ CI server that the team agrees upon, not necessarily a GitHub one.
I think the most important factor is to keep in mind that this is something teams decide to do.
You can certainly argue that there are other advantages to using those systems. However it's an inescapable fact that if those teams had been using git-as-git and had a good local environment for each developer then most of the developers would have been able to carry on with most of their work during all of those outages. Sometimes single points of failure fail.
Seriously, a well set up gutlab is pretty much undownable. And even if you do, what really is the impact? If the pipelines don't run you cannot integrate, sure if the core service breaks down you will probably resolve to exchanging patches the old school way. But the good part is, in order to deploy those decentrally sourced changes you still will go through centralized CI and gain its quality assurances by doing so. Where's the drawback?
Given that many companies now tie their entire deployment process into their source control and CI/CD systems that means you really are reduced to exchanging patch files. For any non-trivial change in a large system that quickly becomes impractical and so development slows to a crawl or everyone literally gives up and goes to the pub.
Indeed. Software forges centralized what was a Decentralized VCS.
For one reason: profit.
I agree that centralization is probably not a hard requirement for these properties, but currently no one has come up with decentralized equivalents that provide users what they actually want - and remember, you don't get to tell them what they want!
You just described https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis...
Common examples of this are multi-million line C++ codebases (e.g proprietary game engines) and monorepos in any language.
Running tests on my computer uses up valuable CPU cycles that I can use to work on something else while the CI servers are running my tests.
However I can see this working for small to medium sized codebases that really don't need CI for testing.
Capacity planning applies to tests. As your test count goes up your budget per test goes down. Every CI tool should plot build and test duration over time and only a
There’s a reason most test frameworks can mark slow tests. You need to not only use that but ratchet down over time. Especially when you get new hardware.
I haven’t run benchmarks lately but my old rule of thumb developed over many projects and with several people better at testing than I, was a factor of eight for each layer in the testing pyramid.
That certainly puts a lot of runtime pressure on the top of the pyramid, but that’s by design. You don’t want people racing to the top, because that’s how you get a cone.
How well does it work? Well .... it sorta mostly works. Sometimes a build will fail for inexplicable reasons because Gradle/Kotlin incremental builds don't seem fully reliable. You re-run with a clean checkout and the issue goes away.
How much time does it save? Also hard to say. Most of the time goes into integration tests that by their nature are invalidated by more or less any change in the codebase. That's not exactly incorrect. Some changes in some modules do avoid hitting the integration tests, though.
TC can also do test sharding in the newest versions when you use JUnit. We don't use this yet though.
We have a project where the team split the tests into chunks and eventually I figured out the reason why is because they had coupling between tests, and running them all together ran into problems. I worry about other people opening that door, because it’s damned hard to close again.
In theory you could get something reasonably close to this working locally, but the it’s a serial process so that’s pointless.
I’m assuming you are only submitting this as theoretical, because in the real world (where I’ve directly experienced this workflow) it’s a nightmare:
* Someone merges main which deletes working files in your feature branch. Automatic pushing/rebasing of main onto your feature branch creates needless work the moment you try to push upstream when git tries to ascertain state of local main.
* countless times I’ve hit bugs where if I try to merge “patch-a” from a common “feat-1” branch (meaning patch was cut from feat, not from main), but then main is updated by the auto-updater, I then have a messy working directory in which main’s new files are treated as unknown orphans and I have to spend time deleting these by hand.
I’m all for having my feat branch be up to date at merge time. But making it a rolling target is something git (from my perspective) hits the boundary of what git can reasonably do, and creates more pain than any type of positive DX
Git may or may not be part of that process.
>Ideally starting from a blank machine/VM you should be able to run a single script that get’s the latest, builds it locally, runs it successfully, and passes all tests.
This isn't automatic, and is a practice that is fairly standard today (And one I agree with). You have placed a requirement/line in the sand that says, "Only when I have ideal state should this pipeline run in linear time and output the final result, which are the return codes from tests." Your example has the user initiating the update of the upstream main, not some other process that runs git fetch on the Developer's behalf.
Your original comment, as I understand it, is contradictory to this point:
>or code to function on CI machine will also be automatically loaded into production and other developers machines.
All of us (I think) agree that auto-deployment to production is a desirable goal. But we all (I think) know that broken commits are routinely delivered to production, where "production" represents the sum of all production environments in the world. So while we can have a reasonable assumption that "Production is, or should be deployable all the time," that doesn't mean the state that is represented by Production is safe to run locally in my environment, unless I *specifically* request it. Since git doesn't have file-locking, some other team/PM/developer can decide it's time for <MASSIVE REFACTOR> that blows away my work/branch mistakenly (Or maybe even intentionally, especially if I work in an org that is terrible with communication), creating unnecessary merge conflicts/mental load. This happens in short-lived and long-lived feature branches.
In no setup, do I think it's ever safe to take away the developer's agency and let some other process keep my local machine in "sync." There are so many variables to account for that some daemon/service can't be aware of, to allow for automatic updating (and again automatic updating != user running `git fetch`).
That’s not actually involved here. The actual process of coding can take place on a separate standalone project or even a whiteboard. But somehow all that code and everything associated with it needs to be packaged up for the team or it’s never making it to production. Further your process needs to minimally interrupt other team members or team efficiency tanks.
How that’s done is up to the team but it needs to happen somehow and automation avoids headaches. I briefly worked on a project where we kept passing around updates to a VM, slow and bandwidth intensive but it did actually work.
> "Production is, or should be deployable all the time,"
That’s a separate question. I am saying the code should be bundled with any needed environment configuration required to run that code. CI is a direct test of the process.
> In no setup, do I think it's ever safe to take away the developer's agency and let some other process keep my local machine in "sync." There are so many variables to account for that some daemon/service can't be aware of, to allow for automatic updating (and again automatic updating != user running `git fetch`).
Capacity is not a requirement.
For developers it’s about being able to hit a big red button and get your local environment working rather than something you automatically do day to day. Onboarding, or coming back after 3 months on another project, etc shouldn’t involve someone trying to piece together all the little environment changes that people need to apply since the last time someone updated the onboarding document.
In practice you might be reading a diff of some script and just apply that manually. But at least the sharing process is automated.
A red local build probably means a red CI build, but a red CI build doesn’t necessarily mean a red local build. Now you’re fucked because there’s a build failure you can’t reproduce.
Reducing variance helps. Having the person who set up the CI system also copy edit the onboarding docs helps a lot with this.
It’s a matter of scale. As the team grows and particularly as you hit the steep part of the S curve of development, you’re going to have lots of builds and that 1:50 error is going to go from every two weeks to once a day.
Humans interpret every day as “all the time”. Some do this for every week, especially if it coincides with their most important commits. It’s not the ratio of failures that bothers people. It’s the frequency, and the clusters.
On who's system? Your? Mine? Someone else's? The purpose of CI is consistent continuous integration in a same like manner, not relying on developer A, B or C's systems which may vary greatly.
This falls apart for things like integration tests, which may be too large/complex/interconnected to work on a local machine, but most of the time this would be more than sufficient.
Distributed signing of artifacts is only effective if you've got fully reproducible builds. If you don't, because almost no one does because it is a huge effort, then all you have is attestation. If a broken/malicious artifact gets deployed the damage gets done and you only know who to blame afterwards.
Local tests are useful for not knowingly committing broken code insofar as your tests can determine. Outside of that a full test suite run by a beefy cluster with ready access to assets and network resources is better suited to test for deployment.
It’s unlikely that verifying such a hash is possible without rerunning the tests, and not at the same time enabling someone to trivially compute that hash without having run the tests in the first place.
In trunk based development you can do a conditional commit, where TC runs your code and only pushes it to trunk if the build is green. This allows you to push something before lunch or a meeting for someone who is blocked without coming back to angry faces because you ding-dong-dashed.
Breaking trunk is fundamentally a problem of response time. If you’re not in the office you can’t fix a red build in a timely fashion. He did this regularly and got put under house arrest.
I started using it on myself so people didn’t have to wait two hours for me to get out of a meeting and fix their api bug.
CI itself is just continually testing before continually merging. Doesn't matter where.
If it doesnt build on CI, you broke it, fix it before merging.
Though, I'm still sour from my only experience in "tech" where "works on my machine" meant devs got to demand I fix the CI somehow.
A small but important part of my motivation to build better DevEx is that it puts a squeeze on the foot draggers. If you can’t blame the tools for being shit (and believe me, I know a lot of them are), then the only other explanations are that you are either too stupid to use them or just not a team player. An asshole.
Developers will pick being called an asshole over being seen as stupid any day, but the former eventually gets management involved. It’s the closest I’ll probably ever get to solving a social problem with tech. Make it easy for people to participate, make them feel silly for not, then ramp up the pressure if that doesn’t work, and when it’s clear they only care about themselves you have actionable evidence to involve HR.
Of course there are many pitfalls. Like with any tool it becomes trivial to use it wrong. Integrating less than once per day and deferring all testing and linting to the CI process is an anti-pattern and the issue is not the CI process but how the team chooses to use it.
For example I worked in a team of 10 and we were doing multiple production deploys per day. This was made possible by a great CI workflow. Everyone was running tests precommit and focusing on keeping the pipeline green. Yes I've seen the opposite as well, but that usually is a symptom of issues with the team not the general concept of CI.
Not saying that there isn't room for improvement but all the real-time collaboration features I used on low-code platforms feel like a step back.
I'm confused as to why this is an anti-pattern? My understanding is that the CI pipeline should run unit tests and linting for every commit. But at the same time, developers should run their tests before pushing code.
Of course the CI should always run them, but that should normally be as a confirmation/safeguard.
I've seen too many cases where the devs wouldn't even run the code locally. They would push it and expect the CI to do all the work. That's how you get shitty CI that is always broken.
If you allow devs to push directly on release branch, thus breaking the CI, you’re absolutely doing it wrong.
Of course you can do it like you said but that means longer feedback loops in general. If the team wants to integrate more often and reduce feedback loops then that model evolves.
I'll give you an example.
In the team I mentioned in my top comment we were initially using a branching model with master, releases/, hotfixes/, dev, features/, which gradually evolved into master, dev, features/, which finally ended up as master, features/*. With the important mention that for small changes/fixes that needed to get deployed quickly nobody would bother with a branch they would just push to master.
This allowed us multiple production deploys per day per developer with no risk. That's why I said I don't get the point of the article, you can absolutely get those short feedback loops and continuous integration if you want it, just need to setup the process that way.
Longer feedback loops =/= long feedback loops. You can definitely wait 5 to 10 minutes if it means doing it right.
> With the important mention that for small changes/fixes that needed to get deployed quickly nobody would bother with a branch they would just push to master.
From my experience, the 1-2 lines fixes are the ones that benefit the most from automated CI because you’re doing it in a rush. In my team just last week a junior dev asked us to review their PR quickly because it was just 2 lines, and it didn’t even compile. We told them to be more careful in the future, but in the end it didn’t impact anything. It couldn’t possibly have impacted anything thanks to CI, it just makes it impossible to fuck up too recklessly.
I agree with you though.
Not that I don't see the value proposed by TBD, but I think you can have >90% of said value and none of the downsides using a well thought out branching strategy.
 Trunk-Based Development: https://trunkbaseddevelopment.com/
> Depending on the team size, and the rate of commits, short-lived feature branches are used for code-review and build checking (CI). [...] Very small teams may commit direct to the trunk.
Sometimes TBD is the answer, sometimes you need something else.
What I did notice is that, with time, mature teams end up simplifying processes in order to reduce friction and increase output.
Lazy devs are going to be lazy no matter what processes their team uses.
It does take some extra work though, because GH and others don't really support this out of the box.
My issue with this approach is that it becomes tricky to scale since you can only have one job running at a time. Allowing master to potentially break scales better because you can run a job for each commit on master which hasn't been evaluated. Technically you could make that approach work by instead of rebasing onto master rebasing onto the last commit that is being tested, but this adds extra complexity which I don't think standard tooling can easily handle.
https://zuul-ci.org/ and some other systems solve it by optimistic merges. If there's already a merge job running, the next one assumes that will succeed. And tests on merged master + first change + itself. If anything breaks, the optimistic merges are dropped from the queue and everything starts from the second chance only. Openstack uses it and it works pretty well if the merges typically don't fail.
The solution is to require that all PR must be rebased/synced to master before they can be merged. GitHub has an option for enforcing this. The downside is that this often results in lots of re-running of tests.
> something even more continuous than continuous integration. It might look like real-time integration of your changes, collaborative editing, or smarter build systems that produce near-live feedback
I can have the normal flow, which takes minutes and runs all the tests and other stuff I might have on my pipeline (security, performance, etc). I use this for typical day to day work on features. I don't care if the deploy takes 50ms or 5 minutes.
I can have a fast-track for critical production patches. I skip all the main CI steps and just get my code quickly in production. If done right this takes seconds.
I didn't know about dark but I've seen this type of promise too many times, so I'm pretty sure there are many tradeoffs hidden under the nice shiny exterior. I can't know until I try it, but that kind of complexity doesn't all just disappear, i always has a cost even if it's out of sight.
Perhaps some of those teams and individuals are experts at determining the correct, minimal continuous integration suite to run on a per-commit basis to minimize time and energy expenditure without compromising correctness.
But I can guarantee that not all (in fact, not many) are, and that they pay maintenance and mental overheads to adhere to those practices.
It feels to me like there is a potentially massive opportunity to design an integrated language-and-infrastructure environment that only re-runs necessary checks based on each code diff.
- "Altered a single-line docstring in a Python file? OK, the data-flow from that infers a rebuild of one of our documentation pages is required, let's do that... done in 2ms"
- "Refactored the variable names and formatting within a C++ class without affecting any of the logic or ABI? OK, binary compatibility and test compatibility verified as unchanged, that's a no-op... done in 0ms"
- "Renamed the API spec so that the 'Referer' header is renamed 'Referrer'? OK, that's going to invalidate approximately a million downstream server and client implementations, would you like to proceed?"
(examples are arbitrary and should imply no specific limitations or characteristics of languages or protocols)
Doing this effectively would require fairly tight coupling between the syntax of the language, ability to analyze dataflows relating to variables and function calls, cross-package dependency management, and perhaps other factors.
Those properties can be achieved during design of a programming language, or they can iteratively be retrofitted into existing languages (with varying levels of difficulty).
Bazel attempts to achieve much of this, although to my understanding it offloads a lot of the work of determining (re)build requirements onto the developer - perhaps a necessary migration phase until more languages and environments provide formats that have self-evident out-of-date status and dependency graphs.
We'll get there and the (sometimes uneasy) jokes will be about how often the software industry used to re-run huge continuous integration pipelines.
 - https://en.wikipedia.org/wiki/Bazel_(software)
What happens is that people don't understand that most of these higher level solutions are very leaky abstractions, they aren't the silver bullet marketed in medium articles. Yes, they can save you a lot of time and headaches in specific scenarios but when you encounter one of the leaks it can take you weeks to get to the bottom of it.
If the teams doesn't understand what problems the tool is solving and if they have that problem then they might be just cargo-culting. An example of this is kubernetes. I know teams that used kubernetes just because everybody else is using it, they didn't actually need it for their monolithic java spring app. They think they avoided accidental complexity by using kubernetes but in fact they just added accidental complexity to their project. And then they add a new tool that makes it easier to manage the complexity of kubernetes, and so on.
Anyway, I'm probably just a rambling fool and I should appreciate that all this generating and shifting around of accidental complexity will actually mean future job safety for guys like us.
I have worked in a place where they did that, and I think the cons heavily outweighed the pros. I can not push incomplete work to a remote, I can not separate feature development and chores (eg. linting) because I _need_ to fix stuff in order to commit an push, etc.
> Continuous Integration is a model thats tightly tied to how we run version control.
I would say that a pre-commit testing world is much tighter. CI, as many know it, as a passive observer. When stuff fails, it will let you know. You can still merge.
One thing that would be nice, however, would be the ability to run the entire pipeline locally. For GitHub actions, it indeed seems like there are viable options to do that.
I prefer not having to replicate locally how GitHub runs GitHub actions, but rather just make my GitHub actions run something that I know I can run locally. So all the complicated stuff is contained in the script that I can run locally, and GitHub actions is just a dumb “runner” of this script.
For my local script I prefer using Nix, since it’s the best way I know to guarantee a consistent environment between different machines.
This cost me many hours of waiting for the Gitlab CI runner when debugging non-trivial pipelines, when the issue was something that did not have to do with the script steps inside of the jobs but rather how the Gitlab runner handled things.
I've found gitlab-ci-local  which actually does run the Gitlab pipeline locally, although I had to write some boilerplaye scripts to set up all the necessary 'CI_FOO_SOMETHING' environment variables before running the tool. (Which sometimes came back to bite me because the issue was actually in the content of some of those environment variables). It's still a very good tool.
Side note: makes me sad when even open-source tools for Gitlab are hosted... on Github...
Also hate those.
What I do instead is have the tests run as part of the build. If the tests fail, the build fails.
Definitely a good motivator for having fast tests, which is essential. The ones for my projects tend to run in about a second or two.
Also gives you way greater confidence in your tests.
Maybe the right thing is to just write the workflows in something else, and have the github workflow file be a single call to your script. But
-It would be nice to be able to use github actions others has made (or libraries/abstractions in general)
- I don't see how to easily get parallell execution.
- I love github environments, how do I pass down all my environments and their variables?
I'm working on something to do just this. Although I've redefined what "local" means in this context. I'm still using remote servers but everything is happening pre-commit from the terminal. If you are interested check out this demo and let me know if you have any feedback https://brisktest.com/demos
Though for me it’s also like… so many people do unspeakable horrors to their machine setups that I like there being a consistent runner in CI.
But “CI is to run tests” in the world of CD is a bit of a simplification anyways.
Tho technically you can probably have a local runner which either distributes the test suite across a bunch of available matchines (à la pytest-xdist), or one which goes and creates jobs on the CI platform without needing to go through the pretest ceremony (e.g. creating commits, branches, CRs, ...)
I stand by this idea that if someone made Bazel, but without all the obtuseness, with strong integration to a hosted CI service, they would quickly overtake a lot of services.
Per-commit hooks save my time by not allowing to push code which guaranteed to break later in CI.
What if the client is compromised? Are we throwing away reviews? Should the reviewer re-run all the tests?
I kind of get what this is saying, but technology evolution doesn't have to mean completely replacing said technology with something else.
I think that's one weird thing about the software field, whereby we keep moving to these shiny new things that we think are better than the tools of yesteryear, yet in the end there is only a marginal gain in productivity.
Fork and pull is an incredibly productive and powerful workflow. CI is incredibly, incredibly useful. If these things were not the case, then neither of these would be even discussed by this article. There is a reason for their success - and it's not because GitHub is the most ubiquitous code hosting service out there. Git is _actually_ pretty great. CI is _actually_ very useful and has secured codebases for decades at this point.
So if one were to proclaim the "End of CI" I really need to see a viable alternative that addresses the same problems as CI and significantly improves upon it. An incremental improvement is not enough to shift and rewrite everything - there needs to be a significant jump in ability, productivity, security, or something else in order for me (and I imagine many others) to consider it.
"Thought leaders" need to he constantly talking up the next new thing so that they can stay ahead of the herd on socia media.
Developers get bored, or worry that their career is stagnating, if they're not using the new shiny. Particularly if they pay attention to the thought leaders, or they're stuck building unglamorous crud apps.
And the software industry generally has a poor collective memory of tools and practices and experience from even the recent past. Contrast with more mature engineering disciplines, or architecture, medicine, etc. I'm not sure why this is.
Agreed. But I'd take it a step further and say:
...yet in the end no one is any happier. Not engineers. Not management. Not leadership. And most importantly not the users of the product.
We keep building and delivering more. But how often is that better? Either as an end to end experience or simply fewer bugs? Most of us - who are at some point users of products we didn't build - have resigned ourselves to that fact (read: it's been normalized) that an uncomfortable amount of friction is a given. That's sad.
FFS, look at GitHub. Shows your commits. They could be a high percentage of shite yet we drool over a saturated commit graph.
I'd say we build bigger things with the same size/bug ratio. Therefore, "Small is beautiful".
I am forever fielding questions / issue from my (retirement age+) parents. They are my benchmark for usability. From being able to open jars of food, to a website bug, to ambiguous UXs.
Until these questions are reduced - and they been steady for many year now - then I'll presume we, the makers, are failing.
That's a misunderstanding about size. It is not about saving some kilobytes - indeed we have more than enough RAM/ROM/flash/mass storage in general, although it is a bit less true currently due to shortages - but it is a hint that the whole thing is better in other areas as well.
Except for the cases where a speed/size trade-off has been made of course. But even when it is a features/size trade-off, it is not always a bad deal, because software often has lots of features you don't care about, but could nevertheless introduce bugs in the features you do use.
That said, I'm not against the idea in principle if access to the internet is guaranteed and constant. In some places that I'm in, internet access is shoddy or just too slow for this kind of thing, and my preference there would be to work easily on a local machine without access to the internet.
Technology and in general historical trends are rarely broken, they change but there is no gaps in sense, I don't remember who said it, someone in defense R&D, if recall correctly.
Actually, like the wheel, I think it is the final form. When artificially sentient machines are writing code for us and someone or some thing pollutes or corrupts a critical model; a human will need so step in and fork where things were working to fix it.
CI isn't just a remote compute scheduler. Its also the place that contains tooling, tooling which repeated across a dev team creates n * jobs replicas of that tooling - which run on a local machine would be fairly wasteful. Its also centralized because CI is one of the biggest vectors an organization runs; it contains credentials for outside services, credentials for deployment (which, if you're SOX-bound cannot exist on a dev machine), and if you're cryptographically signing binaries and doing reproducible builds it's got those keys too. n * developer laptops makes that vector a planet sized blot. A lot of the policies I imagine dealing with that would be limiting developer access to their machines, limiting tooling to only approved or internal tooling.
It'd be a nightmare.
However, things like:
[...] git is already showing its weaknesses. Fork-and-pull isn't the final form of development workflows.
> First, CI rarely knows what has changed, so everything needs to be rebuilt and retested.
Yes, this is an issue with some CI/CD systems, an issue you can solve however. See our Push Policies based on Open Policy Agent. Test-case caching is also sometimes available (available in i.e. Go).
> Running tests pre-commit is one answer.
Running tests locally for a quick feedback loop - sure, that's fairly mainstream though (something you can use our local preview for as well in the case of IaC). Running tests locally before directly deploying to prod - that would be a compliance and security nightmare.
The author presents what looks to me like a very "move fast, break things" attitude, that doesn't really work in a whole lot of cases.
If your CI/CD is slowing you down, make it faster. Easier said than done, I know, but a lot of people don't even think about optimizing their CI/CD, which you often can do, by being smarter about what you run and/or parallelizing extensively and/or caching.
“CI is frustrating!!” I hear ya, but this article does nothing to clarify a viable alternative.
Gitlab even has a great feature where it will run your CI on the post merged result of a MR. It really is the best of all worlds.
A good example of this are tests which hit the DB hard, if you are in a different location to the test database servers, the latency between the app server and the db can absolutely murder performance.
I can sort of see the problem if you are often seeing CI fails, which require tedius 'fix ci, fix CI 2, fix CI final, fix CI final FINAL' commits taking ages to test and see if it works. But really that's a different problem.
Why is the current CI slow and resource-heavy?
The language(s) are inadequate: every function can affect another function or data; every function / method may have arbitrary side effects. Even if not so, the language does not export the dependency graph.
The build system is inadequate: because in the source code everything potentially affects everything else, a lot more needs to be rebuild to be sure than would be strictly necessary for the logic of the change. Even if not so, the build system does not export the dependency graph and the scope of the actual changes.
The tests end up inadequate: even if they are written competently, they cannot trust any info about the scope of the changes, so they need to test way more than logically required to be sure, or, worse, they actually need to re-test everything. Also, due to the way other systems (databases, etc) are represented in the test setup, they are hard to run in parallel.
Microservices were invented partly as a way to fight this: they make very clear bounds between independent parts of the system, and strive to keep the scope of every part small. Their small scale makes the problems with the CI tolerable.
What we need is better tools, better static analysis and compartmentalization of the code, more narrow waists  in our systems that allow to isolate parts and test them separately.
- trunk-based workflow. Small commits. No feature branches as a rule to be occasionally broken
- unit tests move left - run pre-commit (not necessarily run in CI). Emphasis placed on them as a refactoring tool.
- a critical, small suite of e2e and integration tests that block the pipeline before publication (fast feedback)
- latest build publication being constantly redeployed to production, even if no changes have taken place to exercise the deployment routines
- a larger suite of e2e and integration tests being constantly run against production, to give feedback when something isn't quite right, but it's not a disaster (slow, non-blocking feedback).
In summary, emphasise getting code into production, minimise blocking tests to critical ones, test in production & notify when features are broken.
- Engineers spend too much time in test environments that give the illusion of the real thing. They lose touch with production as the Dev cycle increases in circumference.
- Enabling tighter feedback cycles by accepting that some features are important and some are not helps put the cost of change into perspective for the entire product team.
- Engineers get used to working in and around production on a daily basis. Production operations and observation of systems (& users) are emphasised - protective bubbles for writing code are minimised.
You're not trying to maximise code output, you're trying to maximise the velocity of safe change, and you do that by understanding the environment (production) through your intelligence (observability of systems and user behaviour), so that you can employ your force (changes, code, features) rapidly and effectively, whilst maintaining the ability to quickly deal with unexpected problems (defects) along the way.
Disclaimer: might not be possible for your specific theatre of war for any number of reasons.
That's why I wrote mazzle and platform-up.
Mazzle is a run server that is configured as a Graphviz file (.dot) and defines an end to end environment. Here's an example graph file of a pipeline:
It's largely a prototype of an idea. It's infrastructure and pipelines as code but incorporates every tool you use from terraform, chef, ansible, puppet, Kubernetes, packer and shell scripts. My example code spins up a Consul and Kubernetes cluster with hashicorp vault and a Debian repository server, configured SSH keys on every worker, bastion and Java application. And Prometheus exporter and grafana. I haven't got around to adding ELK yet. But it didn't take long to do all these things due to Mazzle meaning it's very easy to test complicated changes together.
https://devops-pipeline.com - Mazzle
Platform up is a pattern for local development that tests your Microservices locally all together. You use vagrant-lxc and ansible together to deploy everything locally. So you can test your secret management locally and deployment process. If your ELK stack is ansible driven you can even run your ELK stack locally as I did on a client project.
Software development that relies on "special" environments is prone to break down sooner rather than later. If you cannot execute your tests on a fresh machine after half an hour of setup or so, something is fundamentally broken in your toolset.
In turn, this requirement means that in a well-designed development environment your integration frequency is only limited by the product of integration points (e.g., your branch and "develop", or your branch and n other ongoing branches) and the time it takes to run the tests.
Let me assume instead that most software projects happen in the long tail of small companies, low LOC numbers, low developer head counts (even one or zero - consultants working only when needed.) In that world I saw deployments run when tests pass on one PC, deployed from that PC. In the case of scripting languages even with a git pull on the server and a restart. That works surprisingly well until the team gets larger. Then customers often resist the extra costs and burden of CI. Then some of them cave in after something bad happens.
Build can be triggered manually in every CI system I know. Why would I push empty commits?
: which is pretty much every project that survived the POC phase
Or at least, if we are, it's a far more robust and capable system that's actually really designed for dealing with all the different things that people need to do while building and testing software.
But surely most IDEs these days make it simple to run all unit tests locally? My usual M.O. is to manually run the tests locally that I believe will demonstrate my changes are "working" and rely on the full suite of CI unit tests to check I haven't broken anything else. What other options are there?
While I develop I start a `watch` process that keeps running the tests. The watch runs a container with the docker image of the CI, mounting only my development working directory.
With this setup there's no added value to the CI
We looked at doing this, but it seemed to be pretty expensive. Like, it was cheaper to upgrade a developers machine twice a year expensive.
Is that what you see or was it just setup wrong on our side?
Not everything has to be a massively scalable cloud-native web app with a dependency tree 27 levels deep and DevOps turned up to 11. Not everyone has jumped on the merge-to-trunk-and-deploy-every-five-seconds process train.
Plenty of developers still work with modular systems in teams that each focus on a particular part of those systems. They follow simple feature-based development processes. They have local development environments that provide near-instant feedback on their own workstations.
A lot of the problems mentioned in the article simply don't exist in that kind of environment. It would never even occur to those developers that they might need to push work in progress to trunk just to be able to test it properly or that their team might have significant merge conflicts so often during normal development that they'd need to change their entire process to reduce the frequency.
It's like Scrum. Some people like it and some people don't. But the people who don't like the rigid time boxes and ceremonial meetings probably aren't going to join organisations that have those things. Sometimes people in those organisations then start to think that their way is the only way because they've been doing it for so long and never seem to hire anyone with different ideas.
The point of CI is the same: to batch your work into stages and apply it just when it is needed. Merging your work incrementally ensures putting in your pieces doesn't stop everybody else from putting in theirs or making them re-do stuff. So CI isn't going away.
What we should do is make the process more seamless. For example, IDEs should not be editing local files, they should be committing directly to a VCS branch, which should be immediately rebuilding an app, which should immediately redeploy in a cloud environment dedicated to an engineer. We can make it as fast as local development (it's not like your local computer is faster than a giant server in the cloud).
If you're thinking "but what about offline work?", we literally have 250Mbps satellite internet worldwide now. If you really can't get a stable internet connection, you can keep a local server to do development in. But the local server software must be 100% identical to what's in the cloud. Towards this end, we must build fully-integrated, fully-enclosed, fully-self-hosted CI environments. DVCS will seamlessly link the local and remote environments.
It's two different worlds.
Left or right of VCS should not be a big question given the VCS is supposed to be decentralized, but for that to happen the whole dev process (or at least a very substantial portion of it) should be day to day decentralized at well, or at least with a (well tested) ability to decentralize when needed, without friction.
If we go back to the more general "configuration management" concept (than merely source code version control), it is obvious the ideal situation is when the whole pipelines can be freely reproduced, in which case you can usefully do it before the integration on a reference branch/repo (and redo it after just to be sure because who knows if the whole configuration has been correctly captured: you better detect discrepensies early too
- also because in some workflow you won't test exactly the same branch before merging to a reference as what is the result of the merge)
I've become a huge proponent of where you can (sometimes you have to cross-compile, or the compute job is so big), using the same os to develop that you deploy in. This grossly simplifies tooling and testing and has the side effect of focusing developer mindshare (and development hours) away from simulating a foreign OS on their local machine. That time that would have been spent getting X working in Windows (and maintaining it) or fixing some deploy to container script because the new MacOS broke something can be spent, if there's nothing else better, fixing bugs in the actual product.
In my case that would be some sort of Linux targeting servers. Individual Linuxes are pretty different in important ways, so it probably would have to be the exact same flavor. I've seen my colleagues spend a lot of time fighting even the Linuxes that target laptops to get basic things like USB-C, sleep, battery life, docking stations and projectors to work well, which I've read on HN were all solved about five years ago, but I still see colleagues fighting Linux quirks, so I guess a laptop running RHEL will be a terrible hassle.
I would also have had to switch my OS every time I've changed employers (sometimes when changing teams), and I'd lose the accumulated muscle memory of over a decade on macOS, plus all of the platform specific tooling, plus the full Microsoft Office suite to interface with the non-tech part of the company.
I just try to deliver everything I build as a docker image, that way I get pretty decent isolation from the particularities of the host OS and it's pretty easy to test locally under conditions very close to prod. I've switched to a M1 Pro notebook earlier this year and apart from having to specify the arch when running some docker images, I've hardly noticed a change, though I don't develop Linux or Windows desktop software (hence, no X or the like needed).
It seems to show that the author likes to use the git commit hooks to run linters and unit tests. Which is ... debatable, but the real question is: why does this should affect CI?
I understand that discovering a problem during the tests after the code landed on main branch is ... disconforting, at best.
I also understand the frustation of not being able to commit your changes in a precise way because the git hook required you to also touch other stuff.
What I don't know is why can't we run the CI on a merge-request flow, on the beefy servers, without blocking the local commits, sharing feature branches any time we want, and be checked only "when ready".
FWIW I worked on a project set up this way, with a jenkins job starting each time a MR was made, where the branch is applied "locally" in the main branch and the tests run, and a (needed for actual merge) approvation obtained only if the tests passed.
This is a twitter tier take with no insight to back it up except “things will continue to get better and soon things will be different!” But doesn’t paint a picture of what different might be.
I mean, yeah things are going to change, duh
Jokes aside, one of the reasons why we still need a CI is it proves we can do a full clean build. Optimizations that rely on what actually changed are good for local dev.
One way to truly get rid of CI is to get rid of the need to do a full clean build. If we structure our build to be constructed from a series of immutables input files (easy with version control) with a series of pure (as in no side effects) build steps and hash the output file by its input + builds steps, then the concept of a clean build is meaningless.
Then you could even allow regular dev machines to push the build assets to a centralized build cache. However I would still want an independent "known good" oracle to rebuild and check the hashes. One could call this oracle a CI...
> Today, a developer would be crazy to suggest anything other than git and trunk-based development for a project.
My understanding is that trunk-based development is about shipping a feature via multiple merges of incremental changes that are often single commits; the main opposing model is feature-branch-based development, and it's with the latter model that the terms "pull request" and even "branch" are associated. Is my description of the terminology accurate and is the author of this article using the terminology confusingly?
However, most of this is a result of the author assuming that everyone and every organization echos their experiences and point-of-view (which isn't uncommon in online writings).
If the article were titled something along the lines of "My views on the long-term viability of CI" and the article written accordingly, then readers would be more willing to ponder alternatives to, or even a redefinition of CI in the viewpoint of the author knowing that it's a personal opinion piece.
> As I wrote in Your Integration Tests are Too Long, developers become reliant on CI and won't run the tests until merge-time.
That's a big issue. I think that testing is very important, and integration testing should be done from the git-go. Anything that discourages early integration testing, is a problem.
Recently, I was invited to submit a PR to a fairly sizable project (I was the original author, but have not had much to do with it for the last three years or so).
I declined, because, in order to make the PR, I would have had to set up a Docker container, Composer, Jenkins, xdebug, PHPUnit, etc., on my computer, in order to run the full integration tests (I won't submit a PR without running the tests, as that's just rude).
For someone that is a regular backend engineer, like most of the team working on the system, that's no big deal. For me, it's a fairly big deal (I write frontend native Swift stuff, and don't have infrastructure on my machine for that kind of work).
That means that they will have to do without a fairly useful extension that I could have added.
That's a problem with your building and testing tools. Even without CI you would still need to build everything and test everything if you have no way to do it just for what changed.
>How often have you or a coworker pushed an empty commit to "re-trigger" the CI
Most CI solutions have a button to trigger a new version without a new commit.
>Running tests pre-commit is one answer
Even with CI developers are likely running at least a subset of the test suite while they are developing.
>Yet, there are roadblocks today that need to be fixed, like tests that don't fit on your laptop
Either develop on a server while using your laptop as just an editor or have a test runner on the server.
>Things are shifting left
The problem with this is that at large companies the changes you've already tested will always be rebased onto a newer version of the codebane which it hasn't been tested with. Who runs tests for this newly rebased version? CI. Also for things like come review you can not push it left onto the devs machine. You will want linters and tests to be run for come review.
Funny, I thought it would be crazy to suggest trunk-based development when everybody is so enamoured with feature branches (trunk-based means that the whole team commits directly into the main branch - so pretty much the opposite of using branches with merge requests).
Another counterpoint is that CI makes at least a lot of sense for cross-platform development. E.g. when you're working on Linux, you can't trivially check your code locally on macOS and Windows, or even just on different compiler toolchains unless you install all those things on your local dev machine.
Trunk-based means merge requests get merged into master instead of feature branches.
Git doesn't lend itself very well to this kind of development model though, because it's doesn't have a central repository like SVN.
Also see here: https://trunkbaseddevelopment.com/
The basic form of trunk-based doesn't use feature branches, the "scaled" version only uses short-lived feature branches. But I can assure you that the "scaled" version isn't needed for teams of up to around 100 contributors (with a centralized version control system like SVN that is).
Feature branches let you save intermediate work outside of your computer.
If you have trunk based development and your machine smokes, all unpushed work is lost.
People are looking for some philosophical advantages of one over the other. But in practice, it's just a trade-off of how much you save and where.
I think it really comes down to the differences between a distributed versioning system like Git, and a centralized versioning system like SVN. A lot of the "branching ceremony" that evolved around Git is about managing the different timelines that evolve because every dev has its own local repository with its own history which then needs to be synced with a remote repository, and this entire problem area simply doesn't exist if there's only a single shared repository.
In SVN there are no "merge commits" polluting the history when working on a single branch. If you update (in git lingo: pull), there are no "merges" or "rebases" happening, instead you get conflicts that you need to resolve locally. Next time you "commit" (in git lingo: push) those resolved conflicts are uploaded to the central repository as if they were regular changes (e.g. no "merge commits").
Somehow this central-repository-model of SVN still is much more logical to me. The distributed model of Git doesn't make a lot of sense unless you're a Linux dev with your own 'fork' sending patches to upstream from time to time via email (e.g. despite git's decentralized nature, Github or Gitlab are also just trying to emulate a simple central-repository-model).
> If you have trunk based development and your machine smokes
That's why you commit to trunk just as often as you would push to a feature branch. You just have to organize your work in a way that frequent commits don't break the build (e.g. by putting your work behind a runtime feature flag), the upside is that you never get into a state where you need to deal with complex merges, because your work never differs from the shared project state by more than a few hours.
Runtime feature flag sounds absolutely repulsive. And to avoid complex merges you can just rebase your feature branch onto current trunk often and either avoid working on two large tasks affecting same area simulatnously or merge or rebase one on the current state of the other when the other is in stable enough state.
But still, branches are simply not as important in a centralized version control system where everybody works on the same shared repository state.
Runtime feature flags also make a lot of sense outside the version control worflow. If you have a "live product" you often want to enable or disable features after a new version went live, sometimes only for a group of users.
Yeah, most likely. Since SVN is not dead yet, that must be have been the case because I don't see how any modern dev could suffer with SVN I suffered with.
I see how having binary data checked out into git and merging in any manner other than "use A" or "use B" might be a nightmare and how never branching might be the best (or even the only) way to even be able to use versioning at all.
I remember that when at some point I started using sqlite in my projects I wanted to have another database format I could check into the repo that would use multiline text representation for the data and things that must be binary like indexes would be just generated on deploy.
I can't imagine needing to have in my source tree some opaque binary formats that might be altered partially.
I think I'd just kept those outside and use meatware process to manage change in them. Or have some migrations system where updates to binary files might be described textually and checked into the source tree.
It’s inherently a batch process, so not continuous.
It’s an automated build and check system. It also runs on merge commits but this “integration” part is really marginal to the concept as represented by e.g. GHA.
I remember the MCSD docs were promoting a "Daily build and smoke test". This was a huge difference in that they promoted at least making sure that thev whole system could be built in some state every day.
CI really appeared in the late 90s when someone had the idea that integration tests could be run for small gains in functionality. Then the software could be automatically tested on every incremental change. I credit C3/XP for popularizing the practice but I'm not a historian. Possibly someone was already at it before.
I think there's a whole lot of truth in this particular statement. I've worked with lots of devs who complete a story, commit it, and then find the tests aren't passing any more and have to do a little more work to fix what they broke. And then tery complain that either tests slow them down or that the estimate for the story was too low. Having a test suite that can be run locally, and teaching the team to use it regularly, even if it's wholly optional rather than running on a hook, improves team velocity significantly.
I often fire off a commit, get notified of a test failure, and fix it in the next commit. Why wait for tests to run locally when some other computer can do it for you?
(Yes, it would be nice if all tests ran in zero time, but back in the real world...)
However on the topic of real-time integration that is somewhat already here for no-code systems that are already divorced from traditional version control.
When I say tech debt I mean design flaws created by designer who simply could not anticipate the future. I'm not talking about bugs, mistakes or intentional shortcuts.
Whether jenkins is "tech debt" is a different topic. But whether Jenkins is here to stay past 2025 has nothing to do with how well it's designed.
If the builds done by the developer can be cached and shared with the CI, CI's role is lessened to just a gate-keeper. Most of the time, the cache is already warmed up by the developer's build and CI is a noop. Imagine a CI that take 5 seconds to finish.
For that to become reality, the build environment needs to be truly hermetic and reproducible, so that the cache can be trusted. Remote builders also help establish trust.
This doesn't mean that daily tests will be a thing of the past, but potential conflicts can be found sooner.
I don't know how commits would work. I feel like they would become a hindrance to productivity, but having atomic changes spanning across multiple files seems like a must for proper rollback scenarios.
I love the freedom having my own private branch gives me to go crazy and demolish whatever I want to see how bad it is to fix. It's like taking out supports in a building to watch how it falls down, then try again.
By the time I make a PR I've rebased and cleaned up my commits and made it look like a nice controlled demolition and careful surgical replacement. No one has to see I was ignoring hundreds of complie errors for a couple days.
> It can be very tempting to substitute automated build software for an integration computer.
So as a thought experiment it’s neat, but knowing that most orgs will never need Kube and will never develop software like a FAANG, it just strikes me as clickbait given the title.
I‘m not convinced by pre-commit arguments. C in CI isn’t only for „continuous“, it also happens to introduce „centralised“ and „certain“ (as in, it’s going to run whether you like it or not).
Coupled with good git style there really is absolutely nothing that needs to change for teams of 1 and bigger.
Just my 2€
> There is such a build system, but I can't remember the name right now. It tracks system calls to see every file opened by the compiler to produce exact dependency graphs (assuming compiler is deterministic).
> The downside is that it's Linux only.
As I declare that I start development on a task, the branch is made automatically from current trunk and testing environment is set up with this branch deployed. As I commit and push that environment gets updated.
Last part of the development before handing it off to a tester is merging current trunk into my branch and resolving any issues. When the tester picks up this task to test then another merge from trunk to task branch is performed automatically and if there are conflicts task is automatically pushed back to development to resolve issues so the tester can immediately move onto testing another task. If there were no issues with pre-test merge the tester tests if the task was implemented satisfactorily. Then, if trunk didn't change in the meantime task branch is merged back into the trunk by the tester. If the trunk was changed then the testing part repeats. The issue is that only one development task can be tested at a time without the need to recheck all currently tested when one of them is merged into a trunk.
In case of small tasks, tester might gain familiarity with the task to recheck it quickly after trunk was changed and merged into the task. In case of large dev task it should be split between few testers.
Order of merging tasks into the trunk should be determined by testing team to minimize recheck needed after the merges.
If anything goes wrong at any stage of testing the task is immediately kicked back to developer.
In this setup developers do no manual branching and merge only current trunk to their branches and can pretty much work independently (unless two pick very overlapping tasks), they can work on multiple tasks at a time and react quickly to tasks that come back from testing with minor issues. And testers coordinate testing and merging tasks back to trunk. Testers are also in close contact with stakeholders so they can demo new features for them on testing environments and know the priorities so they can work on merging higher priority tasks first. They can even participate in designing the new features or improvements as they know the current state of built software best and interact with it the most. Once the design is starting to crystalize deveolpers might be asked to provide input on feasibility and complexity of the feature implementation.
When I say merge it might be better to do a rebase. I'd have to actually work in that system to see what's better.
> But all technology evolves, and git is already showing its weaknesses.
Nobody does it because developers can't be trusted to actually run tests.
Perhaps if there was some way to cryptographically prove that you've run tests? That doesn't sound possible though.
1. I still wouldn't put it past developers to lie occasionally.
2. You don't control the test environment. What if the tests pass on the developers machine but not on the CI machines?
3. Difficult to test on multiple platforms. What if the developer uses Mac but you need to support Windows too?
4. Testing can take a long time. Who wants to wait an hour to submit a minor PR? Plus it ties up resources on your machine. CI can automatically scale.
5. There's no way to avoid races between testing and integration (i.e. merge queues).
CI isn't going anywhere. I think the biggest scope for improvement to CI is
1. Only testing things that have changed. Most people don't do that because it requires a build system that properly isolates everything. Basically only Bazel and it's derivatives do this. You can't do it with CMake or NPM or Cargo or ...
2. Make it easier to run CI on your own machine. I don't think there's really any technical barrier to this, it's just people don't usually bother.
> Trust in the developer is required ... but it is likely the case that developers are incentivized to illustrate that they are reliable, careful and trustworthy -- and that should strongly encourage accurate test result signing
100% guaranteed you're going to see "oh I just made a stupid typo that broke one test but I can see what is wrong so I'll just fix it and use the previous test results".
Also since it's entirely based on trust anyway I'm not sure what additional benefit signing gets you. I don't think there's really a way to prove you ran the tests.
Either way, all my other points still stand. CI isn't going anywhere.
It's a reputation and trust-building exercise, essentially - and that's one of the reasons that continuous integration is particularly useful. "Entities X, Y, Z all say that commit <ID> looks good".
> Either way, all my other points still stand. CI isn't going anywhere.
The idea of "running the tests locally" may fit some projects but in a complex system it's definitely a no-go.
I think the author means "the end of build servers". You can do CI fine without build servers.
When I read the title I automatically assumed this would be a "...because CI won and is everywhere now" sort of closer. Junior devs' eyes go wide when you tell them builds used to happen overnight, "feature branches" could last months or even years, and merging was someone's full-time job.
- it conflates a build system (like bazel) with CI execution platforms (like buildkite)
- it ignores the origins of CI - build everything from scratch so none of your workflow optimizations affect anything
The build is neither working nor not working. In fact, it isn't even built!
If you need this, you can have this already. As far as I'm aware, anyone running a large monorepo (Google etc.) is doing some type of diff-aware test pruning, simply because you can't run all of the tests on every change.
And for us small guys, there are tools you can use to do change-aware tests, "Test impact analysis" is one keyword for this (there may be others, coverage.py calls it "who tests what": https://nedbatchelder.com/blog/201810/who_tests_what_is_here...).
> I, for one, hope that we aren't using Jenkins in 2025.
I certainly hope not. I had moved on from Jenkins in 2015. But I don't think there's much reason to think the author's "run the tests locally" speculation is particularly likely to supplant a centralized CI/CD system. As a simple argument against, for supply-chain integrity reasons, the tests and builds need to be run in a common, public environment. You're not going to push anything to production that has only been tested/built on your own machine. This is one of the core tenets of the modern software philosophy that lets you move extremely fast (deploying many times per day) and (perhaps I'm showing a lack of imagination) it's hard to see an alternative here.
Here's a counter-prediction:
1) In the short term, the pendulum will continue to swing back towards thin-clients (Github Workspaces etc.) which means we'll see more emphasis on cloud-based test runners per-dev-environment, and making these faster.
2) The push towards improved cycle-time (both for developers running tests on their WIP code, and for pre-merge tests on branches) will continue which will mean that the "change-aware test runner" tech will propagate down from high-complexity codebases so that every test runner is expected to offer conditional compilation/testing.
3) And finally, a bigger/more speculative one - the "build/test graph" (the potentially-per-file set of tasks required to create your binary artifacts and test them; the thing Bazel computes, or that pytest computes) and the "CI/CD job graph" (the thing that GitLab defines, which might include "run the build/test graph" and "deploy this artifact to $environment") will meet in the middle and become the same thing, so that you can invoke exactly the same graph processing logic locally (if you choose) as your CI/CD server would invoke on any given commit/tag. Earthly are working in this direction for example.
> or smarter build systems that produce near-live feedback
You can currently set up your IDE to re-run the tests in real-time. If you add "who tests what" to that, you can have your IDE re-running just the tests related to your diff in real-time, which might actually be fast enough.
It's not clear to me that the build system is the thing that needs to run this; it looks more like the IDE to me. But perhaps the author might agree with my general suggestion that we should make real-time IDE tests logically identical to the per-commit CI/CD tests if possible.