And yet it is still half baked. We prepared for this with internally shared docs and the branch built in private for a while, but still had to roll back yesterday because the scheduler reverted to putting jobs wherever it pleased (including on ephemeral runners that already have a job) and randomly cancels large sets of jobs too.
I have been of the opinion that investing into GH Actions at this stage is purely sunken cost (at my org), and I'm not moving until the team behind this thing ships something that doesn't break half the time. These have been seriously frustrating months, because no amount of working around this messy code[1] made of 5 layers of MS style .NET (seriously, deleting a directory goes 5 layers deep in the call stack) will ever produce a stable product. They don't even know their own code base that well, when they first attempted ephemeral runners with `--once` it turned out the thing they produced could never work (because the server-side scheduler loves pipelining jobs to machines and failing miserably when these disappear, job times out after 20 minute of waiting type)
Has your team considered looking into buildkite? We love the flexibility it gives us. Being able to dynamically build pipelines is a very nice feature that not many others have (at least that I could tell when I was researching).
We would like to run other things (probably concourse or argo), but this decision was made way further up to justify picking GitHub as provider. There might also be a Microsoft volume discount involved.
If we hard reject actions, we’ll probably end up with the prior status quo: Jenkins.
Product manager on the GitHub Actions team reporting in, we're sorry to hear about this issue with the rollout of ephemeral runners. Our engineering team is aware of this issue and is heavily prioritizing the investigation and fix.
We'd love to look into your specific case if you want shoot me an email: thejoebourneidentity@github.com
One, our githubcustomers/ contact is already forwarding anything / setting up a call with the team as needed, and two, that is not a twitter profile I'd ever send a DM to in a professional context, considering you retweet a lot of people diametrically opposed to my existence.
I will check in with our githubcustomers group to help accelerate. I can also directly inform our engineering team if you're open to sending me your information at
thejoebourneidentity@github.com
Either way, we're looking into this issue and I'll post an update here once we've learned more.
Asking end users of your product to report issues via DM'ing your personal Twitter account, an account which is full of retweets of homophobic garbage, is really REALLY bad.
Quietly editing your comment after being called out to hide it is even worse.
Agreed. Thanks for pointing this out. I'm genuinely curious now what their Twitter profile is, but my guess is they'll delete the tweets or remove the Twitter account.
Microsoft notoriously hires a lot of people from the Federal Sector who unfortunately appear to be mostly right-wing religious zealots.
Yikes. This is the strangest thing I've seen on HN in a long time. How can someone responsible for product at Github (!) think posting your private twitter account in a context like this is acceptable, and even more so when it's filled with this garbage.
Joe Borne works at GitHub only because it was acquired by Microsoft. He's been in the industry for more than a decade but his GitHub account was created only in 2016, and this is his sole public repo:
I know multiple people who left GitHub after it was acquired by Microsoft because of unsatisfying experiences they previously had with Microsofties. It seems that they made the right call.
I didn't actually see anything overtly homophobic/transphobic, though there's some border control posts he liked/retweeted that go a bit into the racist side. Not someone I'd want to hang out with, but I don't see anything like what was being said.
Still, what could possibly make anyone think posting a personal twitter handle for this was a good idea. That alone makes me question that he should have the role he does at GitHub.
I disagree with their politics and think it's unprofessional to conduct business through a personal twitter account made up of partisan political retweets regardless of the politics, however I don't think it's necessary for you to repost this person's twitter account after they removed it from this thread, it will only lead to unnecessary harassment. I hope that you or the moderators remove it.
It's the first Google result for 'joe bourne github twitter'. He's not trying to keep it a secret, and I don't know why we should keep it secret for him. Is he supposed to be embarrassed about it?
This is the same company whose sales team celebrated their ICE contract with little American flag and eagle emoji (https://www.latimes.com/business/technology/story/2019-10-31...) and eventually decided to donate all their proceeds from the ICE contract (i.e., their motivation was helping ICE, not making money).
Everyone here lives under capitalism. Making money is fine. Saying "If you pay us, we'll take your money" is totally expected. But that's not what they're doing here.
But GitHub is now a MAGA-converged organization pushing a social agenda, and you can see it in the quality of their engineering.
We recently faced this; if you are using the docker/login action I'd give that a check as it turned out it was logging us out by default at the end of each job; resulting in some race conditions when running multiple runners on the same machine (sharing the same docker daemon).
Simple fix was to add `logout: false` to the action options.
I really wish the runner agent was written in something more portable than .NET. That choice feels like something purely political because they’re owned by Microsoft. I doubt and independent organization would have chosen it before other excellent choices such as Go, Rust etc.
Currently hosting the runner on e.g. FreeBSD or custom embedded systems is not supported (or even possible).
It's not because they're owned by Microsoft, at least not in the way you think.
It's because GitHub Actions is rebranded Azure Pipelines.
That a team at GitHub has been given a pile of Microsoft authored code is honestly much more concerning. They don't seem to understand it in its entirety either.
You can also compare the source code structure to the Azure Devops Pipeline agent's to tell pretty easily that the github runner is a fork what has been edited to process the somewhat different format of the actions YAML.
Most of the people who worked on Azure Pipelines got transferred to GitHub to work on Actions. You can even see the same names of developers contributing code to both the GitHub runner and the Azure DevOps agent.
There are other operating systems and CPU architectures. It's a boon for open source projects to be able to have CI on all the BSDs, and illumos, and Plan 9, and even weirder things.
Too little and too late. Meanwhile, I'm over here with gitlab self-hosted runners that "dispatch" ephemeral runners. I can tweak scaling limits and the whole contraption runs seamlessly on the AWS ec2 instances of my choosing.
My company just competed the migration from github to gitlab and, while it's not perfect, there's a lot to like on gitlab.
I think it's all just a matter of team preferences and for many this is less "too little too late" and more "yet another great release" when compared to the other tools they're using.
I personally find Actions to be a far better product than GitLab CI and we're moving all our CI from a mix of Circle/Jenkins to Actions.
It's funny because that's pretty much my experience with GitLab CI. Actions is certainly a younger product, I do feel that, but in terms of the design, how the pieces fit together, and what it feels like to develop for it, it all feels much more mature than GitLab CI to me. GitLab always felt like a Travis clone that was hacked to look more like CircleCI.
The absolutely most annoying issue with GitHub Runners is the fact that they run 1 job .. at a time ... per server.
You can only imagine our follow-up meetings about the fact that we had a fleet of 15 c5a.2xlarge instances and still half of the developers were waiting up to 20 minutes for an instance to go online.
The worst part? The jobs don't clean up -- probably to allow for caching. We ran into into disk space issues regularly enough for it to force us to make the spot instances commit harakiri after 2 days.
GitHub are a cool concept and we'll probably stick with them. But their quality is just bad. There's that .NET runner and it feels like it's so massively different from anything GitHub-like you could imagine .. almost as if it's a whitelabel program they licensed or like it's the result of a 4 week contract work. Simply bad.
Can't you run more than one runner per server? My understanding was you start one up, give it a directory to work in, and it'll register itself with the central server and start processing jobs. I thought you could just run more instances if you wanted parallelism.
> The jobs don't clean up -- probably to allow for caching.
Ya, this helps with a specific build cache scenario I use, A workaround if you want it to cleanup is to put `rm -rf "${{ github.workspace }}"` at the end of your workflow.
If anyone has experience using self hosted GH Actions at scale, I’d love to buy you a virtual coffee and hear about pros/cons for a parallelized CI flow currently running in Circle. Main motivation for switching would be simplification of tooling and increasing performance with better cache reuse and running within AWS for faster network access to ECR.
We moved all our builds (e.g. 100+) to GH Actions. We've been using GH Actions since it was a daily tar ball drop in a private GH Slack channel in Q3 2019.
Happy to answer any questions.
The biggest challenge has been the many GH Actions service outages/impacts. We're working on moving to self hosted runners to mitigate this.
I wish there was an official Helm Chart for k8s, like GitLab CI/CD Runner has, and not the kind that sits there and does no scale, but he kind that spins up workers on demand without taking too much resources while idle.
I wish GitHub copied that feature from GitLab too!
I've been playing about with this and it seems to work quite well. Startup latency is quite high, and it's one pod-per-job (I think), but seems pretty flexible.
I've been eyeing this for a while. My biggest hangup is that CI/CD is a major attack (e.g. supply chain) vector. If you use CI/CD for deploys, then a lot of highly privileged creds are in play.
I'd really prefer if GH made and managed the K8s operator (e.g. the most popular infra provisioning tool) themselves.
The feature pull request has been there for over a year[1], it’s nice that’s it’s released!
Incoming shameless plug; if you don’t have to handle the hosting runners, but still to reap the benefits of having proper hardware(close to the metal). Check out BuildJet for GitHub actions - 2x the speed for half the price. Easy to install and easy to revert.
And a more shameless second plug, I run SurplusCI which does the same thing for GitHub and GitLab with a few other platforms on the horizon.
I can say we're less than half the price, because we focus solely on dedicated hardware and dedicated compute. We're working onworking on pay-for-what-you-use as we speak, and this issue finally getting resolved has generated work for me this weekend.
Yes, job runs in a KVM VM. Nested KVM is supported on the hypervisor, but KVM is not enabled by default in guest OS, due to we run a guest kernel for faster booting time. We will offer an option to enable kvm kernel module in the future.
Wow that looks like exactly what I need. We recently moved to GHA and while it is nice in many ways, my main complaint is that unlike our previous (AWS CodeBuild/CodePipeline) setup, we can't just pay more to get more powerful instances to run CI.
Looking into setting up self-hosted runners has been on my todo list since the first day of using GHA; will definitely check out your service soon.
Ephemeral runner support has been highly anticipated for our organization - I'm excited to see it go live!
However, GitHub Enterprise admins may want to take caution - some users have reported that the changes are not currently compatible https://github.com/actions/runner/pull/660
But, these competing offerings between Azure and GitHub have been really confusing to follow. Especially since folks are pointing out that GitHub Actions is partly Azure DevOps under the hood. It just seems like a complicated branding play because some people will refuse to use an Azure service but will gladly use a GitHub service still owned by Microsoft?
Azure DevOps Pipelines is "stable"/"mature" and not seeing anywhere near as much active investment: most of the team supposedly moved directly over to Github Actions and that seems to be where all the new investment work is going.
Azure Codespaces was rebranded at the 11th hour before launch to Github Codespaces and moved almost entirely to the Github org and Azure DevOps was never given access unlike original announced plans under the Azure brand.
Rumors have been swirling for a while now (including when bharry, the VP whose kingdom was Azure DevOps, retired three years ago) that Azure DevOps is on the slow decline to some sort of chopping block and Microsoft will replace it entirely with Github eventually. There are rumors that even "deeply private" teams you wouldn't expect to move from Azure DevOps to Github internally at Microsoft have already migrated. (Certainly a lot of well known Windows Developers have much more active "Activity Indicators" on Github these days and it isn't necessarily entirely accountable by all the known public repos like Calculator, Terminal, etc and public facing samples projects nor that all of their documentation repos have obviously moved to Github.)
It would be wonderful to get an actual definitive and official statement from Microsoft, even if "eventually" when a migration will happen is still "years away" (which is presumably why they are afraid to give a statement yet, if it's still too far down the roadmap). That would make it easier today for some of us to start making cases to our teams that migrating voluntarily today to Github would be good for us. (Make the debate more than just "I want Codespaces" or "I want Github's dependency scanners" but also "Microsoft suggests it".)
You don't need to create an Azure account to use Github Actions. It's not really refusing to use the service as much as using the streamlined one right in front of you.
The biggest problem with GitHub Actions that you can't restart just one job[1], it always restarts all jobs in the workflow. And this bug is not fixed for quite a while. Travis CI and Appveyor both allow that, of course.
It's a bit of an old drum to beat on but just want to note that GitLab has supported this (and provides docs for running on EC2, Fargate, k8s and other platforms like LXD[0][1][2][3]) for a very long time, and the CI system there is quite robust.
I've seen my fair share of CI systems (AppVeyor, CircleCI, GitLab, GitHub, TC, Jenkins, etc) and I'd argue that the GitLab CI is the best of all the ones I've seen:
- great syntax (it's YAML like most others but somewhat easy to organize well with great documentation)
- Fantastic documentation
- Unparalleled flexibility
- Unsurprising operation (things generally work as you'd expect)
- The ability to clear your build runner cache (Just ran into the inability to do this with CircleCI again today)
That said competition is a good thing so in general I'm glad to see this finally supported by GHA and dig into it over the weekend. GHA is making a lot of really good sustainable moves in the space and keeping the field open (their marketplace is the best) so I'm all for it.
I run SurplusCI[4] which does what you'd think (runs these runners in VMs) so getting this on-demand runners working happens bit top-of-mind, right now I only offer dedicated runners which are cheaper but of course aren't as cheap as on-demand (depending on usage).
Speaking of competition, just learned of a competitor here on HN in BuildJet[5], so if you don't want to manage your own runners check them out as well, unlike SurplusCI they actually offer to-the-minute on-demand runners, and the onboarding process looks way easier.
[EDIT] - Just to say, the list above is absolutely NOT the full list of platforms GitLab Runner supports -- it's pretty insane how many directions the community and GL have gone in. The Docker Machine integration (they maintain a fork) actually means you could run your single-use-machines on Scaleway or Hetzner easily as well, no need to muss or fuss with ASGs or k8s.
So this is a big step forward in terms of avoiding the race condition where CI runners would accept new jobs during scale-in operations. But how do you ensure you only spawn new ephemeral runners as jobs become available? The webhook provides part of the answer, but do we need to use something like redis to ensure exactly one runner per queued job is started?
Can someone tell me if GHA also supports non-ephemeral self hosted runners, and if so whether they work reliably? Any good resources for getting up and running with it quickly?
This is great news. The only part missing is official docker support for the runner (I'm using an unofficial solution right now) and/or Alpine support.
The autoscaling piece is cool! One of the things that impressed me most about Gitlab CI was how easily we could get runners autoscaling in our own AWS environment. We'd run tiny instances as the actual runner, and they'd spin up bulky instances for different jobs with none of those running when nobody was working. It sounds like this might give a building block to build that in Github Actions.
I wonder if/when GitHub is going to start offering a Heroku-like service or full IaaS. It seems like an incredible opportunity to slap GitHubs branding on top of a subset of Azure's infrastructure and try to beat Heroku or AWS.
My main grip with Github is that it's pushing more tie in features to own the development experience even further. Github started as a community development hub, now trying to swallow us all, owning each bit of the development process to then own the market.
I don't have any affection for aws or gcp either, their attempt to dominate as de facto infrastructure and software provider is scary.
We don't need github actions. Spinning up machines that run cook books that can do anything, even at scale is ultimately more flexible and platform agnostic. If that's time consuming to make it work at scale, providers dedicated to that are out there.
My main gripe with it that it has the same effect as MS Teams: Execs see that a new product enters the market, with a vendor they already have agreements with, and it's either bundled in for free or relatively cheap. Being the right solution for the job has already lost at that point.
And yet it is still half baked. We prepared for this with internally shared docs and the branch built in private for a while, but still had to roll back yesterday because the scheduler reverted to putting jobs wherever it pleased (including on ephemeral runners that already have a job) and randomly cancels large sets of jobs too.
I have been of the opinion that investing into GH Actions at this stage is purely sunken cost (at my org), and I'm not moving until the team behind this thing ships something that doesn't break half the time. These have been seriously frustrating months, because no amount of working around this messy code[1] made of 5 layers of MS style .NET (seriously, deleting a directory goes 5 layers deep in the call stack) will ever produce a stable product. They don't even know their own code base that well, when they first attempted ephemeral runners with `--once` it turned out the thing they produced could never work (because the server-side scheduler loves pipelining jobs to machines and failing miserably when these disappear, job times out after 20 minute of waiting type)
[1]: https://github.com/actions/runner