They have acknowledged this in their section "Keeping patched".
However, there is one thing I think they have omitted to consider. The more reliance on third party software not from the server distribution they are using, the more disparate and unreliable sources for security fixes become.
Careful choice of production software dependencies is therefore a factor. Usually it is unavoidable for some small number of dependencies that are central to the mission. But in general, I wonder if they have any kind of policy to favour distribution-supplied dependencies over any other type.
Another way of looking at this: we already have a community that comes together to provide integrated security updates that can be automatically installed, and you already have access to it. Not using this source compromises that ability. If some software isn't available through Debian, it is usually because there is some practical difficulty in packaging it, and I argue that security maintenance difficulty arises from the same root cause.
On a similar note, I'm curious about their choice to switch from cgit to GitLab. Both are packaged in Debian, but I believe that even Debian doesn't use the packaged GitLab for Debian's own GitLab instance. Assuming that Debian GitLab package's version is therefore not practical, wouldn't cgit be better from a "receives timely security updates through the distribution" perspective?
In the (distant) past, we tended to prefer to wrap our own stuff for critical services (e.g. apache, linux kernel) rather than use distribution-maintained packages. The reason was pretty much one of being control freaks: wanting to be able to patch and tweak the config precisely as it came from the developers rather than having to work out how to coerce Debian's apache package to increase the hardcoded accept backlog limit or whatever today's drama might happen to be.
However, this clearly comes at the expense of ease of keeping things patched and up-to-date, and one of the things we got right (albeit probably not for the right reasons at the time) when we did the initial rushed built-out of the legacy infrastructure in 2017 was to switch to using Debian packages for the majority of things.
Interestingly, cgit was not handled by Debian (because we customised it a bunch), and so definitely was a security liability.
Gitlab is a different beast altogether, given it's effectively a distro in its own right, so we treat it like an OS which needs to be kept patched just like we do Debian.
For what it's worth, I think by far the hardest thing to do here is to maintain the discipline to go around keeping everything patched on a regular basis - especially for small teams who lack dedicated ops people. I don't know of a good solution here other than trying to instil the fear of God into everyone when it comes to keeping patched, and throwing more $ and people at it.
Or I guess you can do https://wiki.debian.org/UnattendedUpgrades and pray nothing breaks.
Better than getting compromised! I used to have a very conservative approach to changing anything, including great caution with security updates and the desire to avoid automatic security updates with a plan to carefully gate everything.
In practice though, security fixes are cherry-picked and therefore limited in scope, and outages caused by other factors are orders of magnitude more common than outages caused by security updates. Better to remain patched, in my opinion, and risk a non-security outage, than to get compromised by not applying them immediately.
A better way to mitigate the risk is to apply the CI philosophy to deployments. Every deployment component should come with a test to make sure it works in production. Add CI for that. Then automate security updates in production gated on CI having passed. If your security update fails, then it's your test that needs fixing.
But it's there are still a few custom things running around which aren't covered by that (e.g. custom python builds with go-faster-stripe decals; security upgrades which require restarts etc), hence needing the manual discipline for checking too. But given we need manual discipline for running & checking vuln scans anyway, not to mention hunting security advisories for deps in synapse, riot, etc, i maintain one of the hardest things here is to have the discipline to keep doing that, especially if you're in a small team and you're stressing about writing software rather than doing sysadmin.
Shouldn't ansible do all this for you? I heard it's the recommended way for automatic updates and service restarts.
Please let me know about this as I'm interested myself.
Debian can restart processes dependent on updated packages and issue alerts about the need to, and you can automate checking for new releases of things for which you've done package backports. That doesn't finesse reboots for kernel updates and whatever systemd forces on you now, but I assume you can at least have live kernel patching as for the RHEL systems for which I used not to get system time.
In the past I used to install newer, or customised, versions of e.g. `git` than were available on my Ubuntu into my home directory using e.g. `./configure --prefix=$HOME/opt`. That got me the features I wanted, but of course made me miss out on security updates, and I would have to remember each software I installed this way.
With nix, I can update them all in one go with `nix-env --upgrade`.
Nix also allows to declaratively apply custom patches to "whatever the latest version is".
That way I can have things like you mentioned (e.g. hardcoded accept backlock for Apache, hardening compile flags) without the mentioned "expense of ease of keeping things patched and up-to-date". I found that very hard to do with .deb packages.
It's not as good as just using unattended-upgrades from your main distro, because you still have to run the one `nix-env --ugprade` command every now and then, but that can be easily automated.
Then, if you're bothered about security, it's not clear that having to keep track of two different packaging systems and possible interaction between them, is a win.
I'm particularly disappointed to hear that Google doesn't provide any way to rotate the signing key for an app.
Is there an issue for that file with them anywhere, or more discussion?
Some day, I hope reputable services have migrated to The Update Framework, which has been pointing out and solving these and other problems related to software updates for several years now.
Actually, a quick search leads to this - is it indeed possible to rotate your key, at least for Android's Pie version?
Edit: https://developer.android.com/studio/publish/app-signing#app... is the type of key rotation i was talking about here.
The way Jenkins works, with each plugin being able to implement arbitrary endpoint, it is almost inevitable that it would have many security vulnerabilities.
No Jenkins masters should be exposed to the internet, ever -- and if there is really no other way, then set up a proxy in front of it with strict whitelist of allowed URLs.
TL;DR: keep your services patched; lock down SSH; partition your network; and there's almost never a good reason to use SSH agent forwarding.
You could also fund/donate to/advocate for a better SSH agent.
I use both Pageant and ssh-agent in my home network for ease of ssh'ing into boxes, especially Unifi gear and some dev VMs. I don't think I will stop using agents, but I probably wouldn't use them at work.
Why couldn't there be an agent that required you to touch a Yubikey before it'd allow keys to be forwarded? Why couldn't you add prompting and timeouts to an agent?
The problem here was agent forwarding, which you should almost always replace with opening a new connection via ssh -J (or equivalent.)
At least you only would leak a single access, and you would have a higher chance of noticing, but I can also see that if the hijack was done intermittently you might write it off as a glitch...
I've learned a lot from it and will be adding some of these practices to the infrastructure that I manage.
Also, https://news.ycombinator.com/item?id=19643227 is an excellent tl;dr.
>We then perform all releases from a dedicated isolated release terminal.
>We physically store the device securely.
Why didn't they go with a HSM?
> The signing keys (hardware or software) are kept exclusively on this device.
We still want to make very sure that the build environment itself hasn't been tampered with, hence keeping the build machine itself isolated too.
A much better approach would be to use reproducible builds and sign the hash of a build with a hardware key, but we didn't want to block an improved build setup on reproducibilizing everything.
Edit: we may be missing an HSM trick, though, in which case please elaborate :)
Since Android signing keys are just PKCS #8, and GPG keys are supported by most HSMs, a HSM would definitely be usable (even if you just used an addon HSM card that you added to your "release terminal"). Unfortunately in order to safely use the HSM you'd need to re-generate your keys again from within the HSM -- which obviously is a problem on Android. In addition, HSMs are quite expensive and might be prohibitively so in your case. But I would definitely recommend looking into it if you're really stuck on doing distribution yourselves.
Reproducible builds are a useful thing separately, but using a HSM doesn't require reproducible builds -- after all signing a hash of a binary is the same as just signing the binary. The main benefit of reproducible builds is that people can independently verify that the published source code is actually what was used to build the binary (which means it's an additional layer of verification over signatures).
One question I have is how are going to handle the case where the release terminal fails? Will you have to (painfully) rotate the keys again?
I.e. we are already using HSMs on the build server.
Can the system you log into via ssh just dump your forwarded PRIVATE key? That easily?
Or this was about ssh client being patched on jenkins box to add malicious keys wherever the devops ssh'd from jenkins box?
When you log into a host with SSH agent forwarding turned on, the private key data itself isn't available to the host you're logging into. However, when you try to SSH onwards from that host, agent forwarding means that the authentication handshake is forwarded through to the agent running on your client, which of course has access to your private keys.
So, even though the private key data itself isn't directly available to the host, any code running which can inspect the SSH_AUTH_SOCK environment variable of the session that just logged in can use that var to silently authenticate with other remote systems on your behalf.
If you've already found a list of candidate hosts (e.g. by inspecting ~/.ssh/known_hosts) then your malware can simply loop over the list, trying to log in as root@ (or user@) and compromising them however you like. Which is what happened here, by copying a malicious authorized_keys2 file with a malicious key onto the target hosts. You don't need to patch the ssh client; it's just ssh agent forwarding doing its thing. :|
Manually typing passwords in on an attacker-controlled machine doesn't sound very safe either.
What sort of thing are you using ssh -A for which couldn’t be replaced by ssh -J?
git checkouts from private repositories, for example. HTTPS requires username/password which may or may not be checked/monitored.
> If you need to regularly copy stuff from server to another (or use SSH to GitHub to check out something from a private repo), it might be better to have a specific SSH ‘deploy key’ created for this, stored server-side and only able to perform limited actions.
And this is the approach we're taking going forwards.
If the problem is that you only ever want to read from git when an admin is logged into the machine, i guess the safest bet would be to use a temporary deploy key (or temporarily copy the deploy key onto the machine until you've finished admining).
Forwarding all the keys from your agent is a recipe to end up pwned like we did, however.
> SSH should not be exposed to the general internet.
> If you need to copy files between machines, use rsync rather than scp.
Great. Just great. I still remember when SSH was described as the solution to fix telnet and rcp. And now we can't use it any more. Fan-freaking tastic.
But using SSH as a shell is fine. And rewiring your fingers to type rsync rather than scp isn't too bad either - plus you get resumption etc for free :) (And yes, I appreciate the parent is being slightly tongue in cheek).
Edit: of course, if we'd been using xrsh and xrcp from XNS rather than this newfangled TCP/IP stuff none of this would probably ever have happened...
SCP is a protocol layered on SSH, and has had a spate of security flaws recently:
* Incorrect validation of the SCP client directory name (CVE-2018-20685)
* The SCP client did not receive the validation of the name of the received object (CVE-2019-6111)
* Counterfeit client SCP through object name (CVE-2019-6109)
* SCP Client spoofing using stderr (CVE-2019-6110)
And as of 8.0, OpenSSH recommends you no longer use SCP in favour of sftp or rsync, as per the security paragraph of https://www.openssh.com/txt/release-8.0:
> The scp protocol is outdated, inflexible and not readily fixed. We recommend the use of more modern protocols like sftp and rsync for file transfer instead.
mosh dev and users think no.
>We think that Mosh's conservative design means that its attack surface compares favorably with more-complicated systems like OpenSSL and OpenSSH. Mosh's track record has so far borne this out. Ultimately, however, only time will tell when the first serious security vulnerability is discovered in Mosh—either because it was there all along or because it was added inadvertently in development. OpenSSH and OpenSSL have had more vulnerabilities, but they have also been released longer and are more prevalent.
> In one concrete respect, the Mosh protocol is more secure than SSH's: SSH relies on unauthenticated TCP to carry the contents of the secure stream. That means that an attacker can end an SSH connection with a single phony "RST" segment. By contrast, Mosh applies its security at a different layer (authenticating every datagram), so an attacker cannot end a Mosh session unless the attacker can continuously prevent packets from reaching the other side. A transient attacker can cause only a transient user-visible outage; once the attacker goes away, Mosh will resume the session.
> However, in typical usage, Mosh relies on SSH to exchange keys at the beginning of a session, so Mosh will inherit the weaknesses of SSH—at least insofar as they affect the brief SSH session that is used to set up a long-running Mosh session.
In particular, rsync command that they are talking about is still using SSH as an underlying transport.
> Mosh doesn't listen on network ports or authenticate users. The mosh client logs in to the server via SSH, and users present the same credentials (e.g., password, public key) as before.
Reading the blog post I wonder how many security specialists this organisation really has as they would never allow these fundamental errors to be made even with the explanation that they setup their infra in a rush. Dedicated security teams would have surely fixed these basic errors.
I would advise anybody looking for ‘secure’ applications to stay away from these organisations who knows how many possible flaws are deeply embedded in their systems like zero days, memory leaks and more they did not even have a basic security policy system in place... please don’t use the word secure