The malicious commits:
Can't really picture anyone doing that as a "just trying to have fun" thing.
A chaotic evil would be destroying out of greed or hate.
To me, that's not a "neutral" thing. It seems like it was only luck that the rm-ing didn't work, else there would be a bunch more unhappiness.
People do run Gentoo on production systems. Though they obviously shouldn't pull straight from upstream without some real testing before deployment, in the real world it does get done.
> He isn't doing it for a specific reason but because he can. That's neutral.
If the action is to just do something harmless ("echo 'Couldn't have rm -rf-d you there!", then sure it could be seen at neutral. But it was clearly an attempt to destroy other people's stuff or work. That really doesn't seem very neutral to me. :)
It's just of the transient "I had fun from it" kind, instead of something with more tangible rewards (as you mention).
If it was just someone having a lark, there are plenty of ways to do that other than literally attempting to destroy everything they can access. ;)
Malicious is malicious.
, not rm.
It expands to /bin /etc /lib and such and is not covered by the sanity check.
Tbh it's cool, coming from a previous job at a startup where version control meant just zipping the project from time to time.
Since Pull Request review is a high priority activity there's no "bottleneck", and you double the bus factor for free + prevent bad things from happening.
We should all be talking about the worst things that can be done, so we can make sure we are protected from them.
I consider it worse, since it's too easy for people to become content with it.
It's not just that.
For a given vulnerability, there is an amount of time before the good guys discover it and fix it, and an amount of time before the bad guys discover it and exploit it. Obscurity makes both times longer.
In the case where the good guys discover the vulnerability first, there is no real difference. In theory it gives the good guys a little longer to devise a fix, but the time required to develop a patch is typically much shorter than the time required for someone else to discover the vulnerability, so this isn't buying you much of anything.
In the case where the bad guys discover the vulnerability first, it lengthens the time before the good guys discover it and gives the bad guys more time to exploit it. That is a serious drawback.
Where obscurity has the potential to redeem itself is where it makes the vulnerability sufficiently hard to discover that no one ever discovers it, which eliminates the window in which the bad guys have it and the good guys don't.
What this means is that obscurity is net-negative for systems that need to defend against strong attackers, i.e. anything in widespread use or protecting a valuable target, because attackers will find the vulnerability regardless and then have more time to exploit it.
In theory there is a point at which it may help to defend something that hardly anybody wants to attack, but then you quickly run into the other end of that range where you're so uninteresting that nobody bothers to attack you even if finding your vulnerabilities is relatively easy.
The range where obscurity isn't net-negative is sufficiently narrow that the general advice should be don't bother.
If that's the case, why doesn't the NSA publish Suite A algorithms?
The math on whether you find the vulnerability before somebody else does is very different when you employ as many cryptographers as the NSA.
They also have concerns other than security vulnerabilities. It's not just that they don't want someone to break their ciphers, they also don't want others to use them. For example, some of their secret algorithms are probably very high performance, which encourages widespread use, which goes against their role in signals intelligence. Which was more of a concern when the Suite A / Suite B distinction was originally created, back when people were bucking use of SSL/TLS because it used too many cycles on their contemporary servers. That's basically dead now that modern servers have AES-NI and "encrypt all the things" is the new normal, but the decision was made before all that, and bureaucracies are slow to change.
None of which really generalizes to anyone who isn't the NSA.
A lot of the Suite A algorithms have also been used for decades to encrypt information which is still secret and for which adversaries still have copies of the ciphertext. Meanwhile AES is now approved for Top Secret information and most of everything is using that now. So publishing the old algorithms has little benefit, because increasingly less is being encrypted with them that could benefit from an improvement, but has potentially high cost because if anyone breaks it now they can decrypt decades of stored ciphertext. It's a bit of a catch 22 in that you want the algorithms you use going forward to be published so you find flaws early before you use them too much, while you would prefer what you used in the past to be secret because you can't do anything about it anymore, and the arrow of time inconveniently goes in the opposite direction. But in this case the algorithms were never published originally so the government has little incentive to publish them now. Especially because they weren't publicly vetted before being comprehensively deployed, making it more likely that there are undiscovered vulnerabilities in them.
Cryptography is the keeping of secrets. Obscurity is just another layer of a defense in depth strategy. Problems occur when security is expected to arise solely from obscurity.
But you got to do exersizes thinking out the worst cases (what an attacker could do if they didn't make any "unforced errors") in order to think about defending against them (ie, to think about security at all, which nearly every dev has to be).
Which is what the above was. We can not avoid thinking through the best case for the attacker, in public, if we are to increase our security chops. It's not "advice for the attacker".
The attack was loud; removing all developers caused everyone to get emailed.
Given the credential taken, its likely a quieter attack would have provided a
longer opportunity window.
For any folks out there who've been using or promoting formula-based passwords, this is the potential impact: a leak on one site can be leveraged by an attacker towards other sites you use.
if the formula is to go “hacker$$$$news_” then you’re SOL
Maybe one day the web will get its shit together and let us login everywhere with a certificate that we'll be able to securely store on a HSM and easily revoke and update if necessary but in the meantime it's the best we can do.
* How does a user log in from a new device?
* Especially if they've lost / broken their original device?
* What happens if users want to log in from a shared computer (think public library)?
* Do all the OSes/browsers that users are using actually support certificate management and auth?
Users would find the process of moving their identity on different devices tedious, but if that allows them to synchronize their whole digital lives, it would seem worth the effort.
Of course, they can be lost or stolen, which is why certificate revocation is necessary, and they would need broad adoption among OS and browser vendors.
Plus you still have the issue of proving your identity if it's lost/stolen.
Hardware tokens can allow individual power users to solve issues around multiple devices, lost devices, etc themselves, but unless you're suggesting porting 100% of users to hardware tokens, it doesn't change the workflows a site must support.
> use unique strong passwords per-website.
... use unique, truly random passwords per-website.
"Strong" is ambigous and hard to explain how to do right. Random means, in practice: "don't come up with one yourself, but let a tool generate one for you".
Of course computers are far better at coming up with unpredictable passwords than humans are.
> With respect to the RNG taxonomy discussed above, the DRNG follows the cascade construction RNG model, using a processor resident entropy source to repeatedly seed a hardware-implemented CSPRNG. Unlike software approaches, it includes a high-quality entropy source implementation that can be sampled quickly to repeatedly seed the CSPRNG with high-quality entropy.
So it's a cryptographically secure pseudo random number generator that takes entropy from the processor. It's not a True Random Number Generator. And again, if it does work well for cryptography, it's the unpredictability that matters, not the randomness itself.
> This method of digital random number generation is unique in its approach to true random number generation in that it is implemented in the processor’s hardware
> The all-digital Entropy Source (ES), also known as a non-deterministic random bit generator (NRBG), provides a serial stream of entropic data in the form of zeroes and ones.
> The ES runs asynchronously on a self-timed circuit and uses thermal noise within the silicon to output a random stream of bits at the rate of 3 GHz
What on earth would classify as TRNG if not above?
Just fyi, checked, and NIST SP 800-90B defines NRBG as following:
> Non-deterministic Random Bit Generator (NRBG):
> An RBG that always has access to an entropy source and (when working properly) produces outputs that have full entropy (see SP 800-90C). Also called a true random bit (or number) generator (Contrast with a DRBG).
A certificate means now Youporn, your employer, and your bank all share the same way to identify you. As does Facebook, and five thousand shady advertising companies.
Something like Web Authn / U2F is better here. With this technology sites don't get any meaningful identity, just confirmation that you still have the same token as before. This also means if you find somebody's token you learn nothing from that, you'll have no idea where to return it and may as well just start using it yourself.
Would U2F be a part of the answer here? It is by far the most user-friendly implementation I've seen, and it's already supported as a 2FA token on many sites.
In this case "evidence collected suggests a password scheme where disclosure on one site made it easy to guess passwords for unrelated webpages" (emphasis mine), however it is trivially easy to create formula-based passwords that are not immediately obviously related to the site for which they were created.
Take some combination of:
Number of characters in the URL
Number of syllables in the URL
Number of characters in the first / last syllable
Index in the alphabet of the Nth character of the first / last syllable
Nth character in the URL
Character offset in the URL according number of syllables
Nth symbol on the number key row according to number of characters in the URL
Alt-key symbol for the URL character in the URL
Etc, etc, etc.
It would take serious effort to even detect the presence of a formulaic password from a single leaked unhashed / unsalted password, never mind to determine what the formula might be.
These days it’s becoming more common to have been compromised from multiple sources. And unfortunately storing passwords securely seems to still be too hard for a lot of companies. The more examples of your pattern an attacker has, the easier to work out your pattern.
With the frequency of new breach announcements I feel like you’re going to be fighting a losing battle. For now the best and safest solution is unique truly random passwords per site.
The title should probably clarify it's a "Gentoo mirror on GitHub compromise incident report"
Incident Report for a Gentoo GitHub contributor account being compromised
But Github could probably have helped by detecting logins from unusual IPs for that user - i.e. a login attempt from an IP they haven't logged in from before, and required something like email verification too. Although if they were using an easy to guess pattern, then likely the admins email could have been compromised too.
Edit: GitHub could also warn people whose accounts have admin access to organisations if they don't have 2FA enabled.
For the most part, this appeared to the work of a teenage skiddy (given the addition of a readme with a racial slur as the text), and not any actual sophisticated attack.
Suggesting that this is "merely" a mirror downplays how seriously a more sneaky attacker could have harmed users of the project. Would similar statements downplaying the attack be made if this had been https://mirrors.kernel.org/ ?
2FA, by it's nature, is bound to one single software, tool, or piece of hardware.
This limits the access of e.g. an administrator-login to one person and her personal phone, only. A bus-factor that is unacceptable to many.
Software like AWS allows more granular set-up, but still complex. Linode, Digital-Ocean and even docker.io, last time I looked, make it impossible to share the admin-account by allowing multiple 2FA devices active on one account simultaneous. And if they did, that would greatly lower the security of that account (still better than no 2fa though)
 2FA, like google authenticator (or one of the much better open source alternatives thereof) make it possible to share a 2fa secret across devices, but that is both insecure and hard.
I guess you could still use a software TOTP implementation on your workstation to fool Github, but then you are not getting the additional security from u2f because the totp codes are a substitute for the u2f token.
> - done: Gentoo GitHub Organization currently requires 2FA to join.
Gotta say,a bit impressed by the response.
We don't have a plan for Gentoo. I work for Google and I mostly used a vaguely similar plan to the Google incident plan.
1) Communicate early. For publically visible stuff (defacement was very obvious) you want to get a message out quickly before a natural narrative forms.
2) Communicate often.
3) Mitigate the problem first (e.g. prevent the malicious stuff from being downloaded) then investigate second.
4) Assign roles to people and be clear who is responsible for what.
5) Collect lots of data.
1. Identify a security incident. This is the hardest one usually, unless they are blunt and noisy like in this case.
2. Shutdown and firewall everything involved in the security incident and shutdown and firewall everything that has a high probability to be immediately involved or targeted.
- For example, if you have a cluster of several identical application servers and one of these application servers is compromised, all application servers of the same kind in the same cluster must be shutdown and firewalled to the VPN.
- identical application servers in other clusters are handled as per 2.1, but with less hesitation to react strongly.
- If you have a database only available inside a private network and you have identified suspicious activity, it is reasonable to shutdown + firewall everything with access to this database.
2.1 Increase attention similar systems or connected systems in case of lateral movement.
- For example, our productive clusters have access to our monitoring setup. If a cluster is accessed in a creative way by a user, we'll revoke the credentials and certificates of this cluster. However, an attacker might have obtained information and utilize that to pivot his attack onto the monitoring cluster or the management infrastructure, so we need to pay attention to that.
3. Communicate to customer support and management about the incident.
- Yup. This is intentionally after axing productive systems. If senior persons are sure about a security incident, we don't want to wait for management to kill the malicious access. We want the senior guys to shutdown the security incident and ask for forgiveness later. Fast isolation beats bureaucratic process.
4. Identify vector of entry and eliminate it.
- This is normal post-work. What did they do, how did they come in, patch or report. This tends to be split into immediate mitigation (in gentoos case, change of passwords and implementation of 2FA, or disabling software features in an incident we had some time ago) and further mitigation down the road, such as the audit logs they are planning.
5. Resume operation of service coordinated with customers.
- This is something we learned some time ago. Some of our customers have their own dedicated systems, and from there they have a word how to properly resume operation after an incident. In one case, one department of a customer had an unannounced penetration test against a system without coordinating with the security department of the customer and without coordinating with us. All of that went to hell and back. We were forced to leave the system down for almost a month due to contractual bindings and catfights between teamleads of that customer on our bug tracker. That was fun.
I'm pretty sure it didn't impact anybody.
Not very nice to be a victim of that but at least there is no doubt whether you're vulnerable or not.
The timeline also suggests that the malicious content was made after the break-in and not planned beforehand?
Does anybody know why this is the case?
Doesn't sound to me as if it's not a mirror.
Though I guess you could say as their CI depended on this mirror, it had a higher status than normal mirrors.
From their incident report (under https://wiki.gentoo.org/wiki/Github/2018-06-28#What_went_bad... ): "The systemd repo is not mirrored from Gentoo, but is stored directly on GitHub."
And from their Action Items: "mirror systemd repo on git.gentoo.org"
I'm curious why they seem to treat this repo differently from the others, using GitHub as authoritative and adding a mirror to git.gentoo.org rather than making git.gentoo.org authoritative and mirroring to GitHub.
Eh, that sees unwise. To suggest based on what you think you are "sure" about security protections, that users should not worry about having a copy with malicious code.
Also, the Microsoft acquisition of GitHub hasn't happened - they have only agreed to acquire GitHub.