- Iterating over all internal IPv4 addresses to try attacking them isn't that expensive, there are about 16 million IP addresses, and there are likely patterns in their allocation if your goal is an internal network.
- `HashKnownHosts yes` is not the default setting.
- Shell history likely leaks the hosts anyway if you enable this SSH setting.
- You could substitute `ssh` with a malicious version of it.
1) Iterating over the network:
The point of this type of attack is to stay under the radar of NIDS which should (if configured correctly) detect someone trying to knock on port 22 of every server in your private address space.
2) `HashKnownHosts yes` is not the default setting:
True. But it is an available setting and since some would enable it assuming it would provide them with extra security then the strength of that extra security does still need to be proven. Hence why this research was done.
3) Shell history likely leaks the hosts anyway if you enable this SSH setting:
Indeed. However it's also not that uncommon for people to disable shell history on bastion servers. Plus if that particular user hasn't SSH'ed in a little while it's possible the history file has rolled over to reveal fewer servers in its log.
4) You could substitute `ssh` with a malicious version of it:
You could. That's probably the most likely attack to try first but it's not without it's problems as well:
* You'd either need root access to replace the ssh client, or to be damn sure you updated the right user shell profile to update the $PATH variable to include the location of your preferred ssh client (ie putting export PATH=~/boobytrapped:$PATH ) and the user not noticing either the modification to their profile nor the new folder in a user writable directory (it's worth referencing an earlier point about how a network scan was dismissed because it is a detectable attack)
* It's a longer term attack since you wouldn't get a list of servers until after a user has connected to them (on the plus side, you could glean more detail about the target).
export PATH="~/. ":$PATH
I'd say that it's HashKnownHosts that is impractical, not the attack. One of the reasons someone would publish a tool like this is to raise awareness of the brittle security HashKnownHosts offers vs. modern GPUs.
I don't think this tool was made for the use case of HashKnownHosts not being set.
Using shell history, known hosts, netstat, etc are all great ways to find hosts to pivot to.
Substituting ssh with a malicous version is extremely noisy and risky as well.
- The user might be connecting through (perhaps internal) DNS names rather than IP addresses. And probably is, because who wants to type in IP addresses all the time?
Like a rainbow table attack for passwords, but for IPs
But most problematic, I think, is that HashKnownHosts makes properly maintaining the known_hosts file tedious and error-prone. Its harder to remove hosts with known changed keys, and almost impossible to remove unneeded obsolete entries that have accumulated. Yet those old and obsolete keys could have been obtained by an attacker from recycled hardware or just by owning an old never-updated box. While this scenario might be unlikely, I would consider it just as unlikely that an attacker would find information only in known_hosts.
It's true though, known_hosts for pivoting is a basic network pentest trick.
So I don't think it's that unlikely that any given reader of this has it enabled, tbh.
 search for HashKnownHosts here: https://manpages.debian.org/jessie/openssh-client/ssh_config...
Teleport  should hopefully make it easier to use certificates.
An alternative implementation is Netflix’s Bless .
Suppose there is a method x which creates a salt, but does not store it. Then, hash the IP a.b.c.d together with an output from method x. A user can perhaps specify an x of their choosing. Let's say the hash function is then of two variables g(x,a.b.c.d).
Would cracking g(x,a.b.c.d) necessary expose the workings of x? (Note that one may want to think of this as two functions and write g(f(x),a.b.c.d) instead. In such a case we are cracking f as a first step.)
In the article, one relies on the fact that step 1 exposes the salt and step 2 then exposes a.b.c.d.
This would mean you can't detect whether a host changed their fingerprint, just that you've never seen this host-fingerprint combination. So if someone were to MitM your box, you would need to be sufficiently surprised by the 'This is an unknown connection' warning to investigate further.
To actually detect changed fingerprints, you need to keep a list of IPs for which you know the fingerprint. As the list of viable IPs is so small, there is no way to obfuscate it. The only possibility would be to encrypt it, but that requires keeping some secret from your attacker.
You propose a new hashing algorithm that does: IP -> h. If the salt depends on the IP, you can create a rainbow table offline.
If you can generate the salt on the machine (on the fly), you need an algorithm that must run on the machine - what do you base that on? Hostname? local IP addresses? Hardware?... And known_hosts(5) can't be moved to a new machine anymore, as it's tied to a machine.
As a custom solution there are many ways to solve this if we introduce a third parameter (such as simply encrypting your files with a password). I think however the point that many people are making is that the debate is about what the default should be, and without introducing a third factor.
The two step de-hash case (IP hash + salt) suggests interesting research topics on whether there are ways to have other combinations e.g.: (IP hash + x) or (IP hash + x + y) but with the assumption that we don't want any further apriori information. The point that you are making is that in fact we only have one variable (the IP) and the salt is simply a obfuscation step. Any other approach requires more parameters (hardware, fingerprints, etc.).
OpenSSH uses ~/.ssh/known_hosts to record IPs, ports and public key fingerprints of, well, known SSH hosts. But it was argued many years ago that, the IPs and ports in known_hosts from a compromised system, can help attackers and viruses to discover more hosts to compromise. As a defense, OpenSSH introduced HashKnownHosts. Instead of saving IPs and addresses in plaintext, it saves HMAC-SHA1(host, salt). Some systems enable it by default, but most don't.
This research project showed that, it's still vulnerable to brute-force attacks, especially from GPUs, just like every password storage scheme, and explained the issue with proof-of-concept tools. Finally, the difficulty and impracticability is stated by the authors,
> It doesn't seem like there would be a clear solution. If they used a more expensive hashing algorythm like bcrypt, the GPUs could still crack the entire IPv4 address space. [...] Also, if bcrypt was used, this could cause slowness or performance issues potentially, especially for lower powered embedded devices.
But my personal opinion is, the entire thing just doesn't make much sense... Computing 10,000 rounds of PBKDF2, or a state-of-art KDF like Argon2 (which can consume ~4 GB of memory as the "Proof-of-Work" to stop GPUs), but just for protecting a humble IP address, seriously? Even if you guard your IP address like a private key, a attacker with a grid of GPUs probably can still use their resources to get the information from elsewhere, like, capture some packets, or scan the entire IPv4 Internet with ZMAP...
To me, if you seriously need to hide your hostname for your security, I would say the security is broken anyway... But in case it is really needed, to my mind there are two permanent solutions -
1. Use IPv6.
2. Introduce EncryptKnownHosts. You can implement it yourself using a shell script calling gpg before spawn an SSH instance.
Unlike 10,000 rounds of PBKDF2, this solution is absolute.
This wasn't mentioned in the post but imagine you compromised a server and found an unprotected ssh key. You don't know where it can be used, and the .bash_history has rolled over or has very few ssh commands in it. You see a lot of hosts in the known_hosts file though but it is hashed. That is where this would be helpful, and is why I went down this route.
Now what can the attackers do? Well, they still have hashes of public keys. The attacker can scan the entire IPv4 Internet with Z-MAP, and record all SSH public keys. With some hashing, the host can be identified. With online services like Censys (https://censys.io/), the attackers don't even have to scan and compute, but can directly obtain the information from a public database...
Also, to make it clear, while I'm saying that the attack is too impractical to make sense, I have full respect to your research project, thanks for analyzing this security issue for the community.
Probably don't want to admit publicly that you compromise servers, though I assume you mean with consent. :-)
This isn't really a vulnerability. It's a curiosity. If you're on the sever (or NFS), and have access to this information already, then you're already in pretty damn good shape. This just allows you to be more thorough.