The attack that the researchers describe is very impressive, and using traffic analysis and error messages to find the details of an open TCP connection is extremely clever.
Unfortunately a similar approach can be used even more practically to target DNS on the VPN:
Encrypted DNS queries and replies can be profiled by traffic analysis, and the reply "paused", making it easier to ensure that a DNS spoofing attempt will succeed. This is a good reminder that cryptographic protections are best done end to end; DNSSEC does not help with this attack, because it does not protect traffic between the stub resolver and the resolver. It's also a good reminder that traffic analysis is still the most effective threat against network encryption.
This disclosure only deals with the specific threat against active TCP connections, but there are more coming.
It's taken a lot of focus and attention to make TLS reliable enough to make it a default in browsers, and DNSSEC is not particularly close. DNSSEC supports out-of-date cryptography and has no negotiation mechanisms to avoid them, which makes it very hard to use as a default e2e security protocol. It also doesn't encrypt anything.
It doesn't validate that the answer provided was correct according to the owner of the domain your query was regarding. DNSSEC tries to provide this validation. Roughly it is similar to signing a message with PGP.
You can argue that DNSSEC is not up to today's crypto standards and no longer trustworthy for its intended purpose, but without some protocol which provides the guarantees DNSSEC tries too e2e encryption is a 1/2 solution to DNS security.
His point is that DNSSEC doesn't work the way you appear to think it does. Conceptually, it's meant to prove that a DNS record in a response is actually a record created by the owner of the zone. But in practice, the cryptographic signature isn't valuable to stub resolvers on the end system; instead, end systems trust their "DNS servers" (their resolving cache server) to perform DNSSEC validation for them. The success of that validation is conveyed in a single "AD" bit in the header of the DNS response.
Since this attack happens from a vantage point in between the end system and the resolving cache, DNSSEC isn't in play; the attacker will simply set the AD bit in their responses.
I ran it for a while. It was problematic from a UI perspective (nothing to do with DNSSEC per se — it was just immature), and it wasn’t fantastic with captive portals.
Also, it noticed that I visited an internet connection where the ISP was hijacking google.com. So it worked!
DoH without DNSSEC means the results are modifiable by authoritative servers, recursive resolver, and any of the links between them. DNSSEC with a local recursive resolver prevents this. (Of course if you want privacy from your ISP, you then have to tunnel the recursive lookups through VPN(s)).
If DoH is the best one can do, then do DoH. But it's wrong to shout down DNSSEC until an alternative method for authentication comes along.
Regardless: for the attack that we're discussing on this thread, DNSSEC is not relevant; or rather, this particular attack is a good illustration of how irrelevant DNSSEC is to realistic attack scenarios.
1. DNSSEC (with clients doing full recursive lookups, ofc) is able to have zones signed off of the authoritative server, meaning a compromise of the authoritative server isn't an attack point.
2. With the client performing full recursive lookups themselves, DoH provides partial query privacy.
3. With the client delegating to a trusted third party (eg Mozilla), DoH provides full query privacy modulo that trusted third party (the TTP can break both privacy and integrity)
Comparing (2) to (1) I do see your argument much better now.
However, DNSSEC still has the property of E2E validation that can be used for more than what current software does. One could write a resolver that shipped records laterally between peers to extend its privacy properties beyond either DoH setup. Adopting this wouldn't require all the authoritative servers to get on board, just a community of end-users. This is where my argument is coming from, especially with DoH meaning (3) to most users and the general course of what happens to trusted third parties.
DNSSEC does not have E2E validation. DNSSEC is validated between the recursive cache server and the authority server. The protocol explicitly delegates validation away from endpoints with the AD bit. You can run a full resolver on your laptop the same way you can run a full-feed defaultless BGP4 peer on your laptop; it'll "work fine", but that's simply not how it's deployed in reality.
Another dramatic difference between DNSSEC and DoH is that DoH works whether or not zones sign (a tiny portion of the zones people actually make queries on are actually signed). Nobody needs "permission" to protect their queries with DoH, but everyone has to cooperate to make DNSSEC work.
Since the value DNSSEC provides is made more marginal with each passing quarter --- because of MTA-STS, because of multi-perspective CA DNS validation, because of DoH, because of LetsEncrypt making X.509 certificates free, because of certificate transparency --- the rationale for its continued deployment has become extremely thin. It's 1990s cryptography --- queries aren't even encrypted! --- that people are advocating we forklift into the Internet to solve... it's hard to see what problem?
A better plan would be to take DNSSEC back to the drawing board and come up with a modern alternative to it. DNSSEC itself is a failed protocol.
My comment compared the strengths of each protocol with the fairest interpretation of each. Your judgement here does not do this - it's obvious that a stub resolver relying on a third party to do verification is braindead, and clients doing a full recursive lookup is the correct answer. How clients are currently setup has little bearing on discussion of a protocol's properties.
> Nobody needs "permission" to protect their queries with DoH
This is also false if you compare the protocols on equal footing - if the authoritative servers are not speaking DoH/DoT, then queries are only partially protected. In order to do "DoH across the path" as you said above, cooperation is needed.
> A better plan would be to take DNSSEC back to the drawing board and come up with a modern alternative to it
Sure, but this becomes harder when things like DoH are touted as being a sufficient replacement...
I'm not interested in a debate about a fictitious version of DNS that you make up as the discussion progresses. I think we can probably just wrap up here.
I would be interested in any stats that the DNS system actually "relies" on having clients share caches. Firing out UDP packets is a heck of a lot easier than a TCP/TLS session, and modern websites take the latter for granted for every single user.
If clients sharing a cache is actually important, that's actually a negative point for DoH/DoT as increased resource utilization means that major authoritative servers will be tempted to form a clique with major recursive resolvers, rather than everyone being able to query the zones directly.
Same things like DNScurve do that but somehow nobody got excited about that.
Not that DNSSEC is useless, but we should worry about tree validation AFTER having encoded every stage of the path.
The two are complimentary.
So "Why is this supposed to be interesting?" is a bit rude for someone that was trying to answer a reasonable interpretation of your question.
But if I came off as personally rude, I apologize and will try harder not to do that.
It's better not to bring it up.
That comment there? Never seen it quoted, always in context.
It’s being brought up because it’s very a propos. Someone, somewhere said x problem needs a better solution, and someone else replied that it can be done on one specific Linux distro with a specific kernel version or configuration.
What am I missing here?
Hang on, what is 'it' in this sentence?
Because the dropbox comment was skeptical of the need for a "better solution". In this interpretation, 'it' is an explanation of how to solve the problem the old-fashioned way.
But the comment we're replying to agrees that we need the "better solution" of DNSSEC, and is suggesting a way to deploy the "better solution". In this interpretation, 'it' is the "better solution".
Those two ways of interpreting 'it' are opposites. The two comments are doing very different things.
The overlap here is simple. The question "What end-user OS does this?" was answered with...well, gibberish...and tptacek's reply resonated with me and reminded me of the dropbox comment. I think that's about as well as I'll ever be able to explain it. The fact that the two agree that there's a better solution is one facet of the discussion taken out of context, that doesn't even factor into my response or this whole spiel.
What I'd really like to ask though is what your motivation is for mounting such a defense. I seriously doubt it has to do with it having "very legitimate points" or you would've brought them up by now. Also, re-reading that thread, the OP ends up agreeing with everything except that it shouldn't be marketed as a USB replacement.
I completely stand by my decision to reference that comment in jest and will bring it up again!
I'm saying that the comments are barely similar at all. Yes, they both suggest how to do something on linux. That's the only similarity.
> answered with...well, gibberish...and tptacek's reply resonated with me and reminded me of the dropbox comment
But the dropbox comment isn't gibberish...
> What I'd really like to ask though is what your motivation is for mounting such a defense.
Because it annoys me when people misrepresent the comment as a fool who couldn't see the value of Dropbox, too attached to some overly-complex system not applicable to normal users. He clearly did see the value of Dropbox. He said right there that it was "very good" for Windows users. And the mocked point was only one out of three.
> I seriously doubt it has to do with it having "very legitimate points" or you would've brought them up by now.
I didn't bring them up because I thought it was obvious, and it would be a waste of time to list them. But fine, I'll do it.
The post has three points:
The point about cobbling something yourself is a bad point. But it was very strictly limited in scope.
The point about not replacing USB drives is both correct and important.
The point about "not being viral" is agreed to be correct by dhouston, because the viral parts were secret at that time.
So that's two good points out of three.
Anyway, I think DNSSEC is stupid, so I'm not advocating for using this tool or enabling it in OS's as default policy.
"Encrypted" only in the sense of encrypted by the VPN which you've sidestepped. An increasing fraction of DNS traffic from real clients will have been encrypted under DPRIVE, and so sidestepping the VPN doesn't help you spoof that. In this case the connection to a "real" resolver has TLS guarantees about integrity / authenticity and so you can piggyback DNSSEC assurances on top of that if that's how you choose to do things.
By the way the error side channel is very of this moment - something that for example very much concerned developers of protocols like QUIC and TLS 1.3, the reaction reminds me of an anonymous criticism of Unix using a car analogy from the UNIX-HATERS handbook:
> ... If the driver makes a mistake, a giant “?” lights up in the center of the dashboard. “The experienced driver,” says Thompson, “will usually know what’s wrong.”
During TLS 1.3 development more than once contributors expressed a desire for a feature - maybe an optional feature - to get more verbose or detailed errors than those provided already by the protocol in hopes that it would ease debugging. Old hands correctly urged caution, the giant ? is one less weapon for bad guys.
If you send garbage to a WireGuard endpoint as I understand it nothing at all happens. QUIC is a little less tight-lipped, but still endeavours to ensure that a third party can't distinguish anything useful by injecting garbage and looking at what happens next.
To associate its spartan error handling with security is a bit of a retcon. Interesting analogy, but the minimalistic error handling of ed was not motivated by security concerns and had basically zero security benefits (unless you count poor usability as a form of security through obscurity.)
I'm aware that the story isn't about security, maybe that didn't come through in my post, it's just that the story always comes to mind when talking about how verbose to make error messages. As you illustrate, there are real trade offs, just today's trade-offs are different than in Ken's PDP programming days.
As far as I understand, the idea is to send garbage not to the VPN endpoint, but to any interface on the machine that the VPN runs on, with VPN tunnel's IP there as destination.
The fact that the machine would even consider accepting that leaves me speechless.
I was describing the behaviour of rather newer systems like QUIC, TLS 1.3 and WireGuard which have decided that maybe discretion is the best option.
It seems so far I confused everybody who read what I wrote, (at least everybody who replied) so I apologise for that.
Anybody happen to know why systemd decided this was something they should be messing with?
The explanation in the commit message said this:
This switches the RFC3704 Reverse Path filtering from Strict mode to Loose mode. The Strict mode breaks some pretty common and reasonable use cases, such as keeping connections via one default route alive after another one
appears (e.g. plugging an Ethernet cable when connected via Wi-Fi).
The strict filter also makes it impossible for NetworkManager to do connectivity check on a newly arriving default route (it starts with a higher metric and is bumped lower if there's connectivity).
Kernel's default is 0 (no filter), but a Loose filter is good enough. The few use cases where a Strict mode could make sense can easily override this.
The distributions that don't care about the client use cases and prefer a strict filter could just ship a custom configuration in /usr/lib/sysctl.d/ to override this.
I do not know enough about NetworkManager or sysctl parameters to completely understand if this is a valid reason or not, but this sounds like an acceptable explanation for the change to me.
sysctl.d: switch net.ipv4.conf.all.rp_filter from 1 to 2
This switches the RFC3704 Reverse Path filtering from Strict mode to Loose
mode. The Strict mode breaks some pretty common and reasonable use cases,
such as keeping connections via one default route alive after another one
appears (e.g. plugging an Ethernet cable when connected via Wi-Fi).
The strict filter also makes it impossible for NetworkManager to do
connectivity check on a newly arriving default route (it starts with a
higher metric and is bumped lower if there's connectivity).
Kernel's default is 0 (no filter), but a Loose filter is good enough. The
few use cases where a Strict mode could make sense can easily override
The distributions that don't care about the client use cases and prefer a
strict filter could just ship a custom configuration in
/usr/lib/sysctl.d/ to override this.
There was NEWS entry for this here: https://github.com/systemd/systemd/blob/230450d4e4f1f5fc9fa4...
as to _why_ they would make the change? Systemd trieds to be project for distros to have a sane base system. Introducing this change was probably something they deemed as being useful in a base system. Especially in the context of laptops and mobile devices this default seems sane.
Systemd tries to force a monolithic operating system to behave like a microkernel, and uses propaganda and manipulation to enforce whatever preference its designers have for how all systems should run onto all the Linux distributions it can. You can call that 'sane', I'll call it totalitarian empire-building.
I can verify that turning rp_filter on in my use case is an acceptable solution on Manjaro and Ubuntu.
EDIT: To clear up the confusion below, I was saying that setting the rp_filter variable to strict mode does prevent this attack from working against IPv4, and in my situation, this is enough since I am not using IPv6 or any complicated routing on my network etc.
You write that turning rp_filter to 0 or 1 prevents the attack? While the report says that parts of the attack can't be accomplished.
Does it completely prevent the attack, or only parts of it? This isn't clear when reading this comment, and the initial oss-sec disclosure.
Important as Arch Linux is contemplating possible patching or not.
GCC 2.96, Pulse Audio, NetworkManager, systemd... I've assumed the death of (truly open) Linux would be at the hands of Redhat, this is just another stab wound.
As far as I understood the attacker can :
- Detect an active VPN connection (and maybe close it/monitor it)
- Attempt to inject packets : for this part I am skeptical of the usefulness. The connection between the server and the client are normally encrypted, meaning that the injected packets will be dropped or can be used to forcefully close the connection making it a DoS attack.
Keyword: Normally. What about DNS? DNS without DNSSEC via an untrusted AP cannot be trusted, that's for certain. Hence, you recommend your user to activate VPN first to avoid such attacks because now it's encrypted. Suddenly, that might change and your trusted-page.internal resolves to a different hostname. This /should/ be preventable via HSTS and certs for example with HTTP, but those are assumptions again. Or what about RDP? It's not encrypted, so you hide it in a VPN - mostly due to the cleartext password. But suddenly there's a vector that might be able to inject data into an RDP stream inside a VPN connection.
> - Detect an active VPN connection (and maybe close it/monitor it)
Not just that. They can detect TCP connections inside the VPN connection. Hence, you can start tracking if users of an access point accept anti-gov.com even if they go through a VPN. It might be detectable on the client side and it should be possible to mitigate this via configuration, but that's still plenty scary.
The vulnerability here is unencrypted TCP streams running (purportedly) over a VPN. Not TLS streams (HTTPS and HTTP/2.0+), unless you've also got a TLS 0-day.
(And maybe also unencrypted UDP sessions and unencrypted services, but that's less clear to me.)
(I also now understand why the kernel developers might be looking at it as a vulnernablity. Neighbors in your subnet should not be able to inject packets into your computer's internal routes.)
ELI5 would be in order indeed.
> It should be noted, however, that the VPN technology used does not seem to matter
I was wondering which VPNs, but after reading it seems that it doesn’t matter the VPN but is instead a kernel bug? I wonder if this also affects a host that has two network interfaces, with no VPN involved whatsoever?
Also, it’s good to see Jason Donenfeld involved, which many will likely recognize as the author of Wireguard.
I think this class of attack can best be mitigated at the kernel level, but unfortunately, side-channels exist anywhere that maintains state, and this might be a more general routing problem.
iptables -t raw -I PREROUTING ! -i wg0 -d 10.182.12.8 -m addrtype ! --src-type LOCAL -j DROP
I don't like having to use iptables in wg-quick(8), and so Willy Tarreau and I have been discussing a deeper kernel-level fix, which should be posted to netdev@ sometime soon.
For example, suppose I have a private physical subnet 10.12.0.0/24 (perhaps in an AWS VPC).
I want to allow clients to access to these private hosts using a Wireguard VPN, so I set up a VPN with all clients having IPs from the 10.34.0.0/24. Because I want these clients to have access to the private physical subnet, so each client's config has
AllowedIPs = 10.34.0.0/24, 10.12.0.0/24
I add a new route for the VPC to send all packets destined for 10.34.0.0/24 to the central Wireguard "server", thus the Wireguard server acts as a gateway between the virtual 10.34.0.0 subnet and the physical 10.12.0.0 network.
The packets originating from the 10.12.0.0/24 hosts are not local, but I definitely want to route them onto the virtual 10.34.0.0/24 network.
1. Does encouraging ORCHID addresses reduce the impact of enumeration attacks?
2. Linux at least has controllable behavior for cross-interface IP reachability, in arp_filter/arp_announce/arp_ignore per interface sysctls, and ip address scope, as exposed by iproute / netlink. Perhaps its more proper for VPN addresses to be a scope 'link' address, instead of a scope 'host' address. Maybe a 'vpn' scope of some sort could be defined in future kernels, but I'm uncertain what that would do that a scope link address does not?
It seems as though (correct me if I'm wrong) the CVE requires an attacker to know or guess the target's IP on the VPN, they'll find out if they're right but they don't get a hotter/colder type feedback. So that opens the possibility to play a randomisation game. On IPv4 this is a very marginal benefit. But if a WireGuard client has been given a random 64-bit suffix for some particular IPv6 subnet then unless I misunderstand the attacker needs to probe all such suffixes until they find the correct one, and they can't realistically do that even on a fast network. If I'm right that's a pretty good mitigation (on IPv6).
But for v4? Start your search with private networks (10./8, 192.168./16, etc) and just enumerate /24s from .1, .2, .3 and I expect more often than not you'll do much better than chance.
So, would a bit of address space randomization mitigate step one for ipv6? fd00::/8 is pretty big, right? Even just picking a random IP in a random /64 from that /8 should help? Or am I missing something?
Also interesting comments in the reply on the list regarding "policy based" vpns. I wonder if some subset of that infrastructure could be used by Wireguard without completing it all the way out of its current nice and secure-by-simplicity design?
> Only route based VPNs are impacted. In comparison, policy based VPNs are not impacted (On Linux only implementable
using XFRM, which is IPsec on Linux specific) unless the XFRM policy's level is set to "use" instead of "required" (default)) because any traffic received that matches a policy (IPsec security policy) and that is not protected is dropped.
I'm not sure that I understand exactly what "network adjacent attacker" means. But I'm guessing that it means an attacker on the same subnet. And that it doesn't involve actually hacking VPN encryption.
But isn't it well understood that sharing LANs with untrusted neighbors is hugely risky? At least, I always segregate critical machines in protected subnets.
Or am I missing something?
Edit: OK, I was missing that this focuses on using VPNs via WiFi APs. And depends on the AP being malicious. So yeah, this is a huge issue, for that use case.
I'm unfamiliar with exactly how to tell what's in the base system and what's in ports, but I can see openvpn* entries over at http://ftp.openbsd.org/pub/OpenBSD/6.6/packages/amd64/ - does that mean there's arguably a hole in the base distro?
In any case, nice work.
Question. Besides randomizing packet lengths (preferably with minimum and maximum tunables (per each packet type) that each site can change, to further add entropy), what else can be done to mitigate against this family of attack?
Asking as someone interested in developing "ubiquitous secure container" type protocols that are application- and use-case-specific but (theoretically) high-stakes.
Packages are built from the ports tree.
No, any packages you see there are, by definition, not part of the base install. They are extra packages you can install later.
Why is Linux accepting packets coming from one interface into an IP address belonging to a different interface? It feels like it is "forwarding" the packets internally, but `ip_forward` is turned off.
Is there any case where this behavior is legitimately useful?
For the specific case of point to point VPNs, there's a rule that makes sense. But that's not part of the network stack per se and there's no way to enforce it generically.
> such as keeping connections via one default route alive after another one appears (e.g. plugging an Ethernet cable when connected via Wi-Fi).
I mean, suppose the computer has WiFi IP address 10.0.0.3 & Ethernet IP address 10.0.0.5, then after NAT the return packets will go to 10.0.0.3, and therefore should go to the WiFi interface, not to the Ethernet interface (or, if they don't, how do they know which interface they should go to?).
I understand how cross-interface packets can be used maliciously. I'm just trying to figure out the non-malicious use cases for them.
The server also runs some service, say ssh, and you have a name for it in the DNS that resolves to one of its IP addresses. When you type "ssh vpn-server.example.com" it should work regardless of whether you're in New York or London, right?
If 192.168.0.42 can reach 192.168.1.42 by routing through the VPN server then it should generally also be able to reach 192.168.1.1 on the VPN server itself.
The described attack utilized a malicious router.
I imagine, in theory, that any middle router (such as your ISP) could then be used for such an attack. Imagine Comcast being able to inject their garbage  into even VPN sessions. Or a government actor that Comcast is known to route for.
This isn't "How does the packet get fixed?", it's "How did a packet going to the WiFi IP get transmitted to the Ethernet port in the first place?"
If I just make sure that incomming packets that are destined for the VPN LAN are dropped, this attack does not work?
Of course there are such rules in our firewalls??
Is everyone walking around without any firewall filtering nowadays? How is this a bug? Maybe I am just stupid. Did I miss something?
It's also, like, an escalation from a relatively high privilege level. You need both a passive network observer (compromised router or ISP) and a noisy, active LAN device (to inject IP addresses that a router would filter as bogons). That's not to say this is crazy hard; these are definitely within the reach of a motivated attacker. Routers were the original the-S-in-IoT-is-for-security, and if you've got an IoT type device on the LAN it's probably vulnerable once you've popped the router.
What about NetBSD.
This also appears to be wireless only, i.e., the need for attacker to have control over the AP. Am I reading this wrong.
> The attacker can now inject arbitrary payloads into the ongoing encrypted connection using the inferred ACK and next sequence number.
First, it's hard to get to this point. Then, you're injecting garbage because you don't know the payload encryption keys. So it's just disruptive, even though yes technically it is 'hijacking'.
Unless I missed something, of course! I found the writeup to be vague on the payload encryption point. It should have explicitly stated the impact one way or the other.
If you are relying on the VPN encryption to protect unauthenticated communication, then you're SOL. The vulnerability isn't in the VPNs themselves, precisely, but in the way VPNs and packet routing interact.
For your other point, they don't need to know the keys (unless the traffic travelling /over/ the VPN is /also/ encrypted). That's the whole point. This is about being able to trick the machine into accepting traffic for a connection, from an interface that the connection's traffic isn't travelling over. If it's plaintext within the VPN, this sidesteps the VPN interface and you can indeed inject your own malicious plaintext traffic into that connection.
These facts reduce the impact because it's not just "be a guy on the internet", eg like if there were an open database of PII sitting there for the taking, only needing discovery to find it. In no way am I claiming the attack isn't feasible. It's definitely a real risk, beyond the theoretical.
2. Thanks, got it. That makes more sense.
I think I actually like this vuln. It reinforces the need for defense in depth. It reduces a takeover to an annoyance (could be critical for some apps, yes) assuming you use TLS at the app layer.
1) Anyone who can compromise the residential gateway in your home
2) Anyone & anything connected to the same home network as you (incl. any IOT devices; and no, they don't need Internet access)
3) Anyone who can compromise the residential gateway in whatever coffee shop you happen to be in
4) Anyone & anything connected to the same coffee shop network as you
... and, like I said, easily scriptable, with the tools already available to carry it out.
It's not a doomsday scenario, but it is pretty bad. The one saving grace is that most apps these days use some kind of application-level authentication and/or encryption, e.g. TLS.