Hacker News new | comments | ask | show | jobs | submit login
The Evolution of IPsec [pdf] (columbia.edu)
45 points by DyslexicAtheist 10 days ago | hide | past | web | favorite | 19 comments





>Other technologies, especially NATs and firewalls, got in the way of IPsec

This is a bit understated. From my perspective this is 100% what killed ipsec. By the time workstations had the processing power use ipsec for everything, we were rolling out NAT boxes everywhere to deal with the ipv4 crisis.

ipsec traversal was either non-existent or the connection tracking would be limited to the IP so only one person behind the NAT box could establish a session with a given ipsec endpoint on the other side.

I remember it was common to not be able to connect to the corporate ipsec VPN when at a conference with another employee who was already connected. Limitations like this made it super unreliable and pushed people to VPNs built on top of the standard NAT-friendly transport protocols.


What killed IPsec is that its architecture is just wrong. The IPsec architecture involves authenticating peers and authorizing them to IP addresses. Yes, that's the architecture! See RFC 4301 [0].

That's not good because IP addresses are way to variable. This approach results in configuration hell. No one wants to admin IPsec.

Instead IPsec should have bound authenticated IDs into packet flows (e.g., TCP connections) for the duration of those flows (i.e., until the sockets are closed), and these IDs should be exposed to apps (e.g., via socket options) so that applications can perform authorization (based on authenticated IDs) or channel binding into application-layer authentication protocols. IKEv2-authenticated IDs could be ephemeral / anonymous (e.g., raw ephemeral public keys). And these IDs can be bound into authentication at application layers.

Now, I am biased -- I wrote an RFC on this subject: RFC 5660 [1].

ESP is a fine protocol. So is IKEv2. What's not fine is the architecture. If instead you bind extant packet flows to authenticated IDs, then the architecture works.

Note that this alternative architecture can work with NATs and firewalls because IP addresses aren't relevant in it. And this is very similar to all the non-IPsec VPNs out there, only those others don't use ESP. But there's a benefit to using ESP: it's easier to design hardware to offload ESP than to offload anything else.

  [0] https://tools.ietf.org/html/rfc4301
  [1] https://tools.ietf.org/html/rfc5660

In almost 15 years I haven't seen either a hardware vendor or a software implementation of IPSEC that did not include NAT-Traversal. It's been a standard since 2005:

https://tools.ietf.org/html/rfc3947


It doesn't work if both ends have NAT.

Your statement is just plain wrong. It most certainly does work if both sides of the tunnel are behind NAT.

If that were true nobody on a home cable router or DSL connection would be able to connect to their corporate office which almost certainly uses NAT and RFC 1918 Addresses. I have set this up dozens of times for both client to site and site to site IPSEC VPNs.

This will even work for for site to site VPNs where there is overlapping RFC 1918 address space on both ends.

NAT-T negotiation happens during the first phase of IKE negotiation. See the section "IPsec NAT Transparency (NAT-T)" here:

https://www.networkworld.com/article/2288666/lan-wan/chapter...


but the IKE wants to go from udp port 500 to udp port 500, even if they later agree on nat-t over udp 4500, so you can't have two boxes behind the nat answering IKE connections, right? The flow can go outwards if one end is behind nat, but you can't have lots of ipsec-wanting clients behind same nat.

The responder sends the reply to whichever address/port the initiator appears to be using. NAT-T is no different in this regard than any other protocol like HTTP, QUIC, WireGuard, etc.

To be fair, it also suffers from the same issue as all of those: when a NAT gateway decides to remap the port (because it hasn't seen traffic in 30 seconds, because it got rebooted, w'ever) then traffic from the responder to the initiator is dropped. So the initiator has to constantly send keepalive messages so the responder has an up-to-date notion of where to send packets. The shorter the interval the more the wasteful traffic; the longer the interval then the more often you get frustrating delays.

I've personally run into an issue where a stateful UDP gateway (downstream from the NAT gateway) had a 30s timeout on UDP flows. The IPSec initiator also sent NAT-T keepalives at 30s intervals. Because of drift and jitter, every few hours the 30s periods would synchronize so that the gateway dropped the state immediately before the next keepalive passed through, resulting in a very annoying 30s pause in activity. Most traffic was going from responder to initiator as the VPN was effectively a reverse proxy onto a corporate LAN, so one result was SSH sessions becoming unresponsive for 30s.

To reiterate, there are no simple solutions to these sorts of problems.


>"but the IKE wants to go from udp port 500 to udp port 500, even if they later agree on nat-t over udp 4500, so you can't have two boxes behind the nat answering IKE connections, right? "

Not at all. The box handling IKE connection is not behind NAT it is in front of NAT, port 500 is on a public IP.

Practically speaking it is usually the same box doing NAT and IPSEC.

>"The flow can go outwards if one end is behind nat, but you can't have lots of ipsec-wanting clients behind same nat."

No, that statement is completely false. The technique used is called "overloading", Cisco calls this PAT. It assigns a unique source port for each UDP or TCP session on the device doing the IPSEC termination.


Another serious problem was/is the lack of interoperability. The key management protocol is complex with features for everyone and their cat. Lots of people expected operating systems to implement IPsec which would make it easier to manage large scale. That didn't work out all that well.

Since you can't count on interoperability, especially not in the future, everybody uses tested server/client combinations, that whole standardization process wasn't that useful anymore which somewhat negates any gain over using a much simpler protocol (whether TLS or something else).


So rather than securing connections at the os level, we re-invent the network stack past layer 3. First you bring your own crypto, then create a whole new virtual network stack, then create connections. Or do what Google does and bring your own crypto, then bring your own layer 5-7 routing and transport on the server. These are "interoperable" if 1) your users download an extra app, or 2) your server implements custom app-specific solutions.

>"Another serious problem was/is the lack of interoperability."

What interoperability issues with keying? IKE which is pretty much universal is built on a standardized framework - ISAKMP which specifies how the key exchange mechanics work. IKE is made up of 3 mature and standardized protocols - ISAKMP, OAKLEY(which modes) and SKEME(for pub key-based exchanges). In a past life I configured VPNs, often between sites where there were disparate vendors on both ends. Never once did I run into an issue where a VPN just wasn't possible because of interoperability issue.


This also killed many other of things, things we don't even know we're missing. Basically froze the the lower parts of the protocol stack in time.

IPsec has a tunnel mode over UDP, which is of course NAT friendly.

Sorry, but why would that help? I (naively) would expect udp to fare even worse over nat because the return path isn't automatically allowed... unless it is?

Most firewalls now track UDP state, either at the application layer, or simply by allowing return traffic until a timeout period with no traffic occurs.

You block outgoing udp connections? DNS? QUIC? Good luck with http/3 as it becomes standard

No, inbound.

It is.

The author of this presentation Steve Bellovin, is a real heavyweight of networking and internet security going back to the early 80's and Bell Labs. For those interested there's a good USENIX interview with him here:

https://www.usenix.org/system/files/login/articles/07_bellov...




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: