This is a bit understated. From my perspective this is 100% what killed ipsec. By the time workstations had the processing power use ipsec for everything, we were rolling out NAT boxes everywhere to deal with the ipv4 crisis.
ipsec traversal was either non-existent or the connection tracking would be limited to the IP so only one person behind the NAT box could establish a session with a given ipsec endpoint on the other side.
I remember it was common to not be able to connect to the corporate ipsec VPN when at a conference with another employee who was already connected. Limitations like this made it super unreliable and pushed people to VPNs built on top of the standard NAT-friendly transport protocols.
That's not good because IP addresses are way to variable. This approach results in configuration hell. No one wants to admin IPsec.
Instead IPsec should have bound authenticated IDs into packet flows (e.g., TCP connections) for the duration of those flows (i.e., until the sockets are closed), and these IDs should be exposed to apps (e.g., via socket options) so that applications can perform authorization (based on authenticated IDs) or channel binding into application-layer authentication protocols. IKEv2-authenticated IDs could be ephemeral / anonymous (e.g., raw ephemeral public keys). And these IDs can be bound into authentication at application layers.
Now, I am biased -- I wrote an RFC on this subject: RFC 5660 .
ESP is a fine protocol. So is IKEv2. What's not fine is the architecture. If instead you bind extant packet flows to authenticated IDs, then the architecture works.
Note that this alternative architecture can work with NATs and firewalls because IP addresses aren't relevant in it. And this is very similar to all the non-IPsec VPNs out there, only those others don't use ESP. But there's a benefit to using ESP: it's easier to design hardware to offload ESP than to offload anything else.
If that were true nobody on a home cable router or DSL connection would be able to connect to their corporate office which almost certainly uses NAT and RFC 1918 Addresses. I have set this up dozens of times for both client to site and site to site IPSEC VPNs.
This will even work for for site to site VPNs where there is overlapping RFC 1918 address space on both ends.
NAT-T negotiation happens during the first phase of IKE negotiation. See the section "IPsec NAT Transparency (NAT-T)" here:
To be fair, it also suffers from the same issue as all of those: when a NAT gateway decides to remap the port (because it hasn't seen traffic in 30 seconds, because it got rebooted, w'ever) then traffic from the responder to the initiator is dropped. So the initiator has to constantly send keepalive messages so the responder has an up-to-date notion of where to send packets. The shorter the interval the more the wasteful traffic; the longer the interval then the more often you get frustrating delays.
I've personally run into an issue where a stateful UDP gateway (downstream from the NAT gateway) had a 30s timeout on UDP flows. The IPSec initiator also sent NAT-T keepalives at 30s intervals. Because of drift and jitter, every few hours the 30s periods would synchronize so that the gateway dropped the state immediately before the next keepalive passed through, resulting in a very annoying 30s pause in activity. Most traffic was going from responder to initiator as the VPN was effectively a reverse proxy onto a corporate LAN, so one result was SSH sessions becoming unresponsive for 30s.
To reiterate, there are no simple solutions to these sorts of problems.
Not at all. The box handling IKE connection is not behind NAT it is in front of NAT, port 500 is on a public IP.
Practically speaking it is usually the same box doing NAT and IPSEC.
>"The flow can go outwards if one end is behind nat, but you can't have lots of ipsec-wanting clients behind same nat."
No, that statement is completely false. The technique used is called "overloading", Cisco calls this PAT. It assigns a unique source port for each UDP or TCP session on the device doing the IPSEC termination.
Since you can't count on interoperability, especially not in the future, everybody uses tested server/client combinations, that whole standardization process wasn't that useful anymore which somewhat negates any gain over using a much simpler protocol (whether TLS or something else).
What interoperability issues with keying? IKE which is pretty much universal is built on a standardized framework - ISAKMP which specifies how the key exchange mechanics work. IKE is made up of 3 mature and standardized protocols - ISAKMP, OAKLEY(which modes) and SKEME(for pub key-based exchanges). In a past life I configured VPNs, often between sites where there were disparate vendors on both ends. Never once did I run into an issue where a VPN just wasn't possible because of interoperability issue.