Many years ago when I was a junior dev at Amazon, there was a massive project internally to split up every internal system into regional versions with limited gateways allowing calls between regions. The reason? We had run out of internal IPv4 addresses.
The Principal PM in charge of the "regionalization" effort was asked in a Q&A "why didn't we just switch to IPv6?".
Her answer was something along the lines of "The number of internal networking devices we currently have that cannot support IPv6 is so large that to replace them we would have needed to buy nearly the entire world's yearly output of those devices, and then install them all."[0]
It's easy to presume malicious intent on the IPv4 front from Amazon, but with so many AWS systems being on the scale they are at, I find it easy to believe that replacing all of the old network hardware may just be a project too large to do on a short timescale.
[0] - At least, that's my memory of it. I'm sure that's not an entirely accurate quotation.
I’ve got a slight suspicion you were given some bullshit or at least a creative treatment of facts e.g. everything had IPv6 support but FUD-filled network engineers didn’t want to turn it on.
Most network devices I’ve encountered were dual-stack way before anyone I knew seemed to care about actually using IPv6 — I always assumed it was added for US government/military requirements.
From memory, the regionalization project ran from approx 2014 to 2015 or 2016.
There were also other reasons given, like the amount of internal software that used e.g. IPv4 addresses. Also, AWS likes to have 'lots of small things' instead of one big thing (regions, AZs, cells, two pizza teams, no (official) monorepo) so regionalization was part of that.
Another big reason for regionalization, other than IPv4 exhaustion was that AWS promises customers that AWS regions are completely seperate, but with one big giant network, it turns out there were all sorts of services making calls between regions that nobody had realized. I have a couple of funny examples, but that might make me too identifiable :)
My favorite region isolation oversight was when someone realized that the perl cron job that iterated over every border router globally and applied ACL updates 2-3x per day didn't pay attention to isolation at all, and could easily have just started blackholing the entire network one device at a time if someone configured a bad rule.
The mitigation was to sort routers by hostname which began with the regional airport codes (iad, pdx, etc.), and pause for 15 minutes each time the first three letters changed to give folks on-call time to react.
Oh wonderful. 15 minutes to get the page, put down my beer, get on my computer, sign in to everything, get 2-factored 3 times AND figure out exactly what’s happening and fix it.
This really would not have been true for vendor network gear of the sort AWS had been buying for years by 2014. It's possible that their own switches or the weird fabric they have internally wouldn't have worked with v6, or there were Annapurna NIC ASIC issues, but their primary vendors all would have been fine.
I'm not saying there aren't v6 issues (for some vendors, resource exhaustion might have come into play) or bugs, but there's no way it's that massive a problem. There are huge and complex all v6 networks all over the planet that have more stringent requirements (by law) than AWS DCs.
Facebook started its transition to make everything* internally IPv6 slightly before then.
It was indeed a lot of work. But worth it.
* When I was there we still had a handful of weird things that couldn't be made IPv6. If you needed to access such things you could get a dual-stack dev server.
ssh’ing through bastions was such a pain! We used the JMX GUI to review some AMP details from time to time, and port forwarding through the bastions was frowned upon, but our workflow was broken, what were we to do?
IIRC, early on on that project the gateways would get overwhelmed at the volume of traffic they were handling between various VPCs and had to be rolled back several times early on.
Of all the transitions I dealt with at Amazon, snowfort may have been my least favorite (though the ACL/role migration was pretty frustrating as well).
Sure, everything supports IPv6 -- until you turn it on and rediscover the tickets that have been sitting at the bottom of the JIRA for the last decade.
As a matter of fact Ron Broersma who affiliated with Space and Naval Warfare Systems Command (SPAWAR) has a list of equipment that should be fully IPv6-only compliant including various management interfaces and more. The US Navy supposedly tests this in house in a IPv6-only network. 4 years later I imagine the situation only got better https://www.youtube.com/watch?v=9kQje5gSWw8
Also, AWS now have the majority of NICs and switches built in-house I imagine. The underlay network could be IPv6 or totally custom for what we know (but probably is IPv4).
Cool! I'm glad the military is pushing the internet forward, I guess some things never change :)
As for AWS, I tend to agree with the sibling post and your supposition about IPv4. Everything out of the Amazon organization is aggressively, err, "minimal."
I believe the issue wasn't of IPv6 support generally, but of issues with TCAM space and the increase in routing table size moving from v4 to v6. Overflowing TCAM would cause routing to hit the CPU which would immediately lead to outages.
Tables were relatively large internally because AWS was all in on clos networks at that point. And the devices used to build those clos networks were running Broadcom ASICs, not Cisco or other likely vendors.
Right, if you worked at Amazon and didn't have incentive, then, you didn't do it. It was part of your job to not do things which you were not incentivized to do.
If Amazon is your customer, you fix the bugs; if you're Amazon using your in-house kit, you fix your own bugs whenever you want to. There are plenty of real reasons not to do IPv6, but they are virtually all politics and possibly operational ("we'd have to train our people, and we don't spend money on that"). The idea it was a vendor issue is a BS trope that's been around for at least a decade if not 2.
I remember the regionalisation, that was "fun" to be on the sidelines for (I was in a newer service that was regionalised from the get-go). I don't remember who the PM was for that one, but I remember that being when I truly came to respect the value that a TPM can add.
You're right about the cost and need to replace network equipment being one of the strong reasons why they didn't. Amazon used its own in-house designed and built network gear for a variety of reasons (IIRC there's a re:invent talk about it), which I'm sure is probably still the case.
Every single one of those machines had fixed memory capacity and would need to be replaced to bump up the memory sufficiently large enough to handle IPv6 routing table needs etc. What they had wouldn't even be enough if they'd have chosen to go IPv6 Only (which you couldn't get through except via dual stack IPv4/IPv6 anyway).
Were they also by chance considered accelerators for encrypted traffic?
I'm not privy to details, but I recall once when a mandate was issued to a Java platform to remove an outdated encryption protocol (mandated by Amazon Infosec). The change was made and rolled out with little fanfare.
A few weeks later, a large outage of Amazon Video (which used said platform) occurred on a Friday evening. Root cause? The network hardware accelerators were only setup to use that outdated protocol, which in turn meant that encryption was happening in software instead. Under load, the video hosting eventually caved.
Might be specific to the hardware used for Amazon retail, but it reinforces the point of their home grown (and now aging) stack.
Maybe not the same story, but there was a sidecar service for encrypting traffic and doing access control and other things in a way that was transparent to the app (like Envoy, but without the mesh and much earlier). The original version was written by (maybe) a single engineer in Erlang. Version two was given to another team and rewritten in Java because. They had never tested at scale and every team I know who went to production with it fell over. There was some company wide deadline, but it was unusable, at the point, and the teams I was working with were gun shy to try it again since it was obvious that the owning team had know idea what the performance characteristics or system requirements were for it.
I think I switched teams before that was resolved and moved to some greenfield work where we didn’t have to worry about scale for a while, but I do believe they eventually figure it out.
I believe the PM was Laura Grit, who was actually a TPM I believe. Laura is a Distinguished Engineer now. She seems to constantly do massive scale projects. IPv4 being a smaller one now. Sadly I can't share some of the big projects she's doing now. I've gotten some sage advice from her on a few occasions that she had time and appreciate it.
> replacing all of the old network hardware may just be a project too large to do on a short timescale.
If that is the case, then Amazon should hold off on charging for IPv4 on a short timescale until they have replaced all the old hardware and can support IPv6 internally everywhere.
No one is ignoring it, and the US Government has done everyone another favour on this score. Years ago in the late Bush / early Obama administration, NIST required that all federal government agencies have IPv6 at the border. Federal government money is not to be sniffed at, and that had the effect of forcing a number of vendors to add IPv6 support.
A few years after that, it became that the federal agencies needed to have dual-stack IPv4/IPv6.
About 18 months ago, the requirement came that federal agencies are required to be IPv6 Only, dropping the dual stack. IIRC they have until 2025 to do that. This has the neat effect of forcing all vendors to make IPv6 a first class citizen. The extra little fun from this is that it applies to the military JWCC contract that all the major clouds have been trying to land. The timescales of JWCC meant that initial offerings are pretty bare, but that won't be allowed to last.
I work a federal entity tied to DoE and that's the biggest workstream cut out for us. 90% of our environment is either dual stacked or IPv6 native. We would love to kick IPv4 out under us and go full IPv6. Problem is that the vendors who are largely private don't have the same mandate so there's varying degree of "we support IPv6" which makes planning bit more difficult (especially at the discovery stage).
I can believe that, but also, places like google and facebook saw the problem of having >1million devices and the lack of IP addresses and moved to ipv6.
There is no reason any company of any size should run out of IPv4 addresses internally, IF they are doing proper IP management. If I were to wager a guess I'd say there was a lot of waste going on, issuing /24s or larger to teams when all they need are /29s etc. It adds up over time. Once they exhaust private IP space they can always buy more at auction. They are Amazon after all, there's no shortage of money. This is just mismanagement of resources.
My one issue with this is if it’s such a large lift, why burn the effort to just kick the can down the road? IPv6 has to happen at some point (and for AWS that point is sooner than most).
The better reason is the regionalization was probably a way to decrease blast radius in case of a service failure.
Also, AWS definitely did not regionalize all their services in 2016. IAM and certainly not DNS/Rte53 (part of the reason why they had their massive failure in US East 1 2-3 years ago)
I upgraded a P2P networking library recently to add support for IPv6. That was a pure software solution and it required a lot of work. When you have to upgrade hardware as well, I can imagine it would present a massive challenge (especially logistically). You'd have to upgrade ALL the hardware before you even start thinking about the software side of the equation.
Even cheap consumer hardware supports ipv6. There are significant financial incentives to continue the capitalism of ipv4 addresses. Like NFT's - an artificially limited capital. To create more addresses means more competition, loss of capital. Therefore they will spend billions on continually reworking internal IPV4 than going for the proper solution.
I worked in a company where we had network equipment all over the world.
Often IPv6 and IPv4 paths were entirely different and latency on IPv6 was much bigger, so we had to measure latency between nodes on both. Also, sometimes IPv4 was a symmetrical, but IPv6 wasn't. As a result, we had to buy tons of IPv4 addresses.
Our control plane was on IPv6, but data-plane had to be on both.
It seems obviously against AWS incentives to offer working v6 - all their influencing tools ("well architected" criteria, certificates) strongly herd you towards building mazes of ambigously addressed 10.x RFC1918 networks, and not internet style architectures with end-to-end addressing.
In the world of their recommendations, even the concept of a "public ip address" is a red flag, and AWS even recommends (for an added cost of course) tooling to flag and "mitigate" them. These provide a strong lock-in effect when customers spend effort to build the complex infrastructure for them in the name of security, even though in reality they hurt security through unnecessary complexity, addressing ambiguity, etc.
I've lost track of all of the "Private Endpoints", "Private Links", "Service Endpoints", "Private Resolvers" and "Virtual WAN" products they've introduced... all to make IPv4 work at scale.
Literally none of those products would be required if they had just made IPv6 work properly.
Instead, they NAT IPv6, so you can't even use it to avoid the NAT forced upon you by IPv4. They also release new products -- in 2023 -- that don't support IPv6 and likely never will.
Think about how insane it would be if this was the IPX -> IPv4 transition. Imagine if this page said "IPv4 limitations: All VMs must include at least one IPX address, etc..."
Sounds nuts, right? I was migrating customers to IPv4 from IPX in 1999, and IPv6 support materialised in about the 2001-2003 timeframe. It's been decades, but it still feels like 1999 and migrating Novell NetWare where we had to have IPX+IPv4 because "not everything supports IPv4 yet".
[1] This is definitely not all of the limitations. Most of their PaaS products don't support IPv6. Hence, any IaaS+PaaS solutions must use (mostly) IPv4.
A lot of IT folks are still fearful of IPv6. I've been on calls where people disable IPv6 as a matter of "best practice." It's sad. People will gladly learn the latest flavor of the month web framework but won't take time to gain experience with a fundamental protocol.
I probably fall into this bucket but I’ve seen disabling it fix very odd issue entirely too many times to write this off as just “sad”.
Learning a new framework doesn’t (often) break things for the end user in a way I can’t diagnose/reproduce. IPv6? Totally different ball of wax.
Also my ISP, a fiber gigabit provider and the best available in my region, doesn’t support IPv6. Until they, and others like them, get on board fully I don’t see customers having a snowball’s chance in hell of working cleaning with IPv6. I can only imagine how long it will take the shitty ISPs to fully support it.
Disclaimer: I have never worked for Amazon but I can add some input from the assorted medium to large companies I have been employed with. I won't mention any names. This is for someone I know reads my comments wink wink
I am not justifying it, just adding some of the bits I experienced. There are many security devices that do not have the same capabilities on IPv6 as IPv4 yet. Some enterprise IoT devices only support IPv4. Adding to this some network engineers don't want to step outside of their comfort zone and tooling/scripts to generate configurations automagically do not yet support IPv6. As the company grows they hire less Sr. Network Engineers and some of the new people depend on but do not understand the automation. e.g. someone wrote some API and they retired or changed companies. Some tooling may remain stagnant for some time. It's also a heavy lift to retrofit some enterprise environments for smaller changes so people fear the outages they will induce implementing IPv6. In some companies it is a major change just to migrate customers to a new load balancer endpoint. There may also be hundreds of undocumented things due to employee churn and lack of change control running that when broken will cause extended outages. And then there is internal politics and finger pointing...
Again, not justifying it, rather I think there are too many moving bits and complexity that people have added over the decades and they are paralyzed by fear and risking the loss of their paycheck. And then there is the embarrassment that comes from having to acknowledge that nobody knows the current state of an environment and that embarrassment can go all the way up the organizational chain.
That is based on my experience of being brought into companies with the speicifc task of, "Hey, make this simpler, reduce outages." It's rarely strictly a technical challenge but rather having to navigate politics, personality types and individual sub-org leaders that have had independent control of their environment for a long time. The more I think about it this could be a topic in and of itself how companies induce self inflicted bloat as they grow.
The biggest issue is IPv6 is a privacy, wide open wild west, there is no privacy on IPv6. Every device's IP is literally public, on the public Internet, 24/7. All so called privacy extensions or improvements do not change the lack of privacy of IPv6 and one more thing, the address structure sucks.
This is wrong. Every IPv6 interface has a link-local address (which is not routable outside the LAN) as well as a global address. Global addresses are just addresses that come from a block allocated by the IANA. It has no bearing on whether the interface is reachable from the public internet. Just like with IPv4 networks, a stateful firewall will prevent unsolicited inbound connections.
Why is it wrong? You don't seem to be engaging the parent's point at all, which is that IPv6 addresses are a more specific identifier than current IPv4 addresses. The existence of link-local addressing has no bearing on that argument, because those addresses are non-routable by definition. Nor does stateful firewalling prevent your unique device address from being broadcast over the Internet, you'd need 6-to-6 NAT to achieve that.
Every Ipv4 device has a loopback (well most do.) thats not what they are on about.
Having NAT and a firewall gives you a better illusion of privacy. Sure you can track devices from the outside world, but its pretty hard.
If you have v6 configured in a certain way, then your IP address is basically a UUID for your machine. Plus you can't really just stop ICMP anymore so you can trivially ping it (caveats apply)
FWIW this does not have to be true for companies that do not wish to expose internal nodes. I'm not even talking about the privacy extensions. I realize that people beat the drums that one must not NAT IPv6 but it can absolutely be a NAT just like IPv4. I would actually expect in most companies that they don't even add IPv6 inside their datacenters, rather they just put a block of IPv6 addresses on some load balancers at the edge and then point their edge devices L4 and L7 mapping to IPv4 nodes in their datacenter. The same could apply to VPN/WAN configurations in some cases much like how mobile networks are configured nowadays. There would need to be some of this regardless due to ACL's required between companies that don't use network VPN's for restricted end-points. e.g. PCI to PCI environments.
In the early days of IPv4 many big companies did not NAT IPv4. I was at a company that did this. Our workstations all had routable public IPv4 addresses. There are still some companies that do this. One of these was a company that had 20 managers and VP's on a call when I just wanted to give their network engineer a CIDR block and a pre-shared secret for a network VPN. I suspect they will be using public IPv4 addresses internally forever. And I doubt they will ever sell their /8's unless people comment on their Vogon poetry.
Amazon is probably one of the exceptions as they have so many geographically disperse configurations it would be harder to continue using RFC1918 address space. Not impossible, just difficult. I can think of a few other big companies that do manufacturing that are spread out around the world that would probably run into walls with RFC1918 at some point. I've seen some of them take over public address space for internal routing which then breaks access to some things on the internet thus requiring double/triple NATs. 1/8 assorted, 25/8 MoD, 26/8 DISA are a few I've seen.
> In the early days of IPv4 many big companies did not NAT IPv4. I was at a company that did this. Our workstations all had routable public IPv4 addresses.
A lot of big universities did this and even still do this to a large degree. They got huge IPv4 allocations early and there was no scarcity.
All of the early Internet companies I worked at were like that, up until roughly 2000: public IPs direct on the desktop. My 90's home network was also like that. I had a /24 block from the old class C "swamp" space. I still have it, actually. It's legacy space, no ARIN fees.
In my current company (related to academic institution that introduced internet to my country) every office workstation has FOUR public (but firewalled of course) IPv4 addresses. And every user has an unique VPN IPv4 address on top of that.
I remember the days of non-NAT IPv4, though I'd forgotten until you mentioned it. I'd be OK with NAT IPv6, though the addresses are still ugly and difficult to reason about.
CIDR IP addresses are based on binary number prefixes. If you think it's easier to reason about them in decimal than hex, then you probably don't really understand binary.
IP-based "privacy" is an illusion. With IPv4, your public IP (NAT router IP address) may not change for months, years, and possibly not until your change your router/MAC address. With IPv6 privacy extensions, your address changes regularly. This seems like an improvement.
> With IPv6 privacy extensions, your address changes regularly. This seems like an improvement.
Eh… If I was a company that wanted to use IP addresses to fingerprint users, IPv4 vs IPv6+privacy extensions both seem identical to me. Multiple requests from the same IPv4 address mean “someone, perhaps more than one person, from the same household/wifi”. Whereas multiple IPv6+privacy requests from the same /64 prefix means the same thing.
ie. You just consider the first 64 bits of the IP and can assume the same amount of information you already would assume from the IPv4 address. Just ignore the trailing 64 bits because it’s expected that they’ll be randomized/shuffled even from the same client.
The IPv4 at my router yes. That's where tracking ends. IPv6 privacy is an illusion, try the test I described, remove IPv6 from your router at home, wait a few hours or few days, the family will complain search results are odd or messed up and that's only the beginning of it.
I don't know how companies are doing it but they are able to track your IPv6 changed daily or not.
They probably just track the IPv6 /64. With prefix delegation, the /64 would rarely change, unless your provider delegated a new block. This is similar to your IPv4 router changing its address w/DHCP: it happens, but is relatively rare.
It's definitely "best practice" to turn off features you don't need. Who knows, maybe in five years someone will find a bad bug in the implementation of the IPv6 stack, then you'll be glad you decreased your attack surface.
Not saying this is what you should do, just a common rationalization.
It's also "best practice" to learn a new, foundational technology (like IPv6) sooner than later, perhaps less than 20 years after it was first available.
I know IPv6, just don't feel the need to use it. I prefer to have one firewall and public network interface to worry about. I can't disable IPv4 yet, so the logical solution is to disable IPv6.
Did I say they had to deploy it to production? If they deployed it in dev and test environments even a few years ago, they might have some experience deploying it in prod today.
It's individuals responding to the incentives before them. If something goes wrong because they're using IPv6, it's their fault. If you never upgrade anything until you're forced to, then you can never break anything by upgrading, and you're never seen as a "stuff breaker".
This is it. Having broken stuff by upgrading myself, I can say it does not make you popular. This is especially the case if what the upgrade did was tighten up some bug that it turns hid a bug in your team's code...
I have a reason: we do per IP rate limiting. It's easy enough for IPv4 when the number of IPs is necessarily not too big to fit in a small redis for example, but for IPv6 everyone have at least a /64.
I'm curious how people do it btw, if you have tips to share, I'm all hear. Do you simply rate limit IP ranges? Even limiting per /64, it's still potentially quite a lot of /64 to track.
Given that the only routable IPv6 address space is in the 2002::/16 range (is 2003:: in use yet?), and the standing recommendation for ISP CPE endpoints is to allocate a /48 per customer (a customer can't do any local subnetting if only allocated a /64), the effective address space for rate-limiting is the exact same size as the current IPv4 address space: you only need to track bits 16-47.
It's possible that cloud providers assign smaller ranges to their customers, so you may need to allocate more bits for granularity in that case; on the other hand, one might naively assume that cloud providers are more responsive to abuse reports than ISP's.
We put limits high enough that it's far enough for any expected usage, including a bunch of users on a single IP. If we see rate limiting happening in practice and it doesn't seem to be an attack, we revisit.
Well it sounds like you'd do fine tracking the IPv6 blocks that are currently very active, without needing any significant amount of resources.
If you go the extra mile and simultaneously track /64, /56, and /48 with moderately increasing thresholds, you'll probably end up causing less collateral damage when you block someone than with IPv4.
People see a 10.x and instantly know it can't be reached from the public internet. IPv6 is much harder. For internal-only stuff there is the fd00::/8 block, which AWS actually does use, but there is no equivalent range for outgoing-only connections.
Because there's a lot of shit that still doesn't work well with IPv4. Logging is one good example - a lot of software that uses its database for event logs has the database column for remote_ip defined as VARCHAR(15), you can guess the rest of what happens when deploying that with IPv6 enabled.
Comcast at some point stopped letting you administrate your own router. You can log in to it, but port forwarding is no longer available through the administration interface.
If you want port forwarding, they recommend that you do... something. It's not clear what; what you can find on the internet is mostly just people complaining that they insisted to customer support that they needed port forwarding, customer support said they'd do that, and port forwarding still doesn't work.
But, intriguingly, it turns out that a Comcast router will also assign every device on your network at least one public ipv6 address. They also firewall all incoming ipv6 traffic, but unlike the situation with port forwarding, you can disable that firewall on the router admin page. (You can't put up your own firewall.)
> port forwarding is no longer available through the administration interface.
Sounds like the CGNAT experience, possibly blanket applied even if you have a globally routable public IPv4 in order to have consistent behaviour and reduce network management / support case complexity.
All the modems I see support bridge mode. When enabled the Comcast device doesn’t do any of the routing at all. Your own device gets to do that instead.
Yeah, even at my father's place (I've had different providers than Comcast for personal reasons, but FWIW same story) I've never had a problem just plugging in a router of my choice and using that instead. Makes it a lot easier to handle quirky setups between modem swaps anyway.
Wait a second. A private link means that a service endpoint is public so a part of your traffic goes through the Internet, which is supposed to be insecure (even if you have encryption in transit?), so you don't want to do that and they will happily route your traffic internally so it is not exposed to the bad Internet - for a fee. All these VPC Endpoints etc. cost money and you are charged by the our. I don't think IPv6 would change anything here, they would find a way to charge customers for "security" anyway.
That's what I thought at first, but that's not quite right.
Service Endpoint: Allows a PaaS service (that itself uses public addresses) have firewall rules for overlapping private vnet addresses. E.g.: You can have have two VMs both on 10.0.0.123 addresses (in separate VNets) using individual "Allow" rules to the target service. Essentially Azure tags the traffic at the VXLAN level with the source network ID on top of the IP address, making it a "fat IP address" that is unique within Azure and can be used in firewall rules.
Private Endpoint: Makes a PaaS service appear on a private network address range instead of the default public range. This allows your on-prem firewalls to isolate your specific PaaS instance from other customers -- otherwise the traffic gets "blended in" with everyone else in the same public service tag ranges. This also allows you to use your ExpressRoute fibre links to route traffic from on-prem to the public service.
In all scenarios, the traffic goes over Azure networks and/or Microsoft's private backbone. You have to go out of your way to route traffic "via the Internet". Remember: Network addresses are just numbers! Routing rules determine how they flow, and public addresses can be used on private networks.
Fundamentally, all this exists just to enable the ability to firewall things. With overlapping IPv4 addresses and small shared blocks of IPv4 addresses with NAT behind them, it would be impossible otherwise.
With IPv6, using firewalls would be much simpler because overlapping addresses aren't needed any more. Similarly, PaaS services could trivially allocate IPv6 addresses per customer instance, so that customers could apply selective firewall rules.
> In all scenarios, the traffic goes over Azure networks and/or Microsoft's private backbone. You have to go out of your way to route traffic "via the Internet". Remember: Network addresses are just numbers! Routing rules determine how they flow, and public addresses can be used on private networks.
My understanding is that if you don't have a private endpoint, your traffic to an Azure cloud service won't be routed out to the "big bad internet" per-say, but it will be routed within the Azure AS as mere IP traffic.
If you have a private endpoint to an Azure service in your virtual network, that means Azure has provisioned you a virtual NIC with some private IP address, and presumably alters DNS resolution within your network for that Azure service to resolve to the IP address of the NIC. The NIC provides (presumably encrypted) link layer transport out to the Azure service.
Compliance for some customers may dictate that there aren't any routes out to the public IP address space from within a network. If you still need access to cloud services, private endpoints are a necessity.
All that to say, I think Private Endpoints provide more than just a means of firewalling traffic/changing the IP address associated with a service; the actual transport from client->cloud service is fundamentally different.
You’re exactly right that the main point of private endpoints is to allow customers who aren’t allowed to open firewalls to public internet to still connect to their public services like Azure Storage, Key Vault, or SQL
> In all scenarios, the traffic goes over Azure networks and/or Microsoft's private backbone.
That's the whole selling point of private links. How could you possibly missed that? That's exactly why companies onboard onto the service. They say exactly this exactly on the marketing brochure. That's why customers line up to pay for it: to get their traffic flow only through private networks instead of through the wild.
What kind of confusion reigns in your mind to come to the conclusion this was some obscure conspiratorial gotcha?
It boggles the mind how you felt the need to come up with absurd conspiracies involving IPv4 to arrive at a claim that the marketing pamphlets show front and center as their whole reason of existence: avoid traffic to go through the internet, and instead pay extra to go through private pipes they own.
> That's why customers line up to pay for it: to get their traffic flow only through private networks instead of through the wild.
I believe their point is that it doesn't flow through "the wild" in any case, it is probably routing within the Azure AS(aka Microsoft controlled networks). However, as you say, people line up to pay for it, likely for reasons having to do with their architecture/security model/compliance requirements.
What a shit show. Seriously I can never rant enough about how awful Azure networking and their bullshit concepts is. Like they don’t know how to do networking, so they’re gonna introduce a bunch of shit and pretend that nonesense makes perfect sense because of their own ineptitude.
Some of the complexity seems to be the security theater of the lowest common denominator of customer demands. Companies invest too much money into incredibly expensive Palo Alto firewalls then demand Azure route through those too so that their cloud operations are as theatrically "secure" as their main network because look at all those amazing sunk costs invested in it.
> I've lost track of all of the "Private Endpoints", "Private Links", "Service Endpoints", "Private Resolvers" and "Virtual WAN" products they've introduced... all to make IPv4 work at scale.
I'm not sure what you've been reading, but the concept of a private link has absolutely nothing to do with IPv4 vs IPv6. In fact, practically all your remarks don't involve the issue at all.
The most charitable interpretation of your comment is that you're making the mistake of conflating any application involving a virtual network as something caused or involing IPv4.
I think your comment shows a high dose of ignorance and complete lack of research on the topic. The whole point of private link is to not route traffic over the internet, and instead flow traffic between private networks through private pipes.
One of the primary usecases and design requirements for this service is regulatory compliance. They say right on the tin that the service is designed to send traffic over private networks, including AWS's own global network. The whole point of private link is to ship data through the pipes you own, instead of routing it through the wild. I don't know how you could have missed that.
More importantly, you really need to want to use private link connections. This is a value-added service. You need to want to go out of your way to avoid your traffic to go through the internet to onboard both ends of your services to private link.
Not only are your conspiratorial hypothesis completely out of base, even your baseless assumptions have absolutely no relation with what version of the IP protocol is in place.
I'm the first to join in on any good old fashioned AWS/big cloud provider bashing, but these should be grounded on reality.
If you think that either Azure or AWS loop terabits of customer traffic between two of their own services "out to the Internet" and back just because the IPv4 octets don't start with a "10", then you're the one who's missing the big picture.
Yep. Pick your poison. You're either nickle-and-dimed for a "NAT gateway" or overpaying for "VPC endpoints". Often, it's both. I preferred the EC2-classic days, honestly.
Private link is a clever way to implement real network segmentation
That is, when you have a customer in some network and a provider in another network, you had to implement full connectivity between the customer and the provider
With private link, you can remove all that connectivity, and instead expose the provider' service to the customer
The service, nothing more, so just one endpoint
This is really good from a security point of view, but also for managing your stuff (especially if there are multiple teams in the compagny):
because you now have a resource, you can easily list the services you expose to other people, and whom are your customers
It's not just AWS. Microsoft, security auditors, penetration testers, cyber insurance companies, etc. also largely insist on not having publicly addressable endpoints.
I don't understand why, but until some large tech company starts pushing for end to end addressability as best practice, I have no choice but to follow the conventional wisdom to avoid throwing up red flags.
> Microsoft, security auditors, penetration testers, cyber insurance companies, etc. also largely insist on not having publicly addressable endpoints.
> I don't understand why […]
Excluding Microsoft, all the others find it easier to have as a checkbox to make it easier to confirm that "internal" hosts are actually (theoretically) internal since RFC 1918 isn't allowed outside.
Of course most companies' firewall and NAT rules are probably all sorts of complicated once you get to a certain size (never mind stale open-rules which were never cleaned up), so a bunch stuff is probably accidentally exposed. Also, most attacks are probably from compromised clients nowadays, so even internal hosts need to be locked down as the castle-and-moat security model isn't (as) valid.
But having "internal-only" hosts is low-hanging fruit on the security checklist.
I will resist the urge to be snarky at your expense and politely point out that exposing your LAN to public routing tables is madness, from all perspectives.
Is IPv6 Unique Local Addressing still a thing (or again)? Just because a machine has an IPv6 address does not mean it is automatically routable over the entire Internet.
>exposing your LAN to public routing tables is madness
And I don't understand why people think that.
You are exposing a /64 network. That's 2^64 addresses, no one can scan your LAN if that's what you fear, nor can anyone reach your hosts if you build a stateful firewall that denies incoming connections - you know, just like NAT. But minus the packet modifications.
Using global addresses is not, of course, "exposing your LAN to public routing tables", or any charitable interpretation thereof. Reachability != addressing.
I work in Azure, but my experience is that customers want this - and for good reason. Customers want their own private network to prevent intrusions and exfiltrations, just on machines they don’t own. Or even better, put the nice fancy batteries included PaaS services in these networks too.
I see this at work, being within such a customer. People driving these mandates barely understand IPv4, let alone know what IPv6 is. They're software developers after all, not CCIEs.
There's nothing preventing you from having a private network using unique address space that's either blocked from accessing the internet via a firewall on a router or just plain not even routed. You could even use ULA networks with stateless prefix translation to avoid using GUA addressing for your private network.
The sad part is that IPv6 support is abysmal on every cloud so just migrating to it imposes serious limitations as addressed by the blog author.
> Customers want their own private network to prevent intrusions and exfiltrations, just on machines they don’t own.
If your (default) gateway from one network segment to another network segment only has one rule, default-deny, then it's not a problem. If you think that's not enough, then use IPv6 ULA (fd00::/8).
But why should the incompetence of some customers limit what all customers can do?
Well, I wouldn't put _any_ service on a public network, unless it is explicitly required. Firewall is all well and good, but security in depth is even better.
Private networking is good. IPv6 doesn't help here at all.
The easiest firewall in the world is one that is set up to deny all traffic from all sources. Which is how any decent firewall is configured by default anyway.
I'm not saying that running a private network doesn't provide genuine security value, only that it drastically complicates your networking architecture for very little security benefit. Organizations can decide whether that trade-off is worth it, for organizations with deep threat models like militaries and banks, it's probably worth it. For 99% of the private sector, it's folly.
"Private networking" as defined by assigning private-range IP addresses are only private as long as there is no route to your network, or as long as it's isolated on a dedicated vlan (even then, there could be some rogue machines).
In the first case, you need a firewall for IPv4 anyway. In the second case, that would also work with IPv6.
Disclaimer: I know nothing about Azure/AWS internals.
This phrasing is really problematic. Using internet addressing (vs ambiguous addresses) does not make your network "public". Just like using unique MAC addresses doesn't. Confusing global addressing with public reachabiliy is exactly the rhetoric used by AWS, Azure etc to scare people into building mazes of ambiguously addressed 10.x networks.
Private address ranges doesn’t make a network private. Firewall does.
If I know the external, publicly addressable IP address of your router (e.g. 135.77.9.106), and no firewall whatsoever, there’s nothing at all preventing me from doing `ip route add 10.0.0.0/8 135.77.9.106`, and voila, I’d have a route to your “private” network.
Using private addresses vs globally unique offers no security benefiy whatsoever.
> If I know the external, publicly addressable IP address of your router (e.g. 135.77.9.106), and no firewall whatsoever, there’s nothing at all preventing me from doing `ip route add 10.0.0.0/8 135.77.9.106`, and voila, I’d have a route to your “private” network.
This only works if you are on the same L2 segment as 135.77.9.106, or control and install this route on every router between you and it. Otherwise, 10/8 will get routed to the next hop for 135.77.9.106, i.e. your local gateway, which won't know anything about the intended 135.77.9.106 destination and will route it normally (which likely means dropping it).
It's true that firewall rules should be in place to prevent this attack from your direct neighbors, but it's not possible to perform it over multiple hops that you don't control.
It only takes one, but most likely all the routers in between your network and the remote private network already drop the Martian packets, and you don’t have an interface directly connected to the remote private network, so the route you have configured would not work.
(Though that WP page seems also to have self-coined the "private network" phrase and I don't think it's an estabilished term in this meaning. The first and second references off the leading paragraph talk about "private internets" and "unique local addresses" respectively).
"Public network" can mean many things, but in context of IP addresses it usually means a network, that uses a globally addressable IP range.
Now, that doesn't mean that the network is globally accessible. It can be tightly firewalled.
Therefore job security of old school network administrators is the main factor against IPv6 coverage.
Hopefully one of the big cloud providers figures it is in their best interest to have a much bigger address space and make all this busywork sinecure obsolete.
I’m one typo away from accidentally allowing IPv6 access to every machine in my network with my pf config on my home router. (I know this because I’ve done it one time, and didn’t notice for about a week.)
There is no such typo i could make with my single shared public ipv4 address because it’s just one address. Saying “allow” by accident isn’t enough, I’d have to somehow accidentally configure the particular ingress port to NAT to a particular internal machine, and even then it would only affect that machine and no other.
(Full disclosure, i actually like IPv6 and am in full favor of everything moving to it. This is in spite of the above, but i at least recognize that the above is the case.)
I work with Fortune 50s in cloud, and they can barely manage ipv4. If you're in a digital native it's different, but in my experience most behemoths do not inspire confidence with how on top of their network infrastructure they are.
This is a bit like saying “customers can barely manage driving a stick shift with a manual choke — we shouldn’t let them drive automatics!”
IPv6 isn’t amazing, but it makes many of these problems simply disappear. Of course [0] networks should be isolated, but this should be achieved with a firewall that, by default, disallows connections between the public Internet and private networks. And that’s about it — every VM has a globally unique address, routing just works, one company (if permitted) can connect to another company’s endpoints, firewalls can be deployed where they make sense instead of being forced to exist exactly where inconsistently-addressed networks meet, etc.
The entire mess of designing and negotiating allocation of extremely limited IPv4 addresses for private systems simply disappears!
I'm not convinced these 2 are related. If AWS wanted, they could start promoting IPv6 as a panacea to all internal network problems (if they had it working, that is), and the competition would have a hard time catching up. My guess is the reason is more mundane: supporting IPv4 is just easier, especially if you take into account the number of their services.
While this is probably exactly the spin the AWS sales team head in mind, it conveniently glosses over all the IPv4 related products customers won’t need anymore if they use IPv6, and thus pay a lot less for their simplified network infrastructure.
AWS has no incentive to support IPv6 from a sales perspective.
In addition, IPv4 addresses are an asset that has monetary value; and there’s no reason AWS would want to drive down the value of the asset by helping customers migrate to IPv6.
Have you heard of the Jevons paradox https://en.wikipedia.org/wiki/Jevons_paradox? It is really insightful. If you make something more efficient/ easier to use, people will use more of it. That is exactly what I think about the switch from IPv4 -> IPv6. Networking is really complicated now and if you happen to make it just a bit easier and cheaper, people will use it and the related services more.
Stateful NAT is a real burden on bigger networks and is at least a chore on smaller networks. It at least doubles the complexity of managing a network especially when you have a DMZ that should be used from some "private" and some "public" endpoints.
Sure there is. AWS has a vested interest in the value of IPv4 going down as much as possible.
Owning IPv4 addresses is a requirement of AWS’s core business. Unless people stop using IPv4, then AWS cannot sell those addresses. There is no incentive for the addresses to increase in value.
Further, if people continue to use IPv4, then AWS has to continue to acquire even more IPv4, and AWS wants the price of those to go down so that acquiring them is cheaper (or wants people to stop using IPv4 so that they can stop spending money on them altogether).
> Unless people stop using IPv4, then AWS cannot sell those addresses. There is no incentive for the addresses to increase in value.
But if people stop using IPv4, the asset (billions of dollars by some accounts) becomes worthless... AWS are passing on the cost for public IPv4 addresses now, so there's even less incentive.
AWS' business model is not speculating on IPv4 addresses. But the fact that smaller providers can't get IPv4 allocations coincidentally works in big cloud providers' favor (and incumbent ISPs). The slower you deploy IPv6 the longer you defer that cost and the longer you enjoy your advantage in address space capacity.
> These provide a strong lock-in effect when customers spend effort to build the complex infrastructure for them in the name of security, even though in reality they hurt security through unnecessary complexity, addressing ambiguity, etc.
How do you hurt security by preventing external access to your internal services?
In case not: it's about doing it the wrong way (excess complexity and ambiguity -> hard to understand/analyze/monitor). Using globally addressing is orthogonal to controlling access to your internal services - you can do it using firewalling or various other means. Eg on AWS you get a default-deny firewall.
By making it necessary to use automated tooling and even obtain certifications just to set up everything required for private networking. The complexity of a properly configured service mesh in AWS is staggering, extremely hard for newbies to get right, and easy to fuck up big time down the road. That hurts security.
> just to set up everything required for private networking.
Frankly, no.
I'm tempted to agree only in one aspect, which is apparently you are completely unfamiliar with the topic, and the degree of confusion you are showing in your comments suggest you would benefit from an introductory course on the topic, or in the very lease a 5-minute read through the service's documentation.
Certification wold only help because you would need to learn the basics to pass those, and learning the basics would be enough to prevent you to fill in the gaps in your understanding with fabricated nonsense.
> The complexity of a properly configured service mesh in AWS is staggering, extremely hard for newbies to get right, and easy to fuck up big time down the road. That hurts security.
People being way over their head because they can't even grasp a FAQ will definitely hurt security, but the root cause of this failure mode is sheer incompetence.
As the saying goes, poor craftsman blame their tools, and here you are with a tool-blaming fest.
Yes, because publicly addressable does not mean publicly accessible. Set the ACLs to deny by default.
In the v4 world, one can easily accidentally allow access by inadvertently sending traffic toward the wrong group of colliding addresses or otherwise messing up any of a number of things that ought not to be necessary in the first place.
Okay, in real life I need private addresses because I connect to things that are only available over IPv4. So there’s some negotiation to make sure that my private network does not have an addressing conflict with the other network, there are NATs in the way, and traceroute gives output that is every bit as bad as you would expect. The ACLs that everyone (arguably quite reasonably) sets up suck are fiddly because the clients don’t have well defined address ranges. When people allocate /24 subsets out of IPv4 private space, the probability of collision is annoyingly high. Amateur hour indeed.
I would take globally unique but “private” IPv6 addresses, over private links, with private routes (dynamic or static), and ACLs that actually make sense any day. Heck, I would happily go IPv6 only!
Private addresses offer no security benefit whatsoever. If you have no firewall, nothing at all prevents me from doing `ip route add 10.0.0.0/8 your.routers.ip.here`
IP is silly and refers to next hops by IP address, which fundamentally makes very little sense, because IP routing actually works by sending packets toward either whatever is on the other end of a point-to-point link irrespective of its address or toward a certain destination on a certain link, where that destination is addressed by a link-specific address (generally a MAC address). In common usage, the sole purposes of a next hop IP address are to identify the link (implicitly, while configuring the route) and to tell the router what IP address to ask for via ARP / neighbor discovery so it can actually route there.
With that in mind you are (on Linux, anyway) very much prevented from this particular mucking around:
$ sudo ip route add 1.2.3.4/32 via 5.6.7.8
Error: Nexthop has invalid gateway.
Because it's not actually possible to route a packet via a host that isn't locally reachable.
You can try to send packets using various encapsulation schemes to try to convince an intermediate router to decapsulate the packet and forward it to an attacker-controlled address, and someone manages to pull this off every now and then. Actually getting the evil packets in question to traverse the public Internet can be challenging but is not necessarily impossible. So the actual point stands -- relying on a private IPv4 address range to be unreachable by the general public merely by virtue of being private and without using an ACL is a mistake.
(a) addresses that are globally unique but not globally routable. (These are extremely common in IPv6. These are not so common in IPv4 because IPv4 addresses are expensive, so people try to minimize usage, so people will try to avoid using paid-for globally unique addresses for non-routable purposes.)
(b) addresses that are in ranges that are, per spec and actual usage, only even defined within an organization and are not globally unique. For example, 192.168.0.1.
(a) and (b) are not the same by any useful definition. Sorry.
The difference becomes apparent when you connect or combine organizations using the same private range. It's a lot simpler to route networks (not globally, privately) when the ranges are unique. There's no NATs, double NATs, and other nonsense to deal with.
Even without this, it's a lot simpler administratively. I have about 20 AWS accounts, all with their own VPCs, all using the same 10.0.0.0 block because... well.. nobody thought about this. What do I do if they need to communicate? (They probably won't, but...)
This “human usable” argument gets trotted out so much on here and elsewhere, but the same people would be surprised to know about HTTP/2, TLS and the like, which by that definition, isn’t human usable either because of binary formats and encryption.
People never interact with these protocols directly and use a layer of indirection such as a HTTP/2 client for HTTP, and the same applies for IPv6: use DNS (or your hosts file).
There are a number of relevant issues here, including the general problem that DNS is not trustable and is not reliable or not reliable enough to use for configuring routers and firewalls. It is not even necessarily accessible or usable for reverse lookup at all. DNS wasn't really designed for common cases where network administrators enter IP address prefixes. That could probably fixed to some degree using a name system that was designed for security use, including operation when the network is partitioned or wildly malfunctioning.
And of course the need to maintain two sets of IP addresses and two sets of IP address prefixes - even and especially in DNS itself - is probably the number one factor slowing down the deployment of IPv6. That and far too many places, far too many interfaces, far too many protocols, and far too many APIs (notably Berkeley sockets) that are not transparent to which network layer protocol is being used or what the address format is. The wire format, transfer format, configuration format, and administration of DNS address records is a case in point.
Adding another DNS record or changing a socket listener is hardly the issue though. Most sysadmins are unfamiliar with IPv6 networking concepts such as NDP, DHCPv6 and so on, and having to learn a new system is what hinders its adoption.
Unfortunately, such changes are quite common in networking; Linux networking has many moving parts these days, there was the move to iproute2 and nftables, and the like, so one can only try to best keep up with the changes.
If you're setting up your private DNS resolvers, you can add PTR records to it. There's nothing special about PTR records in IPv6, they're just DNS records for "in6-addr.arpa" instead of "in-addr.arpa".
Some of these "not human usable" complaints about typing/memorizing/pattern matching IPv6 addresses remind me of how long the distributed version control industry struggled with content-addressed storage and how "human usable" it was or was not. As the legends go Monotone spent years of engineering and lots of complicated code trying to build nice human usable sequence numbers in a distributed fashion, and then git just said "do the simple, stupid thing: show the (prefix of the) hash, people will adapt" and people did.
IPv6 doesn't seem "human usable" sometimes in large part because you aren't actually using it. People adapt. The human skills in pattern matching are robust: there are new tricks to learn, but there were always tricks to learn. (IPv4 addresses aren't "human usable" either if you sit down to truly assess absolutely how many RFCs are involved to build the patterns "everyone" has internalized that seem "easy". They are easy because they are familiar, because you use them often, because you've already adapted to them.)
If you remember the early Internet (the 90's, before NAT took off), you'd realize end-to-end connectivity, globally unique addresses is actually the norm. IPv6 is bringing that back. I remember having public IPv4 on my desktop!
>not internet style architectures with end-to-end addressing.
The inside of my service is not the internet, even though my service may be exposed on the internet; why would I want internal implementation details externalized?
I'm not seeing how security is harmed by making things inaccessible to the public Internet if you don't intend for services you do not control to ever access them directly.
As an AWS customer I want to escape IP entirely. It's a waste of time managing these complex networking systems with their archaic protocols (IP, BGP, DNS, etc)
Just let me strongly associate identities with my workloads and apply policy indicating which workloads should be able to send data with which other workloads.
How data gets from one workload to another should not even be my concern, just make it happen.
I agree completely. FWIW, this sentiment is why we're seeing a lot of cloud "platforms" crop up that do exactly what you're talking about. Rather than get mired in the component-zoo of virtualized datacenter (read: pretend) networking, just abstract all of it away.
While your sentiment is valid, this is the type of argument people make on low code solutions. Which has never worked in reality and never will. There's just too much nuance and detail that needs to be considered when you have to do and optimize real workloads.
Of course but it pushes the abstraction forward to what we really want. It turns out that most applications don't actually want to mess with IP except as an implementation detail and optimization. Which is why most of the time you don't and you just get some application layer HTTP payload, RPC thing or WSGIish type call on the incoming side and let the ops people deal with the networking bits and it works well enough that people mostly don't complain. The request is that more outgoing services adopt this model where you had it off to your application server and it does the work and just gives you the data you want back.
I don't think it's that crazy, it's just formally standardizing where we're already going.
In my experience the biggest issue for being IPv6 only in AWS is that github still can't IPv6! Tons of software expects to be able to reach out to github for something.
One can use some public NAT64 services, but that's not very reliable for anything serious. https://nat64.xyz/ . AWS chargers arm and leg for NAT gateways traffic, and I don't think it's possible to configure them so that they only intercept traffic to ipv4-only hosts (please let me know if i'm wrong).
Other than this being IPv6-only in AWS works flawlessly and is cheaper (free egress gateways for private networks). As long as you don't care about IPv4 of course - that's given.
If you have a dual stack network in AWS, wouldn’t it prefer IPv6 from DNS but IPv4 would still work? This is how I have most of my things that need general internet egress configured
IMO services like Lambda and S3 not supporting IPv6 is the real issue, and AWS shouldn’t have made the pricing change without first making them dual-stacked, accessible over IPv6.
(Technically S3 does have a separate dual stack endpoint, however it doesn’t really help as I have to change application configuration anyway to deal with this change.)
Half the reason AWS has leading IPv6 support in the first place is due to mandates from the US government to start migrating. Author is correct that, from a cost perspective, the new costs are immaterial to large customers, but I wouldn't discount the power of policy mandates from the largest customers, where the threat of building an in-house alternative to comply with policy might be sufficient to force AWS to finally prioritize support.
Indeed. I really don't like the thought, but I more and more believe that there is no other way to incentivize IPv6 at the "server side". The client (end user) side seems to do well, considering that Google reports IPv6 end user traffic of almost 50% these days.
Worst for me is CloudFront not supporting IPv6 for custom origins. If you happen to run a lot of separate Fargate containers as origins, you have to enable public IPv4 addresses for them, and that will soon double the price of small confainers. Amazon needs to make their infrastructure actually support IPv6 before starting to charge extra for legacy IPv4 usage.
Yeah, somehow this one hurts the most. I know Amazon has built a giant beast here and rolling out IPv6 across all of their million services is a huge undertaking, but I can't see any reason for CloudFront not supporting IPv6 origins, like yesterday. It doesn't seem like it should be that hard relatively speaking, and would provide a good tool for working around other limitations. Honestly I've always felt that Amazon's decision-making ultimately had the best interests of their customers in mind, until now. I think this is a bad sign for things to come.
It would really help if there were real ISP competition in the USA. There's only one actually broadband ISP provider where I rent, which is in the suburbs near Seattle. It's NOT a rural area by any definition, and yet Comcast is my only option. Their price and service reflect that reality...
And Comcast were the biggest proponents of building IPv6 support into the DOCSIS specs, because they exhausted 10/8 in the management network for their modem fleet.
Starlink doesn’t have capacity for a large urban area. There’s only so many satellites available per square mile, and if you have a sizable fraction of a major city using starlink, they’d all be bottlenecked on a handful of satellites at best, even with the huge number of satellites. Starlink only makes sense for low-geographic-density deployments, where the number of customers on the same satellite is (relatively) low.
IPv6 adoption is only going to further the consolidation of customers onto the big monopoly providers. They will be the only ones who can afford to add the dedicated network engineering staff to make it work reliably.
Most people don't realize there are two IPv6 internets right now, the Cogent side and the Hurricane Electric side. Both are equally sized and refuse to connect to each other, so you need to know that and either buy transit from both or buy transit from a network that buys from both. At least one major provider I know of is still running v6 over tunnels. In many places your v6 traffic is taking suboptimal routes, whereas an enterprise network may have v4 connectivity at each datacenter, v6 all gets sent to that one box under Dave's desk.
But we continue to measure v6 adoption at places like Google and Cloudflare where dedicated teams make sure packets arrive and pat ourselves on the back.
> two IPv6 internets right now, the Cogent side and the Hurricane Electric side
Cogent engages in peering spats on IPv4 too; this dynamic is not new with or unique to IPv6, or limited to Cogent/HE. The lesson here is to not go singlehomed under Cogent, not to reject IPv6.
The takeaway here was that you need to be aware of it to make IPv6 work. Again, your average operator of a small regional WISP may try to deploy v6 because they lack v4 space and face customer complaints because they single home behind HE and can't reach the other half of the internet.
Currently v6 is like connecting to the late 90s internet. It dosen't work as well as people think.
You need to be aware of it to make v4 work, too. I once saw an issue where a network similar to your example WISP (singlehomed behind HE) was unable to reach a network that was advertising its v4 prefixes NO_EXPORT to HE at a distant IX. Adding a second upstream would have fixed (and indeed, did later fix) the issue. It's been advisable since the early days to have multiple upstreams because of routing table holes created by situations like this. Again, not unique to v6, and the HE/Cogent schism is not the only one (though admittedly it is likely the largest).
The "cannot escape IPv4" is apt because while you can setup an IPv6 only VPC so many things break; from various AWS services to package repositories [0]. So then you're stuck either enabling IPv4 or running a NAT64 gateway (or trusting someone to run one for you [1]).
IPv6 is such a massive headache it’s kind of mind boggling. I used to be super enthused - but it is absolutely less useful and more annoying than it’s worth.
This is one example where it's clear IPv6 isn't the problem, actually. A lot of problems with AWS would disappear if they would just support IPv6 like your average budget ISP does.
IPv6 just works. Amazon, Github, and Azure don't. That's not really a problem in most cases (very few people go IPv6 only because it's just not necessary with CGNAT, and even then network translation tricks can put up IPv6<->IPv4 bridges easily). In Amazon's case, they don't even need to bother setting up a real network, they could abuse an fd00::/8 network to mimic their 10.0.0.0/8 network if they wanted to.
Amazon is terrible at implementing modern standards. Just look at how long it took them to support DNSSEC on their domains, and even that didn't exactly roll out great the first time.
Only via the herculean efforts of a bunch of people having to literally reinvent the world to deal with it. Everything needs IPv6 support specifically. It's such a mess, if IPv6 has just been identical to IPv4 but with larger addresses we would be on it by now. But no they had to make it their religious crusade to eliminate NAT (and now we have NAT66 so clearly a winner) put IPSec in there which is hilarious in the era of Wireguard and eliminate DHCP which is actually insane and makes a stupid number of assumptions about hosts being able to communicate with one another and actually complicates DNS registration.
Can you imagine how trivial it would have been if you could support both v4 and v6 by just supporting v6 and having 0::v4addr be literally equivalent to ipv4? It would be more difficult to not support v6.
> Can you imagine how trivial it would have been if you could support both v4 and v6 by just supporting v6 and having 0::v4addr be literally equivalent to ipv4? It would be more difficult to not support v6.
And how are you supposed to get the packets back when the client has an address outside that range? You still need to add support everywhere, or have NAT gateways into the areas that lack support.
Automatic mapping of IPv4 addresses exists but it requires support infrastructure just as much as any other method of allowing access to IPv4 devices.
Yes I'm not qualified to really argue the point but naively it never made sense to me that IPv4 was not backwards compatible with IPv6 addressing. You'd think the people on the committees would have foreseen the trouble that would avoid. Telephone companies didn't make you dial the area code for local numbers. Microsoft bent over backwards to make sure that old DOS software still worked on Windows. Linux has a mandate to never break userland. This is not an unfamiliar concept.
IP is a bidirectional protocol, so the correct analogy would be if Microsoft had to make DOS software run on Windows, and Windows software run on DOS.
Remember the error "This program cannot be run in DOS mode"?
IPv4 and IPv6 are unidirectionally compatible using NAT64. The fact that you can make an IPv6 to IPv4 TLS connection using a packet-level translator without breaking the endpoints is quite remarkable, and it wouldn't have been possible if the protocols were too dissimilar.
WAN failover is not fun with ipv6 - npt doesn’t solve things because prefix lengths are variable. You end up back with NAT with basic networks again but with ridiculously large address space. Firewall rules a trickier- you need to both let ICMP through but be careful because some can drive network reconfig. DHCP is second class, and the network can do weird things when port isolations are on. The number of (rotating) ipv6 addresses per host gets silly and makes logging / accountability/ trace back systems more convoluted. Then you’ve got neighbor discovery threats, header extension manipulation stuff. And if multicast isn’t working because of a security configuration that breaks assumptions but you also have multicast amplification stuff.
There is a reason well resourced companies like google cloud have been slow w IPv6 - and it can be even more hair pulling in smaller settings.
> npt doesn’t solve things because prefix lengths are variable
Then you pick your shortest prefix length and use that for your network configuration, no? Nothing in NPT is forcing you to use a /48 or a /56, if your failover uplink only provides you with a /80 for some stupid reason you'll still be able to do translation.
DHCPv6 is supported just fine by everything but Android (for some annoying reason). Even with SLAAC, IP addresses shouldn't rotate, unless you enable Privacy Extensions on your server.
If this were the bakery just around the corner we're talking about, I would've accepted these problems as illogical to even try to overcome, but these are billion dollar companies selling network access. When networking is one of your major streams of revenue, I expect better.
Amazon doesn't support DNSSEC on their domains. If you mean that it's possible to DNSSEC-sign a Route53 domain, yes, but AMAZON.COM isn't signed. Very few tech companies have signed their domains; it's a net negative.
I've been listening to a very good IPv6 related podcast with knowledgeable hosts (IPv6 Buzz) and all it's done is convince me that IPv6 is a poorly thought out mistake.
Every other episode seems to be about a different new RFC that's replacing another RFC because the original ended up having a bunch of holes and edge cases. That's somewhat understandable for a new protocol but the protocol have been around for almost 30 years and is just so overly complex that it's rife with these situations.
As an example the most recent such episode was on rfc6724[0] which describes these convoluted algorithms systems are supposed to follow to determine which of their many assigned IPv6 addresses to use for a particular connection and also which of many possible destination addresses to use. Just reading the introduction makes your eyes water with how overly complex and prone to nasty failure cases (what if the source address isn't what you expect and somehow the connection routes around your firewall?) the whole situation they've created is.
This somehow reminds me of some anti EV (Electric Vehicle) people. They accept ICEVs (Internal Combustion Engine Vehicle) as given and normal (ICEVs just exist, the fuel falls from the sky) but dig really deep into an anti EV mindset. They follow anti EV blogs and podcasts. They will tell you how bad EVs are for the environment, how mutch water and cobalt and what not is used for the production without acknowledging that the same is true for ICEVs.
I use IPv6 since 2006 and I just can't see how it can give you "a massive headache". I read a blog post about how overly complex HTTP/3 is. Better ignore it forever and never implement it then. ;-) Also, which successful RFC protocol doesn't have a see of follow-up RFCs?
There's a nugget of truth in anti-EV comments, namely 1) EVs are in many ways a mediocre solution (EVs will increase batteries etc needed), and 2) pursuing EVs carries an opportunity cost (public transport will reduce emissions far more than EVs and be substantially cheaper than cars as well, and a $1000 EV subsidy could instead be a free ebike).
That said, EVs are mostly better than ICEVs - "mostly" because performance-heavy applications like long-haul trucking and tractors still benefit heavily from fossil fuels, and in most other applications EVs are still more expensive than ICEVs.
It’s really not a headache in and of itself. The technology is beautiful and enables lots of cool things - just look at all the awesome stuff the fly.io team makes it do.
The headache are vendors that still refuse to properly implement IPv6, in 2023.
It's not that people dislike IPv6 or like IPv4, it's that network people are comfortable with IPv4 and all the extra tech surrounding it. They know it works, so there's no technological risk. There's nothing new to learn. It's cheap. There's nothing your average business wants to do that can't be done on IPv4 that can on IPv6. The ROI of just paying for IPv4 addresses and associated tech/services is undeniable.
Yes, for some who understand and have tested it, we do not like or want IPv6, it has no privacy when the device's IP is public on the Internet 24/7. No privacy extensions fix this. Test it, it's not difficult, disable IPv6 in your home router, wait a few hours, the kids will be complaining their search results are messed up, that's just the start of the indication that the advertisers now have a veil where with IPv6 they had clear fully trackable results.
> The first pattern is having multiple Load Balancers (per VPC); this is often the result of using several readily available Cloudformation templates / Terraform modules, or somehow using Kubernetes ingress controllers that create a Load Balancer for every service. This is fixed by not doing that! A single Load Balancer can handle many URLs and services.
This is the definition of cloud bloat. The fact there are tons of systems abusing that kind of architecture probably justifies charging for IPv4.
I think it's an impedance mismatch between the feature people want -- "logical load balancers" and the feature they're offered "physical load balancers."
How nice it would be if you could just create a bunch of load balancers and all that actually meant was that it was just adding config profiles to a single physical load balancer and kept
them truly isolated? Right now it's really annoying because load balancer config is global state and everyone has to
either be kind neighbors when adding themselves to it or manage them top-down.
I do believe the AWS Load Balancer Controller on Kubernetes allows for sharing a single "physical" load balancer.
You have to set a load-balancer-name annotation https://kubernetes-sigs.github.io/aws-load-balancer-controll... to tie everything together to one load balancer. There is a downside where you have to have a few other annotations be the same value across your ingresses, but once you work around that, you're good to go.
I couldn't find it easily specified in docs. This is a common use-case and part of why I avoid EKS for HTTP workloads is that I have tiny services I want to just make available and I don't want to have another full ELB sitting there. It isn't a cost thing primarily. It's that now I have another significant resource. I want all of these things on a misc LB.
I use DigitalOcean... almost all their products support IPv6. Only floating IPs are IPv4, but can work around that by not destroying droplets so the IPv6 address doesn't change.
One annoyance I have is that IPv6 is disabled on VMs (droplets) by default. So every time you upgrade your Kubernetes cluster and it recreates every node they all have no IPv6 again with no way to change this. In the end I couldn't be bothered and just stopped rebooting every node after every update.
Honestly IPv6 is a clusterfuck. From the horrible addresses (why are they impossible to memorize? Who thought that was a smart idea? At least I can wrap my brain around IPv4) to the need for specific support in literally every layer of the network stack.
If you are going to mention gateways or other methods make it work please just stop. No end-user is going to do that, or rather no appreciable amount of end users are going to do it. If your fix starts with “why don’t you just…” then please stop living in a fantasy world.
I was excited for IPv6 when it was announced, I was excited years later, I was excited a decade later, now I’m just tired of it. 2024, year of IPv6 and and the Linux desktop, ok sure. My ISP, literally the best available in my area and fairly cutting edge in every other aspect, has zero IPv6 support.
While the idea of every device having its own public IP address was attractive to a younger me, I look at it with a bit of horror now. The privacy/security aspects alone are staggering and you rarely want your device to be publicly available by default. I’m not going to exceed the 16M+ limit of 10.0.0.0/8 so I don’t see why I would ever want to use anything but IPv4 internally for my sanity. Are STUN/TURN servers fun? Is needing some central server ideal? No but the alternative (everyone can talk to everyone directly) makes my head hurt with the implications and footguns.
At the end of the day I’ve started disabling IPv6 as a matter of course. Leaving it on is a landmine I’m laying for my future self. I’ve dealt with too many issues directly myself or for clients/customers which end with “let’s try disabling IPv6, oh it’s working now?” (on my end or theirs) that I’m done. Something drastic would have the happen to get me to change that thinking and seeing how it’s been over 2 decades and major websites I use daily still don’t support IPv6 I’m not holding my breath.
>From the horrible addresses (why are they impossible to memorize? Who thought that was a smart idea? At least I can wrap my brain around IPv4)
Every time someone brings up this point, I have to assume that they know nothing about IPv6 but the superficial things.
If you work with IPv6 long enough you will remember the addresses, we all remember 192.168.0.* through years of typing it repeatedly and looking at it. Not because it is easily to remember. I can already recall 2606:4700:4700::1111 or 64:ff9b::101:101 from memory.
>My ISP, literally the best available in my area and fairly cutting edge in every other aspect, has zero IPv6 support.
This is almost exclusively an Euro-American phenomenon. I am not sure why you are lashing out on IPv6 when it's the ISPs' fault. In most East or Southeast Asian countries we are looking at double-digit % of IPv6 deployment, the moment you click on the IPv6 checkbox you get IPv6 connectivity here.
>The privacy/security aspects alone are staggering and you rarely want your device to be publicly available by default.
Another one who mistakes "having a globally unique address" with "public accessibility". Boo.
>“let’s try disabling IPv6, oh it’s working now?” (on my end or theirs) that I’m done
Just say that you are lazy in fixing IPv6 problems. I have found that lots of old networking guys would say "it's defo my fault somewhere" when IPv4 fails but when it comes to IPv6 it's always IPv6's fault somehow. Protip: most of time it isn't.
I feel like there's now a perverse incentive here for Amazon to drag their feet at implementing full feature parity with IPv6. I wish these new charges only applied for IPv4 addresses used with services that already do have that.
There's one viable solution to be able to run IPv6 only subnets in AWS, their (or your own) NAT gateways support v6->v4 NAT. So it allows you to create large IPv6 only subnets for your compute services (ec2, ecs, k8s, elb, all supports that), allowing your containers to scale without worrying about IP addresses. Then you use dual stack subnets for other AWS services that may not support IPv6 and your compute services can access them through the NAT gateway.
> RDS (nine in ten customers have public IP on RDS by accident)
AWS does make it easy to fuck up with its default settings. Subnets that auto-assign EIPs for every instance attached to them should not exist, period. And neither should RDS instances or anything else be reachable from the public Internet by default.
There needs to be a body of law relating to technical matters like this (and interoperability etc) that is adjacent to competition law. Some things we just need everyone to be on the same page about. It is manifestly the case that ipv6 is never going to be that, because the incentives to invest simply don't exist for companies like AWS.
This distorts the market in eyeball networks and hosting - the former are under little pressure to offer v6, and new entrants to the latter can only offer v6. Competition law in the EU works (I think?) on the principles of consumer benefit and market fairness. On that basis, I'm left wondering why this has never been pursued by the EU's competition authorities.
The EU did have a mandate for government services to use IPv6, but the programme it was part of got replaced by another that didn't include IPv6.
The European Commission did advocate for IPv6 use, but, the EU being the EU, motivated their recommendation by complaining that law enforcement had issues tracking down people behind CGNAT, and made clear that they wanted every IP address to point to a specific person for law enforcement reasons.
So, yeah, I don't think we should let the EU deal with the specifics of network infrastructure just yet.
I think it's hard to make an economic argument for IPv6. Yes, it's obviously a superior technology, but ISPs can CGNAT for cheap, consumers can still access every server, and the €40 per year a business needs to pay for an IPv4 address isn't exactly breaking the bank either.
Perhaps the EU should force the issue, but I think countries like Lithuania ,where there is practically no IPv6 available (0.58%, according to https://stats.labs.apnic.net/ipv6-zoom, but who knows how accurate that is), will protest any mandate that will force their ISPs to buy new networking equipment.
Not really that cheap. While CPAEX is CAPEX, OPEX is still a thing and operating CGNAT requires efforts. Also some (most?) CGNAT implementations are buggy and is not a good user experience, even for users who don't understand the concept of IP at all.
> Also some (most?) CGNAT implementations are buggy and is not a good user experience, even for users who don't understand the concept of IP at all.
They're a pain, especially when you're visiting a website with CAPTCHAs, but the money they save on buying IP space seems to be worth the bad experience from an ISP point of view.
Even here in the Netherlands, with its relatively high wages, a fiber ISP decided to use CGNAT on their new fiber networks as a cost-cutting measure. Luckily, customers can disable CGNAT in their online control panel, but the cost cutting measure seems to be worth the annoyed customers from that company's perspective at least. Of course they also didn't roll out IPv6.
Assuming you’re referring to Delta/Caiway… I think they’re expanding quite quickly, considering both started out as smaller local ISPs; so it’s probably between CGNAT and having to acquire IP space for them.
The fact that they’re owned by an investment fund also makes them probably very focused on profitability.
As a point of comparison, the other players aggressively rolling out fiber (KPN, ODF/Odido) have been nationwide ISPs since the 90s, and they aren’t doing CGNAT AFAIK (so they probably aren’t hurting for IP space).
I fully understand their choice to default to CGNAT because of their rapid expansion and the lack of available IPv4 space. However, if they have the money to invest in ISP grade CGNAT equipment, adding IPv6 shouldn't be a big problem.
Ziggo's DS-Lite, which also CGNATs IPv4 traffic, is annoying but at least you get a normal IPv6 subnet. This would've been a much better solution looking forward.
Dutch ISPs in general have plenty of space. Dutch ISPs has 53 million IPv4 addresses for a country of 18 million according to the first result on Google. Every person in every household can have a home connection and two servers without anyone lacking IPv4 addressing if these addresses were all pooled together.
However, there's no guarantee that things will stay this way. Like I said, Ziggo already does a form of CGNAT, and as the price of IPv4 addresses keeps rising, I expect more cheap providers to start selling off address space. KPN will stick to normal IPv4 for a while, but I don't trust super cheap companies like Odido to have the benefit of the consumer in mind, especially after trying to route all traffic through their affiliated German exchange instead of AMS-IX a while back. Odido is owned by an American fund as well (which is why they had to change their name), as is VodafoneZiggo.
The problem is customers don't like CGNAT. You can't run Animal Crossing on Nintendo Switch in network mode as a host if you don't place the Switch as a catch-all in the DMZ.
Wish I were joking here - especially due to the security risk involved in running something in all-ports-open on the Internet - but Nintendo doesn't seem to (want to) run STUN/TURN servers.
Nintendo's hilariously bad Switch networking guides ("to make games work, forward ports 1-65535 to your switch") are more of a Nintendo problem than a CGNAT problem. Normally I'm all for blaming CGNAT for shitty internet issues, but Nintendo is at fault this time, and ISPs should rightly tell their customers to ask Nintendo to get its shit together. Even without CGNAT, STUN/TURN is important to get peer to peer connections working.
CGNAT brings tons of issues, but following Amazon's pricing model, I don't think consumers would be willing to pay $4 a month to rent an IP address. Better to sigh and shrug at the two of three games and programs that don't work than to spend $48 a year, especially with the current cost of living being on the rise.
> Normally I'm all for blaming CGNAT for shitty internet issues, but Nintendo is at fault this time, and ISPs should rightly tell their customers to ask Nintendo to get its shit together. Even without CGNAT, STUN/TURN is important to get peer to peer connections working.
I agree with you, but it doesn't change reality... Nintendo doesn't give a fuck and (from hearsay) people with Nintendo Switches make up a huge proportion of service calls from customers that want CGNAT disabled and pay for a legitimate IP address.
I don't have a complete overview of the industry, but there were at least one or two Ubiquity gateways that didn't support hardware accelerated IPv6 routing. I also read about a lineup of Microtik switches that got updates to enable IPv6 hardware offloading this year.
Perhaps the enterprise side of networking is better about this stuff, but I doubt it if my experiences with other enterprise products is anything to go by.
The packets routed by these devices will end up at their destination, but at very low speeds.
Lots of AWS customers want IPv4 because that's what they know, and that's what they benefit from.
To me, the question is: what stops me today from spinning up an IPv6-only website and having 99% of the world's browsers use it? If the answer is "nothing", then AWS shouldn't be forced to offer IPv6 (or only IPv6) - IPv4 is just part of what they offer customers. If the answer is "these 7 things" then those 7 things need to be fixed[0] before we pay civil servants to try and force companies to do things that they barely understand.
It's impossible for an IPv4 endpoint to accept an IPv6 connection. Perhaps you have a dual-stack CDN with an unpublished IPv6 address that some users have found? Or your service is accepting third-party 'Forwarded' headers, which would allow HTTP clients to spoof their IP address.
Sorry for the aside, but I hope the neveragain.de author will make a blog post about their site theme. I _really_ like it, and of course I would like to mostly copy it for my own personal site.
That said, until the cost of IPv4 becomes really huge, few organizations are going to suffer the effort-cost of embracing IPv6.
I would argue that the IPv6 sales story is unmemorable|unclear|weak. Also, it is arguable that most IPv4 addresses are wasted.
The moment you have a customer with crappy legacy infrastructure who refuses to allowlist anything but static IPV4 addresses, you have to support IPV4.
My ISP, Fidium, does not have the word IPv6 on its entire website. And definitely has no support of it on my WAN. They should. I want to connect to IPv6 services using their connection.
I actually keep a cheap Comcast connection as a second WAN just to get IPv6 enabled on my home network. (And also because I live in the woods in New Hampshire and having two ISPs means I have fairly ok uptime)
Off topic: Does anyone know if this page is generated from a Static-Site generator starting from Markdown?
I currently use Hugo and my blog is in Markdown in git, but the theme is pretty heavy-weight, and I like this look of the page in OP; Looking at the source, it's so minimal!
As mentioned in the footnote, this can be done by using PrivateLink; it costs a few bucks too, but it is the way to go if your VPC does not (or must not, for Compliance™ reasons) have internet connectivity.
If your target VPC has neither PrivateLink nor public IPv4 connectivity somewhere, I'm not sure how that would work; I'd love to learn how that was built.
Yeah, sure, we use PrivateLink. In my opinion, it's clickbait to say "almost no AWS API can be used from a VPC without public IPv4 addresses" with a footnote "actually most can if you use the service that enables that".
If you get a /48 you can probably evade the problem by assigning a /64 for your proxy at a time. You will have another ~65,500 such blocks for use.
Yes, some might just block the /56 (you would still have another ~250 chances) or /48 but nothing is perfect.
At the ISP level, you have better chances of having IPv6 connectivity if you’re based out of a developing country, whose ISPs don’t have the means to pay for too many IPv4 ranges.
For servers, there are plenty; AWS Lightsail, Hetzner and Vultr both provide IPv6 out of the box. If you don’t have an ISP which provides IPv6, you could use a server and set up a wireguard tunnel for IPv6 connectivity.
> you have better chances of having IPv6 connectivity if you’re based out of a developing country, whose ISPs don’t have the means to pay for too many IPv4 ranges
Thank you for the link. I was taking about India for the most part, but it seems France, Germany, India, and Saudi Arabia are the leaders in IPv6 deployment, a weird mix of countries that I honestly didn’t expect.
The Principal PM in charge of the "regionalization" effort was asked in a Q&A "why didn't we just switch to IPv6?".
Her answer was something along the lines of "The number of internal networking devices we currently have that cannot support IPv6 is so large that to replace them we would have needed to buy nearly the entire world's yearly output of those devices, and then install them all."[0]
It's easy to presume malicious intent on the IPv4 front from Amazon, but with so many AWS systems being on the scale they are at, I find it easy to believe that replacing all of the old network hardware may just be a project too large to do on a short timescale.
[0] - At least, that's my memory of it. I'm sure that's not an entirely accurate quotation.