> This is the same IP address: 3232271615. You get that by interpreting the 4 bytes of the IP address as a big-endian unsigned 32-bit integer, and print that. This leads to a classic parlor trick: if you try to visit http://3232271615 , Chrome will load http://192.168.140.255.
This was the source of one of my favorite “bugs” ever. I was working on multiple mobile apps for a company, and they had a deep link setup that was incredibly basic: <scheme>://<integer>, which would take you to an article with a simple incrementing ID. This deep link system “just worked” on iOS and Android; take the URL, grab the host, parse it as an int, grab that story ID. Windows Phone, however… the integers we were parsing out were totally wrong, returning incredibly old stories!
Turned out that the host we were given by the frameworks from the URL was auto-converted to an IP in dotted-quad format, and then the int parser was just grabbing the last segment… which meant that we were always getting stories <256, instead of the ~40000 range we were expecting.
Curiously, this appears to be a bug in Windows Phone. In URIs, part following `//` is called authority, which is essentially a host with some optional additional stuff (like port number).
According to RFC 1123, hostname could legally be entirely numeric, and web browser shouldn't attempt to "correct" it (as it is a valid URI) for schemas it doesn't know anything about - as it doesn't know the rules for hostname for a given protocol. This is also not a valid IP address according to RFC 3986 (which specifies URI syntax), as this specification requires #.#.#.# format with three dots.
That said, using authority for something that isn't technically a hostname is misusing the field. I think using `<scheme>:<integer>` would have been a better idea.
Funny bug :) I often prefix integers with a char, e.g. maybe "u12345" here, in places where I'm using integers as id to force a string conversion and avoid any code accidentally doing math on it.
I love those kind of stories. it shows on how high-level abstraction we are working on a daily basis when we have no clue what is going on with stuff which we are touching constantly.
I bet it was not “I love it” sentiment when you had to debug this kind of issue though, haha :)
Any bug you can learn something from is better than the alternative. :)
Thankfully, I caught this one while building the feature in the first place; I don’t imagine I’d have such fond memories of it if I’d had to recreate it from user reports!
Well I read how foobar2000 dev (Peter) developed his apps on Windows Phone. He said it was so annoying compared to other two major mobile platform and he was considering to refund the crowdsourced money which comes from pledging Windows Phone.
These different representations also lead to frequent server side request forgery (SSRF) bypasses - someone might be blocking local IPv4 but you can still access their AWS metadata endpoint at ::ffff:169.254.169.254, etc.
For anyone using Ruby, I'm the author of a gem [1] that comprehensively protects against SSRF bugs. For anyone using Golang I recommend this [2] blog post.
I've found, and reported, a whole bunch of services which take user-supplied URLs and don't filter out access to localhost:8080/server-status, and similar local resources.
A common route to attacking these is to access the AWS metadata URL endpoint. Something at least the Google cloud prevents, by forcing the use of the `Metadata-Flavor: Google` header.
I wonder how many of these bugs are the result of people thinking "Well I've read the spec but most of it is 'cursed' so I'll just implement this subset which fits my idea of 'acceptable'".
Unfortunately the blacklisting approach that works on IPv4 is completely broken for IPv6 since you can't really know where your own services are. I still did not find a good generic way to protect IPv6 and ended up just disallowing it so far everywhere.
IPv6 has internal ranges defined just like IPv4 does - anything in an internal range should be blocked, and anything in an external range is safe to pass through.
Since there is no NAT in an IPv6 deployment unsafe services you typically want to prevent access to look like non internal ranges. Whereas in your normal IPv4 deployment you might have your protected service on 192.168.4.111 with IPv6 it will just share the same prefix (potentially) as your host.
NAT was meant to work around IP exhaustion issues, not act as a security layer. By accident, it often happens to provide some additional security. By keep in mind those "internal" IP addresses may be routable in some cases, due to either accidental or deliberate mis-configuration.
IPv6, with end-to-end connectivity, is how the Internet is supposed to work. It's how it did work in the early 90's, even with IPv4.
If you want to secure your servers, use a firewall. Maybe it's a host-based firewall.
What are the practical reasons we should rearchitect our systems to remove NAT?
(I know the weaknesses of NAT but the cat’s out of the bag at this point...the question isn’t really “why should we use NAT”, it’s “why should we go through the pain of breaking it”.)
IPv6 does nothing to break NAT, you could deploy the exact same kind of NAT and it would have the exact same behaviour, if you really want to make your router use a bunch more memory/CPU and make it a pain for users to do anything that needs a direct connection. But you gain nothing from doing that.
It's nice that this is how the internet is "supposed to work". In practice not having a NAT makes "automatic" internal protection of web services hard to impossible.
> If you want to secure your servers, use a firewall. Maybe it's a host-based firewall.
Firewalls do not solve this problem because a you do want service to service communication. What you do not want is code that crawls to user supplied URLs to access your internal services. Do you need application level protections. With IPv6 you're basically forced to declare your CIDR explicitly whereas with IPv4 you could easily achieve a secure by default system.
> IPv6, ..., is how the Internet is supposed to work
No, the way the Internet is supposed to work is that you have one routable address space. If you need to expand it, the previous address space is imported as a subset of the new one.
I will never forgive the IPv6 for not making the 32-bit IPv4 space a subrange of the 128-bit IPv6 space. Years after winning the IPng wars they admitted their mistake and standardized NAT64, but it was too late. NAT64 should have been part of IPv6 from day one, and every IPv6 router acting as a default route gateway should have been mandatorily-required to offer NAT64.
Mandating that every router has to do stateful connection tracking would have been an enormous, wasteful burden. NAT64 is there for those who need it; 464XLAT setups with IPv6-only clients are quietly the reality on networks that don't have too much legacy infrastructure (mostly mobile).
And having two completely separate internets (IPv6 and IPv4) forever isn't a wasteful burden?
Not requiring backwards compatibility in IPv6 guaranteed that IPv4 would be around forever. IPv4 is never, ever going away because of this.
The only people who couldn't see this coming were from Bell System backgrounds where you could use centralized schemes like "Ma Bell says tomorrow is the Flag Day, flip the switch". In a decentralized system people don't stop using the old system until you give them a new system that is backwards compatible. Then you drop the backwards compatibility in a second, separate upgrade much later, on a timetable dictated by adoption, not flag days.
> And having two completely separate internets (IPv6 and IPv4) forever isn't a wasteful burden?
Your proposal would force maintaining IPv4 for longer and in more networks: every IPv6 router would have to have IPv4 connectivity and probably a routeable IPv4 address, so it wouldn't even solve the address exhaustion problem for long (perhaps not even at all).
> Not requiring backwards compatibility in IPv6 guaranteed that IPv4 would be around forever. IPv4 is never, ever going away because of this.
IPv4 has already been eliminated from newer edge networks, and for those networks the vast majority of upstream traffic is IPv6. No doubt those networks will have to maintain 464XLAT for a long time as the long tail of upstream sites that are only v4-accessible, but they'll be able to have a smaller and smaller pool of 464XLAT servers and outsource the v4 connectivity support further and further upstream (just as with Usenet), until eventually v4 connectivity becomes a paid add-on and then goes away entirely. Home routers for use with PCs will probably have to offer 4over6 for a long time, because it's hard for an ISP to be confident all their users are up to date, but that doesn't actually reduce the benefits that much (all your internal network management can still be v6, only the little home user LANs are v4), and organisations that manage all their endpoint devices don't even need that much.
Maybe I'm misunderstanding you, but one of the points of the article in that you can represent IPv4 addresses in IPv6. In other words, IPv4 is a subset of IPv6.
If I'm on an ipv6-only host and blast UDP at ::ffff:1.2.3.4, they should get delivered to 1.2.3.4, no?
The actual, real-world problem with 4 being a subrange of 6 is that 4-only hosts are blissfully unaware that the super-range exists, so have no mechanism to send packets there. This is of course where you're right about NAT64 and the state requirements.
> If I'm on an ipv6-only host and blast UDP at ::ffff:1.2.3.4, they should get delivered to 1.2.3.4, no?
No, it won't necessarily! That's precisely the problem. Until NAT64 was introduced it was in fact impossible for an IPv6 router to deliver your packet to the IPv4 host 1.2.3.4. NAT64 still isn't mandatory (and likely never will be), so if you're writing software you can't assume those packets will get through even if you have an IPv6 network connection with a default route.
NAT64 didn't come about until long after IPv6 was finalized, and NAT64 support from default-route IPv6 routers is still is not mandatory. That's why we have this mess with dual-stack hosts: you cannot safely assume that your IPv6 router is willing to deal with the IPv4 world on your behalf.
The ::ffff:1.2.3.4 address space does, in fact, date back to the early days of IPv6 (it came from RFC 2765, about one year after IPv6 was finalized), but it was not meant for letting IPv6 clients share a single IPv4 address -- it was only for servers which for some reason had their own IPv4 address but couldn't speak IPv4. Yeah, back in the early 2000s people thought this problem might happen.
The IPv6 committee was viciously hostile to NATs. The way they saw it, NATs were the problem that made IPv6 necessary, so no way were they going to allow any NATs to pollute their precious IPv6. If that meant that the whole world had to run two separate internets (IPv4 and IPv6) for the rest of eternity just to keep the IPv6 network puritanically NAT-free, then so be it!
It took them more than a decade to realize how stupid this mindset was.
Ah! T see what you mean. Thanks for clarifying and correcting me.
I possibly have some sympathy with the anti-NAT view taken at the time, even if it ended up being the wrong thing to do it hindsight. Adding more mandatory complexity to implementors would have harmed adoption rates, and I've seen some weird edge cases with NAT64 - it's not necessarily a trivial thing to implement correctly.
Yes, but having two entirely separate internets, like we do today, is much more complex than any amount of NATting!
I fault the IPv6 proponents for not forseeing our current situation. DJB saw it with crystal clarity in 2001. Lots of people warned them that this would happen.
::ffff:0:0/96 is for representing v4 addresses in v6 APIs. If you tell the kernel you want to send a packet to ::ffff:1.2.3.4, you're actually telling it you want to send a (v4!) packet to 1.2.3.4, you're just doing it using an AF_INET6 socket rather than an AF_INET one.
Since packets aren't APIs, you should never see ::ffff:0:0/96 in packets on the wire. A v6-only host can't use this prefix to send v4 packets to v4 hosts.
(What would the source address of those packets even be?)
0 is 0.0.0.0, which is not a valid address for most purposes. Some programs, like iputils ping, have special handling for that case (i.e. using it as an alias for the unroutable host address); some programs, like FreeBSD's ping, do not [1]. Unlike most of these address tricks, it's not standardized, except that treating it as a normal address is technically disallowed.
Interesting. That doesn't seem to be specified in the RFC1122 standard or the Linux ip(7) docs, but it's an explicit special case in the kernel (ip_route_output_key_hash_rcu):
if (!fl4->daddr) {
fl4->daddr = fl4->saddr;
if (!fl4->daddr)
fl4->daddr = fl4->saddr = htonl(INADDR_LOOPBACK);
...
Would you be further humbled if the ipad accepted http://CXXVII.I also?
I'm never writing anything that positively accepts 127.1, or 0127.000.000.0001 as a valid address no matter what garbage implementations do.
The issue we have with this are situations when we have to accept only inputs that are domain names which are sure not to be treated as an IP address by some software downstream of us.
I'm now going to change my LAN to use 10.0.0.1 instead of 192.168.0.1 so that I can just type 10.1 This will help not only when testing stuff on mobiles only to have to rewrite the whole adress again because you forgot http:// but also when telling the kids what IP to connect to when setting up LAN games. Or coworkers when telling them them some LAN/router IP. Time server is on 10.36
But we'll see how well that works... I just fed the first 4 google results for "ip address converter" with 10.1: Three converters gave an error message and one came up with 0.0.0.10.
> I’m on the fence about that last one, the “IPv6 with an embedded dotted decimal” form. My reference parser (Go’s net.ParseIP) understands it, but it’s not really that useful any more in the real world. At the dawn of IPv6, the idea was that you could upgrade an address to IPv6 by prepending a pair of colons, as in ::1.2.3.4, but modern transition mechanisms no longer offer anything as clear-cut as this, so the notation doesn’t really show up in the wild.
I have to disagree with this conclusion. I see it very frequently on Linux. It turns out that programs can bind their listen address to just ::, and the kernel will still allow connections from IPv4, with the address mapped to ::ffff:0.0.0.0/32 -- outbound connections use the same notation.
> It turns out that programs can bind their listen address to just ::, and the kernel will still allow connections from IPv4, with the address mapped to ::ffff:0.0.0.0/32 -- outbound connections use the same notation.
This is only true if the sysctl bindv6only or socket option IPV6_V6ONLY is 0, and is defined by RFC3493.
I definitely frequently used this in code I had written and ran. It is very nice to not have to worry about both stacks and IPv6 is the future anyways. It’s nice to make this configurable for your daemons but I think the default should be true. And also this allows you to not have two separate bind address config lines and all the confusion that comes with that.
Also, some applications have built-in filtering of allowed IP addresses and they don't take into account IPv4-mapped on IPv6 and thus rules may be bypassed without the admin knowing because they dutifully entered their filters in IPv4 only and forgot to tell it to bind to IPv4 only by default.
> At the dawn of IPv6, the idea was that you could upgrade an address to IPv6 by prepending a pair of colons, as in ::1.2.3.4
No, IPv6 explicitly rejected that idea at first. Most of the other IPng proposals did have a backwards compatibility mechanism like that. I'm still sore that the least backwards-compatible proposal was the one that won.
Later the IPv6 cabal admitted their mistake and published NAT64, but at that point it was too late to make it a mandatory required service offered by any default-route router. So now we have all of this crap about dual-stack hosts instead of simply being able to upgrade to IPv6 and trust that you will not lose any connectivity.
This is basically why, twenty years after it was standardized, IPv6 is still merely the "internet of cellphones" and no closer to replacing IPv4.
As usual, DJB saw all of this decades ahead of time:
> It does not process Class A/B notation, or hex or octal notation.
I got to find that notation useful once, to make a shorter one-liner... without even knowing that there were different classes of IPv4 address, and that I was looking at one of them.
It's a tiny function that gives me the IP address of my machine in the LAN, for either Linux and Mac:
# Get main local IP address from the default external route (Internet gateway)
iplan() {
# Note: "1" is shorthand for "1.0.0.0"
case "$OSTYPE" in
linux*) ip -4 -oneline route get 1 | grep -Po 'src \K([\d.]+)' ;;
darwin*) ipconfig getifaddr "$(route -n get 1 | sed -n 's/.*interface: //p')" ;;
esac
}
(sorry to people reading on small screens)
Full disclosure, I got the "1 is shorthand for 1.0.0.0" from here (which didn't get into explaining why it is a shorthand): https://stackoverflow.com/a/25851186
Oh no, that's another shorthand that's different from all the others. A single number should be interpreted as a big-endian uint32, and so "1" should be "0.0.0.1". However, I can confirm that `ip` interprets it as "1.0.0.0", even though you should have to write "1.0" for that.
Well I did think of that, it technically is not a Class-A because it should have 2 parts. My conclusion was that maybe what happens is that "1", while incorrect, is flexibly parsed as "1.0" and thus it would become "1.0.0.0". But you're right that, given the uint32 representation does exist, the most correct thing to do seems to interpret it as "0.0.0.1"...
unless an exception to the rule exists somewhere, and 'ip' is actually doing it right!
Conclusion: explicit is better than implicit, and what's more, in this case the implicit alternative was depending on a non-standard choice made in the specific tool for obscure, legacy reasons.
This fits almost any situation (explicit vs. implicit) and I'm a big fan - when mentoring I tend to say "yes that was the default when you looked today, how do you know it won't change tomorrow? If you want specific behaviour, be explicit don't trust defaults." (more or less, depends on subject - commandline switches to code loops, same advice)
What I wanted to express here (and did badly) is that crossing paths with this arcane Class-A style IP address is something so strange nowadays... in my case in more than 10 years professionally working as a developer, I had seen it exactly once and even then, didn't recognize it for what it was.
The code snippet was just an extra curiosity in case anyone found it useful.
> So, it’s a de-facto standard that boils down to mostly “what did 4.2BSD understand?“
By the way 4.2BSD was being compatible with older or contemporary implementations, like ITS which was running TCP before any Unix was.
For example plenty of machines back then used octal as a preferred human representation. In fact that’s why octal is the default format of numeric constants in C: C, like Unix, was initially developed for an 18-bit (six octal digits) PDP-7. The smaller 16-bit PDP-11 version came later.
It was a surprising amount of work to figure out all the different formats an IP address can be shown in and convert a given IP into all those formats.
Bigger nitpick: as per RFC 5952, canonically :: is ::. 0000:0000:0000:0000:0000:0000:0000:0000 is a valid way of writing the same address, but it's not the canonical way.
As Go’s net package IP parsing was mentioned, here’s a fun fact: under their API it is impossible to distinguish between an IPv4-mapped IPV6 address and the equivalent normal IPv4 address.
I find this to be a great feature. net.IPNet.Contains takes this into account, so you don’t have to worry about or deal with shenanigans like IPv4 mapped addresses. It makes implementing SSRF protection much easier.
Since I write a Lua-parsed DNS server which works with IPv6, even when compiled for an ancient version of MINGW on Windows XP (which has IPv6 support but no built-in IPv6 parser), I had to write an IPv6 address parser (no inet_pton(), which is what most programs use for IPv6 parsing, on that system).
No, I did not add dotted quad notation to the parser. No, you can not have more than four hex digits in a single quad; 00000001:2::3 is a syntax error. It supports “normal” stuff like ::, ::1, 2001:db8::1, and even non-normal stuff like “2001-0db8-1234-5678 0000-0000-0000-0005” (to be compatible with the really basic IPv6 parser I put in MaraDNS’s recursive resolver nearly two years ago), but does not support any of the IPv6 corner cases in the linked article.
Love it! No conversation about SUS is complete without Theo bashing up the absurdity of some historic bugs being documented as features. :-)
---
I do like the hex specification, though. Especially in the age of /29 and such, it's way easier to deal with space using such notation than the decimal numbers, which make little sense for network boundaries in such case. It looks like ping supports most of these (try `ping 0x08080808`, or `ping 0x08.0x080808`, but note that 0x0808.0x0808 is not valid, only 0x08.0x08.0x0808 would be), but `dig @` doesn't.
BTW, I guess this finally explains why the netmask is often shown as `inet 127.0.0.1 netmask 0xff000000` on the BSDs, which is actually a valid IP address notation, as it turns out!
I'm not convinced these are "cursed". They may be the result of bygone networking conventions, implementation ideas that never came to mainstream fruition, flexibility for use-cases etc. Just because we don't understand something that looks strange, doesn't mean it's cursed, nor that one can simply turn one's nose up and say "I don't understand why these exist so I'll just ignore them when I implement x".
I can help here: these definitely aren't cursed, because curses aren't real. I was exagerating for comic effect, because this was just a twitter rant that got out of control :)
That said, many of those representations no longer make sense in the modern world, and I'm actively choosing to not support them. That doesn't mean I don't understand why they came about in the first place, au contraire! I'm explicitly deciding that their historical reason for existing no longer applies.
Thank you for the clarification, it does sound like you've done more background research than the linked blog entry may explain. Was this the result of simply reading the RFCs or did you come across other resources that expand on the obsolete IP address representations?
I do think "curse" is a valid technical term, but to me a cursed IP address (or number, or edge case, etc.) is one that behaves significantly differently from other addresses for no self-evident reason. None of these examples are cursed, but 127.0.0.1 is definitely cursed.
The bygone implementers clearly cursed us with their peculiar decision making. Just as we occasionally curse the implementers of the future (either knowingly or unknowingly) with our peculiar decision making.
I think they've got Class A/B/C wrong? Or at least they're using it in a way that I never learnt
> The familiar 192.168.140.255 notation is technically the “Class C” notation. You can also write that address in “class B” notation as 192.168.36095, or in “Class A” notation as 192.11046143. What we’re doing is coalescing the final bytes of the address into either a 16-bit or a 24-bit integer field.
> Traditionally, each of the regular classes (A-C) divided the networking and host portions of the address differently to accommodate different sized networks. Class A addresses used the remainder of the first octet to represent the network and the rest of the address to define hosts. This was good for defining a few networks with a lot of hosts each.
There you go, thanks! Should have properly read the article I linked. So it's been repurposed to be as OP's linked article states? Not so much ranges but the amount of bits in the netmask?
It is other way around: in the original class-ful internet the numerical range of first octet directly implied what is in CIDR called netmask length. The original IPv4 implementations probably did not even have concept of netmask and this was instead hardcoded. Implementing the routing decision as netmask is nice optimalization which then probably inspired the CIDR concept, because at sufficently high level the only thing you need for that to work is making the netmask (or at least the length) freely configurable.
but basically it's a weird anachronism. I'm not sure if NTP will actually bind to those addresses using the tcp/ip stack, or if it someone just got lazy and coopted the ip address parser for off-label use.
What is the use-case of a decimal representation of a v6 address or a 32-bit int representation of an ipv4 address?
I’ve never had someone tell me, “see if you can ping 143267841”. I’ve worked in networking for coming up on 30 years now and just haven’t found the use.
I suspect it's actually the other way around. On the wire, a v4 address is four bytes. uint_32 is the natural type for this. So when we start looking at cidr scopes, /24 means the first 24 bits of those 32. "The first 24 bits of 4 bytes" sounds wrong to me, "the first 24 bits of 32 bits" sounds logical.
So as I see it - 143267841 (or 0x88A1801) is the address, and quad-dotted decimal is a (slightly more) human-readable representation of it.
Internally, I would imagine that almost every IPv4 stack uses 32bit ints to represent an address. Its not that crazy to think this would leak out somewhere.
I've written (un)parsers where we would just treat IPv4 addresses as integers because A) that is how they were treated in the binary data and B) given what we were doing with the data, we didn't actually care about the IPv4 field.
Or worse, in one case I've had to deal with both (plus another surprise twist):
public class IP {
byte[] value
// if true value is a variable-length ASCII dotted octet,
// if false it is length 4 - with LSB in value[0].
bool isString
}
IPC at least. If you want to pass an IP address (whose natural native representation is a uint32) from program to program as text, having to format it as dotted decimal would be just unnecessary and inconvenient.
Same here, pity that many things I've learned through that jewel of a site have been rendered useless by the same very constant updates Fravia himself warned us against.
Wow, this.
One thing I didn’t see mentioned was “0”. You mentioned it, but it didn’t grok to something I know to work in some implementations: “ping 0” behaves like “ping 127.0.0.1”.
Maybe ping is treating 0 like 0.0.0.0 aka INADDR_ANY ( https://en.wikipedia.org/wiki/0.0.0.0 ). And interpreting it as all the IPv4 addrs mapped to the local machine (including localhost).
That's why things like IP address textual representation needs to be rigorously and formally specified using non-ambiguous syntax notation. The implementations then can formally verified to comply to this syntax spec. At the end I would love to have a formally verified library implementation of IP address parser for major mainstream programming languages which everybody could rely upon and do not try to write their own parser. That's a dream.
I wrote a little applet where you can put in a class A decimal IP address, and it gives you the 3×4 representations mentioned in the article: https://jtvjan.nl/tools/cursed_ipv4.html
If you count mixed representations, there would be 120 possibilities, but the tool doesn't generate those.
I maintain a JavaScript library that does exactly this (called ip-address). Unit tests are very important for handling the esoteric formats, though there are a couple that were new to me in David's post.
One of my motivations for writing the library was being able to grep for IPv6 addresses in text files; it's surprisingly difficult to match all valid representations of a simple IPv6 address as seen in the example here:
I spent hours debugging an issue that boiled down to an IPV4 parser that treated leading zeroes as octal.
Connections to 192.168.123.100 worked as expected. Connections to 192.168.123.034 went to 192.168.123.28. I thought sure it was an issue in my TCP client code, which was handling connections to hundreds of different devices.
Guilty party was Poco::Net library if I recall correctly. I can maybe see this making sense if you provide four octal digits (0377), but not three, and I have a hard time believing anybody has ever used this on purpose.
It’s how any reasonable software represents IPv4 addresses. Dotted decimal is only for human convenience (and honestly, I’d argue that 0xDEADBEEF would be just as convenient, after all people turned out to handle HTML/CSS hex colors just fine!)
This is great! If I'm honest with myself, one thing keeping me from configuring IPv6 as an option locally was the intimidating addresses. This is a great explainer, I finally feel like I "get it".
I don't know if it counts as in practice, but I use the notation he chose not to parse quite a lot on internal networks..
ssh 10.0.0.123 is already a nice quick address to type out, but ssh 10.123 or ping 10.123 is even quicker.
Works in all kinds of random things. Web browsers of course, but games work just fine too usually, if they hand it off to the system to look up.
I write 127.1 all the time when I'm too lazy to type 127.0.0.1. Then I'm sad when it doesn't work because the nearest ip address parser wasn't written in the previous millennium.
Oh, yeah, and 1.1 is the only DNS server address I memorized.
I use mtr 1.1 all the time. (Like, literally all the time, I normally have it running in the background so I can see whether it’s my computer’s wi-fi adapter, the wi-fi router or the local ISP that’s playing up this time.)
I remember it was a few days after they came out with 1.1.1.1 and 1.0.0.1 that it dawned on my that I could drop the zeroes. I’d been wondering why they hadn’t chosen 1.2.3.4, but once I realised 1.0.0.1 was just 1.1, it became fairly obvious why they had chosen it.
(P.S. mtr’s stripchart with latency information is super great for this sort of thing; I have MTR_OPTIONS=--displaymode=2 set in my environment.)
I once used some of these weird notations to pack config data (mainly IP addresses) for remote installations into a product-key-like string that field techs could receive over the phone
Writing a parser and saying "I'm dropping support for all these old ways of doing things" seems like poor form.
Unless there is a big reason, never drop backwards compatibility. In this case, supporting all those forms would be very do-able. The best way to support them would be to find some old BSD parsing code and port it, then you can be sure every corner case is handled the exact same way. Handling corner cases differently is a great way to introduce security vulnerabilities and crash/DoS bugs that every user of your library will have to be aware of.
Maintaining such code isn't really a good excuse here either - the code is only going to be a few thousand lines, is self contained with no dependencies, is easy to test, not going to change much with time, etc.
Basically, there is no benefit to removing this feature, so don't break what isn't broken.
There is a good reason: many of the unusual forms are unused except as tricks and exploits. The whole internet uses IPv4 classless routing. There is no value in keeping pre-CIDR forms. Graybeards might object because they have been typing "127.1" for forty years. It's merely an old habit. Who is to say how big a reason is required to "never drop backwards compatibility"?
The way to handle security problems with corner cases is to just return a parse error if something unusual is seen. With security, the rule is to be conservative with what you accept; anything unusual should be rejected.
In cases where backwards compatibility is needed, just use inet_pton() and let the libc maintainers deal with the bug reports (I believe inet_pton() dropped octal and hex support for ipv4 addresses)
This was the source of one of my favorite “bugs” ever. I was working on multiple mobile apps for a company, and they had a deep link setup that was incredibly basic: <scheme>://<integer>, which would take you to an article with a simple incrementing ID. This deep link system “just worked” on iOS and Android; take the URL, grab the host, parse it as an int, grab that story ID. Windows Phone, however… the integers we were parsing out were totally wrong, returning incredibly old stories!
Turned out that the host we were given by the frameworks from the URL was auto-converted to an IP in dotted-quad format, and then the int parser was just grabbing the last segment… which meant that we were always getting stories <256, instead of the ~40000 range we were expecting.