Hacker News new | past | comments | ask | show | jobs | submit login

I've identified several technical problems with this domain, and this isn't an example of how to properly operate DNS. 37signals is setting an absurdly low TTL on these records (10 minutes; the answers never change, I absolutely do not understand the logic behind this TTL), which means every 10 minutes you're re-resolving a local address, through a CNAME (so two DNS round trips, and in my case this resolution took between 115ms and 230ms, not small change):

    [~]$ dig foo.
    foo.	600	IN	CNAME	foo.daze1.xip.io.
    foo.daze1.xip.io.		600	IN	A
Concerningly, ns-1.xip.io is also broken; it does not serve NS records for its own zone, instead relying upon the SOA record and the upstream glue, which I'm shocked works:

    [~]$ dig +short NS xip.io
The nameserver delegation from nic.io is also broken:

    xip.io.			86400	IN	NS	ns-1.xip.io.
    xip.io.			86400	IN	NS	ns6.gandi.net.
    ;; Received 86 bytes from 2001:678:5::1#53(b.nic.io) in 60 ms
Oh, well that's interesting, Gandi is a backup for their custom daemon, eh? So did they implement AXFR, IXFR, and notify and such to Gandi? Well, let's ask Gandi:

    [~]$ dig @ns6.gandi.net. SOA xip.io
    ;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 3222
Oh, guess not. The long and short of this is for DNS purposes, a custom daemon is almost never the answer. This could have been accomplished with BIND fairly easily, and the zone would be functional as well.

It's a cool idea, but there are some other problems too; which I just want to list to help the developers and am not trying to rain on a parade.

As in the parent comment, a CNAME is returned for arbitrary names;

  % dig foo.     
  foo.	600	IN	CNAME	foo.a2eo0.xip.io.
  foo.a2eo0.xip.io.	600	IN	A
but only if the request is of type A. Requests of other types return invalid NXDOMAIN responses - invalid because they contain no SOA in the authoritative section. CNAMEs are supposed to be returned for all records of any type for a given name, not doing so is dangerous as it can poison caches. Not returning the CNAME even for a query of type "CNAME" is particularly harmful.

Responding with no name would be bad on its own, but saying that no name exists is clearly wrong and can be used to poison caches (the NXDOMAIN is cacheable). Note that most browsers and clients will now perform an AAAA lookup prior to the A lookup - poisoning their own cache if they happen to have a copy of the SOA for xip.io in cache (the SOA record hints to the negative cache lifetime).

It's not clear that using an intermittent CNAME does anything useful - why not just return an A record, with a billion second TTL value. As-is, it merely adds a round-trip (the CNAME and A are not returned in one pass by ns-1.xip.io).

Additionally, ns-1.xip.io does not mark the "authoritative answer" bit in any responses - which will cause issues with some resolvers.

But, still a neat idea. Question for the developers;

It's clear that the intermediate CNAME represents an encoding of the IP address, e.g.;

  foo.	600	IN	CNAME	foo.a2eo0.xip.io.
here "a2eo0" is an encoding of , but then;

  foo.	600	IN	CNAME	foo.k201s.xip.io.
are you using some kind of cipher?

PS. Everybody please use for IP addresses in examples and documentation, and 2001:db8::/16 for IPv6. See RFC3330/5735 and RFC3849. It's good karma ;-)

Source code for encode/decode is found here:


Thanks! it reads like a 36-ary encoding of an IP address in host byte order, rather than network byte order, which is why it seems to jump around so much.

Interestingly, it encodes as 0.xip.io , but then refuses to answer for 0.xip.io. Why isn't obvious to me from reading the code, perhaps some kind of overflow condition is triggered by the right shift.

Hold on - who cares? This isn't meant for use in production right? Or am I missing something... from what I can tell the purpose here is that I can setup a domain that will resolve to an address on my LAN without having to modify, say, /etc/hosts on my android device (which I wouldn't even know how to do) or setup a DNS server on my LAN (which is of course possible but a lot more long winded than the solution proposed here).

I can't see how it matters whether or not there are problems with this from the perspective of being a "correct" DNS server so long as it works for it's intended purpose (testing things on your local network from a bunch of different devices).

Edge cases and niggling "works ok for me" problems really do matter for something that's intended to be used with testing.

Otherwise it's easy to rat-hole for a long time trying to determine why your test isn't working, when it turns out it was a problem between your DNS resolver and an upstream domain.

Problems between DNS resolvers and DNS authoritative servers are classically intermittent; they usually depend on the ordering of a chain of steps to occur. For example, I might get a resolution failure for a xip.io record if one of the following sequences occurs;

  Client asks resolver asks for AAAA - gets NXDOMAIN from ns-1.xip.io, caches it
  Client asks resolver for A - responds with NXDOMAIN
but if the queries happen in the reverse order, things are fine.

or, another example;

  Client asks an AD DNS server to perform resolution, server chokes on lack of the AA bit in authoritative answer.
And so on.

But then in either case if a tester fires up nslookup or dig, everything works on the command line, and so they may spend quite a while trying to figure out why my library routine for connecting to my service isn't working.

What I took away from this, though, is that they just want to be able to load a web application they're developing on their iPad/iPhone or otherwise "restrictive" device that doesn't allow you to easily make local DNS modifications (such as /etc/hosts files).

I honestly don't understand what you've written above (although I re-read it a couple of times, I guess I'm just not knowledgeable enough about DNS for it to make sense) but can you see those issues impacting the ability of someone to load an application on their iPad in order to test it out?

I guess the problem might arise that people start to use this "not as originally intended" and get into all sorts of strife but for the particular scenario they were originally intending it for it seems perfectly adequate, no?

The issues reported so far could absolutely cause errors across any device, including an iPad. The problems that the DNS setup will cause affect resolvers - which can be a combination of software in your browser, your c library, local caching daemon, on your cable/dsl modem, in your ISP, and a public provider.

Many crufty resolvers - on things like wifi routers in particular - don't deal well with the lack of an AA bit, or a REFUSED answer. So a tester could easily end up with "works for me" and "not for me" reports that are really just down to the particulars of their network and resolver software, whether they have IPv6 enabled, and so on.

Edited to add: Again, I don't mean to rain on the developers parade. It's a great idea.

Writing DNS implementations is hard, and requires a certain kind of technical archeology to get to grips with the detail. DNS is a tricky protocol, chaotically and ambiguously documented. I've helped write 3 different ones - and I still get things wrong. And that said; anyone interested in writing hardcore DNS implementations that have to operate on the scale of microseconds per query should drop me a line.

The state that xip.io is in right now could, theoretically, result in DNS failures 50% of the time on any device.

Actually, a low TTL is ideal for testing purposes. While you're correct that a query will never give a different answer, a low TTL ensures that the name won't linger in any resolver's cache for very long, which makes it less easy to discover. This also makes the arbitrary string chosen as a subdomain particularly ephemeral, which is important when testing name-based virtual hosts. Why leave a testing domain stuck indefinitely in the cache of a resolver I don't control? I'd rather have it disappear when I'm not using it.

It depends on what you're working on. If you're developing a SaaS application where an individual instance should be providing group features based on the hostname, suddenly host names becomes a development detail. Though I agree that in most web applications this isn't the case.

I'm sorry, I edited my comment (the broken zone is more troubling to me than the impetus for using it) and made yours look out-of-place. You responded to something accurate the first go, and I agree with you.

I find it interesting that we are required to do a second round-trip for the CNAME when it is available in the same bailiwick and could be sent as part of either the ADDITIONAL SECTION or as part of the ANSWER SECTION. Would you happen to know what the RFC has to say on this? I understand that CNAME's are not allowed to exist with other records, but returning it in the ADDITIONAL SECTION shouldn't be a cause for concern.

The reason why it may not be serving NS records for itself is because looking at what is available on Github the server is started on port 5300, so I am assuming that there is some sort of DNS resolver/cache sitting in front of it that may be stripping them out. Same thing with it not responding with "Authoritative answer" bit set...

Although that is simply speculation, maybe they did put the node service directly on the internet.

Once a CNAME exists for a name, no record of any other type may exist for that same name (it's an override for all types).

But for a query like this, a server is allowed to return both a CNAME and its relevant target(s) ... as long as they are within-bailiwick. It can go right into the answer section, e.g.;

  % dig example.allcosts.net @ns-22.awsdns-02.com.            
  example.allcosts.net. 300     IN      CNAME   at.allcosts.net.
  at.allcosts.net.      3613    IN      A
this is permitted because the original type of the query was "A", so we can include it as an answer, and it will avoid a round-trip on behalf of the recursor. That's all regular RFC1035 behaviour.

It's more common to use the additional section to include details about the target(s) of MX, SRV and NS records. That's more of a "I know you asked for an MX record, but you're going to need this A / AAAA record too pretty soon, so here it is in the additional section" kind-of thing. The additional sections in the responses to the following queries should be illustrative;

  dig NS  ns_example.allcosts.net @ns-22.awsdns-02.com
  dig MX  mx_example.allcosts.net @ns-22.awsdns-02.com
  dig SRV srv_example.allcosts.net @ns-22.awsdns-02.com
Something I forgot to mention; Being that DNS is the chaotically documented protocol that it is, I'm glad they launched early with a minimally viable product. It's the best way to get feedback like this for free! I think the real scope for real-world error is something like 1 in 100 users experiencing a problem as-is. Most resolvers are hyper tolerant of any amount of DNS crud, because they've been beaten on so much by poor implementations over the years. But the 1% of the time it breaks will cause you hours of pain in debugging.

In fact, returning both the CNAME and the A in the initial response is required. Returning just the CNAME and setting NOERROR tells a recursor 'the target name exists but I do not have an A record for it'. Luckily, all recursors I am aware of are stubborn and will then ask for the A anyway.

This is a good example of where things get tricky in DNS. A resolver could never really infer non-existence of the A record from mere non-presence in an answer like that.

Although RFC1034 outlines that a server typically would do that, it also says that it shouldn't include data that it's not authoritative for.

So a conflict arises when you CNAME to a sub-delegated child zone. E.g.

  foo.example.com IN CNAME baz.example.com
That response may come from a server that's authoritative for example.com - and so "baz.example.com" is technically in-bailiwick from the point of view of a resolver who has made only this query.

However baz.example.com may itself be delegated to other nameservers, and so is "really" out of bailiwick. But the response won't signal this to resolvers at this stage (though in theory could via the additional section).

The simplest reason why resolvers ignore it though is that there's no SOA in the response from which to derive the negative caching time - so it wouldn't know how long to cache that non-existence - and almost all resolvers are caches.

Even if it is required, some authoritative DNS server implementations don't, so far I have found BIND that came with FreeBSD 8.0 doesn't, nor does tinydns.

So recursors would have to account for possibly broken implementations and try the query anyway.

Ah, here is where implementation becomes important... not all of the authoritative DNS servers I have tested actually have this behaviour. So far I have found that BIND and tinydns don't send the A record even-though it is in bailiwick for the CNAME.

Hopefully they will solve this problem at some point, but if you want an alternative fast, couldn't you consider using services like dyndns, noip, etc?

I use the following script which I found somewhere (can't remember where) and modified: https://gist.github.com/2894514

You can run it:

    avahi_publish.py service1 service2

    curl http://service1.local:8000/bla
It's linux only of course :)

Patches welcome.

I can't patch your registrar misconfiguration, but, thanks for the response.


(The author of the software wrote a comment here: "So you're just here to shit on things?", which he has since deleted.)

I genuinely and honestly cannot log into your Gandi account and fix your nameserver delegation, so that means I'm just here to shit on things? That's a logical leap for you? You are delegating xip.io to a nameserver that is refusing queries for your zone; that's seriously broken and can result in resolution failures, making your clever hack worthless.

I don't know why I bother providing feedback, since people from your school of thought (I'm looking at the 37signals community as a whole, here, which you're being a shining steward of) just get defensive and take your software being broken personally. You wrote a poor DNS server. Read the spec, study BIND's or NSD's source to understand the years of work that went into this before you, and understand the problems I've pointed out. I just get annoyed when people flagrantly misimplement DNS, because that starts trends, like Heroku suggesting for a long time that you use a CNAME for @ (don't do that).

I'm not making this up: http://i.imgur.com/zFNkV.png

> You wrote a poor DNS server. Read the spec, study BIND's or NSD's source to understand the years of work that went into this before you, and understand the problems I've pointed out.

Why? xip.io exists and works. If he had to read hundreds of pages of technical specs or thousands of lines of source to implement it, it wouldn't exist.

Feel free to make your own spec-compliant or better-working version though; Sam's done the same for RVM, and I'm sure he wouldn't have any problem with better software existing.

Bravo. I sometimes wish HN had an option to filter out 37 Signals items...

sscheper, I love your average on HN (-.2). In response to jsprinkles: I do believe this was just a hack and not exactly intended for public consumption but someone decided "hey that's pretty cool let's chuck it on the web". Which has the obvious results of the opinions of hundreds :P

This was an announcement; "someone" in this case is a co-worker of the author. The author of xip.io (sstephenson) came on thread to discuss this and a related announcement about Pow.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact