I'm always interested to know how long these infrastructure changes took from project initiation to launch, and whether it was a dedicated project team or something completed alongside BAU tasks.
These details are almost never included in these write-ups. Anyone have any guesses?
This was a roughly six month project for a single engineer working around 75% of the time on it, with help from other folks along the way for code reviews and etc. The first three months was research, planning, implementation, etc and the latter three months was a very careful roll out and migration from the old system to the new and finally decommissioning the old system.
Thanks very much for the reply. Very useful. I think these kinds of details really help people in other organizations who might want to undertake similar projects.
Do queries to github.net stay internal or do you also sync github.net zones to Route53/Dynect ... just in case?
We have a similar setup with unbound and nsd (no need for powerdns for us). Even then it took a while to get it right because JVM apps especially love to hang for no reason doing NS lookups. You also need to specify -Dnetworkaddress.cache.ttl= etc since they don't listen to TTLs.
Running unbound on every single machine has saved us a lot of downtime.
Nearly all of our internal zones are internal and not sync'd to an external provider. In a few cases we need to perform lookups of internal zones external to our network and those zones live both internal and external.
We use the mysql backend and http API, a few small nits but for our purposes it has worked very well thus far. Note that our authorities never see production traffic outside of AXFRs from our "edge" hosts so I can't say how well it works for other use cases.
What's the reason you've chosen MySQL over the bind backend when you are using the API anyways? I have to make a similar decision soon and I am not really sure yet, any insight would be appreciated.
Full access (read and write) to the PowerDNS HTTP API requires one of their generic SQL backends (via https://docs.powerdns.com/md/httpapi/README/), such as MySQL. The bind backend only supports reading from the API, changes to zones would need to be done on the file system and/or using pdns_control. Beyond that having all our records queryable via SQL has been nice for debugging and researching our own DNS records, types and etc. Lastly, backends like the MySQL one allow for things like auto generating serials and adding comments to the DNS data.
I'm curious if they're using DNSSec at all. I notice they're using Dynect for this, and in my experience DNSSec and Dyn do not get along (unless you're not using any of their special features like geotargeting), so it I'm interested in hearing how they've managed to get all that working.
I'm curious why people ask about DNSSEC support. None of the major browsers support validating it.
Even to validate the DNSSEC records by yourself, there is only a single website available[1] (which doesn't even have TLS). I want DNSSEC to catch up, but adoption level is a joke.
You're not limited to just browsers, and a perfect use-case for dnssec would be in combination with sshfp records for ssh, incidentally something GitHub heavily relies on, and where support is much better.
Adoption is slow, nobody argues there, but when you've set it up and have routines for rolling keys it's more or less self-maintained.
Google public DNS will return servfail if validation fails, which is a step in the right direction.
There are plenty of tools to validate dnssec, even with TLS [0]. But I'm not sure why you would need a webpage to do it. You can easily grab the root keys and validate the whole chain using dig on your own computer.
I don't see any mention of HTTPS support for custom domains. I wonder if this helps move the needle on that. I had moved a lot of project hosting to my paid GitHub account but SSL has become a necessity (SEO and privacy) so I'm launching sites on Digital Ocean again. I'd love to have less server config to do though.
It means that for those zones, they explicitly put the IPs of the edge servers in their resolver (Unbound) configuration, so that lookups of names in those zones don't have to go the root servers and then the TLD (like .com) servers, only to find out that the authority (the edge servers in their design) are in the next rack. Instead they will go directly to those edges. This gives them "Addtionally, public zones are completely resolvable within our network without needing to communicate with our external providers. This means any service that needs to look up api.github.com can do so without needing to rely on external network connectivity."
These details are almost never included in these write-ups. Anyone have any guesses?