
Spotify’s Love/Hate Relationship with DNS - yarapavan
https://labs.spotify.com/2017/03/31/spotifys-lovehate-relationship-with-dns/
======
jlgaddis
What Spotify calls a "stealth primary" has typically been referred to as a
"hidden master" for 15 years or so. Googling that term will turn up more
relevant results, for anyone looking to do something similar.

~~~
inopinatus
To be clear, a strict implementation of your DNS means that this host appears
in the MNAME field of your SOA record. Not so stealthy or hidden, despite the
moniker. You might ask "who on earth looks at the MNAME field of SOA records",
I answer "well I do".

However there is no requirement that the MNAME host be willing to answer
queries. There _is_ an expectation that the host in the MNAME field accepts
dynamic DNS updates, if one is using RFC2136-style dynamic DNS, although I
didn't get the sense Spotify were doing so.

------
guitarbill
> We run our own DNS infrastructure on-premise which might seem a bit unusual
> lately.

I don't think running your own DNS is too uncommon, especially if you have a
lot of on-premise hardware that changes somewhat frequently. However, if you
do this don't run BIND. We found PowerDNS to be much better in terms of
features, user-friendliness, and documentation. Having backends that aren't
zonefiles is a huge win. I've heard good things about Unbind, but haven't used
it in a big environment yet (>1000 machines).

~~~
jlgaddis
I'm a "BIND lover" and have been since the 90s and still use it for public-
facing authoritative name servers. In their case, though, it definitely sounds
like they should consider PowerDNS. It allows for various backends, including
SQL and custom ones, which might fit in well with the "data store" they
mentioned. Instead of all the cronjobs and pushing and pulling, they might be
able to point the authoritative nameservers directly at their "data store" and
cut out a lot of that "plumbing" (it's impossible to know without more
details, of course).

Also, unless there's a huge amount of DNS data changing every 15 mins, they
might gain some speed-ups from sending dynamic DNS updates to the
authoritative nameservers and/or using IXFRs instead of AXFRs.

(n.b.: unbound only handles recursive DNS, not authoritative.)

~~~
guitarbill
> (n.b.: unbound only handles recursive DNS, not authoritative.)

Yeah, but as you probably also know the PowerDNS recurser is separate, so
there's no reason PowerDNS + Unbound couldn't also be a great combination.
Heck, I might even choose that combo so resolvers only have unbound installed
and can never act as authoritative servers.

------
sigil
> _It simply pulls from our DNS data repository, then compiles all the zone
> data via named. With every compile time – which takes about 4 minutes..._

4 minutes seems like an awful long time for what I'm assuming is a fairly
simple transformation. Any insight as to why? Is named just slow?

~~~
inopinatus
It's not intrinsically slow, no. I've built & run BIND-based infrastructure
spanning >20 sites, >50 servers and >10,000 zones, and changes were compiled
and propagated in seconds. Only a total rebuild & recompile & reload of all
zones & services required time on the order of minutes.

------
adrianratnapala
I got side-tracked by the talk of version control. Do they use a single repo
for everything? It seems like it from they way they talk.

In that case, I am surprised git is a good fit. SVN might have been better,
though some commercial solutions like ClearCase or Perforce would actually be
right for that sort of work-load.

~~~
raverbashing
I assume you never used CC but it's a gigantic pile of crap. It is completely
worthless. Sold on golf courses to clueless higher ups, because most
developers hate it

But in general I've never seen any commercial source control system beat an
open source I've.

~~~
daxelrod
Perforce is mostly a better Subversion. The two have an extremely similar
model, but Perforce handles merges much better and gives more flexibility in
slicing and dicing your local view of the repo (extremely useful for huge
monorepos).

Disclaimer: it's been several years since I've used Subversion, this may have
changed.

------
vinay_ys
DNS for service discovery is very old school stuff that has been known to be
unreliable and requires unnecessary amount of work. You are better off using
an actual dedicated http based service for doing service discovery. Keeping
things largely TCP/HTTP based makes everyone's life simpler.

------
peterwwillis
They didn't even look at their firewall tables before deploying a completely
different OS to production? Wow.

~~~
nailer
This:

> Upon the migration of the final nameserver – you guessed it – DNS died
> everywhere. The culprit turned out to be a difference in firewall
> configuration between the two OSes: the default generated ruleset on Trusty
> did not allow for port 53 on the public interface.

Deploy a new service on a Linux box in the last decade? Poke it through
whatever the distro uses to manage iptables.

It's like a webdev saying "it turns out DROP deletes tables".

------
alex_duf
So DNS as way to orchestrate sharding? is this common, or am I right to find
that odd?

~~~
daenney
What makes you think this is about sharding?

~~~
ec109685
The article talks about their use of dns service records to encode shard
locations. I know other companies that do similar things. DNS has to be up, so
rather than adding another dependency, encoding it into that infrastructure is
a valid approach.

------
blorgle
Spotify should consider implementing Consul :P

------
Something1234
What's with the weird font rendering? It looks blurry.

~~~
Twirrim
Comparing Firefox with Chrome, it kind of seems like it is doing more anti-
aliasing on Firefox.

If I split the two windows side by side, the difference is marginal. If I do
them fully maximised, the difference is more pronounced (at least on Windows)

edit: downvotes? really? Do I need to post screenshots or something?

~~~
kuschku
They also set the font-weight on every single line of text, separately, via
html style attribute. To 400 (normal is 300).

That’s very weird.

~~~
notyourwork
> That’s very weird.

They are probably using a WYSIWYG editor that auto generates so when you press
return it creates a new paragraph. I agree the output is weird but I could see
how this would happen.

------
leesalminen
Maybe instead of worrying about internally built DNS servers they could allow
me to undo an artist ban from a daily mix. Seriously frustrating.

~~~
daenney
Maybe their engineering blog is about engineering things and customer support
should go through
[https://twitter.com/spotifycares](https://twitter.com/spotifycares) and
[https://support.spotify.com](https://support.spotify.com) instead of an HN
thread.

