Hacker News new | past | comments | ask | show | jobs | submit login

Disclaimer: I'm a CS PhD student who works on distributed service architectures.

It's worth pointing out that this design pattern only makes sense when the entire system lives under one administrative domain. Google owns all of the servers that make up GoogleFS; a cloud provider owns all of the Hadoop nodes in its datacenters; a PaaS provider owns all of its NoSQL datastore nodes; etc. We see a similar pattern at work in Puppet, Chef, Ansible, Func, certmanager, etc. as well.

Under these circumstances, it's desirable to maintain the authoritative state in a logically centralized place for two reasons. First, doing so makes it easy for the rest of the system to discover and query it. Second, it makes it easier to keep authoritative state consistent with updates. Centralizing control and distributing data lets you address control-plane concerns separately and independently of data-plane concerns.

However, it stops making sense to centralize the authoritative state (control) once you build a system that spans multiple administrative domains. Which domain gets to host the authoritative state? How do you get the other domains to act on it? Centralization won't work here, unless you can first get the domains to agree on who's the controller (sacrificing their autonomy to decide the state of the system).

We have addressed these concerns instead by distributing responsibility for the authoritative state across domains, and devising a way for them to reach consensus on it. DNS does this by delegating authority for name bindings hierarchically. The Internet maintains routing state by having each AS learn and advertise routes to each other AS via BGP. Bitcoin maintains the blockchain (its authoritative state) by having a majority of nodes agree on the sequence of blocks added to it. DHTs work by sharding the key space AND routing state across their participants.

It's hard to achieve consensus (and react to changes) in these multi-domain settings versus the single-domain setting since you can't force every domain's replicas to agree. However, this is a feature--no one but the computer's owner should have the final say on the state it hosts. Naturally, multi-domain systems must account for this in their design--something that Google's internal systems can safely ignore.




I think you're going to see multiple systems with centralized control exert a partial influence (as opposed to complete control like Puppet/Chef) over systems. DNS is a pretty good example: nobody thinks of configuring a DNS server on a system as setting up a "controller", but in some sense it controls an important aspect of the system's behavior. So does NTP.

Another example is the SDN-inspired "Wi-Fi sharing" platform Anyfi.net that I'm working on. It allows you to configure an "anyfi controller" e.g. on your home Wi-Fi router, but that "controller" only has a say in how the spare bandwidth and "extra SSIDs" on your router are used. It can set up Wi-Fi networks and tunnel out the raw 802.11 frames to an endpoint anywhere on the Internet, but it can't do anything that impacts your security or steals significant portions of your bandwidth. In that sense it's somewhat like DNS, but with even less security implications for the "controlled" system.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: