Hacker News new | past | comments | ask | show | jobs | submit login

If you are using clustered storage (Ceph, for example) instead of a single RAID5 array, ideally the loss of one node or one rack or one site doesn't lose your data it only loses some of the replicas. When you spin up new storage nodes, the data replicates from the other nodes in the cluster. If you need 'the university storage server' that's a pet. Google aren't keeping pet webservers and pet mailbox servers for GMail - whichever loadbalanced webserver you get connected to will work with any storage cluster node it talks to. Microsoft aren't keeping pet webservers and mailbox servers for Office365, either. If they lose one storage array, or rack, or one DC, your data isn't gone.

'Cattle' is the idea that if you need more storage, you spin up more identikit storage servers and they merge in seamlessly and provide more replicated redundant storage space. If some break, you replace them with identikit ones which seamlessly take over. If you need data, any of them will provide it.

'Pets' is the idea that you need the email storage server, which is that HP box in the corner with the big RAID5 array. If you need more storage, it needs to be expansion shelves compatible with that RAID controller and its specific firmware versions which needs space and power in the same rack, and that's different from your newer Engineering storage server, and different to your Backup storage server. If the HP fails, the service is down until you get parts for that specific HP server, or restore that specific server's data to one new pet.

And yes, it's a model not a reality. It's easier to think about scaling your services if you have "two large storage clusters" than if you have a dozen different specialist storage servers each with individual quirks and individual support contracts which can only be worked on by individual engineers who know what's weird and unique about them. And if you can reorganise from pets to cattle, it can free up time, attention, make things more scalable, more flexible, make trade offs of maintenance time and effort.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: