Hacker News new | past | comments | ask | show | jobs | submit login
A Server Naming Scheme (mnx.io)
465 points by mstolisov on July 9, 2014 | hide | past | web | favorite | 199 comments

Use something like a mac address and map them logically (i.e use chef envs/roles/tags). This forces you to stop remembering hostnames and setup automation to ssh/run commands. If you try to encode information into hostnames you head into the deep end pretty quickly and it's probably going to change in six months anyway.

I believe that's why the information is all encoded with CNAMES. That way you can change purposes easily without ever touching the original A record, which refers to the hardware (not its function)

You should be very careful with this. Even a relatively low TTL of 300 (5 minutes) can be too long to wait during a major outage. Make sure changing CNAMEs is a very rare thing, and be prepared to tolerate inconsistency.

It's also worth noting that some widely-used pieces of software cache DNS lookups in-process (MongoDB), so changing a DNS name is no guarantee that all connected processes will automatically fail-over to the new machine unless restarted.

Moral of the story: distributed systems is hard

I have run into this at least twice this week alone with both Nginx and HAProxy pointing at Amazon Elastic Load Balancers. Amazon occasionally rotates out IPs for ELBs, so everything's working fine for weeks and then boom, now we're having a bad day.

As a rule, NEVER EVER use CNAME's within AWS on Route53. You should always use their A Alias function. This provides near instant changes when you need to adjust a record. This only works on aws resources, but it is a great feature they built in.

I totally agree. In these two specific cases, R53 + ELB wasn't going to work for us though, because of reasons. (I don't think I can go into specifics, but they're actual reasons.) Our workaround was reloading the configs (which triggers a DNS cache flush on both Nginx[1] and HAproxy[2]) once every couple of minutes.

I know, I know, I hate it too. We're working on it. Just wanted to share the workaround.

[1] http://wiki.nginx.org/CommandLine#Loading_a_New_Configuratio...

[2] http://www.mgoff.in/2010/04/18/haproxy-reloading-your-config...

The JVM is another one of those things - DNS lookups are, by default, cached for the life of the VM instance.

I remember when Java added this feature. It was because lookups in the JVM were atrociously slow, requiring a native layer to get out of the Java sandbox.

Speculating here, I suspect that an important Sun or IBM customer had an app that did lots of DNS lookups and the performance stunk. So, an engineer did a quick 'fix' to cache DNS lookups. Customer was happy, everyone moved on. Some time later this quick fix got ported into the mainline code base. But, it appears that nobody did a proper analysis of this quick fix, ie. respect TTL on DNS. Maybe supporting TTL wasn't important because this was back in the early days of Java when it was trying to win the desktop war and desktop apps weren't really expected to be long lived processes.

God, how awful. That's the point of TTL!

Beyond that, my system has a perfectly functional resolver (and a daemon to manage name service lookups, using policies that our organization has chosen and assumes are being used).

I understand Java doing 'its own thing', because the goal is to provide consistent behaviour on all platforms, but it shouldn't be stupid behaviour.

Personally, I don't think you should use DNS at all for intra-network service communication. I think the naming scheme described in the article is useful for humans, but for services trying to connect to other services, I would stick with hardcoding or Zookeeper.

Cnames are really quite poor for this unless the cname information is being driven from some other tool that you can easily do lookups on.

Otherwise you won't be able to get that classifying information on the end host.

Using a deterministic ID for machines is definitely the way to go.

We actually studied this for a while and MAC address is a pretty universal ID that works virtually anywhere. We use the smallest physical non-zero MAC address in case of multiple NICs. We considered using chassis or baseboard serial numbers but it gets too vendor specific.

The problem you run into with using the MAC address is what happens when you swap out the NIC?

You use the new one? Either that or dont. Doesn't really matter that should all be automated shouldn't it?

I'm not following. The whole purpose of naming is to make it easy. I don't want to remember 000a43dbac72.example.com, it's easier to remember "pine" and "oak" and trust me, after working in the environment a while, everyone knows what every keyword is.

Over a certain size you want your tooling to take care of that knowledge not you. You don't want your sysadmins or developers deriving information from records that can change.

if your developer or sysadmin assumes that the server named castle will always be something special instead of looking in the CMDB or other ENC for that information you will have less fun in the long run.

tl;dr random names with a central ENC forces you to only get meaning or facts from one central repo.

The benefit here is mostly just with larger infrastructures where you're gonna run out of wood names and where agreeing on and enforcing naming standards is a futile and painful process.

With logical mappings, you just use chef/puppet/cmdb to do something like sshnode prod app 1 to connect to the first production app node or sshnode prod app to connect to all the production app nodes. CNAMES can do this too but then you run into potential DNS consistency issues. With this you get the human friendly abstraction and keep the machine friendly determinism.

Of course when your chef server goes down this can be a PITA (I recommend building in some knife search caching!)

This seems reasonable. I think the 'themed' naming works well for up to perhaps a 100 or so servers/devices managed by a single team, but with tons of automated provisioning, I suppose unfun naming is a requirement.

I am still trying to understand their reasoning myself, unless you just have hundreds of servers I can see no reason to resort to numeric naming methods, that might be fine for arrays/sorts/computers but not people. It looks good on racks but for representation to humans, argh.

Ours are named for location and function. There are some large servers which have six character names as they so unique/important to the organization there is never a question. Usually though many fall into <owner+location+environment)

However I can see justifications made in either direction, its no worse than coding conventions that people come up with.

That sounds like premature optimization for a lot of people.

Part of the problem is that hostnames are so quickly ingrained into our workflows that when you only have a few servers there's no point, and when you have a bunch it's too late – unless you want to spend a lot of time changing every reference.

If you start building a small cluster which you know is going to grow into a big cluster, then these aren't really premature optimizations so much as laying a proper foundation. It's the difference between rewriting some of your Python in C and making sure all your python code is packaged into sensible objects and grouped into libraries.

Thanks for this link: http://namingschemes.com

It's one of the two hard problems, you know. Help is always appreciated.


Wait a minute, I always thought the two hard things were cache invalidation, naming things and off by one errors.

I got burned in a tech interview with off-by-one errors. I had signed into a shared editor to do some simple practice stuff for the interviewer.

He noted the off-by-one errors as a concern.

I was thinking: this is untested, unreviewed, 2-minute code that shows I can at least do some sort of problem solving. I'm nervous, don't know you, kinda worried about the tone of this interview and you're worried about off-by-one errors.

Kinda glad I didn't get that one. Wish I'd just hung up on the guy.

I feel for both you and him.

The problem with getting praised a lot for being smart is that you then feel obliged to say smart-sounding things often, and being critical is an easy way to sound smart. It's even worse if you're hired to be smart. So the guy, if he wasn't a total tool, was probably thinking: I wish I could just be coding, but instead I'm supposed to interview a bunch of people in a way that guarantees it will be awkward. And shit, I just got nervous and said something that sounds insulting. I wish I could just hang up.

In the "expanded" set, there are technically three hard things - you forgot dealing with February 29th.

Daylight savings time... I mean unless you do things like a human, that is in UTC. Otherwise, be prepared for having at least one hour a year when your software does not work.

No, those are the four hard things.

condition. I'm pretty sure the fifth is a race

Naw, it's probably graceful expansion. Or maybe that is 6th?

Wow, that's interesting. Had a similar thought not too long ago about applying a naming scheme to identifiers in code. Not like this or Hungarian Notation, but more like a controlled vocabulary variables with the same meaning use the same name across your codebase. So no more using "price" here and "amount" there. That would directly address the second hard thing.

In the aerospace industry there is something called "Simplified Technical English"[1]. Maybe something like that already exists for your domain as well?

From the linked Wikipedia article:

    The approved words can only be used according to their specified meaning. For 
    example, the word "close" can only be used in one of two meanings:

    - To move together, or to move to a position that stops or prevents materials 
      from going in or out
    - To operate a circuit breaker to make an electrical circuit

    The verb can express close a door or close a circuit, but cannot be used in 
    other senses (for example to close the meeting or to close a business). 

[1] http://en.wikipedia.org/wiki/Simplified_Technical_English

U.S. submarines do a similar thing. There is a specific doctrine for how interior communications are performed, down to the level of always using "shut" instead of "close" (which can be confused with "blow"), referring to compartments and equipment using precise wording and pronunciation, etc.

Similarly I heard recently that air traffic controllers are never allowed to use the word "takeoff" unless its in the context of clearing an aircraft to do so.

I do consulting, so I don't work in a particular domain. There's usually a mix of coporate lingo and industry lingo. but the developers don't come in with familiarity with either so it ends up a being all over the place.

Given that thought, you'll likely enjoy Eric Evans' "Domain Driven Design". I love that book: https://www.google.com/search?q=domain-driven+design

I like the suggestion on the first link that these naming schemes should be used for kids. An amusing site :)

service discovery via DNS on top of etcd: https://github.com/skynetservices/skydns

I personally believe that servers should not have names, as naming them just creates the incentive to try to fix broken infra instead of just killing and respawning. I've considered using some kind of short UUID for hostnames instead.

"short UUID" is an oxymoron. If it's short, then it's not universally unique.

Assuming you meant "short ID", how is that any different than the OP's suggestion of randomly choosing from a list of meaningless names?

"Short" is relative. 4 billion IDs is likely Unique within an organization, even one the size of IBM, Google or Microsoft.

At the risk of pedantry, it depends on whether their allocation is centrally coordinated. 4 billion IDs with central coordination isn't too bad, but you need a much, much bigger space to make collision-free, uncoordinated ID assignment practical.

Corollary: in a room of 23 randomly selected individuals, there's a 1/2 chance that two individuals share a birthday. 365 days in a year, but only 23 samples gets you to 1/2 chance of collision. Look up the "birthday paradox" for more info on this surprising result.

4 billion IDs = 32 bit = Math.pow(2,32).

For a 50% chance of collision you need Math.pow(2,32/2) calls to your ID generating function which is 65536 calls.

And that's just a 50% chance. Even a 1% chance is too much (for which you would need much less than 65536 calls).

I doubt "IBM, Google or Microsoft" are using 32bit UUIDs.

"4 billion IDs is likely Unique within an organization"

I don't think it's unique within any organization.

I think it would matter what the Universe is for their environment. A UUID may be fine as "server123" if there will never be a collision in their environment (ie. server124, server125, server 126, etc).

In a past life as sysadmin, I found very useful to encode a letter with number, so that serves with similar specs share the same letter.

a1, a2, ... aN for serves 2 cores 8GB ram

b1, b2, ... bN for serves 4 cores 8GB ram

c1, c2, ... bN for serves 4 cores 16GB ram


It only has to be "universally" unique within the DNS servers you query, no?

"Hey Al, could you go take a look at the drive on the secondary mysql server? I think it might be throwing a fault condition."

"Sure Frank. What's its host tag name?"


Not sure your comment is intended on proving a con of naming servers as random strings, but honestly I would consider that a positive. If hostnames are actual words, then people are able (and sometimes incentivized) to transmit them verbally. This can lead to communication issues.

If the hostname is randomized, then the logical thing would be to copy and paste the hostname and send it off over the wire, which would theoretically cut down on human error.

The human then has to write down a long string of unrelated numbers and letters and track it down somewhere in the server room. Both are time-consuming and error prone.

It's annoying enough to try to find "snowwhite" amongst 40 servers in each of 40 racks; trying to scan the list of names for "ABC123XYZ987" when none of the name correlates with any useful information would be absolutely maddening. Jesus christ, I can just imagine for every sever looking at the name on a sticky note, looking at a tag name, and then looking again at the sticky note and back to the tag name, possibly a third time, just to make sure it wasn't off by one number or letter or something. Repeat for thousands of servers? Euuggh!

Besides the poor cage monkey having to deal with that crap, when you're fighting fires in the ops room you need to be able to refer to a handful of server names quickly. You can't be copying-and-pasting names to people, looking information up every 2 seconds because you can't possibly memorize it. We are human beings and so we need human interfaces to information.

The fix for that is don't have the cage monkey care about server "names", but instead physical servers.

I've done this, and it works very, very well with third parties. You simply give them their work-order, and it consists of working on particular physical devices.


Row 10, Rack 11, RU-15 - Serial Number 103527382 - Replace Hard Drive.

Row 13, Rack 9, RU 6 - Serial Number 103528942 - Replace Power Supply.


The work orders are generated from your CMDB, which tracks things like serial numbers and physical locations.

As organizations grow, this is eventually where they all end up (after multiple iterations of other less scalable systems)

Having worked with datacenter technicians, I can confirm that providing step-by-step handholding instructions such as these is way more effective than telling the tech to "look for the server with the sticker that says unicorns.thenextfacebook.tk".

The human interface problem is solved by physical directions, not type of data provided.

That helps you track it down but it doesn't help you relate it to another human in a non-location context. Hence why dp-t75x-013 is nice ("oh hey, it's that dell precision T7500 that keeps segfaulting, maybe the RAM's bad..." versus "oh hey, it's that Serial 103528942 that keeps segfaulting"). Good to name it something relatively simple to pronounce, too ["deepeetee seventyfiveecks thirteen"]. A less common problem is serials/macs/tag numbers become duplicated when you buy enough servers...

Depends on the size of the organization and how many servers they have and who you are working with.

In the above case, the question was how to direct technicians to gear - and, specifying a physical location and a faceplate label (Serial Number or what have you) does the job.

In the case of Humans - they usually have cnames for the function they are interested in anyways.

I do work on about 4500 servers - and I couldn't tell you the name of a single one of them (though, I note they have some long convoluted DNS PTR, even harder to understand than a serial number) - But, I have a ton of cnames when I want to login to a particular customers server (based on customer, production/test/dev/fste, function).

Teams that need to do maintenance on servers have tools that group them based on role, location, data center, etc... They never actually "ssh" into a server the way I might.

A lot depends on whether you are talking about scales of 100+, 1000+, or 10,000+ servers (or network devices for that matter) At each stage you start to lose more and more of human naming convention, and move everything into a Configuration Management Database (CMDB)

If we're naming racks to be serviceable, seems easier to have numeric rows, dash, alphanumeric cabinets, dash, numeric racks. 3-K-7 seems extremely better than disney-snowwhite-dopey. And your labels should hold the breadcrumb: [3] (row), [3-K] (cabinet), [3-K-7] (rack). The other benefit is not having to guess where these things are (or having to dictate this very same information to land someone near "dopey").

Can't start with a number or you won't be able to use it as a host DNS record. But yes, numbers and letters that correlate to significant information would totally be useful compared to a random string, with the caveat that putting rack/row information in a hostname would mean changing all the records relating to that host every time you moved it around in a rack. (which might be the best way to mandate that the damn NOC engineers record where they moved the goddamn host to...)

I'm curious - what's wrong with starting DNS A records off with numbers?

   dig +short 4test.shephard.org

Illegal according to the RFC syntax for a DNS "label": http://tools.ietf.org/html/rfc1035#section-2.3.1

One aspect of host name syntax is hereby changed: the restriction on the first character is relaxed to allow either a letter or a digit. Host software MUST support this more liberal syntax.

-- http://tools.ietf.org/html/rfc1123#section-2

thank goodness, how else would 4chan have been allowed to exist on teh interwebz?

> This can lead to communication issues.

That was part of the reason for that curated list of words that they were recommending in the article.

> If the hostname is randomized, then the logical thing would be to copy and paste the hostname and send it off over the wire

Eventually that chain of communication will need to end in an action. Not all actions are just "copy-paste it into a box." What if I know that the server is a "Rack #3", but now I need to look at labels on the actual servers to find server 'SDFssdfa4324tdfgfg"? It's a lot easier to grok server "crimson".

It's all fun and games until the actual server is on fire.

Bob: "Frank? Which server did you say was on fire?" Frank: "It's the email server. The name is ..." Bob: "Frank - you know the rule. Send it, don't say it." Frank: "Uh ... I can ... maybe text it to you?" Bob: "Insecure." Frank: "Uh ... pigeon?" Bob: "The intern ate the last one." Frank: "Uh, carve it on the intern's back?" Bob: "Tattoo it. Safer that way." Frank: "OK, give me an hour." Bob: "Hey, server's burning, you know. Ain't gonna last forever ..."

uh what?

Keeping your hostnames secret, like dropping ICMP, is useless.

That's taking the "pets" vs "cattle" meme a bit literally. Memorable hostnames can be given to both pets and cattle.

However, I personally question the wisdom of giving a network intruder a roadmap to your internal network via easily predicted DNS names, but that's probably just me being paranoid.

We used something very similar to this where I work (Crittercism). I designed most of it.

I used five-part DNS CNAMEs exactly how they're specified in the article (a bit shocking, actually) except that mine switch the "group name" (prod, staging, CI, etc.) and the geography, with the rationale that "groups" can span several geographic areas (example: a multi-regional production environment), so they're logically the "containers" of geographic regions.

Example: rtr01.us-west-2.prod.crittercism.com, NOT rtr01.prod.us-west-2.crittercism.com

Great article.

I like your way better.

It all depends on your environment.

Do you manage 12 machines in a closet, 200 EC2 instances, or 1,500 systems in 5 datacenters?

Do you have more than one datacenter in a given city? in a given campus? in a given building?

Do you build systems that serve a single purpose such as a database, or do you run multiple daemons on a single system? How do you take docker/containerization into account?

How often are your sysadmins logging into these systems? Once a week (spell out "production"), or 20 times a day (abbreviate it as "prod")?

Are your systems all owned and operated by you? Do you manage systems for multiple clients in the same datacenter?

Does your team consist entirely of your college roommates you started your company with? Or are you a team of 20 spread out across different time zones and different countries? Will your funny/memorable names be offensive?

This all seems like an exercise in trying to build the The Server Naming Tower of Babel Bikeshed.

Yeah, but most of those are good ideas for a moderately large business that's not managing other people's servers. Especially mnemonic names. Would you rather reference ruby.web or W1L1Z (which is often the kind of cap that people come up with by default).

Presumably you'll need to iterate over hosts anyway, so why not move directly towards your automation tools' (ansible, chef, puppet, etc) host inventory and not fuss with DNS...













Just started out with Ansible. How do you efficiently SSH into all these hosts by name instead of IP?

At the low level, Ansible has the ability to execute ad-hoc [0] modules [1] against any combination of hosts defined. In my host(s) inventory above, you can execute ad-hoc modules against just nyc-web, lax-web, web(against nyc and lax), or whatever suites your needs or you've defined a host group for in your inventory.

Report, using the "command" module, free memory on all "nyc-web" hosts in our "prod-hosts":

  ansible -i prod-hosts nyc-web -m command -a 'free -m'
More often, we'll want to have Ansible playbooks [2] that take our hosts and assign roles. Briefly on the topic, we might have a "common" role that deals with updating our package manager, checking/upgrading packages that every machine should have (eg. ssh, ntp, etc), and ensuring certain processes are running. Second, we might have a "web" role that is similar to "common" but dealing with web-server specific packages, configs, etc. As to keep it brief, we'll skip how the roles are setup, but here's our "web" playbook (web.yml), and find that we've defined the hosts [group] to execute against -- all hosts under our "web" group in the hosts inventory.

  - hosts: web
    remote_user: web-user
      - common
      - web
Which we'll execute with:

  ansible-playbook -i prod-hosts web.yml
Maybe you only want to hit nyc-web and not all web hosts? The following will take the intersection of the playbook's hosts "web" and the "limit" hosts "nyc".

  ansible-playbook -i prod-hosts web.yml --limit nyc
Or maybe you know the one IP you want to hit. The ":&" syntax is a logic AND to find only the nyc, because we also have one in LAX. If your IP layout, then simply a "--limit" would do.

  ansible-playbook -i prod-hosts web.yml --limit 'nyc:&'
[0] http://docs.ansible.com/intro_adhoc.html

[1] http://docs.ansible.com/modules.html

[2] http://docs.ansible.com/playbooks_intro.html

Ah, cool. That clarifies things. I take it there is no way of smoothly combining Ansible host lists with being able to use pure, old SSH?

Ansible will use the native OpenSSH on your box, which does enable you to use whatever is in your .ssh/config -- jump hosts, controlpersist, kerberos, etc

OK, I suppose a list of common, easy-to-pronounce words is ok...

I use Pokemon. What do you use?

To this day, my favorite naming scheme I ever encountered was names of elements, everything CNAMEd to the elemental abbreviation, with the last octet of the IP address as the atomic number. It was for a geophysics group, which probably made it a little more discoverable/easily spellable.

That might work for a private network, but what if you have a /24 and need to use the entire range?

Maybe find molecules with the corresponding molar mass, e.g. glucose = C6H12O6 = 180?

I had that idea ages ago and I'm glad someone actually used it.

I used to run a dual stack app, windows was Roman mythology, linux was Greek, that allowed us to keep roles straight across environments - apollo/hermes, athena/minerva, etc.

You...do know that Apollo and Hermes are two entirely different entities, right? Hermes is the counterpart of the Roman Mercury. Apollo has the same names in both pantheons.

Yeah, I remembered that wasn't the right pairing after I wrote that (and thankfully ours weren't paired like that, we had the right names); but that also goes to show you the downside of that particular naming scheme - it requires additional knowledge in a specific domain.

I tend to use names from different mythologies, so I usually end up with names like "osiris," "lorien," "cernunnos," etc. Makes for easy to remember names, and I never expect to run out of them.

At 280 North we used exit names along 280... woodside.280n.com, pagemill.280n.com, sandhill.280n.com, etc


I use a bunch of anime characters when naming the machines. Source of my inspiration is: http://www.phdcomics.com/comics/archive.php?comicid=1467

Excellent. Somehow I've managed to call a side project server here "Toothless".

I once used Cthulhu mythos monsters. It turned out to be a bad idea: those things are horribly hard to spell.

I'd be afraid to wake them from hibernation.

They are all named for the mountains of New York State in decreasing height order.

Nearby restaurants. Admittedly a shorter list than Pokemon :)

I use famous computer scientists. My Macbook is named Woz, and I also have Kernighan, Ritchie, Torvalds, etc.

Now all I have to do is become a famous computer scientist and accuse you of stealing my hardware!

Do you also match the names with the technologies they're related with? It would be awkward to call a Windows server "Torvalds".

Yeah, I try to. For instance, my Magic Mouse is called Doug, after Doug Engelbart, the inventor of the computer mouse.

Star Trek: TNG characters. Excluding Data to avoid confusion, of course.

My fairly modest home network uses characters from Madoka Magica.

We use the names of Final Fantasy characters.

Bananaman characters here. In hindsight, they're nice and clear and short, and as a bonus, I get to refer to our staging servers as 'blight' (after General Blight)...

I second that one :)

During my final internship, the sysadmins gave every intern a Dell named after defunct air carriers. (Mine was PANAM.)

Characters/references from Arrested Development: Gob, Buster, StairCar…

Porn star names will usually do the "trick".

I found that naming my servers after characters legendary badass mexican Danny Trejo has played to be cathartic and awesome!

Machete, Vega, Jack, Oscar, Angel, Tattoo, Tigre, Carlos, Vic, Reynaldo, Guerrero, etc...

There was a good debate about this on the devops-toolchain mailing list.


I guess there is no actual need to use security by obscurity to protect the servers?

On the other hand, choosing a name for what is running on that server makes it just one step easier for an attacker?

Only the web load balancers are likely exposed to the public internet, along with some sort of bastion host. The rest are probably on your internal network and can't be accessed/queried anyways, so the names mean nothing to them.

If attackers ARE in your DC already, you're already hosed, and the few minutes that it would take them to determine that some obscure name like "host-a831f1" is the DB won't matter.

So in general, I believe optimizing for maintainability here (easier names) is more worth it than falsely believing that obscuring the names provides some level of security.

Though, if all your servers are able to be accessed from the public internet, it might be a different story. But that really isn't recommended.

> If attackers ARE in your DC already, you're already hosed,

That's a defeatist attitude, and the reason why security companies get away with only selling perimeter defense products. "Well if they get in they can do whatever they want anyways." If servers are properly insulated from one another, violating a single server won't give them complete access to your infrastructure.

An example of this: Valve was infiltrated by a hacker that managed to exploit an ASP server for a random webpage, and was able to get all the way to Valve's perforce servers and steal a copy of a the tree for Half-Life 2. There's no reason in hell a random web exposed server should be on the same network as their Perforce server, but that's what having poor internal network security does for you.

I don't think he's advocating that you should have unprotected internal networks; just that the naming scheme shouldn't be confidential, since it is one of the first things that will be exposed in a compromise.

Knowing the name "main.prd.example.com" doesn't help if it's got a bastion host, thorough firewall rules, key-only SSH login, et cetera.

Thank you! I totally forgot about not exposing sensitive machines to the outside world.

Yes, I am in total agreement with you. Much appreciated.

Even if you're in our network, you'll still have a lot of fun. There's more ports open, yes, but most ports will just refuse you with an access denied and report to a secure log, which is centrally collected and triggers the admin spawner if necessary.

By that argument, you should be naming servers after Aztec gods (Tlahuixcalpantecuhtli, anyone?)

Even when you pick a naming scheme, it'll eventually cause grief if you get big enough.

Name your machines XXYYN[N] where XX is the building name, YY is the rack name and NN is the position in the rack? You'll eventually have more than 676 buildings, 676 racks, or 99 machines in a rack. Multiple people will have written regexes to split that into building, rack, position and they will break when you grow one of the fields.

That's where the subdomain-based CNAME system comes in handy.

And I'd argue that building numbers probably don't want to be part of your scheme.

is about as deep as I'd like to get with my hostnames (though there are other options for shortening them, with DNS search directives and such).


... might keep you from losing boxen. You will hate yourself typing that. Store the metadata elsewhere, perhaps a txt record for the host.

My naming system is pretty similar. Infrastructure consists of a handful of big beefy machines that then each host anywhere from 4 to 12 single purpose guest vms.

For the beefy host machines, I just go with core-a, core-b, core-c etc plus a geo tag. So: 'core-a.de.domain.local', 'core-b.de.domain.local' etc.

For the vms, they get functional names like 'n0-dbc1.de.domain.local' (for node0, database cluster 1) or 'git.de.domain.local' etc.

Only the last two digits of each guest vms mac address (or addresses if it has multiple eth interfaces) changes. The rest is tied to the underlying host machine. So e.g. 00:50:56:01:01:06 means a mac for a guest on 'core-a', 00:50:56:02:02:03 means a mac for a guest on 'core-b' etc.

All internal non-publicly routable IPs use our domain and the '.local' TLD (with bind serving our internal network). All external, publicly routable IPs use our domain and the .net TLD. Finally, our frontend website IPs use our domain and the .com TLD.

The above wont scale well beyond one or two hundred vms per data centre location but if we ever need more than that, it will be a nice problem to have :)

I'm guessing you don't use live migration of VMs very often?

Nope, never. If a vm needs permanently moving from one core to another (which would be unusual for us), we just spin it down and dd the lvm volumes for that guest via ssh and then change the mac.

It helps that we don't filter on mac. The main reason for tieing the addresses to the underlying core is to avoid accidentally allocating the same mac to two different vms (since at the moment vm creation is still a relatively manual process for us - hoping to improve this in the future though!).

Why tst instead of test, prd instead of production?

standard width of 3 characters maybe? makes visual scanning of a list of names easy to do to spot name/purpose, environtment, geo, etc. you don't have to keep moving your eyes left or right with each row.

If you're regularly typing the names, it's nicer to type something short like

  ssh prd01
rather than

  ssh production-01
On the other hand if you're managing clusters of servers using something like ansible, you may not spend as much time typing in individual server names. I think it may depend on how your workflow is set up.

Typing test and prod is likely faster than typing tst and prd because you're not accustomed to leaving out vowels. Short doesn't have to mean no vowels. Short real words are better than abbreviations.

Because 3 letter grouping is visually easier if you use that for every environment. dev, tst, prd, stg, ppd, bkp. Combines well with geographic locations by using airport codes also.

Actual words are better for all environments; abbreviations are ridiculous. They don't speed anything up, they slow down comprehension and they make jargon where none is warranted.

The words you speak should be the words you use, not abbreviations of them. You don't say prd, you say prod, you don't say tst, you say test, so prod and test are what you should use. One ubiquitous language everyone can share without the burden of useless jargon. Dropping out vowels isn't saving you anything.

> Dropping out vowels isn't saving you anything.

Except the consistency thing I was talking about. In one of previous job at a fortune 100 company had to have a consistent naming scheme for everything in the DC for whatever reason that predated me. 3 letter production environment, followed by a 4 digit number which described all kinds of things about what apps / environments ran on the host.

Also having fixed with fields for hosts is very valuable when you are writing scripts to parse data on hostnames, especially when you are scaling up to tens / hundreds of thousands of hosts you begin to appreciate consistency.

I think the question is about what is good, not what do you do when you're stuck working with a bad standards, and those are bad standards. Relying on fixed length host names is as bad as relying on fixed length data or fixed length file names, i.e. it's absurdly bad practice. These are things developers learned sucked long ago. Fixed width is not valuable, it's brittle.

I wouldn't use fixed values myself, but when I've worked at places with 100k+ hosts using naming schemes with consistent abbreviations of the same lengths, with other fields describing consistent data is much better than fun names when you need to fix things quickly.

Different story for small networks, have fun all you want. With hundreds of thousands of hosts scattered all over the world, this isn't used.

I haven't said a word about fun names, I think you rather missed my point entirely.

I get it, you like full hostnames such as production-mail-server-and-occasional-nas-file-server.

I'm just pointing out the fact that billion dollar real world orgs use abbreviated hostnames more often than not, and they have good reasons to do so.

If you got it, you wouldn't be putting up a straw man of long server names like "production-mail-server-and-occasional-nas-file-server" so no, you really don't get it because you aren't listening.

You can have short server names that are easy to type AND still use actual words. Typing prd instead of prod is not saving you anything worth saving nor making server names easier to type or remember no matter how many there are.


is an absurd name, it was your example so I assume part of the scheme in this big network, but you'd just have well crammed all the same info into...

Without abbreviations and without losing the meaning.

> and they have good reasons to do so.

Yea, inertia.

The real world companies I've worked for in the past have good reason to use abbreviated host names, sorry this is so offensive, but that's...real life?

That's a good point about dropping vowels. Typing 'prod' feels more natural than typing 'prd'. But even 'ssh' and 'prod' are abbreviations. It's not the case that you never want to abbreviate, even for host names.

Prod is not an abbreviation, it's a nickname; big difference. SSH is, but Linux is full of badly named commands that you just have to suck up and learn and that's been its Achilles heel for a great many years, it's the ultimate jargon.

Partially yes. Partially meh. For servers I use a lot, my SSH-config looks like this:

   Host f1
       Hostname foo.bar.baz.qux
So... just use a readable name. I'll shorten that to something like 'a' locally anyway.

Because prd means fart in Czech/Slovak (at least).

Somebody's nostalgic over Hungarian Notation?

This article, while interesting to repeat occasionally, was previously discussed here: https://news.ycombinator.com/item?id=6796318 And here: https://news.ycombinator.com/item?id=6800935

The subject itself was also discussed here: https://news.ycombinator.com/item?id=6540044

I usually suggest that people read RFC 1178, Choosing a Name for Your Computer: http://tools.ietf.org/html/rfc1178

At least then they will avoid the usual pitfalls worked out over the many years in which networked computers has existed.

I kind of like the approach Elasticsearch takes to this, name an instance after a random Marvel superhero. Keeps things interesting.

This is cute, but devolves into a maintenance nightmare very quickly. It is much easier to look at a sever list and see "db001" than it is to see "spiderman" and remember if that is a DB or not. Extrapolate that out to hundreds of hosts and you kind of get the picture.

In a modern world where the machines might be homogeneous and VMs/containers above define the actual role, the machine might just be "host123" whereas the higher level services have specific names like "db001."

With service cataloging and discovery tools like Consul (disclaimer: I wrote it), there is an easy way to see a mapping of service name back down to the host it is on. So even if you're yelling to an ops person "hey db001 is having problems," the ops person can quickly map db001 down to host 319.

And with slightly more complex (but worth it, imo) naming, you can determine the rack, datacenter, etc. of a server just by the name.

> This is cute, but devolves into a maintenance nightmare very quickly.

Especially given that the name changes at every startup. It's absolutely not recommended to stick with the auto-assigned names, but a rather good idea to name the node after the machine or the location.

When this came up before I liked the comment from an ex-dropbox employee on how dropbox found positional hostnames to be a the best solution: https://news.ycombinator.com/item?id=6540044 (top comment by gmjosack)

I worked in the field of infrastructure documentation for a while, applying configuration management principles to it.

Naming is very important and often done poorly. The positional scheme you link to works very well.

You need to label things, and bear in mind that you might be asking some outsourced third party "pair of hands" who has no knowleged of your environment to reboot a box - you want to be sure it's the right one!

Active equipment like servers often have naming conventions around function etc, but consider what happens if someone remotely renames the server and its physical label is ot updated...

For passive equipment, e.g. patch panels, we recommended a suffix indicating U position (being consistent about whether U's coount from the bottom, and which edge defines the U position). Most places start with U1 as the bottom U position, and equipment that might take up Us 1&2 is defined as being in U1 (bottom edge). Alternatives OK, but be clear and consistent.

Many organisations are not definitive on location ids, down to city, name of building within the city, even the names of rooms within a building - multipe names often used!

Examples might be:





(without U suffix is fine for such kit often)

Lots of fun in this area!

I use pokémon names to name computers. I just pick the last 9 bits from the computer MAC address, add 1, and check which pokémon have this number in the national pokedex. This scheme can be adapted depending of the network. Perhaps you could use the names of legendary pokémons to name servers and more important computers. It's easy to find images, ASCII art and perhaps even toys to put in the computers to help identification. And even if your network gains 40 new computers each year, the number of existent pokémons grows more than this and you won't be without names. :-)

I suppose it makes sense to get rigorous and logical about hostnames from an admin point of view, but I do miss the romance and personal expression of naming machines after.. well, scifi entities, fairy-tale dwarves, whatever.

One of the many things I like about my ISP is that they apparently feel the same:

  traceroute to (, 30 hops max, 60 byte packets
  1  X.X.X.X (X.X.X.X)  7.057 ms  8.826 ms  8.859 ms
  2  c.gormless.thn.aa.net.uk (  26.691 ms  28.136 ms  27.243 ms
  3  c.aimless.thn.aa.net.uk (  29.673 ms  30.793 ms  30.273 ms

I used to go the multiple subdomain approach until I discovered some wildcard SSL certificates do not support more than one level of subdomains. Therefore, I now do something like staging-nyc.example.com

All wildcard SSL certs only support one level of subdomain.

You can use a SAN cert to create a multi-level wildcard cert. The cert would be issued for:

    example.com, *.example.com, *.*.example.com, *.*.*.example.com [...]

That's technically possible, but you'll have a very hard time finding a CA that's willing to issue a crazy cert like that.

It's most likely internal, you run your own org ca.

It's easy to do once you have a good relationship with your CA.

You can also self-sign for internal machines, and leave your public ones one wildcard deep.

In what world is `tst` a better part of a hostname than `testing`? All this article seems discuss is a naming scheme for lazy people who don't want to type long and descriptive hostnames.

That was my first thought, the author seems to like three-letters, I don't know why `sql` is better than `db` for database, considering not all databases are based on the SQL standard, and `mta` for mail server? Really?

Author here...the important thing isn't the abbreviations, but having a standard that you can agree upon and use consistently. The listed items are just an example scheme that tries to keep the length consistent to make scanning though a list visually a little easier.

I think you should make that more clear. It doesn't say anywhere in the article "for example, you could use something like this".

It is easy for the hacker news crowd to forget this in an age of google apps, but lots of organizations run their own -- possibly substantial -- email services, and in a well-scaled email setup the mta may be a completely separate machine from the mda/msa/mra.

Having a "mail server" might be fine for lots of places, but having disparate machines with specific functions is not invalid.

Just want to highlight one thing:

Essentially, the hostname should not have any indication of the host’s purpose or function, but instead acts as a permanent, unique identifier to reference a particular piece of hardware throughout its lifecycle

It follows that if you are using virtual machines and configuration management, where the lifecycle of a VM or amazon instance is trivial compared to an actual piece of hardware, there is less benefit to separating hostname from functional name.

Actually, it's still quite nice if the hostname has no relation to the server's purpose – the functional name can then outlive the hostname if you throw away and recreate VMs regularly.

Our primary server names identify the location (for us), and we try to keep it very short:

<data center><rack><unit>

So for example: sea1r001u01. Our team in the data centers provide the initial name based on where they place the server. The unit value starts at 01 at the top of the rack and increases sequentially.

This means that if we have a hardware issue, the data center teams only need the hostname to know exactly where the server is located and get to it.

I've been through a number of naming schemes for a few different purposes.

Firstly all VMs are named after their function(AD01, named01 etc) we never re-purpose VMs

Workstations went through a few conventions dependent on size:

o sysXXX

o french revolutionaries

o generals from WW2

o Fellows of the royal society

o Towns of the UK

My favorite is fellows of the RS, as a lot of them have wiki pages.

The key thing is that you don't recycle names. When a machine dies, so does the hostname

Why do they do this? Why do infrastructure people suffer so much pain learning the same lessons software developers learned many years ago?

1) Join the vowel generation. If your tools cannot handle unlimited length names, use different tools.

2) Use descriptive names. Don't name them after your pets.

3) Rigid rules like Hungarian Notation produce cryptic and irrelevant noise.

The idea proposed by OP is precisely NOT to use a descriptive name as the base A record.

Why? Because the functions served by this machine may change over time -- so any descriptive name you use may become confusingly inaccurate.

But you need a way to refer to the _machine_ itself, whose functions may change over time. The best way to do this is to pick a _non-descriptive_ unique identifier that will always refer to that machine. As the OP mentions, that non-descriptive unique identifier "will mostly be useful to operations engineers, remote hands, and for record keeping."

Then you make a descriptive name as a CNAME. Which, they don't mention, but it might change over time what that CNAME points to etc.

I have found this general principle to be a very good one even in my much smaller shop. When we used descriptive names as the 'main' canonical machine name, this led to huge confusion when roles changed -- you either had descriptive names that were no longer accurate or confusing, or you changed them but still had documentation or notes or tickets referring to the old name, etc.

The rest of the OP goes into a suggested scheme for making the descriptive names (the CNAME's), in a way that will actually _be_ descriptive of what developers and ops will need to know. Their scheme seems pretty decent to me, but definitely depends on the particular context and domain of the shop, including how many machines you have, if they are geographically dispersed, etc.

But the basic concept of using a _non-descriptive_ name as the basic machine name A record, with descriptive names being CNAMES to it -- is I think pretty widely applicable and wise. (And that mnemonic projet list is pretty useful for creating non-descriptive unique identifiers that are still easy to remember and record).

He didn't say to use a functional name, he said descriptive. If the machine is an hp bladecenter 7000 series, reflect that in the name. The functions may change but the silicon will not. If you're paranoid about security, create class names that refer to classes of hardware and number identical/similar ones.

Also it's useful to put the dc and rack number in subnets after the hostname (hpbl7x0001.r55.la.example.com). The next time you're running around the cage reading every single 1U's tag name going "AARRRGHFHH WHERE THE FUCK IS SLIMSHADY.EXAMPLE.COM??", you'll thank me.

(i have no idea why, but it's much easier to get a dns admin to create a new A record than it is to get a NOC employee to update the network inventory database)

(also: good luck using your network inventory db to look up a rack location when the rack/switch that's hosting the network inventory db is down...)

Just tell the real server to please stand up.

> (i have no idea why, but it's much easier to get a dns admin to create a new A record than it is to get a NOC employee to update the network inventory database)

Well yeah, if parts of your organization are dysfunctional then you find ways to hack their functions into the more functional parts of your organization. But that doesn't make it the right approach in general.

So do blades have internal speakers?

    apt-get install beep
then play some marco polo

> Why do they do this?

This short form is "for historical reasons". That's not "they're too lazy to fix it" reasons, rather there is a very large amount of hardware deployed elsewhere which you might have to interact with making assumptions. You have to fit within the lowest common denominator of all those assumptions.

> If your tools cannot handle unlimited length names, use different tools.

Because that's not possible. According to the RFC[1], FQDNs are are limited to 255 characters. Individual components (i.e. between the dots) are limited to 64 characters. Having to account for e.g. IDN, means space limits are an active concern.

[1] http://www.ietf.org/rfc/rfc1035.txt

Get off your high horse and actually read the article - descriptive names should be temporary CNAMEs. There's nothing rigid about an easily changeable pointer. And 'just use different tools' is a naive and reactionary approach.

Their "descriptive" names are Hungarian Notation for servers.

We decided to be democratic in our office and arrange a vote for the name of the new server. Unfortunately this resulted in the server being called "Ron's Electric Love Machine". (Thankfully this abbreviates to RELM which is a bit easier to deal with)

I name my servers with names fitting ships so that I may have a proper fleet of boxen.

I actually had this problem a while ago but now I use a script I wrote:


To generate one during the setup of the server automatically

This list of hostnames comes to mind from an older thread: http://seriss.com/people/erco/unixtools/hostnames.html

I generally use hyphens instead of subdomains for SSL certificate purpose and go general-to-specific in the order:




For clouds I use the cloud and zone/region:

rs-ord-web-prod-01.domain.com aws-east1-app-stg-01.domain.com

One thing to consider: .'s cooperate better with graphite, salt-stack and possibly ssh-configs, since in those tools, * matches everything but .; Thus, something like * .foo.* .* .domain.com.metric selects the metric for all datacenters & environments for the application foo. Using hypens, this doesn't work.

(Why can't I escape * 's? ....)

I really dislike the suggestion of labeling devices by what they are.

fwl, ups, pdu, rtr, swt, etc... all really give away too much info imho.

Yes, someone may discover what the device is on their own via nmap -O or something, but telling someone up front this is a PDU and if you mess with it, it may crash an entire cabinet... is just... silly.

I tend to follow the parent comment's suggestion more, labeling by location and environment (prod, dev, etc).

Assuming you're doing split-horizon DNS, those records should be hidden from the outside. And the only way to detect the CNAMES other than brute force scanning of a DNS zone is to do a zone transfer. And you only have zone transfers allowed from other relevant DNS servers, right? And your monitoring software will catch a brute-force scan, right?

Remember that the reverse dns always resolves to something like orange.example.com, which gives away no information at all.

Not if you don't control the DNS, and/or don't notice a crawl in the background network noise.

I've never seen MNX before. I noticed in their pricing and features that they offer "additional IPs at no extra cost". Has anyone had experience with this? Are there limits?

Thanks for checking us out!

We ended up chatting -- but for further clarification for anyone else -- yes, you can have additional IPs if you can justify them. There are practical limits, but we haven't come across a situation where we've needed to enforce a hard limit at this point.

Another interesting naming scheme: http://www.reddit.com/r/nameaserver

2 hours and no reference to the relevant XKCD?


Because it's in the top of the page of the story you're commenting on? I'm sure we can keep hn from turning into /. for a couple of more years yet... ;-)

Well, it was posted in the article itself, so I'm guessing thats why no-one had reposted it here.

Where do you put the Star Wars references?

I wish I had this about 5 years ago when I was knee deep in server naming. Good article anyways!

Big choice of names? Memorable? Fun? Just name them all after Muppets.

This scheme has no way of denoting if a machine is virtualised or not.

How would this naming scheme work if you're using heroku?

Good post. Does anyone knows this company? To the company: VERY BAD call requiring users to signup to check custom instances, specially at the beginning, you should prefer new customers to new signups.

Thanks for the feedback! We're still optimizing, and this was a mistake not to put the pricing calculator on the main website.. we're working on that and will move it to the main page soon.

Don't forget to encode the physical location if that's a thing. It might be important to know that web02.prd.nyc.com is in Row D, Cage 4, Slot 5, Front.

was the mnx.io article talking about the hosts file? or what you put in your dns manager....or both?

"and yet you settled on Caroline as a name in a few seconds" ... that's a pretty good cartoon

stg or str is better than sto (storage).

dev,stg,prd... omg, whats so bad about actually spelling it out?! Its not like you require those precious bytes for your business to suceed.

> "The actual purpose abbreviations you use aren’t important, just pick a scheme, make sure it’s documented, and stick to it."

Given that in some cases you might be typing them very frequently, some people would want to shorten them. I assume it's more to do with typing speed than saving bytes somewhere.

Thanks for making it so clear what services your machine provides on the public internets. I like advertising on the public internets too and sometimes I even run SSH on port 22 with root access enabled for convenience sake. Also, since I have so many servers, I make sure to run fingerd and include hints to what the root password is in my plan (everybody knows what it's like to forget a pw).

Another good thing to do, I was told that according to RFC1413, always make sure you are using identd service, it's a very helpful protocol.

I just don't get why people are so paranoid on the internet I've been a sys admin since the 90's and never once had a virus or been hacked.

I guess with everything now being a "devops" world, we should just focus entirely on convenience and forget about the old timers annoying speeches on "SPOF" and "Security/Risk Management" that block me from pushing my awesome new codes to production as quickly as possible. We use Chef and CI so I don't ever have to even think about the server itself, other that it is running Ubuntu, which is super fast and a very secure OS.

I love when new software comes out, I just grab the recipe and ship it! Even if it is geared to server farms with hundreds of machines, I probably need to use it for my 5 server cluster.

Anyway, sorry to get off topic.

I'm an infosec guy, and I think sometimes we can make things so difficult for attackers that it becomes too difficult for our sysadmins to do their jobs. Using obscure, pseudorandom server names is a good example of this. If I get an alert that says, "uxeprdweb05 is down", I know to immediately check the Mason, OH, production hypervisor cluster/web server rack. If the alert says, "I-2A51DDBCE621 is down", I have to look up that host in my CMDB before starting any troubleshooting. Sure, "uxeprdweb05" leaks information to attackers, but it's nothing they couldn't find out with a port scan. In the meantime, I and my colleagues don't have to go through a lot of unnecessary indirection to do our jobs.

As to the security benefits of configuration management tools, again, speaking as an infosec guy, I love them. I can push a locked-down, default-deny base config out to all of the computers under my care, and I don't have to worry about making a mistake or missing a step or forgetting to document something. I can work with the devops team to set up automated functional and security testing using a continuous integration tool, so that config changes (including security updates!) get vetted automatically in a development or test environment before being pushed to production. I can put the whole config under revision control, so if there is a service or security incident due to a config change, we can figure out how our development and testing processes failed us - and how we can improve them.

And finally, speaking as a ardent FreeBSD user, I wholly agree that Ubuntu is utter rubbish.


tl;dr: Security through obscurity never is the best choice =)

[0] https://en.wikipedia.org/wiki/Security_through_obscurity

I am not talking about insane cryptic obfuscation, i am simply stating that advertising services by publicly naming your server what it runs is flat out n3wb stupid.

And my point is that public-facing services already advertise all kinds of information about themselves. Obscure codenames won't make attacks harder to run, while they will significantly increase the level of effort on the part of your sysadmins. Infosec is all about cost-benefit analysis and risk reduction, right? Well, in my view the administrative overhead costs of a naming convention like "Star Trek characters" outweigh the supposed security benefits.

I will grant you that it may come down to differences in our threat models. In my case attacks are impersonal - it's the malware or botnet /du jour/ that I have to deal with on a regular basis, versus the kind of APT facing journalists, civil rights organizations, or militaries. Even then I'm not sure that naming conventions that leak less information than a basic port scan will slow APT down. Too, the administrative overhead caused by forcing sysadmins to constantly go to a CMDB just to do basic troubleshooting might be a cost targetted organizations would be willing to pay. In my case, we run really, really lean, so we do what we can to make our I.T. services self-documenting (naming and numbering conventions that have meaning across multiple network layers).

There are different extremes. I've dealt with plenty of security admins whose religous adherence to the cults of compliance and network segmentation make it impossible to get anything done.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact