Hacker News new | past | comments | ask | show | jobs | submit login
A couple neat things about using chemical elements as hostnames (greer.fm)
165 points by arundelo on July 11, 2010 | hide | past | web | favorite | 67 comments

Naming is a classic bikeshed. Sure, a good naming system is a good thing, and helps you recognize what box is where. But once you hit between 20 and 200 hosts (depending on your memory) you are going to need to stick 'em in a database and sort 'em into groups to manage them in a sane manner anyhow, and at that point, naming schemes won't mean much anyhow.

The problem is that as a bikeshed, /everyone/ has an opinion and everyone wants to have that opinion heard. I've been places where we spent more effort arguing about naming schemes than it would have taken to write the database and tools for retrieving groups of hosts.

So generally, when this comes up in situations where I'm in charge, I loudly declare it a bikeshed, and we choose the scheme essentially at random.

as a person who deals with tens of thousands of machines/network devices, i disagree that it's a bikeshed. yes you have to refer to a database for specific information. but it can be a huge time saver if the information that is most generally needed is just there, not needing to be looked up.

for us, we basically have 4 types of information in a fqdn: datacenter, lan, type of host (web server, database, filer, router, switch, etc) and an additional datacenter identifier. after the root domain comes the datacenter, and if there's an additional subdomain that's the lan. after that is the host name which is a number prefixed by the host type. the number is a randomly-selected number within a certain range. the range indicates what datacenter it's in, so 4000-5999 are in one place and 6000-7999 are in another place (incase whatever you're looking at did not include the subdomains). now we can refer to 'ws1234' and know what it is and where it is. it still has much more specific information, but that can be looked up as necessary. since it's less random it's also easier to read hostnames to people and grep through logs.

after having worked with much more confusing naming conventions, i like this simple straightforward method because it's easy to read, talk about, and gives the basic information i need before i look up the specifics in the inventory db. can you do without it? of course. but does that make my job easier or harder?

Declaring bikeshed should be like calling shotgun--whoever declares it should make the decision.

But... what if they make the wrong decision!? I think we need more time to consider our options here.

"choose the scheme essentially at random."

Given that, as you say, it all ends up in a database anyhow, at work where we have ~100 machines or so with various name, there simply is no scheme. Why should there be? The real key is that it's in the database and we enforce that (via saying that if you grab something and it's taken, the database wins), but beyond that it's all just artificial anyhow. We have romanized japanese numbers and characters from Zork, the Hitchhiker's Guide, and the Legend of Zelda, places where one person took their favorite photos (used as logos for the machines), several other vague schemes I've only caught wind of since I'm not part of the relevant groups, and straight-up utilitarian naming ("printer.subdomain") and so on, and it doesn't cause a problem. (IT itself does have its own naming scheme, we're engineering. It makes good sense for IT, I think.)

I wish I could upvote you a thousand times. Naming discussions are classic bikeshedding. Everyone has an opinion, they'll argue endlessly and vigorously for their preferred naming scheme, and in the end it has very little real impact.

IMHO, almost any attempt to create some kind of meaning in the naming convention ends up sub-optimal. For example, I've seen people include information about the server hardware or OS in the name (ie something like dellwinnt01), or locations (atl-row1-01), etc. All of which fail miserably when there are server hardware/software update, data center moves, etc.

In the end, a server name should be easy to say, easy to spell, easy to remember, and shouldn't be anything that's likely to be controversial or offensive. No other criteria really matter.

Yeah. I remember one place that did the thing where they named the box based on the rack and the 'slot' in the rack. the problem was that this was the /primary key/ in the database so as troublesome hosts got swapped out for diagnostics, we lost their history. We had no idea if this was the second time the box was sent in for diagnostics, or the 50th.

Easy solution to that: put a label on the box and put a line there when it comes in for diagnostics.

This place had something like five thousand servers over several data centers. Like most places that size or larger, the admin never actually sees the servers. You (the admin) get a serial console and at best tell the monkey to replace part X or Y. At worst, it gets shipped back to dell, and reported as a new box when dell doesn't fix it (dell never fixes intermittent problems, unless you can tell them /exactly/ what the problem is. )

We have several very very large clusters - as you say, the names become problematic.

Currently we are using a "code" to ID servers.

- 2 letter geographical location code

- 2 letter building code

- 4 number computer ID

(e.g. LIMB0001)

This was after someone had to spend nearly a whole day working out where computer CU2043 was located :)

yeah. every time i've worked on multi-thousand box clusters you always did operations on sets of servers... and you'd grab those sets using a tool that would pull from a database. if you got an alert for a particular server, you'd go back into the database and you could then pull up all details you have for that server, including how to access it physically or remotely.

My current setup is approaching the 'need a database' scale... (we're at around 60 physical servers I've gotta deal with, if you count co-lo customer boxes, and maybe another 20 virtuals we manage. We've got around 1300 customer virtuals that I don't name or manage. so dealing with that database, either by massaging freeside until it can do what we need to do, or writing something else is one of our next projects.)

Whenever the list of systems gets too large I find I need to come up with a very boring scheme like that. At the university where I worked I set them up as:

- 6-letter department abbreviation

- 3-letter group abbreviation

- 1-letter role/type identifier (e.g. w for workstation, s for server, m for mobile)

- 3-number sequence number, with the highest order digit often grouping related systems together.

It worked out pretty well and it made the systems easy to figure out from the hostname alone.

I like the idea of geographical names, but I would get a little annoyed when I moved a system from one building to another.

Couldn't they have just used the IP address of the computer to figure out what subnet the computer was on?

many people have big, flat layer two networks... e.g. one subnet might span locations. Now, personally, I think this is a bad idea in nearly all cases, but many other people disagree with me.

The thing is, vendors of expensive MPLS gear like to sell this idea of 'location independent subnets' to upper management as a way to 'simplify' the network.

Now maybe it's just 'cause I don't know layer two well, but god damn it is so much easier to troubleshoot layer three problems than to troubleshoot layer two problems, so I strongly favor making the subnets location and/or rack dependent.

The very idea makes the network engineer in me cringe. There are so many good reasons to establish layer 3 boundaries before crossing any relatively low speed or high latency connections.

No, we don't control the network infrastructure at most of the locations - so two computers could easily have the same IP address on the local internal network as a computer several hundred miles away.

Unless you're nating between those locations, that's not possible.

It's a distributed cluster - I was trying to avoid being too specific for that reason :) my post was just about our practical example of naming many many computers

Naming schemes by people who haven't had to deal with at least three or four large environments are typically poorly done and result in confusion more than helping you. People think they understand naming, but inevitably they have no clue. Find the sysadmin/network engineer with 15+ years experience in real world environments and just let them come up with a naming scheme. Speaking as one of those people - server A-records = m1, m2, ... M9999, etc..

Routers=rt1.datacenter, rt2... (sw1 for switches, fw1 for firewalls, ...)

Yes, everyone has an opinion and everyone wants their opinion heard. Some people get much too passionate about naming. But in my experience, it's often been the people with the least vested interest who have the strongest opinions, and rather than 'calling bikeshed' the smart thing to do would be to identify the real stakeholders and get their opinions.

If you have hundreds of servers, the team of sys admins using them every single day, who stand a good chance of connecting to all of them at least once in a year, should take priority over a manager who might refer to a small portion of them in conversation at a high level a few times per month or year.

Someone who has to use server names in conversations with customers, investors, partners, and the like should take priority over a guy who wants to name production servers after his favorite porn stars.

One solution is to simply declare bikeshed and pick an arbitrary scheme. I think in most cases that wouldn't be necessary if you muzzled the people who really should never have gotten involved in the discussion to begin with. It may be that once you get consensus on a few criteria (eg. no porn-star names) you may stall. That's when you declare bikeshed and pick something arbitrary that fits the criteria.

"I admit this naming convention is more for the massive nerd factor than practicality."

I like to use Pokemon names, as the numbers of those first 151 have stuck in my head for many years for some reason. You can subdivide them in many memorable ways too, e.g. the author could say: "On my network, pikachu is my router, rack-mount servers are fire-type pokemon, embedded devices are water, gaming consoles are dragon, and laptops are ground type."

It would also has the advantage that you could use redily available pokemon stickers to attach to each device, and is a nice way to personify devices if you are so inclined - "Ivysaur is playing up again; time for a trip to the pokecentre"

Laptops should be flying type. Then embedded devices can be ground type.

The last sentence should be the first: “If you have more than about 20 machines, just give them numbered hostnames and be done with it.”

The problem with numbered hostnames is that it's very difficult to remember which host is which, even in the course of very short projects (was that ser0483 or ser0438 that needed rebooted?) etc. Although I'll grant you that numbered hostnames is probably the only workable alternative for a large farm of otherwise identical servers (VMWare server farm, rendering farm, etc.)

I've never found numeric names to be a problem. And there is _never_ a problem with how to spell the name. I will take "ssh m435.local.domain" over (real world example, ARGH) "ssh Vanaheimr.local.domain." and "ssh Muspellheim.local.domain" - I want to shoot people who think that's clever.

I like that there is consistency in naming them when considering the periodic table. However, most people don't know the table well enough to know their atomic number, their classification (noble gas, halogen, etc.) nor would there be any connection between name and purpose. Rather you must know the meta-information about the element to know the purpose which makes this a well-suited convention for chemists but few others.

Usually the hostname should be disconnected from the machine's purpose. Instead, use a generic name and then use aliases (either cnames or additional A records) for mapping services to the machine.

I don't think nnutter meant naming machines by their purpose like "mail.localdomain", "www.localdomain", etc. Rather, for example, I once named a QA build-testing machine "yaeger" after the test pilot that broke the sound barrier. Hostnames whose namesake has similar traits (even vaguely) to the machine's purpose can help you remember which machine performs what task on your network, or recall what a machine's purpose is by it's hostname. Careful selection of such names provides these benefits without causing any more problems when you move a service to another machine than using any other naming scheme.

this! i like it when the hostname is the physical location of the box, and then aliases to map services to the box, like you suggest.

A periodic table will make the atomic number and element type easy to look up. You can get it on coffee mugs. It'll be fun.

This would be a good way to learn.

It's gonna be a disaster if they ever need to re-address for any reason.

Numbered hostnames are fine if you've got a cluster of identical machines such that there's not much value in differentiation.

'xyz241' is a test router

'xyz211' is a second test router

'xyz262' is a production webserver

'xyz546' is a production webserver

'xyz33' is a development database server

'xyz411' is a development database server for a temporary contract dev team.

'xyz101' is the corporate file server

If you have hundreds of devices to track, the weaknesses of using numbers should be apparent. The only advantage of that naming scheme is organization of the entire monolithic collection. It's easy to select new names and it is easy to enumerate the entire collection at once. Those are definitely advantages, but there are some obvious drawbacks, too. There's no obvious way to tell, short of committing it to memory, that "546" is a webserver.

To track of any other pattern, you'll have to encode some sort of information into the names. For example, you could organize the numbers, like most colleges do for classes. <subj> <level-number> * 100 + <course-number>. eg: MUSC 211.

I always figured that names are good if you treat your servers like pets -- unique, long-lived individuals with personalities and quirks.

Otherwise, you use numbers, and you treat them like farm animals -- anonymous, members of a species, with shorter lives.

You can get attached to pets, but you shouldn't bond to livestock.

I wouldn't say it's good or bad, it just is. Sometimes being attached to a server can be helpful, but it can also be a hindrance. There's no right or wrong answer.

You also should never have hundreds of pets at once.

Lest you become a crazy cat lady. I like this metaphor.

I would find it easier to remember that the 400's are webservers than that the pnictogens are webservers. You can also make the even/ oddness of the numbers significant like interstate highways.

For practicality, you would combine meaningful names (for the groupings) with numbers (web3, dev5, router2) rather than bijecting the entirety of your company's assets onto a set of integers.

Meaningful names are fine if you have enough servers that every single server can do one thing. But in smaller circumstances the 'web' server may need to do decidedly non-web tasks. And it's difficult to know this in advance, so trying to name things meaningfully will only cause confusion when someone discovers that the database server is running apache or the web server is hosting the local office nfs mounts.

I like the use of something like elements because it lets you use the character of the element as a sort of metaphor for the function of the server - but since it's a metaphor and not a clear name it's understood that you can't rely on the name to know what the server does, and you should keep it documented.

This gives purpose to all the years of Chemistry I learned at school.. Thank you!

At my current job, we've switched from Star Wars characters to colors. While I can understand that having a testing machine called "jarjar" is a bit, well, jarring to those with less geeky proclivities. But mentally associating colors to certain servers is a bit too bland. Maybe that's just the way my brain works…

(Worst sin I've commited, name-wise: Naming my desktop PC dunsinane and my laptop birnham)

Just refer to the RFC: http://www.faqs.org/rfcs/rfc1178.html

Yes, this is the PERFECT plan.

But how do I name my router at

I believe that would be "bipentquadium". (http://en.wikipedia.org/wiki/Systematic_element_name)

Very good. Now I can safely append bipentquadium bpq


You name the new element of course. perhaps tahunium? ;-)

unobtanium, adamantium, mithrilium, ...?

Edit: imodium?

we do this as well; the inspiration came from the engineering machine names at umich.edu.

Edit: umich had a lot of funny machine names. The login servers were named for console video games (galaga, moonpatrol, galaxian), while the kerberos servers were (inexplicably) named for schwarzenegger movies: terminator, redheat, lastactionhero, etc.

Overloading the names? Classifications as functional groups?

Um. Ok. That's, um, interesting.

If you start with chemical names and hit the limits of the complexity that permits, are you going to move onward to molecules? Wait for Ununbium to be renamed Copernicium? What's next, molecular host names within the hydroxyl group are assigned to management?

If you're going to use a "corporate-style" naming scheme for your hosts, then just encode whatever it is you want to convey into the host name. Location or whatever.

Or just select random names.

Having the IP address encoded in the name strikes me as particularly problematic, too. Hosts and subnets and VLANs tend to come and go, and names and VLANs and address assignments tend to shift.

And naming based on physical locality is an increasingly quaint concept in the era of virtualization and client mobility.

After a while, most any naming scheme will prove insufficient or will unravel, so why not just have an eye toward the likelihood of distributed management and (increasing) complexity from the onset?

Seems better to spend your time getting (for instance) SNMP and DNS records configured and working for all your boxes, and less time making (more) work for yourself with explaining and maintaining complex host name schemes.

As the article states:

"I admit this naming convention is more for the massive nerd factor than practicality. If you have more than about 20 machines, just give them numbered hostnames and be done with it."

Seems like this extends to domain names too. SO why even have an address bar at the top of the browser? What will replace it?

Many browsers have two text input boxes.

In more than a few places, a Google search string (or the browser's search input box) is the already the preferred contact path; not the URL.

What might replace URL-based navigation more widely, I don't know. We went from ARPA not all that long ago to the Big Seven to however many dozens of TLDs we have now. We'll certainly find out, as the URLs are increasingly hideous, and we're just a few years into this current naming structure.

There are certainly potential business opportunities here, too. Approaches to better solve a distributed planetary-scale address book than "just" based on a domain name, and better than a brute-force search engine.

Yes, I've wanted an "authoritative" directory of web pages for some time. Every since Yahoo gave up on it, I've had to learn clever ways to tease the right pages out of google.

How about a search engine that ALWAYS returns the canonical page for any techinical/commercial/copyright/patented term FIRST! Instead of somebody's dog named "rfq" or a car for sale by a gal from Concordance, OR.

In my household, we've found that place names can be useful for memorable and informative hostnames. Servers in the garage: Japanese cities. Upstairs northern places (being English we use Scottish cities), basement Australian, and the central computer is Rome. After all, all roads lead to Rome!

We've solved it by having the actual hostname be the mac address from the machines first NIC preceded by a 2 character code for what type of machine it is (DT for desktop, VM for virtual etc) and then allowing for the assignment of an "administrative" name that goes in the internal DNS and is shown in green on the command line when logged in.

This lets the suits pick goofy names for their boxes based on whatever they've seen on tv recently and allows us to automatically generate unique hostnames based on physical properties of the actual machines.

The sweet lack of multiple "Gimli's" and "Gandalf's" on my network warms my geek heart.

This becomes somewhat more problematic (but not impossible) when you need to start replacing NICs, and moving MAC addresses over to new servers. Some linux IPv6 stacks (RHEL) tend to work better in autoconf mode (EUI-64 promotion of their MAC address) - particularly across multiple versions of the kernel (Static addressing has given us problem) - so we tend to do the migration of IPv6 addressing by changing the MAC address on the NIC, which, ironically, offers more straightforward and predictable behavior across multiple RHEL 5 releases than trying to force a particular static IPv6 address onto the host.

If you end up having to play games with your MAC addresses, basing hostnames on them is a suboptimal strategy.

Asset tags, on the other hand.....

This is great for highly connected devices like routers and switches because each element has an unambiguous abbreviation which makes labeling cables much easier.

naming servers is so BAWS (before AWS). :)

I did this back in 1995. I still remember the first three hosts: mercury, carbon, helium ...

Cute hostnames are pretty obnoxious. My preferred method of building out datacenters is more self-documenting, and I've done it on 4 rack colo buildouts and 100 rack colo build-outs.

I just use a naming format of Loc-Cabinet-Rackspace.

In my Seattle datacenter I know that sea-c01-s1 is a server thats in my seattle datacenter, cabinet 1, slot 1. I know it's PDUs are plugged into port 1 of PDU-LEFT and PDU-RIGHT in that cabint. I know it's two ethernet ports are plugged into Port 1 sea-c01-sw-left and sea-c02-sw-right. I know it's LOM cable is plugged into port 1 of sea-c01-sw-mgmt.

(I handle multi-U spaces by naming the server based on the bottom most rackspace it uses, and I stay consistent. If Sea-c01-s1 takes 10 rackspaces, then the server above it is sea-c01-s11. Beyond that, my configuration management tools tag spare resources when needed, and release them when unneeded. Cute hostnames are of no use to me, or anybody building out a datacenter that they'd like somebody else to be able to manage without months of knowledge dump.

What happens when you have to move a server for some reason? Has that never happened?

My guess: logical names (like www or db or ns) are CNAMES, so that when the server moves and gets a new named based on location, the CNAME is adjusted as well.

You shouldn't need to rename a server when you move it, esp. since you might not know where its hostname is used (e.g. at least one major database vendor at software installation time embeds the hostname in various generated config files).

Machines should have sequential names, services should be VIPs (or CNAMEs if you must) and everything else should be in a configuration management database.

Yep — exactly. To use software parlance, if using this convention would halt or inconvenience your production, it's a smell and a likely indication you've got a spaghetti setup on your hands.

At least the way I like to build infrastructure, physical servers don't really matter. If for whatever reason servers had to get moved, their mac address would be updated to reflect it's location. I wrote some code a while back which grabs all the mac addresses from a switch and use that to map their physical location. It's quite simple, it looks at a port to see if it has multiple mac addresses behind it, if it does, it ignores it, if not, it maps that port out and labels it appropriately. There's also a little bit of logic to figure out what kind of a device a port is (whether it's another switch, a server, a pdu, etc), which also helps with my own personal continuing de-emphasis on the physicality of computing.

"I admit this naming convention is more for the massive nerd factor than practicality."


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact