Hacker News new | past | comments | ask | show | jobs | submit login

> In what context would a primary key change, even when sharding?

The only time I've seen that happen is when a DBA evangelized for just using 'natural' keys, which in this case were slugs based on the item name. The problem came when users decided they wanted to rename things and then wondered why the IDs they were seeing in URLs didn't match the new names.

I'm not a believer in natural keys anymore.

> If you need sorting in your table, provide some kind of indexed timestamp.

A v4 UUID contains a timestamp. So it seems a bit wasteful to add the separate timestamp column. Why not just use the timestamp built into the ID? This is what MongoDB's internally-generated IDs do. (Though they're not quite UUIDs.)

There's a catch with v4 UUIDs though: the timestamp portion of the ID is not the most significant digits, so a simple ORDER BY doesn't do what you want.

I like the solution developed by Instagram, which Rob Conery discusses at the link below. It uses a pretty simple plpgsql function to generate a 64 bit integer based on the current time, a shard ID, and a per-shard sequence. This lets you generate 1024 IDs per second per shard. It also puts the timestamp portion of the ID first, so if all the shards' data ends up in Hadoop or something you can order consistently across the whole set.

http://rob.conery.io/2014/05/29/a-better-id-generator-for-po...




Item name was never suitable as a key... It's not stable. There are natural keys, they are just very rare. And the thing is even when you think you've got one it is often better to err conservatively. A good anti-example is SSN, often used in text books. They do change. Other than DNA sequences I can't think of a good person key. Just use an internal surrogate and be done with it.


> Other than DNA sequences I can't think of a good person key.

In theory, one could use 3D geoposition at time of birth, time of birth, and sibling order (for twins / triplets / etc delivered surgically), where said order is dictated by the parent(s) or ob/gyn present. Of course, the main problem with this in practice is that not everybody has this information.

I would have said you could do something via retinal imagery, but not everybody has eyes. If we had non-invasive neural imagery, would it maybe be possible to derive a key from a simplification of a person's physical brain topology?



Not to mention all the identical twins out there.


In a some countries national ID numbers are good primary keys, but that's only when they're designed (and used internally) as such by governments. e.g. Finnish, French, or Israeli ID numbers. Even then, though, using them limits you to people who are in the system - no dealing with international customers, for example.


At least in Finland, national ID can change in some cases, e.g. when you change your legal gender or if it's repeatedly misused by someone else (like in identify theft scenario).


I mixed up uuid versions above. v1 are the timestamp ones, and v4 purely random.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: