I'm curious, did you ever do testing on whether converting a UUID to a a pair of...

netcraft · on Oct 27, 2017

The two biggest things it gets us is an identifier that is unique across the database and also something that the application layer can generate and then provide to the database - instead of inserting a record and finding out what the id is after the fact. Secondarily it means that someone cannot guess what the next record is by trying to increment or decrement an integer key but like you say there are ways around that.

We do plan to eventually convert the text versions into real uuids but our performance/size isnt an issue at the moment and its a lot of risky engineering work to do.

kbenson · on Oct 27, 2017

> it gets us is an identifier that is unique across the database and also something that the application layer can generate and then provide to the database - instead of inserting a record and finding out what the id is after the fact

That's a good point, and makes sense. Also, as gzrm noted in another comment, you can make use of the different parts of the UUID to encode specific information, which may be helpful (but at the loss of uniqueness bits).

grzm · on Oct 27, 2017

With respect to uniqueness, you can avoid collisions using, for example, some kind of sharding scheme, similar to what Instagram did:

https://engineering.instagram.com/sharding-ids-at-instagram-...

kbenson · on Oct 27, 2017

It's been quite a while since I've needed to, but when I was setting up HA databases, I just set the auto increment amount to some number more than the number of databases, and set the starting offset for each database to sequential ints (1,2,3...). This results in databases that can't generate duplicate ids. As a bonus, you know what database was used to create each record.

grzm · on Oct 27, 2017

That's one way if you're using sequences to generate ids. If you're encoding information in UUIDs, you also have to take into account how many bits you want to allocate to the database (shard) id. One method I've used is to hard-code the shard id as the return value in a function: effectively a constant. (Edit to add: This serves the same purpose as setting the starting offset of the sequence.) You do have a bit of custom code on each database instance in this case, which is its own tradeoff.