

Ask HN: How should I create a unique id for entries that aren't incremental? - tim_nuwin

For example, right now when I&#x27;m creating boards (agile), it will create a new board and its id will be n + 1.<p>What is an efficient way of creating an ID where there won&#x27;t be any collision even if there are 1 billion+ entries?<p>This ID will be used in a url..<p>Thanks,
Tim
======
smt88
_" where there won't be any collision even if there are 1 billion+ entries"_

This is a really complicated topic, and there are multiple ways to handle what
you're doing. It really depends on your read/write ratios, typical volume,
growth rate, and the underlying DB software you're using.

Because there are so many considerations that require knowing real-world use
cases, it's a premature optimization. Are you going to have more than 1
billion records in the next few years? If not, don't worry about this.

However, there are other reasons to use non-incremental IDs (security, for
one).

To answer your question as asked though, check this out:
[http://www.postgresql.org/docs/8.3/static/datatype-
uuid.html](http://www.postgresql.org/docs/8.3/static/datatype-uuid.html)

~~~
iancarroll
> However, there are other reasons to use non-incremental IDs (security, for
> one).

That's just security by obscurity, with proper authorization checking it
doesn't matter.

~~~
smt88
Security doesn't always mean "seeing something you're not supposed to see".
He's saying that the boards are public, so people are able to just change the
number at the end of the URL to find them all.

You can have the same issue with scrapers. It's much easier for scrapers to
get all your pages if you use sequential numbers for unique IDs.

Yes, a search engine could index the pages, but the big engines will obey your
robots.txt, and the small engines will never know that you exist most likely.

So s/he's not trying to "secure" anything as much as just hide it.

------
iancarroll
Incremental IDs work best, but if you want you can hash a UUID which will work
for your use case:

% uuidgen

B14818B6-4219-43BD-82EF-8421EC1AFBCF

% echo "B14818B6-4219-43BD-82EF-8421EC1AFBCF" | shasum -a 256

00ea501d47789ac5eb559f10d631b3f6df8f82b5cba9c1f9d234b705d89f1704

~~~
tim_nuwin
Those urls are kind of ugly. If this helps at all, maybe I could create a
public url id based off the incremental PK id's.

------
sjs382
[http://hashids.org/](http://hashids.org/)

------
Rainb
How about hashing the incremental? Now I wonder how ids like imgur or youtube
work.

~~~
Jeremy1026
Base62 encoded incremental IDs

~~~
smt88
That's still incremental though. It just looks different.

~~~
Jeremy1026
bigint gives you 9.2 quintillion options before you run into a collision.
Which is obviously not forever proofed, but certainly future proofed.

~~~
smt88
OP isn't asking for a data type that can hold more than a billion records.

OP needs a way to generate a random ID without checking that the ID has
already been used -- s/he wants a UUID. Bigint is FAR too small to do that a
billion times without a collision.

