
Show HN: Decentralized, k-ordered unique IDs in Clojure - llambda
https://github.com/maxcountryman/flake
======
dantiberian
Very nice. This is based on Boundary's Erlang Flake ID service, more details
of which are at [http://boundary.com/blog/2012/01/12/flake-a-
decentralized-k-...](http://boundary.com/blog/2012/01/12/flake-a-
decentralized-k-ordered-unique-id-generator-in-erlang/).

One thing I had to look up again was the definition of k-ordering: K-ordering
gives _roughly_ time-ordered id's down to the millisecond when sorted
lexicographically. This is usually fine for most purposes but keep in mind it
doesn't allow you to reason about causality between timestamps from different
machines.

This also relies on your servers clocks to be in sync (something that should
be happening anyway).

~~~
swah
So, one would only use this when the powerful server that is responsible for
just generating IDs is saturated, right?

~~~
dantiberian
That's one reason you'd want this. A few others that I can think of:

* No single point of failure if the id server goes down * Avoiding network round trip latency * You have distributed servers across multiple data centres (though clock sync may be an issue)

If your service can tolerate k-ordered ids then I would pick this over a
central server every time.

~~~
swah
Another alternative is using local UUIDs, right ? If you're just interested in
generating a unique ID to shard and insert into one of your DB nodes.

~~~
bagels
The disadvantage a UUID has is that they're larger, Snowflake was something
like 63 bits, UUID is 128, 2x as long.

~~~
dantiberian
Snowflake was 63 bits due to some cleverness with resetting the epoch and
coordinating unique worker ID's through Zookeeper.

These Flakes relax those restrictions and are 128 bits. The benefit is that
they don't require a central server to distribute worker ID's and the epoch is
the standard Unix epoch. This holds as long as the MAC addresses on your
machines are unique. As another commenter mentioned, this doesn't always
happen.

------
VLM
"So this allows for 2^16-1 unique IDs per millisecond per machine."

Technically thats per MAC not per machine. I've run into multiple NICs with
duplicate MACs as has everyone else who's been in the game long enough.

~~~
ivenhov
This is interesting. I always thought MACs are unique (manufacture's ID + some
sequence). Unless you override factory MAC of course. I may be wrong here but
if MACs were nit unique basic networking, ARP etc would not work? COuld anyone
elaborate?

~~~
VLM
Nobody else responded? Its not supposed to happen and yet it does. Not often.
And yes it really confuses ARP.

------
mumrah
Neat! I made one of these in Java a while back after reading the snowflake
Twitter post (which has since been removed). Looks like Twitter has retired
Snowflake:
[https://github.com/twitter/snowflake](https://github.com/twitter/snowflake)

[https://github.com/mumrah/flake-java](https://github.com/mumrah/flake-java)

------
dtauzell
What happens if your system clock goes backwards? I think that NTP will not do
this by default, but I wonder if it happens for other reasons.

~~~
michaelmior
Lots of bad things happen if your clock goes backwards, which is why NTP
doesn't do this. Depending on what your system is doing, setting the clock
backwards can have serious security implications, such as making tokens which
should be expired valid again. I'm not sure it's worth considering this
scenario. (Other systems I've seen which rely on time for security purposes
don't seem to bother thinking about this either.)

------
espeed
Simpleflake is a 64 bit alternative
[http://engineering.custommade.com/simpleflake-distributed-
id...](http://engineering.custommade.com/simpleflake-distributed-id-
generation-for-the-lazy/)

