
A brief history of the UUID (2017) - tosh
https://segment.com/blog/a-brief-history-of-the-uuid/
======
ponytech
Comments from the first post in 2017:
[https://news.ycombinator.com/item?id=14508413](https://news.ycombinator.com/item?id=14508413)

------
thanatos_dem
Reading through this, I kept thinking that ULIDs[1] give the same benefits
described, with wider adoption/support.

Luckily it looks like the author has already written up his thoughts on the
differences[2].

[1] [https://github.com/ulid/spec](https://github.com/ulid/spec) [2]
[https://github.com/segmentio/ksuid/issues/8](https://github.com/segmentio/ksuid/issues/8)

~~~
masklinn
UILD is pretty much lying though:

> UUID v1/v2 is impractical in many environments, as it requires access to a
> unique, stable MAC address

RFC 4122 Section 4.1.6 "Node"

> For systems with no IEEE address, a randomly or pseudo-randomly generated
> value may be used; see Section 4.5. The multicast bit must be set in such
> addresses, in order that they will never conflict with addresses obtained
> from network cards.

There is no requirement of "a unique, stable MAC address" in UUIDv1, and most
UUID API should allow overriding the node (and probably clock_seq) fields.

> Canonically encoded as a 26 character string, as opposed to the 36 character
> UUID

> Uses Crockford's base32 for better efficiency and readability (5 bits per
> character)

> Case insensitive

> No special characters (URL safe)

You could just encode your UUID in base32…

> correctly detects and handles the same millisecond

I mean, that's worse than UUIDv1 by 3 orders of magnitude.

The lexical ordering is not a lie at least, so there's that.

------
dabber
Here's the Google cache until the server regains it's bearings:

[https://webcache.googleusercontent.com/search?q=cache:xWcDCg...](https://webcache.googleusercontent.com/search?q=cache:xWcDCgDGYKwJ:https://segment.com/blog/a-brief-
history-of-the-uuid/+&cd=1&hl=en&ct=clnk&gl=us)

------
jph
We changed from UUID-4 to ZID
([https://github.com/zidplan/zid](https://github.com/zidplan/zid)) because
it's faster and easier for many of our typical projects, including ones with
distributed computing and concurrent computing.

ZID is a secure random number represented as lowercase hex. No embedded
timestamp, no MAC address, no reserved character, etc. ZID-64 uses 64 bits,
ZID-128 uses 128 bits, same as a UUID, etc.

KSUID describes a hybrid ID approach i.e. the ID is a hybrid of a timestamp as
a string and random bits as a string. Our projects use a similar approach,
creating a timestamp and ZID (which is more flexible than a KSUID) or if we
want embedded time sortability then we use a ULID.

~~~
masklinn
> ZID is a secure random number represented as lowercase hex. […] ZID-128 uses
> 128 bits, same as a UUID, etc.

So… A UUIDv4?

~~~
jph
ZID comparison with UUIDv4:

1\. ZID specifies secure random number generation. UUIDv4 does not. Thus ZID
is useful in higher-security areas such as creating a unique ID that functions
as a password, or bearer token, or proof of knowledge, etc.

2\. ZID specifies that it can be as many bits as you want in multiples of 8,
and a notation suffix that says the bit count e.g. "ZID-128" means ZID with
128 bits. UUID can only be 128 bits. Thus ZID is more flexible e.g. ZID-64 is
a good fit for 64-bit systems, ZID-256 is good for fulfilling requirements for
256 bits of randomness, etc. This notation suffix is akin to the SHA
algorithm, which has SHA-128, SHA-256, SHA-512, etc.

3\. ZID specifies lowercase for hexadecimal string representation. UUID does
not specify lowercase or uppercase. Thus ZID is more-specific; ZID parsing is
one step easier/faster/clearer; ZID string comparison uses exact character
matching rather than case-insensitive matching. Thus ZID skips entire areas of
UUID bugs that we see in practice, such as one UUID system that emits
lowercase, one UUID system that emits uppercase, and an integration system
that needs to do string comparisons.

4\. ZID is always random. UUID has multiple algorithms, as you point out. In
practice we have seen the UUID multiple algorithms cause confusion and bugs
e.g. when a spec says "UUID" and the implementation uses a UUIDv4 yet the
spec's intent was a UUIDv1, or vice versa. Thus ZID makes it easier to write a
better spec.

5\. ZID subsections all satisfy proof of randomness e.g. computational
statistical analysis. UUIDv4 does not, because UUID4 uses 6 fixed bits to
indicate the algorithm. Thus ZID is easier and faster to prove as random, both
as a whole and also as any subsection such as by subsampling.

------
classichasclass
It's remarkable how much influence Domain/OS and Apollo had on later computing
and how few people actually remember them. I have an HP 425t here with a
Domain keyboard port, but after someone upgraded it to a PA-RISC 715, the
keyboard port is no longer connected to anything internally. Somehow this
seems metaphorical.

I also remember their computer graphics division. "Fair Play" made the rounds
at a lot of CGI festivals around that time.

------
amaccuish
I guess the NCA/NCS rpc stuff explains why UUIDs are so pervasive on Windows,
since DCE/RPC was based on NCA, and MSRPC is based on DCE/RPC.

------
ch33zer
I don't understand the desire to store timestamp information into a UUID. Why
not just add an extra timestamp field to your data? That seems like such a
simpler solution then embedding it into your UUID. I would go further and
argue that embedding anything but randomness into your UUID is a bad idea that
you will pay for in the future.

~~~
grzm
> _" I don't understand the desire to store timestamp information into a
> UUID"_

One reason is to be able provide sortability with respect to what is often a
surrogate key attribute, as listed in the introduction:

> _" It borrows core ideas from the ubiquitous UUID standard, adding time-
> based ordering and more friendly representation formats."_

You can find additional motivations in the "Time is on our side" section:

[https://segment.com/blog/a-brief-history-of-the-
uuid/#time-i...](https://segment.com/blog/a-brief-history-of-the-uuid/#time-
is-on-our-side)

> _" In Cassandra, TimeUUIDs are sortable by timestamp, quite useful when
> needing to roughly order by time."_

While you may not agree with the the reasons, I think they are understandable.

------
the_arun
Timestamp in the UUID will make sense if these are generated by one computing
node. Even if the nodes are off by a nano second in a cluster, we lose the
accuracy.

~~~
grzm
Timestamps in UUID values shouldn't be (and generally aren't) used for
coordination between nodes (where such precision an accuracy would be
important): they're used for rough sorting and partitioning of values.

Indeed, node-generated timestamps should never be used for coordination
regardless of whether they're encoded in UUIDs or not.

------
OliverJones
Credit where credit is due: Apollo Computer founder Paul Leach dreamed up and
implemented the UID concept, and later took it to Microsoft.

------
gumby
what a strange article. No, networked computing was not invented by Apollo and
indeed, I like how the author describes the first UUID as having been based on
prior UUIDs. I feel dumber after reading this.

~~~
contrast
Did you read it, though?

It absolutely does not say that Apollo invented network computing, it just
says it was one of the companies at that time working in that field.

Of course there were unique identifiers before the first UUID standard was
defined, and the author gives examples.

Acknowledging precursors, following the threads of how a particular
implementation or standard developed, is the only intelligent way to read up
on its history. The dumb thing would be to read into this things the author
simply never said or implied.

~~~
cfmcdonald
I think the clearly wrong statement here is "Workstations were really the
first networked computers."

