That's true, but there are a lot of places where ids live, but where I can't add `user-select: all`. For example, in terminal (logs), Studio3t (db client) etc.
The whole point of this article is that double-click to select a word doesn't work with a UUID... though, I think this should just be fixed: I have XTerm set up where double clicking selects a word, triple clicking selects a filename (which can include a hyphen, but not a slash), quadruple clicking selects a URL or path, and quintuple clicking selects the rest of the line.
The workaround I've landed on is double-click and hold, then drag in the correct direction. Precise enough to just grab the UUID, imprecise enough to be quick and not annoying
This is what I use, because it works almost everywhere almost exactly the same way, even when quite a lot else doesn’t have that quality. I similarly use long-press+slide for very similar behavior on iOS (although that has much more variance in behavior across apps, and sometimes even within the same app).
Adding clicks is not ideal, many people already have some troubles with double clicks (because they can't be fast enough, or it hurts), triple click is already harder.
Also more than 3 clicks starts taking a considerable about of time.
Now replace the double click, desktop, interaction, with press and hold, touch screen. I don't know of any triple click interaction that maps to a simplistic touch screen interaction.
The example used is a URL. I don't believe there is an equivalent for the address bar. Plenty of other examples exist, but that one is pretty easily reachable by users.
I like that some tools also provide a copy button next to fields, it looks like two overlapping boxes (kind of like: ⎘). If I were building a tool that exposed user copyable keys, I'd be sure to add these to my forms and fields. A brief contextual "copied!" modal that quickly fades away is a nice touch too.
Yep, double-click+drag is how I do it. On Linux double-click selects a chunk, and then dragging selects more chunks (fully). I double-click on the first or last chunk and go towards the other end.
Not as good as double-click selects all the id, but at least id doesn't take too much time and I don't have to precisely go to one of the ends.
There is danger in that as well, you could be copying transparent/hidden text as part of the copy, and if you paste and <enter> without reading what you've entered, it can be dangerous.
That being said, most people copy and run bash script straight off the internet, so clearly not worried about copying stuff they haven't read!
> That being said, most people copy and run bash script straight off the internet, so clearly not worried about copying stuff they haven't read!
The most common complaint about "pipe to bash" I've seen is the possibility for the server's response to detect it's being piped to bash, and then execute malicious code. The suggested remedy is to first download the install script (and check it) then run it. -- This seems overblown to me, since if you think the server may be malicious, then downloading programs from that server also seems risky.
Criticising people for not reading bash scripts from install pages is weirder to me. -- It's possible that some software author would hide malware in the install script; but, then why wouldn't they just hide malware in the installed program itself.
> This seems overblown to me, since if you think the server may be malicious, then downloading programs from that server also seems risky.
I heed the risk with the reasoning that even a benevolent server may be compromised, and that detecting pipe to bash is a potential way for that to go unnoticed.
> One way to enhance the usability of unique identifiers is by making them easily copyable. This can be achieved by removing the hyphens from the UUIDs,
No! That's throwing the baby out with the bathwater! Removing all separators means rare-but-important manual tasks of transcription or comparison become terrible, since there are no clear chunks.
Instead use a different character which doesn't have the same problem, one that most software considers part of the same "word"... such as the classic underscore.
For most people, double-clicking on this 123_456_789 will select all 9 important numbers. (And maybe a trailing space, but that's a separate problem.)
Speaking of which, why are we stuck with this terrible "trailing space selected" behaviour? It's not the case on all platforms, macos/ios perform fine and only select the actual word but Windows still includes the trailing space. There are posts online complaining about this going back 15 years at this point, it's super low hanging UX fruit.
Alas, no... however you might not need a sound if you can use tonal inflections and pauses to express the boundary instead. Particularly when chunks are short and when the receiver (or the software they're typing into) knows the format already... Although with a tech-illiterate relative you'll have bigger problems, like explaining what an underscore even looks like and where it is on their keyboard.
Obviously I can't fully express it in text here, but try to imagine this as a coworker speaking to you: "Hey, write down this IP address. It's ten, seventy, one twentyyyyyTWO, five."
They didn't actually say "period" or even "dot", but I bet you'd type 10.70.122.5 .
I've worked in IT (support, network mgmt and development roles) for 20 years, with colleagues, customers and clients from dozens of countries.
I've never once heard anyone drop out the dots in an IP. Non technical users aren't confident enough to do anything but read it exactly as it appears (one zero dot seven zero dot...) and technical users who are generally experienced enough to know what an IP address is, know that the dots are meaningful.
If it's something like 56.7.23.231, I'm definitely going to disambiguate it by deliberately saying each one of of those three dots.
But if it's more like 192.168.0.1, I'm probably not going to bother with speaking any delimiters in conversation with another person who has at least reasonable familiarity with common IP networking layouts.
Bringing it back to the topic: UUIDs should not ever follow familiar content patterns (if they do, then that's an issue in and of itself), so I'm always going to speak the delimiters of a UUID -- whatever they consist of.
(If nothing else, doing so breaks up the pattern into human-digestible chunks -- which is probably the sole reason we have those delimiters in UUIDs to begin with.)
> double-clicking on this 123_456_789 will select all 9 important numbers
True, but alas, on iOS I found that a double-tap selected only one of the digit groups.
In my view the touch interface UX is just as significant - perhaps even more so in recent years - given that the backdrop to many of these identifier format decisions is ensuring nontechnical end-user support, under time pressure, over possibly quite unreliable channels, goes as well as it can.
But look on the bright side, at least it didn't try to call the number
BitLocker does this and it's nice UX for walking someone through a recovery key over the phone.
Another VERY nice feature is it hashes each set of 6 digits as you type, so if you transpose one, you immediately get feedback instead of "invalid key!" after typing the whole thing out.
I don't think we're talking about the same problem here.
Regardless of how many dashes you have or how (ir)regularly they are spaced, to select the whole ID you must carefully click-drag-release around its boundaries, you can't just double-click anywhere in it to select.
The bech32 format is a favorite of mine because it uses an alphabet that's designed to be unambiguous and its checksum is designed specifically to guarantee catching few character mistakes and make it possible to suggest where the mistake likely is. It also has a builtin human-readable purpose prefix at the front. Since it's all lowercase it also fits into the QR alphanumeric mode, which doesn't support mixed case so QR codes of bech32 IDs are more efficient.
UUIDs should not be used as database primary keys unless the DBMS recommends it or you have a well-studied special reason for it. Postgres and MySQL are meant to use bigserial by default, even Citus. Some special sharded DBMSes like Spanner need non-sequential pkeys, but even Spanner explicitly tells you to use uuid4 because k-sortable keys cause hotspotting: https://cloud.google.com/spanner/docs/schema-design#uuid_pri...
I understand the performance implications of using a UUID for a primary key. And if performance is your primary concern, then this is good advice for large tables.
But if I could go back 25 years and only give myself one bit of advice, it would be to use UUIDs as the primary key. Because in a different context to raw performance, it offers a lot of advantages.
While there are advantages in numerous areas, I'll focus on one for this post. The area of distributed data.
We started by running a database on prem. Each branch or store got their own db. 15 years later always-on networking happened. 15 years after that, all businesses have fibre.
So now all the branches use a giant shared online database. With merged data. Uuid based this task would be trivial. Bigint based, yeah, it's not.
Along the same timeline data started escaping from our database. It would go to a phone, travel around a bit, change, get new records, then come home. Think lots of sales folk, in places without reception, doing stuff.
So you're right in the context of a single database (cluster) which encompasses all the data all the time.
But in the context where data lives beyond the database, using uuids solves a lot of problems.
There are other places as well where uuids shine.
So as with most advice when it comes to SQL, I'd add "context matters".
When data lives beyond the database, you need a uuid, but it doesn't need to be your pkey. Even your typical backend-frontend app with a single DB will often send uuids over the API.
If you're copying a DB, mutating, then merging back in, you just have to reset the bigint pkeys. I can see how in some contexts that might be less convenient (or if merges are very frequent and reads are not, less performant), but that's a special case and not something to assume from the start. For example I've done merges like this before pretty easily with bigints, and I've also been in places where they start out with uuids pkeys then never benefit.
Bearing in mind that primary key, and clustered key are not necessarily the same thing, your point stands that the uuid does not need to be the clustered key.
Renumbering bigint primary keys, so as the effect a one-time merge, becomes substantially less trivial if the desire for minimal downtime, coupled with hundreds of related tables, and tens of sites are in play.
I can't speak for PG but MySQL at least has a built in function to resolve the time ordering issue when storing v1 UUIDs (and a corresponding function to restore them to a valid UUID).
The CUID readme is wrong. You can safely ignore anyone who says "cloud-native" while discussing performance unless they're explaining why "cloud-native" architectures are often the worst of all possible designs for performance.
In postgres for example, full_page_writes (default on, generally not safe to turn off unless you can be sure your filesystem can guarantee it) means you have to write the entire page to WAL if you write one record. This will make your WAL grow way faster if you're doing random IOs. So right off that bat that's going to be a huge write impact.
Is this particularly widely used? I don't think I'm aware of TypeID either. I don't see why the author's pretty light solution is inferior to this library.
Depending on the font, it can collide with underlining (as in hyperlinks). I also once had a case where the dashed line a document viewer displayed for a page break hid underscores that happened to be on the last line of the page, causing the recipient to misinterpret the documentation.
In proportional fonts, underscores are generally wider than spaces, creating larger gaps between the underscore-separated parts than between the surrounding space-separated words. E.g. in "AAA BBB_CCC DDD", "AAA"/"BBB" and "CCC"/"DDD" are closer together than "BBB"/"CCC". In some fonts the difference is quite substantial. This makes for incorrect/unintuitive visual grouping.
You have to press Shift to type them. On mobile keyboards, underscore is usually one extra layer removed. For voice dictation, it's also longer than "dash" or "minus".
So your IDs are now tightly bound to whatever "types" you've currently decided you have, forcing a narrow view of what an entity is and making the entire system extremely brittle to change? What is a "type" even supposed to be? This is forcing a doubling-down on an already problematic design principle: that every entity is exactly one type of thing and these types of things are completely different than those types of things and obviously you can just make the perfect set of types that will never change if you think think really hard and everyone will agree on what each type means and they'll never change and that will never be a problem.
A quick glance of the repo shows that the "type" is just a prefix. You can do whatever you want with it. Basically the same thing as what the article suggests, no?
You face the same issues with table names though. So do you not name your tables?
The solution to your entity problem should be the same. You do the reasonable, practical thing, and rename/refactor if they drift away from the original mental concept.
> You face the same issues with table names though.
Exactly right. You've succinctly stated the biggest problem with almost all modern database design.
> So do you not name your tables?
I do, but that name does not represent a Type of Entity, where all entities therein are Exactly Thus, and all entities everywhere else are Absolutely Not At All Thus. Instead, it represents a statement I want to make about entities. Any "natural" meaning you put into your identifiers about what they are is defeating the point of the identifier.
> You do the reasonable, practical thing, and rename/refactor if they drift away from the original mental concept.
And now all of your IDs that are "in the wild" have expired. Can I still submit a request using the old ID, before you renamed Employee to WorkPerson and then to MobileLivingBeing and then to PossiblyMobilePossiblyLivingBeing? And it's not "if" they drift away, it's "when". And it's not just that they change over time, it's that they change from one perspective to the next. You can never have two distinct disciplines of the business ever referring to the same entity, because they don't agree on what the types mean. That bears repeating: you can never have two different disciplines both referring to the same entity unless they agree on what the types are, and they don't, because their terms have different meanings. Do your accountants and your maintenance people and your capital planning people and your corporate leadership all agree on exactly what a "facility" is? Because if they don't, they literally cannot even refer to the same entity. Good luck with your microservices.
All I'm getting here is you're against names and labeling things. But you don't provide any solutions and since a heap of untagged and unlabled or in any other way annotated data seem so obviously strictly worse I doubt that's what you're actually suggesting?
As someone who maintains a UUID library, this is definitely something that has been thought about, especially in the UUIDv6-v8 updates. But it was moved to be considered later as an extension after v6-v8 get approved fully.
Either way, I am in favor of prefixing and using alternative encodings, but it will need some time to figure out the best route. In the mean time, there are so many alternatives. TypeID, NanoID, ULID, etc. I even made my own quick one just for giggles: https://github.com/daegalus/snowflakes
If you can’t model the table with a natural key (or it would be so large as to inhibit performance), then a simple, normal monotonic integer is best. MySQL even lets you use unsigned ints, so if you use a bigint, you can go all the way up to 2^64-1.
For those who think this doesn’t work in distributed systems, it absolutely does – PlanetScale uses them internally [0]. If what is likely the largest MySQL (under Vitess) cluster in the world can manage, yours can too.
If this is still untenable, then anything k-sortable (like UUIDv7, as the sibling comment mentioned) is a vast improvement over randomness. Don’t cause B+tree page splits, especially in an RDBMS with a clustering index like MySQL.
Even if there's a natural key, serial bigint is usually the best. You can use unique indexes beside that without being stuck with them, and joins are faster on integer pkeys.
For performance, generally yes. Designing a purely relational model is satisfying, but theory often falls apart in prod.
It’s still a good idea, IMO, to think about table design starting as though you have a natural key (composite or singular). It helps develop the schema; you can then drop in a serial/identity/autoincrement column, and use the other relationships as FKs.
I agree. It also makes queries simpler. It's easier to handle
WHERE id = 123
compared to
WHERE site = 'HN' and username = 'hot_grill'
The last query is easier to write when you are querying the database manually, but I find the first more easy to handle programatically. It's easier to pass an argument from an URL or a message queue in this case.
As systems evolve, you may find that you need a third component to the natural key. If you don't use a simple id, you need to update every query that references the natural key
Resistant might be a strong word. I can see it only has E, A and Y as vowels which maybe helps a little for English as long as you're not the SATAN himself.
Collecting the dictionary of all swear words for all languages and their dialects might be less trivial. Keeping it up to date would probably take an institute worth of researchers.
More importantly, Base58 is orders of magnitude slower than Base64/Base32/Base16 due to O(N^2) algorithms required to encode/decode it. Blockchain software is already trying to get rid of it. Adopting Base58 now would be shortsighted.
Nobody does this. Normal users don't even know this is a thing. I worked on an app that did something like this for placeholders in generated text, and in all our extensive testing and high-touch rollouts, we never saw anyone use it.
It's nice that you took the time to think about it, but it's not that important.
Without getting too specific, it helped automate report writing for caseworkers by generating blocks of boilerplate text appropriate for each case, with placeholders for the particulars of the case.
Sort of like:
The applicant is fully employed as a _PROFESSION_ at _NAME_OF_COMPANY_
Our designer excited told us how he specifically used underscores so you could double-click the placeholder and just type over it.
Exactly. It seems the way they use "users" there is much more accurately said as "Junior Devs", which takes your applicable pool from ~7billion max to "almost noone".
Ulid have a short representation that uuid7 could use but doesn't define. Also, has UUID7 been standardized already? I thought it was still in the pipeline.
As a Microsoft SQL user (not everyone can choose their DB..), UUIDv7 has the issue that people will (understandably, but ignorantly) store it in "uniqueidentifier", which shuffles bytes around and are no longer sorted on time... ..
There is even a specific Microsoft SQL-time-ordererd UUID format which is sorted after byte shuffling..
We store ULID in binary(16). Works nicely. Only difference from UUIDv7 is the version bits..
Yes, one could. The problem is most programmers who are not aware of this (probably most), will see the UUID and make assumptions that uniqueidentifier type is the most suited one.
They are less likely to take a ULID and store it in binary(16). (Might store it in char(26), which is less storage efficient but is still sorted.)
Apart from that I happen to LIKE the fact that I can very quickly see the difference between a random identifier and a sorted identifier, they have quite different characteristics after all.. Although I guess people might eventually get used to the initial bytes being 0 on UUIDv7, or just learn to recognize the version bytes..
On that topic, why are the 8 fixed bits of a UUID not concentrated on the same byte. Perhaps the first one. Huge mistake..
I built cybertoken [1] for API keys and passwords, not (only) IDs. It is basically the format that GitHub uses for their api keys. Underscores, a prefix, so we can get a better debugging experience and automated secret scanning. It also has a CRC32, so you can check offline if the token candidate is a cybertoken while doing secret scanning.
I'd rather focus on optimizing the size of uuids than their text representation. Shipping uuids around as utf-8 text is silly. They're at most 128bits, so we shouldn't use more than that many.
In the case where it's necessary to have a text representation (e.g. in some user interface) I guess it's fine to choose whatever (stable) transformation you like, but the standard ways specified in RFC-4122 (hexadecimal, with or without hyphens) seem like the most foolproof. Regarding the logs search use case, the first "chunk" of a uuid is usually well more than enough for a unique match, IME.
Also, I've been burned before in cases where some clever transformation was used to make a uuid look different in text form, because in order to synthesize the actual binary uuid I first have to reverse engineer the transformation--e.g. to find the database record corresponding to some http request log message. That's just annoying, and the polar opposite of "user friendly" for the user story of an engineer trying to figure out what's wrong with the system.
"Let’s not pretend like we are Google or AWS who have special needs around this. Any securely generated UUID with 128 bits is more than enough for us."
Thank you. This is overthought so much, including with partially-random things like uuid3, 5, 7.
An even better UUID UX would cycle through colors when you clicked one and then would overlay the assigned color when you see that same UUID elsewhere. Better to find a needle in a haystack if it's the only one with a pink background.
We're not looking for duplicates because we're afraid that probability has failed us. We're looking for duplicates because we have a thing of interest, with a corresponding UUID, and we want to notice where else that thing is involved.
base58 is case sensitive which hinders readability. When devs work with uuids they typically remember the first few letters ("this is guid abc", "that one is guid 1ac"). Hard to do that in base58.
The coarsest encoding to have this property is Base32 where it remains easy to memorize first few letters without needing to memorize case.
But like... why? This article literally does not explain the benefits beyond copying, they are just assumed. I'm not immediately sold on shorter === better, especially when the updated UUIDs are only marginally shorter and you have now introduced the overhead of a translation layer for one of the most basic building blocks in your application.
I thought it was fairly well reasoned in the article.
Let's say you have a customer with that UUID as their ID. Do you expect them to recite their UUID perfectly to you every time? What if they made 5 transactions, each with their own UUIDs and you need to look them up, do you now expect them to read out 6 fairly unwieldy IDs?
The article is about the UX of UUIDs. Yes, there's a translation layer and more dev work to implement, but the shorter size and use of non-ambiguous characters is a massive improvement in the usability for the end users.
You've added an overhead where it had trivial costs and limited to be handled by professionals to reduce an overhead where it's annoying many more people, including poor customers
Also, you don't explain why copying isn't good enough, you just reject the article's reasons
Why would readability of a UUID matter? At most, users should by copy-pasting them, not reading them or trying to memorize them, so why should I and l and 1 looking similar matter?
Here is a thing I wish would Just Work, everywhere.
Given:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
I'd like to, say, double click on consectetur to select it (which works), bt then, while holding Shift, I would like to double click on elit so that the selection of consectetur is preserved, and extended over elit, including adipiscing.
The behavior I typically see is that, while holding Shift, the first click will extend the selection to the exact character of elit that I'm pointing at. The second click will then cancel the selection and select all of elit.
Like it doesn't mean a damn thing that I'm holding down Shift!
Ironically, I can make multiple selections that way using Ctrl in Firefox; Ctrl does modify the semantics of the second click while there is a selection.
We need to separate the storage format (Postgres and MySQL both have a bigint serial for primary key and it should stay that way)
for display, you have various ways to encode that number into something easier for humans, what I prefer:
- short word
- no offensive words
- with a checksum (so we easily spot any copy paste mistake)
- not sequential
- can be put in a url without extra encoding
this is our implementation of that (base32 and luhn code)
Q: What's special about the format of UUIDs compared to, say, an equivalent entropy 128-bit number? For many use cases, the hyphens appear to be utterly irrelevant.
This is a great point. It's unfortunate that few engines and standard libraries implement base58 encoding/decoding. This is probably because it is a more complex format because it does not include certain ambiguous letters like l and O. Base64 is relatively straight forward to implement as it maps easily from a number to a character.
Do they really have to be that long? And why can't they just be 4-4-4-4 (I'm sure someone knows). In hobby apps I often .split('-')[0] lol the first 8 is fine (so far)
The UX of UUIDs... should not exist. There are so many great ways to improve UX that don't involve overloading or mangling an internal primary identifier that otherwise follows a standard structure.
Use immutable human-readable identifiers like "slugs" and/or "natural keys" in addition to robust primary keys.
User-friendly URLs with slugs are useful for blogs and other things commonly linked to. But for many other entities, links will be rare, and the central coordination required would just be a waste of time.
I at first misread "TLDR; Please don't do this:" as applying to the entire article. I read it expecting a kind of McSweeney's parody where it gives you terrible advice. Really confused because the advice was good and not funny.
Except that UUIDs by themselves don't do this at all:
> They provide a reliable way to ensure that each item, user, or piece of data has a unique identity.
It is the registration of a UUID in a database which prohibits reuse that does that. If you aren't do doing that, ensuring that each use of a UUID is not a reuse of an already assigned UUID, they are not UNIQUE.
In the real world, the likelihood that a bug in your database engine or ID-hander-outer (a race condition, storage edge case or the like) is a lot higher than the risk of collision on a sufficiently large and random key.
The whole point of using UUIDs is that you can generate them locally without central coordination--if you want to coordinate your identifiers, you can use a much friendlier ID length (which is explained in the article).
But you still need to be wary of malicious collisions. I have seen security vulnerabilities where the client generates a UUID ID and that was inserted into the database. However by picking an ID that corresponded to objects from other users it was possible to gain some access to those objects.
So any UUID coming from an untrusted source (like a client application) should be checked for uniqueness. However your client apps can be written assuming that their randomly generated UUIDs never have collisions.
It can be very useful for offline work and latency hiding. For example the client can generate data structures with the final IDs then sync them to the server. This can also be useful for implementing idempotent updates.
The alternative of having some sort of "placeholder ID" until the sever gets back to you (or you get back online) adds a lot of client complexity.
Can't you just give the client a list of generated IDs on the first request and check if the IDs are from the pregenerated IDs afterwards? That should be a lot cheaper than checking all IDs you ever generated.
You can do that I suppose. But I have found that in most cases I have a unique index on the ID anyways, as they are often table primary keys. So really I just have to ensure that it is either an INSERT or an UPDATE to a record that is owned by the user.
Then you don't really have the problem this is about though, and which UUIDs are supposed to solve. The real pro of UUIDs is that you can issue them in distributed situations where you can't look up the already used IDs easily.
If you ensure each UUID is generated to spec, using a good source for the random bits, they are astronomically unlikely to collide without any coordination. Choosing the same 124 bits at random just doesn't happen by chance.
Of course you have to deal with the implications of people intentionally colliding UUIDs, so maybe don't generate them client-side.
One of the major motivations for UUIDs is that you can generate them in a decentralized fashion, without central registration, with a very high degree of confidence that they will not be repeated, a very important feature in distributed systems where you don’t want to rely on a either a central point of failure or the need for distributed consensus just to generate an ID for data elements.
"Reducing the length of your IDs can be nice, but you need to be careful and ensure your system is protected against ID collissions. Fortunately, this is pretty easy to do in your database layer. In our MySQL database we use IDs mostly as primary key and the database protects us from collisions. In case an ID exists already, we just generate a new one and try again. If our collision rate would go up significantly, we could simply increase the length of all future IDs and we’d be fine."
Strictly speaking you can't be sure that your UUIDv4 isn't (by pure luck) also in someone else's database, so it's not guaranteed to be universally unique. It's just very, very likely to be so.
For some value of "strictly speaking", this is true; but it's not a very relevant value. You can't be "sure" your counter never produces a duplicate, either--strictly speaking--in reality. The bug-free program or the computer that isn't affected by external factors is like the friction-free surface: sometimes useful to think about but not something that exists in reality. And the likelihood of a cosmic ray causing a bit flip or a race condition in the way your counter updates is a lot higher (as in, we see it happen all the time) than the theoretical likelihood of a collision on a sufficiently large and random key.
No, even that is not true. If all digital (and non-digital) storage media ever manufactured by humans - meaning all hard drives, tape drives, CDs, DVDs, BluRays etc. ever manufactured to date, and every book and word ever printed or written down. If those ALL were only filled in with UUIDv4s generated from a good random source .. you would still not see even one single collision!
UUID collisions are only possible with currently known human technology if your randomness source is not good enough. And it will remain so unless there are some astronomical leaps in digital information storage technology - at least 10 orders of magnitude more storage than currently exists.
EDIT: I thought of a way for programmers to mentally visualize how unlikely UUID collisions really are. Let's imagine that in some not-too-distant future, there are 10 billion people on Earth. Each of them are given one thousand CPUs. These CPUs have 1024 cores each, and they run at 10 GHz (clock cycles per second). The CPUs implement a hypothetical instruction that can generate a totally random UUID in one clock cycle.
As an experiment, all people on Earth one day decide to program all their thousand CPUs each to run a tight loop that will indefinitely generate UUIDs on all 1024 cores and then immediately discard them.
After continuing to run this experiment (whose electicity bill will make Bitcoin look like Earth Hour) all day, 24/7, for about 800 years, the likelihood of one UUID ever having been generated twice will have exceeded 50%.
I'm sorry to say that your analysis is wildly incorrect.
- 10 billion people =~ 2^33
- 1000 CPUs =~ 2^10
- 1024 cores =~ 2^10
- 10 GHz =~ 2^33
So: one second's computation by all of these people is 2^86 UUIDs generated. UUIDs are 128 bits. With probability essentially 1, there will be a collision within one second.
The reason is known as the birthday paradox. If you sample random values from a set of size k, after you've chosen about sqrt(k) values you will have chosen the same value twice with probability very close to 1/2. By 10*sqrt(k) samples you'll have found a collision with probability well over 90%.
In this case, after sampling 2^64 values you'll have a collision with probability 1/2. That happens in roughly 250 nanoseconds (2^-22 seconds) in your thought experiment.
2^64 sounds like a lot, but in many contexts it's not all that much. Every bitcoin block mined takes well in excess of 2^70 SHA evaluations. Obviously the miners are not dedicated to generating UUID collisions, but if they were they'd easily find thousands of them in the time it takes to mine one block (this neglects the fact that it is much easier to sample a UUID than to evaluate double-SHA256).
What part of "universally unique" in Universally Unique Identifier (UUID) you don't understand? They are specifically designed so that you can generate them locally without fearing collisions. That's like.. their entire point.
No matter what your identifiers look like, if you want them to be easily copyable you should add `user-select: all` to the element containing them.
If you do this, all of the text will be selected automatically when you click on the element.
https://developer.mozilla.org/en-US/docs/Web/CSS/user-select