User IDs probably shouldn't be passed around as ints

peterkelly · on April 28, 2018

I see this an issue of typing. Programs use integers in many different kinds of ways, and if you have a function that accepts an integer as an argument, neither static nor dynamic type checking will catch the case where you pass the wrong "type" of integer. The same goes for UUIDs, strings, etc. or whatever other opaque values you use to reference entities in your system.

You can avoid these problems by wrapping the integers in objects that you use solely for referencing entities of the relevant type. For example:

    class UserRef {
        id: int;
    }

Then you define functions like this:

    function ban_account(user: UserRef) {
        // ...
    }

And a static type checker will pick up incorrect uses. For dynamically typed languages, you could instead use a unique field name such as user_id, to achieve the same thing (though getting a runtime error instead of a compile time error).

Obviously you'll still be using ints or strings as an external representation, but as long as you do the conversion at the point where the identifier enters the program, the type system will take care of the rest for you.

chriswarbo · on April 28, 2018

> I see this an issue of typing

Yes, this reminds me of "stringly typed programming", i.e. where the language may offer strong types, but the program just uses `String` everywhere. String injection attacks are examples of this: SQL injection can only occur if it's possible to concatenate "SQL statement" with "user input"; if these are both represented as `String` then it's easy to run into such problems; if they're represented as different types, then the only way to combine them would be with a designated conversion function, which is exactly where we can put the neccessary escaping.

See also: http://blog.moertel.com/posts/2006-10-18-a-type-based-soluti...

weberc2 · on April 29, 2018

CMake is stringly typed, and that’s only part of the reason it’s such an atrocity.

tigershark · on April 28, 2018

It’s what pretty much everyone in the F# or DDD community is advocating for. Avoiding to use the first 10000 integers as she suggested is not a solution. And using contiguous integers as user ids is the silliest thing that you can do given that it leaves you also open to enumeration attacks.

naasking · on April 28, 2018

> And using contiguous integers as user ids is the silliest thing that you can do given that it leaves you also open to enumeration attacks.

I don't think it's silly, you just need to protect your parameters. There's a way to do this as part of the basic programming framework. See my other post: https://news.ycombinator.com/item?id=16947546

weberc2 · on April 29, 2018

You’re right that it’s an issue of typing, but lots of type systems handle this just fine. Even Go’s type system—which everyone loves to hate—let’s you create new types from existing ones (such as a UserID type based on int). This feature is frequently referred to as newtype, and its even present in Python via typing.NewType! This is a bit nicer than the wrapper approach you describe for performance reasons (in practice, wrapping an int in a class will create unnecessary overhead for many languages).

dominicr · on April 28, 2018

Basically saying that don't use numbers because somebody might write crap code and it'll get run on your deployment database?!

If getting such dangerously awful code deployed to production is likely then sequential IDs are just one of your many problems!

Sequential IDs for key data can be good to avoid for a few good reasons but awful code isn't one of them. Testing, code reviews and not having bad programmers should be in place to fix that.

danpalmer · on April 28, 2018

All programmers make mistakes. I think the mark of good software design is using practices that make it impossible or much more difficult to make the mistakes that we know are going to be made at some point.

Much of our code review is not about "is this code correct" - that's obviously important, but rather "will this be easy to review changes to in the future" or "if I came back here in a year, would I get it wrong". I think that's just as valuable.

There's always a trade-off with complexity, and I'm not sure whether this one pays off, but designing _for_ programming errors is important for any team/company/product that is growing, and will introduce new developers who weren't around when the decisions were made.

Chyzwar · on April 28, 2018

I would reject PR that introduce code like this. I disagree with the article. You should not introduce "safeguards" as described in the article. In few years time, someone will make bug assuming that users id starts from 1.

Instead, you can write this as (js):

  banSendersOfMessages(messages) {
    messages
     .forEach(({senders}) => senders.forEach(banAccount));
  }

In the ruby community, there is the principle of least surprise. Your design you should not introduce astonishing things.

tigershark · on April 28, 2018

You have to assume that a bug is the normality, not an exception. The more safeguards you put to make illegal state unrepresentable and illegal code uncompilable the better is it.

andrewflnr · on April 28, 2018

The author was an SRE at Google IIRC. Google has a pretty decent code review and testing culture. However, even if you have a process that eliminates 99.9% of the moronic mistakes like this, that kind of scale still makes it a certainty that stuff like this will get through occasionally. She still has to try to keep the service running. That's the perspective from which she's writing.

dominicr · on April 30, 2018

OK, some fair comments there.

For me the article's negative was that it focused on one edge case reason to not use IDs, when there are others.

While everyone makes mistakes I'd hope that using an array's size instead of its value is nigh on impossible to end up committed, let alone deployed to production.

As everyone makes mistakes for me the main reason for not using numeric user ids is because you're much more likely to accidentally expose an API or query string param that can be used to look up other user ids. When that happens being able to enumerate ids makes for a massive data breach, whereas a GUID stops that.

(There may be some reason why you'd be forced to use numeric IDs in a data store for performance at scale reasons but I imagine that's relatively rare.)

eesmith · on April 28, 2018

How do you draw the conclusion that the author is saying "don't use numbers because somebody might write crap code" when one of the suggested improvements was "to use a large key space, like the 64 bit integers (or perhaps the subset which can be represented by Javascript, sigh)"

int64 are numbers, so the author cannot be saying "don't use numbers".

Instead, it's specifically numbers generated by something like an auto-incrementing primary key in the database.

ris · on April 28, 2018

Problem is that randomly assigned ids tend to perform sub-optimally in relational databases due to the sparse distribution screwing up row estimates and causing fragmented writes to the underlying table. For this reason many people have a policy of having (nice sequential) "internal" ids and then (pseudorandom) "external" ids.

kyrra · on April 29, 2018

Depends on your database. For example, Spanner it is better to use random numbers for the primary index, as monotonically increasing ints will cause hotspots on the database.

https://cloud.google.com/spanner/docs/schema-design#choosing...

_em6m · on April 28, 2018

Hi, I'm mister UUID and I have been living in your favourite framework since 2001. Use me.

drawkbox · on April 28, 2018

UUID are the way, GUID for those in Microsoft's lands.

Not only will you never run out but you don't need a round trip to the keymaster, horizontal scaling, clustering and vertical sharding even are all way easier due to that. This removes a single point of failure entirely when the keymaster is no longer needed.

Using ints for keys for profiles/users and many other things were needed way back when processing/db/disk/memory and performance from that were a problem, no longer.

Do your part, join the UUID revolution. Also, if you were a piece of data, would you not want to be unique across all the databases, storage and services? You've heard of Roko's Basilisk right? Do not disappoint.

kasey_junk · on April 28, 2018

I’ve often wondered where the tipping point is where you do start having to worry about collisions. Is frantic “fix all the uuid breaking code” the jobs boost for programmers 20 years from now? 100?

It seems unthinkable now but who knows...

drawkbox · on April 28, 2018

We'll be good until at least sentient AI takes over and they can solve it for us, or them.

sethammons · on April 28, 2018

We've had multiple collisions with uuid v4. I'm not overly familiar with the examples, but in one case, the collision was found because the collision happened in the same account! Customer: "Why do these two events have the same id?" Us: "Wow." I think we determined that it was a limitation in the random number generator in the Perl lib we were using. To avoid that, I believe the solution was to go with uuid v1 plus a nonce of some sort.

cesarb · on April 28, 2018

If you have collisions with UUID v4, something is really broken with your UUID generator. It's not a "limitation", it's a critical bug.

partycoder · on April 29, 2018

GUIDs are expensive to compute. They are based on your MAC address, that is read every time from your registry.

NathanOsullivan · on April 29, 2018

You are describing a particular way of creating a UUID ("Version 1").

Another method, "Version 4", is simply a 122-bit random number.

megaman22 · on April 28, 2018

Holy schmolies, yes. And if you use them as your access keys for, say, a webservice, people aren't really going to be able to fish around the same way they can if you use auto-increment integers.

If your url looks like https://awesomeunicorn.com/userProfile/123, it's pretty obvious that I could start poking around and trying to get user 122 or 124. If that url is https://awesomeunicorn.com/userProfile/123456-dead-beef-abba..., I don't really have any idea what the ID of the next user might be.

pletnes · on April 28, 2018

The downside is that your urls are long, ugly and incomprehensible.

zxcmx · on April 28, 2018

The upside is that your business is not totally transparent to outsiders who can figure out how many users, carts, products and messages you are adding each month.

pletnes · on April 28, 2018

There must be some other way. I can often figure out the direct url to a software project in bitbucket, but office365 urls are pages long. I somehow doubt atlassian is as transparent as you say.

Disclaimer: not a web dev

kuschku · on April 28, 2018

This is a massive issue with pastes on GitLab.

Unlisted, private, public snippets all get links such as https://gitlab.com/snippets/1712835

If you have an instance with few users, you can just create a new snippet and then auto-decrement.

e.g. I just found https://gitlab.com/snippets/17128 this way.

llao · on April 28, 2018

If trying more or less random URLs of your service is an issue for you, you have an issue.

megaman22 · on April 28, 2018

True, you absolutely should have security and authorization setup. But if the server replies to the request with a 401 or 403, rather than a 404, you at least know that there is something there.

edejong · on April 28, 2018

Other advantages of UUIDs:

- client can generate the UUID, allowing eventual consistency and retrying using distributed DBs (other DB is on phone for example)

- merging of DBs is easy

- allows capability based access control (you can’t guess the UUID, you don’t have access)

- can use a bit or teo to encode production vs development IDs, so cross contamination of systems is less likely

malingo · on April 28, 2018

This post discusses using UUIDs as primary keys (but never exposing them), and an additional integer key on each table as the public facing ID.

https://begriffs.com/posts/2018-01-01-sql-keys-in-depth.html

HN discussion: https://news.ycombinator.com/item?id=16050047

mike-cardwell · on April 28, 2018

Looking at the provided example. Nothing like that has ever happened in any code I've ever written, or I've ever seen written by anybody else, and I have absolutely no fear that I will ever see anything like that happen in any code I write for the rest of my career/life.

So, not a great example, and not a very convincing argument to me to stop using integers.

tom_ · on April 28, 2018

That's interesting. Every project I've worked on has had at least one bug of this form, and typically more than one, where the integer-type indexes into a table of integer-type IDs have got mixed up with the integer-type IDs.

You might be surprised how much of a pain in the arse it is to even realise these bugs exist, because during development and initial tests these tables have a nasty habit in many situations of containing IDs that are the same as the index. And when everything is an int, or similar, fixing them can be quite painful too. It just takes one bug in one function for a set of subtly broken workarounds and/or misunderstandings to spread throughout the code. Lots of places where functions take an "id" and then pass it into a function that takes an "index", or vice versa... just what was the intention here? :(

This shit is the worst kind of bug.

My usual solution:

    struct ThingID {uint64_t id;};
    typedef struct ThingID ThingID;

    struct OtherThingID {uint64_t id;};
    typedef struct OtherThingID OtherThingID;

And that's it. When at all syntactically inconvenient, it's a sign you're possibly doing the wrong thing.

JoachimSchipper · on April 28, 2018

Agree. I've used 'uint64_t thing_id' in somewhat-similar situations, to make it visually obvious which type you're dereferencing.

(That is, thing_do(foo.other_thing_id) still compiles, but at least it looks wrong.)

mike-cardwell · on April 28, 2018

"Every project I've worked on has had at least one bug of this form, and typically more than one"

Then we have had completely different experiences in software development and are unlikely to agree on the importance of the content in this post.

tom_ · on April 28, 2018

Quite! It's supposed to be an alternative viewpoint, nothing more.

Thinking about it, maybe I shouldn't have opened with "That's interesting", which here means exactly what it says, but is a phrase sometimes deployed with malicious intent.

matt_the_bass · on April 28, 2018

I like your solution.

mrob · on April 28, 2018

What arithmetic operations do you realistically expect to do with user ids? You might not make that specific error, but it seems to me that any arithmetic operation is likely to a bug, and giving user ids their own type will catch all of them.

__s · on April 28, 2018

This can be addressed by using 'opaque ids' which are typed to be incomparable & non interchangable. Then have explicit to/from int conversion

ajnin · on April 28, 2018

There are several points in this article :

1/ stuffing random ints into your user functions : that problem can be solved with typing and tests. I'm hoping that extensive testing of any piece of code which would have a drastic effect on a user would get extensive testing before going into production.

2/ ID canary : seems a rather good idea, like stack canaries commonly used when you don't have much stack space and you might get a collision with your heap. It's only a problem for languages for which point 1/ couldn't be a solution.

3/ Using UUIDs to avoid disclosing information about your user count : I think that's a separate problem. You should avoid disclosing unneeded information in general. If you use int IDs, always have an opaque public_id field that you use publicly and for interoperability with third-parties. But it does not mean you have to use UUIDs internally. However they do have a number of advantages, mainly that you don't need a central authority to distribute new sequence numbers, you can just generate UUIDs where you need them which will save you DB round-trips and make sharding of your DB easier. Also will help avoid issues such as this one : https://blog.travis-ci.com/2018-04-03-incident-post-mortem

oftenwrong · on April 28, 2018

Types, as mentioned, make this easy to avoid. For example:

A User has an ID of type UserId.

A Message has an ID of type MessageId and a sender of type UserId.

ban_senders_of_messages would have a parameter of type [Message]

ban_account would have a parameter of type UserId

peterburkimsher · on April 28, 2018

I think the real problem is the terse coding style. Why does the for loop always use "i"?

When I write code, I don't use "i" or other single-letter variable names. I write long names, like "currentItem", "currentMessage", "currentUser", etc. If I reference an object, I usually name it "thisItem", "thisMessage", "thisUser".

Compilers shrink executables, so it's not a size issue. I don't know why people want code to be shorter; I prefer it to be easy to read and debug.

mic47 · on April 28, 2018

Naming won't help, sooner or later someone will make similar mistake that will slip through code review.

Real problem is that it is possible to accidentally do this type of mistakes. If you can avoid doing it by leveraging type system, you should. Relying on humans to never make mistake is futile.

naasking · on April 28, 2018

Since I largely program in C#, I typically define the unique ids on my classes as enums. Enums in C# are open, not closed, so you can define something like a UserId like so:

    public enum UserId { Unset = 0; }

    public class User
    {
        public UserId Id {get;set;}
    }

This works with various ORM and other mapping tools since enums are ints underneath, but you get static checking.

Then there's the issue of protection when passing around ids in web apps as parameters, in cookies, etc. I devised Clavis [1] as an experiment for protecting URL parameters via an HMAC. The idea works pretty well in practice, but it's current incarnation is a little too cumbersome to use.

[1] A url http://foo.com?userId=1234 becomes http://foo.com?-userId=1234&clavis=asdbwef67t34rfbs, where the 'clavis' parameter is an HMAC of the URL's protected parameters, and changing any of them causes the request to fail. Unprotected parameters are also supported, so GET form submissions are still possible. See: http://higherlogics.blogspot.ca/2014/01/clavis-rebooted-secu...

stu_douglas · on April 29, 2018

Last week we caught a bug caused by a security check on the users id (a Java Long) that was accidentally using == instead of .equals(). Since Java caches Long’s between -127 and 128, == will pass for any ids <128, including those from our tests. Our QA stage only caught it because the tester happened to be a later stage employee, so had a user ID > 128.

paulbjensen · on April 28, 2018

Another risk is that numerical ids for users (as well as for other database tables) can be used to infer the growth rate of a business. If you record the time that a user is created at, and then repeat the process over a time period, you can work out how many users are joining the app over that time period, and potentially work out the total number of users too.

pornel · on April 28, 2018

Strongly typed languages help somewhat, but you still often need to store or serialize the data in something less clever (e.g. JSON).

To reduce risk of mixup of different kinds of IDs in the system I used different increment values in Postgresql sequences (e.g. 13 for users, 7 for categories), so the IDs quickly went out of sync and had little overlap.

floatboth · on April 28, 2018

Another reason: prevent users from incrementing their ID in the URL bar and discovering whatever pages...

IgorPartola · on April 28, 2018

Whenever I deal with an external API, even if it gives me what looks like an integer for any object it has, I treat it as a string. I never do math on it, I don’t care about saving space, and I never know when they’ll run out of space and need to switch. Remember when Twitter famously stopped working because their IDs were BigInt but in JavaScript you only have 53 bits of space? Yeah, I don’t want that.

Also a stupid number of APIs I deal with like to use one or more leading zeros in their identifiers. The meaning isn’t different, but it gets annoying when trying to do search, because of course the end user typed those in and wants to be able to look up whatever as 021.

based2 · on April 28, 2018

https://github.com/jOOQ/jOOQ/issues/5589

teddyh · on April 28, 2018

By the same argument, Unix file descriptors should not be ints, either.

This argument is, in essence, an argument for strongly typed languages. There are many argument against this historical argument, and it is by no means a settled issue – on the contrary, it is very slowly, as the years go by, looking more and more like the strongly typed languages are on the way out.

megaman22 · on April 28, 2018

> on the contrary, it is very slowly, as the years go by, looking more and more like the strongly typed languages are on the way out.

What? I'd argue the complete opposite. Above a certain level of complexity, lack of type checking becomes so onerous and bug-inducing that dynamic languages start introducing stronger typing. Typescript is paradise compared to Javascript, and even Python has added an optional type-checker.

dredmorbius · on April 28, 2018

What you more likely want is a sense of privileged, sensitive, and/or high-consequence users who arent trivially compromised or blocked.

E.g., https://en.wikipedia.org/wiki/Politically_exposed_person

That and some sanity in your account-handling ops.

gwbas1c · on April 28, 2018

It's standard security practice not to expose integer user IDs. Anyone who goes through standard security training knows this.

Granted, one thing languages could do is provide easy type containers so it's hard to misconstrue an I'd as referring to a wrong type. I once tried to do this with generics in C# but it wasn't worth the effort.

tigershark · on April 28, 2018

C# is ill suited for this sadly. On the other hand in F# you have single case discriminated unions or unit of measures if you care about performance.

hungerstrike · on April 28, 2018

What is the standard called and where can I read about that?

osrec · on April 28, 2018

I think user IDs should be passed around as ints. It's efficient and simple. If you've got critical code that might lock someone out (or worse), you're much better off spending time testing it to ensure it works, rather than building in inefficient paradigms into your data model. For argument's sake, imagine if your data model also stores the sender's current manager per message. You could by mistake do this as well (a little slip of the finger when using code completion):

  ban_senders_of_messages(messages) {
    for (i = 0; i < messages.size(); ++i) {
    
  ban_account(message[i].sendersManager);
    }
  }

The only way to catch that is to test, and because the manager and sender will have similar data types, it will compile/execute just fine. The point is, this is probably a more likely error than the one mentioned in the article, and needs manual inspection and testing to correct. If you're carrying out that process anyway, the added inefficiency of a more elaborate data type just for user IDs seems redundant.

tom_ · on April 28, 2018

Why not define a new data type for each type of ID? Simple and effective. And there's no inefficiency; these things aren't created and destroyed all the time! They're just retrieved from other objects and then passed around.

In languages that support value types, you'll typically make them integer size, so the cost is likely to be the same as passing an integer. In languages that support reference types only, you're just passing around a pointer anyway.

osrec · on April 28, 2018

But in my example, surely the data type for a manager's ID will still be the same as that of the user, so I'm not sure it really solves the problem entirely.

mic47 · on April 28, 2018

Then maybe you should have manager's IDs done as separate type too?

osrec · on April 28, 2018

But they're all users, where one user can be set as a manager for another. Why would you have a separate type for each?!

mic47 · on April 28, 2018

If you need to mix them up, there is bunch of options. It all depends on what you need to do, and what language you are using, but usually there should be simple solution to this.

For example, if you simply want to have functions that works for both of them (so that you don't duplicate code), you can either create function from Manager to User (so that you can reuse functions for users), or use whatever polymorphism stuff your language support (polymorphic function, OOP, ...).

If you want to mix User/Manager in same collection (or have function that returns any of those), OOP can help too (Manager is "child" of User). If your language have sum types, you can use those (have additional type "User or Manager", and accompanying matching/extraction function).

In some languages you can do this with no runtime overhead (i.e. the additional type will be erased during runtime, as it's already type checked).

tom_ · on April 28, 2018

But your example implies you shouldn't be allowed to get them mixed up?

If manager IDs and user IDs truly are the same type of thing, then there's a limited amount the type system can do for you in this respect. Maybe you'll have to stop at this point and just accept that you'll have to exercise a certain degree of care.

But there's a big gap between stopping there, in my view, and what you appear to be advocating: deciding that since manager IDs and user IDs are the same thing then you might as well give up entirely and just decide that they may as well be the same thing as ints while you're at it.

osrec · on April 28, 2018

Not quite. If I was to truly critique the example, the problem lies mainly with the ban function, which appears to allow ints to be passed in as arguments. If we have static typing available to us (the example in the article doesn't seem to), then yes, I agree we should ensure that only the relevant type should be allowed as the argument. But still, problems can lurk in the shadows, for example, you could initialise a User type with a message ID by mistake, or even an iterating int, much like the example in the article. I'm not trying to be difficult or unnecessarily contrarian, but my experience tells me that you can put a whole bunch of safeguards into your code, but nothing beats testing at catching bugs, and sometimes, the safeguards are not worth the efficiency hit. Worse still, the safeguards can at times provide a false sense of security.

tom_ · on April 28, 2018

It shouldn't be an excuse to dispense with testing altogether, but static typing is certainly superior to testing for certain classes of bug. The compile-time checks prove certain types of defect simply don't exist, which is the kind of guarantee no amount of testing can give you in that respect for any useful program.

As for the initialisation problem, it's true that it can't be structs all the way down, and at some point you will have to create one of these objects, probably from a primitive with a non-meaningful type such as int, or string. But my experience is that IDs and the like tend to be created in a small number of places, and then reused, copied and passed around. Far easier to find and check all the places where one is created than all the places where one is used!

stavros · on April 28, 2018

I can recommend the ShortUUID[1] Python package I wrote, I use it for all my IDs nowadays. The good thing about it is that it makes nice and short human-readable/typable IDs that you can use for all your objects, so you don't care even if you expose them to the user.

lazyjones · on April 28, 2018

This has nothing to do with user IDs. It's a general problem with antiquated languages that quietly convert/promote compatible integer types and can happen to any other integer types.

Solution: use a modern language like Go or use pointers to structs like everyone else.

olingern · on April 28, 2018

I hope that it's obvious that you shouldn't expose part of your implementation to your end users a la auto incrementing primary keys.

Obfuscation of how data is queried and stored is pretty low hanging fruit, security-wise.

fma · on April 28, 2018

Besides all the other arguments made in the comments about other solutions for user id's...

This code is so easy to write a unit test for. I hope it didnt even get committed, let alone deployed to prod.

alt_ · on April 28, 2018

Doesn't pretty much every language nowadays have foreach loops that don't require keeping an integer index around when iterating over elements? That seems like a way better idea.

all_blue_chucks · on April 28, 2018

If you need a way to uniquely identify something in the universe, there is no reason to get clever. Just use Universally Unique IDentifiers. UUIDs. Done.

foxhop · on April 28, 2018

I use a UUID for the primary key, it protects against this as well as other issues. Does have some drawbacks though.

partycoder · on April 29, 2018

And also you should not recompute the size on each iteration, or ban the same user more than once.

originalsimba · on April 28, 2018

Why would you use anything but a string for User IDs?

My understanding of numerical types is that they exist to perform math. User IDs are not used for math, they're a completely arbitrary vanity system to assist with identification, so they should be strings, equally arbitrary.

Personally, I think E-mail addresses are the best user identifiers these days. Back in the day when there were like 5 websites everyone used, having your username was a cool thing. These days there's a billion websites and nobody uses the same ones and there's zero inter-user interaction on most sites. From the perspective of user friendliness, E-mail addresses are the easiest because you kill two birds with one stone (contact method + username + password recovery).

If you want a numerical ID, what about using a hash of the E-mail address? Or perhaps a combination of things, email, full name, sign-up date.

stavros · on April 28, 2018

> E-mail addresses are the best user identifiers these days

Oh god no. You don't want all your IDs changing when a user changes their email address.

You probably want your ID (e.g. a UUID) and your user-friendly lookup method (e.g. an email) to be separate.

originalsimba · on April 28, 2018

> Oh god no. You don't want all your IDs changing when a user changes their email address.

That's a pretty passionate response, can you explain your logic? What are you doing with your usernames that you can't afford to let users change them?

detaro · on April 28, 2018

The article doesn't use "User ID" in the sense of "username" (externally visible identifier, that likely is used for log in), but as in "mostly internal id thats used to reference a user across database tables, services, ...". If you use something that can change in there, you need to do the change across all those things consistently, which is a lot of potential for error.

Or am I misunderstanding your perspective?

stavros · on April 28, 2018

That's exactly it, and the article doesn't talk about user-facing usernames, since they're automatically incrementing ints.

originalsimba · on April 29, 2018

I got that but I can't imagine why you would even use the User ID for anything if we're talking about the row ID from the database. If you're doing tests against a user's profile why not use their username? There must be some case-examples that I'm not thinking of...

I know that some services have a public-facing "username" and a behind the scenes unique identifier (which is a great UX model), I'm just focusing on the unique identifier. Which I would think should always be it's own column, whether it's also used for the public "username" or not.

> Because if you change that, all your relations between tables will break.

Okay, that is not a response to my question, which is why would you ever use the row ID for anything in your program. If you never use it, then it cannot ever be changed. Also SQL allows relationships based on more than one field, so it seems such a disaster could be easily avoided.

stavros · on April 29, 2018

Because if you change that, all your relations between tables will break.

fasj82 · on April 28, 2018

[flagged]

jstanley · on April 28, 2018

I've done an UPDATE without a WHERE on a large database before. Fortunately the table was so large that the UPDATE took a significant amount of time, and I was able to type ^C before it completed, which meant the result of the update never got committed and no damage was done!

protomyth · on April 28, 2018

If the database you are using understands transactions then it is best to begin a transaction before any update to make sure you did it correctly.

danpalmer · on April 28, 2018

Please try to refrain from using language such as "retarded". It's really quite offensive, and there are less offensive words that would likely make your argument stronger.

I recommend reading the comment guidelines.

https://news.ycombinator.com/newsguidelines.html