Hacker News new | past | comments | ask | show | jobs | submit login

> What if the social security number changes, because it was wrong or because it actually changes?

What if the user doesn't have an SSN? What happens if they have one but lawfully refuse to provide it? What happens when you ask for and SSN from a US citizen who is also a European citizen? What happens when your database leaks?

In general, relying only on natural keys is a nightmare. Double nightmare if it's PII. Natural keys only work if you are flawlessly omniscient about the domain. And you aren't.




In my experience from what I've seen there are ways to use natural keys and handle domain changes - I've seen some systems like this work quite successfully. The cost to using synthetic IDs (auto-increment, UUIDs) is a lack of reproducibility and slower importing of data especially across multiple tables/entities limiting scalability. This can be very problematic for certain classes of applications I've seen, but not most. While I agree with your comment for many classes of apps as always there is no general "silver bullet" answer - it depends on your problem space.

Some cases I've seen in previous roles where some natural key is required include reconciling third party data sources, or processing events from a topic or stream and being able to replay the event log, etc knowing that a different ID may break other third parties you don't control since they've already imported the ID. Being able to replay your data sets from scratch and get exactly the same data can have some real advantages for some apps.

Of course you need to be aware of the domain and assume that the key can change over time and have strategies to deal with that (e.g. entity version tables bound by time, data migration to add key attributes, etc etc) and the data structures/processes needs to be designed for this. There's more work in it for sure to get right - it shouldn't be the default. But in some cases I've seen it work really well which frankly surprised me at the time.


a Social Security number is far from unique. https://www.computerworld.com/article/2552992/not-so-unique....

Identity crisis: how Social Security numbers became our insecure national ID https://www.theverge.com/2012/9/26/3384416/social-security-n...

Back in the 1990s I was trying to convince my colleagues not to use SSNs as unique IDs. I've since noted that quite a few organizations that had gravitated to SSNs as IDs had to go through expensive and chaotic migrations to real unique IDs.


How do you look for a person, if not based on his/her SSN?

SSN alone is not sufficient, of course. But it _is_ definitely part of the natural key that you use _implicitly_ ANYWAY, whether recognizing it or not.

> Natural keys only work if you are flawlessly omniscient about the domain

I would call that BS. Nobody is "flawlessly omniscient" about anything, not even in mathematics, yet we design and build systems that work.

On the other hand, yes, it is a very good requirement to have someone on the team during database modeling who understands the domain model thoroughly. No UUID columns will save you from that.


> How do you look for a person, if not based on his/her SSN?

These are different concepts. "Looking for a person" means search. You can look for people lots of ways. In medicine for example, they often look for first name + last name + birthday. Is that a unique ID? No, but it's close enough for search usually.

On the other hand, for any kind of indexing or foreign keys, you want an actually unique, immutable ID, which means you don't want a natural key, you want an artificial ID created just for your database.


> I would call that BS. Nobody is "flawlessly omniscient" about anything, not even in mathematics, yet we design and build systems that work.

And such systems typically use synthetic keys to completely dodge the kind of problems I outlined.

The problems with natural keys are that you, the programmer, don't know as much as you think you know. You muddle the problem domain with the solution domain and when something comes along in the problem domain you didn't think of, it's now much harder to fix.

> On the other hand, yes, it is a very good requirement to have someone on the team during database modeling who understands the domain model thoroughly. No UUID columns will save you from that.

They save you from having to work out how to store a record when you chose SSN as primary key and discover that, uh, no you can't do that. The same goes for purchase order numbers, waybill numbers, student IDs, payroll IDs, bank account numbers, license plates ... anything whatsoever that is visible in the problem domain will or will have exceptions you didn't know about, didn't foresee and for which legislation or policy allows no exception for not using a UUID column.


I want to print this comment and frame it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: