Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Basic incremental counters work for most real world production constraints. Most people are not going to create tables with 4.2 billion rows, even with failed inserts. If you are doing that its an extreme of either very much you know what you are doing, or you very much do not; I have seen both in production.


What if I have multiple partitions? Replication? What if I don't want business data to be exposed due to strictly incremental counters? What if I want unique IDs across different tables?


Partitions should not impact the use of an INT PK, except that you’ll need to include the partition key in the PK, e.g. (id, created_at) if partitioning by datetime. The displayed ordering without an explicit ORDER BY may not make sense, but to be fair, there are never any guarantees about implicit order.

Replication should be fine, unless you mean active-active in which case I suggest a. not doing that b. using interleaved chunks, or a coordinator node that hands them out.

Business data exposure can be avoided (if it’s actually a problem, and not just a theoretical one) in a variety of ways; two of the most common are:

* Don’t use the id in the slug.

* Have a iid column that’s random and exposed, while keeping the integer as the PK.

If you need unique IDs across tables, then I question your use of an RDBMS, because you aren’t really making use of the relational aspect.


I could not have said it better myself. I would also add that I keep the slug just as an entirely separate column (or "user visible id") that they can change, had too many systems do things like "invoice id is auto generated" and then a customer coming back and saying "the invoice id has to be this or the auditors will scream!" - don't expose internals of your database to your users and you wont have a bad time.


A hundred inserts per second is going to hit 2^32 within a year and a half. I've seen that volume repeatedly. A colleague has seen this limit blow up prod. Do you really want to spend time on a project you're sure can never succeed to this extent?


Yep, seen it many times, and I cant tell you how many thousands of times I have seen tables with UUIDv4 primary keys for "future safety" that have 12 rows.

Converting int to bigint is not a big project, I have done more times than I can count just like any database evolution. I have also had customers say "oh, we'll just drop and recreate the table every year because its just some trash data." or "oh wait, we didnt mean to create 100 rows every second every day, that's a bug and costing us a lot of money for something we dont care about."

There's no one size fits all in databases, but most people who don't know better don't need to design a database for scale... because their choices won't work in the long term anyway.


That table is a self-limiting problem. It's true that failure is likely, but optimizing for that outcome never adds value, and it would probably make me quit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: