Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That was my initial thought as well. I'm hoping the original author reads this and can explain, because surely there must be a reason...?


I think because ID is random and not monotonically increasing.

From article one of them is: 'tpomh9yu1tffmdp6dopobwuc9h'

But the query is trying to page through the most recent posts first.

So a newer post doesn't necessarily have a higher ID or lower ID.

The ID is just there to have a stable tie break between the same timestamps.


One branch of the OR is comparing IDs as though they're monotonically increasing, so either they are or the query is broken.


I think it's a reasonable implementation. It is only used to separate rows with the same created timestamp. This query is run some time after the insert has happened, so it is unlikely that there have been more rows with the same timestamp at query time.

If this assumption is correct, then first sorting by the timestamp and then sort the id alphabetically will ensure that the pagination is deterministic.

I guess we could consider some edge cases where we have lots of rows with the same timestamp that is inserted after the ingestion query is run due to latency, but it might be acceptable for this use case.


In this case they should switch to uuidv7 or similar to have this guarantee


Switching IDs is going to be a breaking change across their whole product and customer base.

Thats a way bigger undertaking and decision then just optimizing a single query.

Like killing an ant with a Tsar Bomb.


One alternative could be to add a separate column that can be used to keep track of insert order. But they would need to consider the costs before they do this, as it could impact the insert performance.

The true "modern" cool kids solution would of course be to create a service that listens to WAL, inserts it into a Kafka cluster that is connected to a pipe for inserting into Elastic search. Much more fun and resumé friendly than optimizing a query. I bet the author of this blog would get a much larger audience.

Bonus points if the ingestion service was written in Rust and run on a serverless platform


Actually it's not a breaking change.

You can switch the uuid generation, wait a little bit and add an.'archive' switch which will use the old and slower query when the date is old.

Should definitely be helpful resource wise for a lot of people


Yes and sooner this gets started the better.

Otherwise it will cost more and more.

Especially for a chat app!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: