
TopN for your Postgres database - samber
https://www.citusdata.com/blog/2018/03/27/topn-for-your-postgres-database/
======
sfg75
We've been using this extension for a while now at Algolia, great to see that
it's now open sourced!

We heavily rely on this to power our analytics API. We use it precompute tops
for billions of daily events. We can then fetch tops across specific time
range usually in the order of the milliseconds on the fly. This was a game
changer for us :)

------
zombieprocesses
Is this an advertisement? Because it reads like one.

Postgresql and most mature database systems already have topN/offset/paging
solutions.

Also, what's the point of aggregating JSONB data? If you need to calculate
topN, why not normalize the data properly and index the data? Then top N will
be blazing fast without needing an extension?

If the data set is extremely large then you can maintain an internal "Top N"
table that gets calculated when data is added/removed. It all depends on the
workload, but inserts/updates/deletions may be slightly slower, but reads of
topN will be constant speed.

~~~
craigkerstiens
This is an entirely new PostgreSQL extension. Within the post we talk about
group by, order by work fine for smaller datasets but for larger datasets the
amount of time to compute and roll things up is not feasible. TopN or TopK is
a very common algorithm for approximate counts of top items when you have
large enough datasets.

