Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Be careful of performance when using CTEs in Postgres: unlike in other DBMSs they are optimisation fences with regard to predicate pushdown so for some queries will result in extra scans for every level of call needed.

Doesn't affect all queries of course, and where is does the difference may not be significant compared to what else is going on (i.e. querying a small tree/graph structure to pull out some large/complex data), but it is something to watch out for when working with data of any appreciable size.



It's possible that this note at the bottom of the article was not there when you made your comment:

As a note CTEs at this time are an optimization fence in PostgreSQL, though there are hopes of that changing in the future. Common Table Expressions are an incredibly useful tool for reporting. At times the readability of CTEs outweighs the performance impact, but consider the trade-offs when using them


It is also possible that I skimmed over that part!


For the amount of data it processes it can be slow. I used it to render a category taxonomy of just maybe 100 nodes and it was a perceptible duration. The representation was so direct and clear that I kept it and just cached it server-side I validating went any category was changed. Also had to ensure there were no cycles in the data when saving changes.


For some kinds of data, such as taxonomies, there is a datatype called an ltree available in Postgres that can be indexed. Not as general as a CTE but useful for avoiding the optimization barrier in specific cases.


It's always to learn about new things in areas you already thought you'd covered. I'll definitely look this up.

Another common issue I've encountered is with representing reorderable items. I've usually just amortized a partial renumbering that works well enough but always wondered if there's a better way. Would storing balanced trees and using a recursive CTE work as well and be a bit simpler?


Re-orderable items are best represented using a rational.

There is a great blog post on the different approaches and trade-offs of user-orderable items[0].

[0] https://begriffs.com/posts/2018-03-20-user-defined-order.htm...


You can also mitigate this by judiciously ordering your CTEs, when you have more than one, having earlier ones do more of the winnowing work, and being used as inputs to later ones.

An often unrecognized consequence of CTEs being optimization fences is that they're much easier to farm out to background workers when parallel query execution is a thing.


What pattern ought one look for in the explained plan to identify this issue?

Is this something that can likely be improved, technically speaking?


Yes - most other DBMS don't do this. If you have a filter on a derived table that uses a CTE, Postgres will not be able to push it down to the CTE. If the CTE is expensive to calculate (a lot of rows, or joins, or complicated functions), but you don't need many rows from it in the final results, performance is likely to be very poor compared to rewriting as normal subqueries. You should be able to see this on the plan if you see an expensive CTE subquery with limited row results being used.

This is unfortunate as it can reduce readability significantly. There's resistance to fixing this as it is considered a breaking change apparently, which it may be for CTEs used for data manipulation (INSERT / UPDATE / DELETE can operate using CTEs), but it shouldn't be for plain SELECTs. however no obvious plans for this to change.

https://blog.2ndquadrant.com/postgresql-ctes-are-optimizatio... has more.


Is this something that would be evident upon running EXPLAIN over a query?


Excess index (or full table) scans on the recursively referenced tables, or "wrong" index choices otherwise on those objects, where your filtering/joining clauses would otherwise allow for more efficient options with the indexes that are available.

I'm not an expert on postgres (I spend most of my life in MS SQL Server's domain) so I'll not try be more detailed than that for fear of accidentally spreading/creating misinformation. Search for "postgress CTE optimisation fence explain" and you'll hopefully find some good examples as it is a commonly discussed topic once you know the right keywords to search for.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: