JIT-Compiling SQL Queries in PostgreSQL Using LLVM

murkt · on Feb 22, 2021

It's enabled by default since Postgres 12, and in some cases can be harmful, I mentioned it in my blog post about JSONB indexes [1].

I found it hard to make Postgres reuse JIT-compiled plans, and quite often JIT compilation can kick in when it thinks that there will be many results and spend more time in compilation than executing a non-compiled query. Postgres needs a setting to increase JIT compilation threshold based on a query plan complexity.

[1] https://vsevolod.net/postgresql-jsonb-index/#jit-compiling-b...

jeltz · on Feb 22, 2021

I am a bit confused by your post. PostgreSQL does not yet support reusing compiled plans at all (it is tricky to implement and there is only one guy working on JIT) but PostgreSQL already has a threshold based on query complexity.

murkt · on Feb 22, 2021

> PostgreSQL does not yet support reusing compiled plans at all

Well, that explains why I couldn't make it work :)

> PostgreSQL already has a threshold based on query complexity.

PostgreSQL has settings that threshold based on a query cost - jit_above_cost, jit_inline_above_cost, jit_optimize_above_cost.

For example, imagine we query a table with millions of rows, extract some columns, do some transforms. Will the cost be quite big? It will. Will it run for a long time? It will, let's say 2 second. JIT compilation will take, for example, 0.2 seconds and the query itself now will run in 1.4 seconds, so 0.4 second total win.

Imagine we have other query that joins seven tables, access many indexes, etc, returns a modest amount of rows and results in the same cost as the first query. Often it could finish even faster, like in 0.5 seconds or something, but let's say it's the same 2 seconds. However, as there are much more code to compile, JIT compilation now takes 10 seconds, and speeds the query up to the same 1.4 seconds.

So I want a setting for JIT compilation that will capture a query complexity.

baq · on Feb 22, 2021

i had exactly your scenario in tests: very complex query got jitted with full optimizations (expensive!) and then due to the data, etc. 95% of that query wasn't executed at all. performance tanked.

you have to carefully analyze your queries if you want the jit to be a net benefit - and by carefully, i mean look at explain analyze outputs to check for major mismatches of estimated vs actual rows (say 10x).

magicalhippo · on Feb 22, 2021

> query cost

That would be query execution cost, right? And you're after some knobs to tune query preparation/compilation cost?

I don't know how the PostgreSQL JIT thing works, but I imagine it already has an AST-ish structure of the query by the time it calls LLVM. In that case I guess it should be possible to assign some weights to the nodes/elements and get an estimate for how long it'll take to compile.

mulmen · on Feb 22, 2021

How does complexity relate to cost? Is it just a matter of weight?

murkt · on Feb 22, 2021

What I called complexity is an amount of different nodes (not sure this is the proper term) in a query plan. So, the more complexity for a given cost - the more work JIT needs to do.

mulmen · on Feb 22, 2021

But the planner sums each step in the plan so isn’t a more complex plan already more expensive? Or is the addition of another step not adequately considered in that cost?

murkt · on Feb 23, 2021

A more complex plan isn't already more expensive in a way that captures how much work the JIT will need to do. I think I'll need to write a separate blog post about that, with details and examples.

darksaints · on Feb 22, 2021

After reading through your blog post, I think the feature you should be looking to is expression indexes. Build up the expression that you use repeatedly and and is so costly, and directly index that.

murkt · on Feb 22, 2021

I have thought about expression indexes, and for expressions I need (@> on jsonb) they can't be used.

asah · on Feb 22, 2021

Can you wrap that operator call in a function, and index the output of that function?

(sorry if dumb q - but I've used this technique myself, and indeed, my favorite hack in recent years are JavaScript UDFs in plv8... super ugly but very very flexible...)

murkt · on Feb 23, 2021

I'm not sure I can, and I definitely do not want to do that. I'm happy with a separate table and `btree` index on it.

rowanseymour · on Feb 22, 2021

We noticed this recently when we enabled CI testing with PG12 and suddenly unit tests were taking 4x as long to run. Haven't tested it in more production-like settings, but definitely a bit surprised this became the default given its potential for causing performance issues.

derefr · on Feb 22, 2021

I know RDBMSes are supposed to hide the “raw” procedural query plan underneath a declarative “what data do you want” abstraction layer... but what if I know exactly the query plan I want? Why can’t I upload a “prepared query plan” and execute exactly that query plan?

In most RDBMSes, the problem with that would be that there’s no existing formal intermediate representation of a query plan, to use on the wire.

But in PG’s case, that’d be LLVM IR, no?

I haven’t looked too deeply into what Postgres exposes via its extension APIs, but I feel like it wouldn’t be impossible to write a PG extension that would let you call a stored procedure with a bytea containing the LLVM IR for a query plan, and get back a handle to a “prepared statement that’s a little bit more” — where EXECUTEing that statement would just reuse the cached query-plan attached to it, as if PG had previously JITed a plan for that statement.

If I built something like this, would anyone want to use it? It’d have the disadvantage of you having to figure out how to generate LLVM IR on the client side representing your desired query plan; and that query plan potentially getting out-of-sync with the DB. But with stable DB schemas + data sizes (e.g. in data warehousing) it seems like a win.

brokencode · on Feb 22, 2021

Knowing exactly what query plan you want is not as easy as it sounds. That depends on all the indexes and internal data structures at the database’s disposal. These can change at any time, and you’d need to update your query plan to match.

Also, using raw LLVM, it seems like you could do all kinds of things to shoot yourself in the foot, such as breaking ACID guarantees or failing to properly update indexes.

That’s always the trade off with databases. More speed for less safety. Even fancy SQL databases that like throwing around claims about ACID don’t usually enable the full ACID guarantees by default for all transactions due to performance reasons.

If you want low-level control, then maybe a NoSQL database would be better suited to your needs. But otherwise, I think improving the native query planner in SQL would be a better use of resources.

derefr · on Feb 22, 2021

> Also, using raw LLVM, it seems like you could do all kinds of things to shoot yourself in the foot, such as breaking ACID guarantees or failing to properly update indexes.

I was picturing using these only for read-only queries (i.e. the kind you can run against a hot standby.) That could be enforced by the extension, if need be, and I don't think it would make much difference in the extension's usefulness. Nobody needs this level of optimization for OLTP queries, because OLTP queries just ain't complicated enough to confuse the query planner. Only OLAP queries are.

And OLAP queries are usually either "generate this report and return it" or "generate this report into this table." The precompiled query-plan could just be for the "generate this report" part, and recuse itself from the "into this table" part, leaving that to just-in-time planning. (Remember, LLVM IR is — before the LLVM optimizer runs — still abstract enough to have stuff like functions in it. So it's possible to send a "module" of LLVM IR that will "plug into" a larger query plan at a defined attach-point, and then get Whole-Program-Optimized down to straight-line code by the optimizer.)

> If you want low-level control, then maybe a NoSQL database would be better suited to your needs

The thing is, mostly we don't need that low-level control. 99.9% of queries can be planned by the DB just fine. But there's always that one query in the hot path of an app, that the DB just has no idea how to plan correctly. The query that makes you want to just grab the steering wheel away from the DBMS and swerve.

I don't want to have to denormalize my data into an entire secondary representation (either in the same DBMS, or worse, in a separate NoSQL DBMS) just for the sake of that one query. I want to just do the DB's homework for it. Sometimes it drives me to the point of considering forking the DBMS and manually patching in a query-plan literal datastructure for that one case.

IMHO, client-side pre-compiled query plans are in the same class of solutions as a DBA manually going in and partitioning large tables invisibly to the application that relies on them—or, heck, as `CREATE INDEX` existing at all as a command†: sometimes fully-general algorithms for solving an entire class of problem are hard, but coming up with the answer to your specific problem is very simple.

† (Did you know that CREATE INDEX is actually non-standard SQL, precisely because a theoretical sufficiently-advanced RDBMS could come up with the right indices itself, and so manually-specified indices have no theoretical place in the abstract expression of relational algebra that is standard Structured Query Language? But of course, every RDBMS has CREATE INDEX, because there is — as of yet! — no fully-general algorithm for indices smart enough that it can't be out-planned by your average DBA. Well, and also because of the bootstrapping problem, where without indices some queries would take an eternity to resolve, and so the statistics collector would never observe enough queries to know what sort of indices it needs.)

brokencode · on Feb 22, 2021

I get what you’re saying, and it makes sense why you’d want that. I wonder if you posted on Postgres’s forms or asked one of the devs if they’d have some other ideas about improving performance.

Just at a initial take level, LLVM seems too low level for this kind of usage to me, and it doesn’t seem very robust or elegant. From that perspective, it doesn’t seem like something that would get much adoption or support from the Postgres community. I’m no expert in Postgres however, so maybe it would be better than I’m picturing.

Maybe creating a more efficient alternative query language would work, or a system for adding hints to your query about how it should execute. Like if you could specify in low level terms how you think it should work, and the database would feel free to ignore those hints or change any part if necessary because some internal data structure changed.

AaronFriel · on Feb 22, 2021

Two reasons, I think:

* there isn't a general method of starting from a query plan and verifying it aligns to a particular query, which means you're very much on your own if you try to change it to something Postgres has not (or maybe cannot) generate

* there is no "ABI guarantee" on query plans, so you're locked into a particular build

slashdev · on Feb 22, 2021

Some RDBMSes like Oracle let you give hints about how you want things joined, which is usually what you mean when you say you want to specify the query plan.

I think I would like to see that in Postgres, but I don't think it's a common requirement. Usually you can just update the statistics to get the planner on the same page as you. I just like to have options in the toolbox, just in case I need them.

parhamn · on Feb 22, 2021

I don’t imagine the plan could be constructed without a lot of lookups to internal information (cardinality, which data table it’s stored in, etc).

Makes me wonder what such a plan could look like and if it even could be constructed on the client side without the necessary reflection, or if it’d be too complex for a non-simple query.

darksaints · on Feb 22, 2021

A couple of things to point out: the JIT is very dependent on the Postgres cost model, which is still pretty primitive. For example, the cost model is not aware of TOAST sizes, which can drastically impact function execution costs if that function is size dependent (very common with things like geospatial queries). Another problem is the lack of join statistics...you can have perfectly recorded statics for both tables, but the combination of tables or columns used in the join, or possibly the expression used for the join, can cause cost estimate chaos.

I know JIT plan caching is a hard problem, but it would be very helpful. But I actually think JIT is the wrong solution to this problem. Instead of recompiling an optimized plan every time, we should actually be using PGO. We already have a profiling mechanism built in: Explain Analyze. All that is needed is for that profiling mechanism to link up with the optimizer, giving it all the information it needs to do all of the same optimizations that we get with JIT today, but combined with extremely powerful global optimizations that no JIT would ever even attempt. And we already have the perfect use cases too: every view or materialized view is already based on a query that will be used regularly, and is worth profiling and optimizing.

mattashii · on Feb 22, 2021

I would say that Postgres' cost model is quite sophisticated, and not at all primitive. Some weigh-able costs like toast size might indeed be missing, but I would not dare to argue that it is primitive because of the lack of taking toast size into account; indeed I would argue that you missing this is showing how many measured points already are available.

Regarding missing join statistics: What exactly do you mean by that? You can already add multi-column statistics, which would allow postgres to optimize on correllations that cross column boundaries (it can use statistics on [id, value] to estimate the occurance of column value when joined on id). But, you must specify the existence of cross-column corellated statistics manually, because all statistics slow down the planner by a bit, because it has to consider the statistic.

darksaints · on Feb 23, 2021

> I would say that Postgres' cost model is quite sophisticated, and not at all primitive. Some weigh-able costs like toast size might indeed be missing, but I would not dare to argue that it is primitive because of the lack of taking toast size into account; indeed I would argue that you missing this is showing how many measured points already are available.

They aren't measured though, they're configured. And poorly-defaulted or misconfigured costs are a major reason for bad plans. They could be measured though, and that would be some low hanging fruit for substantial improvements to the cost model. Furthermore, they seem to have no effect on filtering predicate evaluation order...even when selectivity of one predicate is highly predictive, and the other predicate is extremely costly to compute.

A bigger problem is that inaccuracies compound when more complex plans unfold. A small inaccuracy in the rowcount estimate from one set of rows, combined with a small inaccuracy from a rowcount estimate for a different set of rows, joined together, can result in cost estimates that are orders of magnitude off from reality.

> You can already add multi-column statistics, which would allow postgres to optimize on correllations that cross column boundaries

Multi-column statistics do help, but only with very simplistic scenarios. You can only collect cross-column statistics within a single table. You can't collect statistics across foreign-keyed tables, commonly joined column/tables, nor are they helpful for anything but pure equality comparisons. Geospatial relationships, range comparisons, inequalities, etc.., they all get no help from multi-column statistics, unless you're going to collect them across table relationships.

asguy · on Feb 22, 2021

This is a good read, but it probably should be marked as coming from 2017.

ksec · on Feb 22, 2021

Yes I remember reading it long time ago. Any update since then?

weliveindetail · on Feb 22, 2021

They moved to LLVM 12 and OrcV2 https://github.com/postgres/postgres/commit/6c57f2ed16e12810...

killingtime74 · on Feb 22, 2021

I googled it https://www.postgresql.org/docs/13/jit.html

jeltz · on Feb 22, 2021

There have been some improvements but a lot of work remains.

megous · on Feb 22, 2021

Seeing that PostgreSQL has pluggable JIT backends, maybe using MIR would be a faster and more lightweight alternative to LLVM.

https://github.com/vnmakarov/mir

MaxBarraclough · on Feb 22, 2021

For good measure here are two other JIT backend libraries:

* libFirm, https://pp.ipd.kit.edu/firm/

* GNU lightning, https://www.gnu.org/software/lightning/

1337_d00dZ · on Feb 22, 2021

Although GNU lighting is a JIT, it is probably not well-suited for this application (it has limited virtual registers, no register allocator, no optimizer). Instead, gccjit would be a better option: https://gcc.gnu.org/onlinedocs/jit/

ltbarcly3 · on Feb 22, 2021

In my experience tuning postgres the JIT has rarely been a net win. It is very reliant on postgres' cost estimate, which becomes a problem. The more complex the query is, the more likely the JIT is to help, but, in general, the worse the cost estimate becomes. This often leads to cases where the cost estimate causes a full JIT'ing of query, and the time to jit and optimize is longer than the entire runtime of the query with the JIT turned off.

I thought maybe this was due to the default operation costs not having been updated for a long time (and being created when there was no such thing as an SSD etc), and so I tried to optimize them. I wrote a script which used a scipy global optimizer (https://docs.scipy.org/doc/scipy/reference/generated/scipy.o...) to run a set of real life queries against a copy of my database, and with a goal of finding a postgresql cost estimate configuration that minimized the total query time.

This optimization took over a day to run, and it would have taken much longer except I wrote a function to diff query plans and cache the runtimes of each query for identical query plans.

Here is what it ended up with:

  cpu_index_tuple_cost = 0.05935355125176216
  cpu_operator_cost = 0.012593074414763324
  cpu_tuple_cost = 0.15786073958355235
  random_page_cost = 0.4529142183444561
  seq_page_cost = 1.163448763641469
  parallel_tuple_cost = 0.18328552089499073
  parallel_setup_cost = 30630.595453975257
  join_collapse_limit = 29.62367337441951
  from_collapse_limit = 18.30069861231032
  geqo_threshold = 20.35752330027858
  geqo_effort = 1.9431285958606317
  min_parallel_table_scan_size = 42.64686041069217
  min_parallel_index_scan_size = 897.8594030984586
  jit_above_cost = 1060139.219983833
  jit_inline_above_cost = 5674002.74709
  jit_optimize_above_cost = 4974481.895365894

Doing this global optimization was made more difficult by the fact that most, but not all of the costs are only meaningful in terms of relative size. For example, setting random_page_cost to 3x seq_page_cost is just a relative cost, if you scale all these costs by some constant you will get the same query plan. However, the introduction of the *_size parameters have the effect of anchoring all the other parameters, so the search space goes from searching an N-dimensional unit sphere to an entire N dimensional space. While the unit sphere is only one dimension less, it is also a much smaller space in terms of boundaries, since each variable can be limited to the range of (0, pi) without imposing artificial absolute numerical guesses (which could easily exclude the optimal configuration from the searched space) to make the search space small enough that a global optimization is tractable.

(Things like join_collapse_limit aren't scale free but they also refer only to the query and not to the data, so they are independent, and besides, they have a very easy to guess range of acceptable values.)

Anyway, that is a long way of saying that the above did have a measurable effect on performance for my use case, but only on average, some queries still ended up being slower than using default settings. Looking into these, they were for the most part cases where the rowcount estimates the planner were generating were wrong by orders of magnitude, and this incorrect rowcount composed with the incorrect cost configuration caused the planner to pick a good query plan by luck.

To sum up, after analyzing the performance of hundreds and hundreds of queries, I have yet to see the JIT make one significantly faster, and I very often see much much worse performance. When I looked into what the JIT was doing, it appears that it is compiling hundreds and hundreds of functions for somewhat straightforward queries, which looked like a bug, but I didn't have time to look into it further. I ended up just cranking up the thresholds on the usage of the JIT to very very large numbers, under the theory that I want the JIT basically turned off, but if there is a query with a cost estimate that is astronomical, the time taken to JIT that query is probably negligible compared to the total runtime so it probably won't hurt.

Meanwhile, if you have a database server with spare CPU (and IO if your database doesn't fit in memory) capacity during peak times, I would look at tuning the parallel query parameters to encourage more parallelization. So far in my testing this can give you very close to a 1/K query time for a query that uses K workers. It is fantastic.

apavlo · on Feb 23, 2021

Andy from Carnegie Mellon here. This is impressive work. My PhD student and I have looked into similar methods for autotuning config knobs for MySQL + Postgres: https://db.cs.cmu.edu/projects/ottertune/

One of the many challenges with DB tuning that we have found is that people don't know whether their database, application, or DBMS version has changed enough to warrant another round of tuning. The best values that your tuning produces today may be different two weeks from now. Also, all the knobs that you are tuning don't require restarting the DBMS. Some of the knobs that make the most significant performance difference require you to restart the DBMS first before they take effect, which is not something people want to do often on their production database. None of the knobs that you target require restarting.

This problem is super interesting and our research has shown that it can really improve DB performance for some applications. We are in the process of spinning out OtterTune as a new startup: https://ottertune.com/

ltbarcly3 · on Feb 24, 2021

I think the biggest improvement/effort thing you could look at for Postgresql is the cost estimates. It really does a poor job at estimating row counts (over/under estimating by multiple orders of magnitude quite often) when there are multiple joins, and the new extended statistics are for the most part useless for real scenarios that I have come across.

xyzzy_plugh · on Feb 22, 2021

Thank you for sharing this analysis, it's very fascinating.

I'm interested in inferring correlations between query plans and tunables through strictly observing queries in flight, as it would be much too costly to perform the sort of experiment you performed.

mhh__ · on Feb 22, 2021

In general tuning JIT's can be a can of worms

murkt · on Feb 22, 2021

Parent comments about tuning Postgres itself, though

rrdharan · on Feb 22, 2021

See also: https://www.citusdata.com/blog/2018/09/11/postgresql-11-just...

CalChris · on Feb 22, 2021

The talk is available:

https://www.youtube.com/watch?v=ZNTeVt8V6PU

runeks · on Feb 22, 2021

This sounds like a cool idea, but I don’t think it’s very usable in practice.

At least not when using SSD/HDD for storage, where — in my experience — CPU time never exceeds 50% of the time spent executing a query. This means that the maximum speedup is 2x, without even accounting for the time spent performing JIT-compilation.

So we have a situation where the maximal theoretical speedup is 2x and the maximal slowdown is unbounded.