I am trying to look for the logic to layout the plan on a 2D plane so that the boxes don't collide with each other but so far no luck. Can you give me a pointer please?
The sample JSON of the plan doesn't indicate how big the boxes would be when presented on the UI. And also how to draw the connections between them.
This is really great and I think I might start using this. I would also love a standalone version of this that runs outside the browser. Something that can maybe connect directly to my DB.
I use this tool often, and it's great. It's a lot easier to wrap your head around plans when the way it displays them.
The one thing I wish it had is either the ability to not save plans automatically, or at least a button to clear the history. As it is, I just pull up a console from time to time and do localStorage.clear()
I'm not sure why you're being downvoted without comment: that sucks. I think the fact this project can show per-node costs of the plan is interesting---and certainly a killer feature---but I think it's only part of the story. I think the structure is just as important.
Unfortunately, flame graphs don't show (much) structure.
Now, as a second note, I do a lot of performance work (systems programming in stacks starting at hand-assembled machine code, working all the way up to scripting languages). Personally, I find flame graphs not as useful as a traditional (inverted-)back-trace. I think the structure encoded in a traditional (inverted-)back-trace tells you a lot about poorly designed algorithms and bad systems interactions in a way that flame graphs (mostly) erase.
I'm not sure how you mean flame graphs don't show much structure. They appear to show a lot of structure so I'm surprised by the criticism. I haven't used them as much as I would have hoped; but found them useful. Otherwise I normally see @brandangregg banging on about how good they are so could you elaborate more on your criticisms?
Flame graphs are based on high frequency linear sampling of a (at least conceptually) single thread of execution, where each sample records the stack.
SQL query execution does not follow this structure. Query plans are executed in whatever order or degree of parallelism is appropriate. And linearising an SQL query plan is actually wrong. If you've ever watched SQL Server's Live Query Statistics play out, you'll know that the engine does whatever work it can when possible. It would be incorrect to think of the query execution as a stack the CPU can get stuck deep down in, and a flame graph would create an incorrect model.
As an exercise, where would you put self-time in a flame graph representation of an execution plan for a filtering node, or a sorting node? The engine doesn't report that, because filters happen inline (at least in batches) and sorts are optimised to happen progressively and can be scheduled in between IO handlers. Ask an expert about this, I'm not your guy.
Also, the data you want to see isn't represented by one dimension on the flame graph (how wide is this function call?), it is represented by many more: cost, CPU time, number of rows, IO operations, and estimations of all the above. The execution plan needs to be roomier, and include all this information, not merely wall clock time.
I should add, in my experience optimising queries, a time breakdown is generally the first step of many. Flamegraphs are great at telling you where in 1,000 invocations your time is spent, but your execution plan only has ~20 nodes max, and the percentages do that job well enough. At that point, you want to know why it's slow, and a flamegraph representation of those percentages won't help you with that.
I'm not batting for either side, but brandangregg did sort of invent flame graphs, so I'm not surprised that he finds them very good. [No offense to brendangregg, I've met him and he doesn't strike me as an egotistical person, but everyone does love their pet thing.]
The tooling I use records backtraces at some sampling frequency. The obvious sorting is from the root call site (_start), down. However, while that sorting shows the worst offending single backtrace, it can be a bit impractical once the top offenders (in terms of perf) have been fixed. Instead, I like to see an inverted tree: reverse every backtrace then create a tree. This method tends to show systemic issues.
I've looked into creating a flame graph visualization for PostgreSQL EXPLAIN output before, but CTEs that are scanned multiple times complicate things a bit. That's because the first scan on a CTE node will cause the CTE query to be executed, while further scans will simply replay the cached result-set from the first execution.
I'm sure these problems can be overcome, but it's a bit more complicated than I had imagined when I tried to do it myself. I also suspected nested-loop-joins on CTEs could further complicate the flame graph construction.
Cool project. Not a DBA but was interested in playing around with this. Be great to maybe add some example plans here:
http://tatiyants.com/pev/#/plans
So if you just want to checkout the interface you can click to load up an example or two.
I can completely see myself using this on my Postgres projects, but something like this would be most useful for me at work.
How feasible would it be to port this over to MySQL / MariaDB? I know EXPLAIN output on MySQL is much simpler than what you get out of Postgres so my gut feeling would be that it wouldn't be possible.
Something I've wanted from an explain viewer for a long time is simply using the "start time" "end time" information on the nodes to put things in a basic timeline. Most visualisers seem determined to keep the layout as a pimped up version of the tree given to them.
You'd probably have to expand each step to see the specifics, but a simple possibility is that many customers don't have orders, and hence the first join filters out a bunch of them.
Thanks for the shout-out - I'm the author of the base library you listed that can be used to work with the Postgres parser (libpg_query).
That said, the tool being discussed here (pev) is used to interpret EXPLAIN plans, which libpg_query doesn't give you, as its only concerned with the parsing stage, not the planning stage.
In order to plan a query you'll need more information from the actual database, including table statistics, config settings and the size of the underlying table :)
The output of explain is not "parse this SQL statement and show me what it means" but "given this SQL statement, how will it be executed on this database (which is affected by the table's stored statistics and takes into account any indexes)". I am thereby not sure how these query parsing libraries are relevant.