Pev: Postgres ‘Explain’ Visualizer (2016)

hotdogknight · on July 24, 2017

I needed a version that ran on the command line so I made one here: https://github.com/simon-engledew/gocmdpev

smcleod · on July 24, 2017

That’s fantastic and no JavaScript! Thank you so much for your time / effort on this!

binocarlos · on July 24, 2017

Thats a fantastic tool - there is scope for running it for queries on CI. A bit like a code coverage tool but for db query performance.

TekMol · on July 24, 2017

That's awesome! Does something like this exist for MySql?

kakwa_ · on July 24, 2017

IIRC, it's already present in mysql workbench.

https://dev.mysql.com/doc/workbench/en/images/wb-new-visual-...

TekMol · on July 24, 2017

I mean something that runs on the command line. Preferable a small open source script like the one hotdogknight wrote.

atatiyan · on July 24, 2017

creator of pev here, thanks for all the kind words!

leohart · on July 25, 2017

I am trying to look for the logic to layout the plan on a 2D plane so that the boxes don't collide with each other but so far no luck. Can you give me a pointer please?

The sample JSON of the plan doesn't indicate how big the boxes would be when presented on the UI. And also how to draw the connections between them.

mooneater · on July 24, 2017

This is beautiful.

Can I ask, do you use a library to handle laying out the visual network, or is that using custom JS?

atatiyan · on July 24, 2017

I don't use JS for layout, it's pretty much all custom CSS.

swrobel · on July 24, 2017

Still being developed/maintained? Last push in Oct...

garysieling · on July 23, 2017

Does this store the plans? I like these things, but I'm always a little leery that this will expose my database schema in Google search results.

matthewmacleod · on July 23, 2017

I’ve used this before; data is only stored locally. Alternatively, you could run your own copy of the service using the NPM package.

garysieling · on July 24, 2017

Thanks!

stavros · on July 24, 2017

Here's a version that runs without a server, entirely on IPFS, for your peace of mind, and also because lately I've been IPFSing everything:

https://www.eternum.io/ipfs/QmQ1Qh9moFgV4MikubnXzAmV3rD48JPG...

sgt · on July 24, 2017

This is really great and I think I might start using this. I would also love a standalone version of this that runs outside the browser. Something that can maybe connect directly to my DB.

mistercow · on July 24, 2017

I use this tool often, and it's great. It's a lot easier to wrap your head around plans when the way it displays them.

The one thing I wish it had is either the ability to not save plans automatically, or at least a button to clear the history. As it is, I just pull up a console from time to time and do localStorage.clear()

fnord123 · on July 23, 2017

Looks good, but why not dump is as a flame graph?

thechao · on July 24, 2017

I'm not sure why you're being downvoted without comment: that sucks. I think the fact this project can show per-node costs of the plan is interesting---and certainly a killer feature---but I think it's only part of the story. I think the structure is just as important.

Unfortunately, flame graphs don't show (much) structure.

Now, as a second note, I do a lot of performance work (systems programming in stacks starting at hand-assembled machine code, working all the way up to scripting languages). Personally, I find flame graphs not as useful as a traditional (inverted-)back-trace. I think the structure encoded in a traditional (inverted-)back-trace tells you a lot about poorly designed algorithms and bad systems interactions in a way that flame graphs (mostly) erase.

fnord123 · on July 24, 2017

I'm not sure how you mean flame graphs don't show much structure. They appear to show a lot of structure so I'm surprised by the criticism. I haven't used them as much as I would have hoped; but found them useful. Otherwise I normally see @brandangregg banging on about how good they are so could you elaborate more on your criticisms?

Thanks

cormacrelf · on July 24, 2017

Flame graphs are based on high frequency linear sampling of a (at least conceptually) single thread of execution, where each sample records the stack.

SQL query execution does not follow this structure. Query plans are executed in whatever order or degree of parallelism is appropriate. And linearising an SQL query plan is actually wrong. If you've ever watched SQL Server's Live Query Statistics play out, you'll know that the engine does whatever work it can when possible. It would be incorrect to think of the query execution as a stack the CPU can get stuck deep down in, and a flame graph would create an incorrect model.

As an exercise, where would you put self-time in a flame graph representation of an execution plan for a filtering node, or a sorting node? The engine doesn't report that, because filters happen inline (at least in batches) and sorts are optimised to happen progressively and can be scheduled in between IO handlers. Ask an expert about this, I'm not your guy.

Also, the data you want to see isn't represented by one dimension on the flame graph (how wide is this function call?), it is represented by many more: cost, CPU time, number of rows, IO operations, and estimations of all the above. The execution plan needs to be roomier, and include all this information, not merely wall clock time.

cormacrelf · on July 24, 2017

I should add, in my experience optimising queries, a time breakdown is generally the first step of many. Flamegraphs are great at telling you where in 1,000 invocations your time is spent, but your execution plan only has ~20 nodes max, and the percentages do that job well enough. At that point, you want to know why it's slow, and a flamegraph representation of those percentages won't help you with that.

cyphar · on July 24, 2017

I'm not batting for either side, but brandangregg did sort of invent flame graphs, so I'm not surprised that he finds them very good. [No offense to brendangregg, I've met him and he doesn't strike me as an egotistical person, but everyone does love their pet thing.]

felixge · on July 24, 2017

Do you have an example of "traditional (inverted-)back-trace"?

fnord123 · on July 24, 2017

I think thechao means the stuff you get from perf where it looks a bit like the output of pstack. But I'd like a clarification too.

thechao · on July 24, 2017

The tooling I use records backtraces at some sampling frequency. The obvious sorting is from the root call site (_start), down. However, while that sorting shows the worst offending single backtrace, it can be a bit impractical once the top offenders (in terms of perf) have been fixed. Instead, I like to see an inverted tree: reverse every backtrace then create a tree. This method tends to show systemic issues.

felixge · on July 24, 2017

I've looked into creating a flame graph visualization for PostgreSQL EXPLAIN output before, but CTEs that are scanned multiple times complicate things a bit. That's because the first scan on a CTE node will cause the CTE query to be executed, while further scans will simply replay the cached result-set from the first execution.

I'm sure these problems can be overcome, but it's a bit more complicated than I had imagined when I tried to do it myself. I also suspected nested-loop-joins on CTEs could further complicate the flame graph construction.

obiwahn · on July 24, 2017

Looks awesome! How about adding a direction to your graph for people just starting with SQL.

sghall · on July 24, 2017

Cool project. Not a DBA but was interested in playing around with this. Be great to maybe add some example plans here: http://tatiyants.com/pev/#/plans

So if you just want to checkout the interface you can click to load up an example or two.

luhn · on July 24, 2017

It already has that feature!

Hit "New Plan", and then in the top right there's a link that says "Create Sample Plan"

sghall · on July 25, 2017

Ahh, nice! Missed that. Maybe make it more prominent ;)

beefsack · on July 23, 2017

I can completely see myself using this on my Postgres projects, but something like this would be most useful for me at work.

How feasible would it be to port this over to MySQL / MariaDB? I know EXPLAIN output on MySQL is much simpler than what you get out of Postgres so my gut feeling would be that it wouldn't be possible.

johannes1234321 · on July 23, 2017

Not exactly the same, but if you run MySQL Workbench as GUI there's a visual explain feature built around MySQL's JSON explain: https://dev.mysql.com/doc/workbench/en/wb-performance-explai...

linux2647 · on July 23, 2017

I would imagine that as long as you could get a JSON representation of the plan, Pev could consume it.

Edit: looks like MySQL can output it as JSON: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html

ris · on July 24, 2017

Something I've wanted from an explain viewer for a long time is simply using the "start time" "end time" information on the nodes to put things in a basic timeline. Most visualisers seem determined to keep the layout as a pimped up version of the tree given to them.

maxvu · on July 24, 2017

Why, in the example, does the constituent `customerid` join take longer than the forming `orderid` one?

icebraining · on July 24, 2017

You'd probably have to expand each step to see the specifics, but a simple possibility is that many customers don't have orders, and hence the first join filters out a bunch of them.

edraferi · on July 23, 2017

Very cool! Now I want to figure out the Postgres EXPLAIN JSON format and start parsing other DBs to fit, just so I can use this tool on them.

stuaxo · on July 24, 2017

This is great, should really be a part of pgadmin4.

rusht · on July 29, 2017

pgadmin4's Query Tool [0] has a similar analyzer.

[0] https://www.pgadmin.org/docs/pgadmin4/1.x/query_tool.html

dlb_ · on July 24, 2017

Very nice! I wonder if it would be possible to embed that into pgAdmin? Possibly with Electron?

emilsedgh · on July 23, 2017

Absolutely fantastic! Thank You!

isatty · on July 23, 2017

Thank you, this is very useful!

j_s · on July 24, 2017

The PostgreSQL parser is available as a C library.

C - https://github.com/lfittl/libpg_query

Ruby - https://github.com/lfittl/pg_query

Go - https://github.com/lfittl/pg_query_go

Node - https://github.com/zhm/pg-query-parser

Python - https://github.com/alculquicondor/psqlparse

lfittl · on July 24, 2017

Thanks for the shout-out - I'm the author of the base library you listed that can be used to work with the Postgres parser (libpg_query).

That said, the tool being discussed here (pev) is used to interpret EXPLAIN plans, which libpg_query doesn't give you, as its only concerned with the parsing stage, not the planning stage.

In order to plan a query you'll need more information from the actual database, including table statistics, config settings and the size of the underlying table :)

saurik · on July 24, 2017

The output of explain is not "parse this SQL statement and show me what it means" but "given this SQL statement, how will it be executed on this database (which is affected by the table's stored statistics and takes into account any indexes)". I am thereby not sure how these query parsing libraries are relevant.

j_s · on July 24, 2017

Thanks for taking the time to comment (?).

Never-before-on-the-HN-front-page tools for working with PostgreSQL queries seemed relevant to me; this info may help a few others.

edoceo · on July 23, 2017