Hacker News new | past | comments | ask | show | jobs | submit login
Pev: Postgres ‘Explain’ Visualizer (2016) (tatiyants.com)
612 points by insulanian on July 23, 2017 | hide | past | favorite | 48 comments



I needed a version that ran on the command line so I made one here: https://github.com/simon-engledew/gocmdpev


That’s fantastic and no JavaScript! Thank you so much for your time / effort on this!


Thats a fantastic tool - there is scope for running it for queries on CI. A bit like a code coverage tool but for db query performance.


That's awesome! Does something like this exist for MySql?


IIRC, it's already present in mysql workbench.

https://dev.mysql.com/doc/workbench/en/images/wb-new-visual-...


I mean something that runs on the command line. Preferable a small open source script like the one hotdogknight wrote.


creator of pev here, thanks for all the kind words!


I am trying to look for the logic to layout the plan on a 2D plane so that the boxes don't collide with each other but so far no luck. Can you give me a pointer please?

The sample JSON of the plan doesn't indicate how big the boxes would be when presented on the UI. And also how to draw the connections between them.


This is beautiful.

Can I ask, do you use a library to handle laying out the visual network, or is that using custom JS?


I don't use JS for layout, it's pretty much all custom CSS.


Still being developed/maintained? Last push in Oct...


Does this store the plans? I like these things, but I'm always a little leery that this will expose my database schema in Google search results.


I’ve used this before; data is only stored locally. Alternatively, you could run your own copy of the service using the NPM package.


Thanks!


Here's a version that runs without a server, entirely on IPFS, for your peace of mind, and also because lately I've been IPFSing everything:

https://www.eternum.io/ipfs/QmQ1Qh9moFgV4MikubnXzAmV3rD48JPG...


This is really great and I think I might start using this. I would also love a standalone version of this that runs outside the browser. Something that can maybe connect directly to my DB.


I use this tool often, and it's great. It's a lot easier to wrap your head around plans when the way it displays them.

The one thing I wish it had is either the ability to not save plans automatically, or at least a button to clear the history. As it is, I just pull up a console from time to time and do localStorage.clear()


Looks good, but why not dump is as a flame graph?


I'm not sure why you're being downvoted without comment: that sucks. I think the fact this project can show per-node costs of the plan is interesting---and certainly a killer feature---but I think it's only part of the story. I think the structure is just as important.

Unfortunately, flame graphs don't show (much) structure.

Now, as a second note, I do a lot of performance work (systems programming in stacks starting at hand-assembled machine code, working all the way up to scripting languages). Personally, I find flame graphs not as useful as a traditional (inverted-)back-trace. I think the structure encoded in a traditional (inverted-)back-trace tells you a lot about poorly designed algorithms and bad systems interactions in a way that flame graphs (mostly) erase.


I'm not sure how you mean flame graphs don't show much structure. They appear to show a lot of structure so I'm surprised by the criticism. I haven't used them as much as I would have hoped; but found them useful. Otherwise I normally see @brandangregg banging on about how good they are so could you elaborate more on your criticisms?

Thanks


Flame graphs are based on high frequency linear sampling of a (at least conceptually) single thread of execution, where each sample records the stack.

SQL query execution does not follow this structure. Query plans are executed in whatever order or degree of parallelism is appropriate. And linearising an SQL query plan is actually wrong. If you've ever watched SQL Server's Live Query Statistics play out, you'll know that the engine does whatever work it can when possible. It would be incorrect to think of the query execution as a stack the CPU can get stuck deep down in, and a flame graph would create an incorrect model.

As an exercise, where would you put self-time in a flame graph representation of an execution plan for a filtering node, or a sorting node? The engine doesn't report that, because filters happen inline (at least in batches) and sorts are optimised to happen progressively and can be scheduled in between IO handlers. Ask an expert about this, I'm not your guy.

Also, the data you want to see isn't represented by one dimension on the flame graph (how wide is this function call?), it is represented by many more: cost, CPU time, number of rows, IO operations, and estimations of all the above. The execution plan needs to be roomier, and include all this information, not merely wall clock time.


I should add, in my experience optimising queries, a time breakdown is generally the first step of many. Flamegraphs are great at telling you where in 1,000 invocations your time is spent, but your execution plan only has ~20 nodes max, and the percentages do that job well enough. At that point, you want to know why it's slow, and a flamegraph representation of those percentages won't help you with that.


I'm not batting for either side, but brandangregg did sort of invent flame graphs, so I'm not surprised that he finds them very good. [No offense to brendangregg, I've met him and he doesn't strike me as an egotistical person, but everyone does love their pet thing.]


Do you have an example of "traditional (inverted-)back-trace"?


I think thechao means the stuff you get from perf where it looks a bit like the output of pstack. But I'd like a clarification too.


The tooling I use records backtraces at some sampling frequency. The obvious sorting is from the root call site (_start), down. However, while that sorting shows the worst offending single backtrace, it can be a bit impractical once the top offenders (in terms of perf) have been fixed. Instead, I like to see an inverted tree: reverse every backtrace then create a tree. This method tends to show systemic issues.


I've looked into creating a flame graph visualization for PostgreSQL EXPLAIN output before, but CTEs that are scanned multiple times complicate things a bit. That's because the first scan on a CTE node will cause the CTE query to be executed, while further scans will simply replay the cached result-set from the first execution.

I'm sure these problems can be overcome, but it's a bit more complicated than I had imagined when I tried to do it myself. I also suspected nested-loop-joins on CTEs could further complicate the flame graph construction.


Looks awesome! How about adding a direction to your graph for people just starting with SQL.


Cool project. Not a DBA but was interested in playing around with this. Be great to maybe add some example plans here: http://tatiyants.com/pev/#/plans

So if you just want to checkout the interface you can click to load up an example or two.


It already has that feature!

Hit "New Plan", and then in the top right there's a link that says "Create Sample Plan"


Ahh, nice! Missed that. Maybe make it more prominent ;)


I can completely see myself using this on my Postgres projects, but something like this would be most useful for me at work.

How feasible would it be to port this over to MySQL / MariaDB? I know EXPLAIN output on MySQL is much simpler than what you get out of Postgres so my gut feeling would be that it wouldn't be possible.


Not exactly the same, but if you run MySQL Workbench as GUI there's a visual explain feature built around MySQL's JSON explain: https://dev.mysql.com/doc/workbench/en/wb-performance-explai...


I would imagine that as long as you could get a JSON representation of the plan, Pev could consume it.

Edit: looks like MySQL can output it as JSON: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html


Something I've wanted from an explain viewer for a long time is simply using the "start time" "end time" information on the nodes to put things in a basic timeline. Most visualisers seem determined to keep the layout as a pimped up version of the tree given to them.


Why, in the example, does the constituent `customerid` join take longer than the forming `orderid` one?


You'd probably have to expand each step to see the specifics, but a simple possibility is that many customers don't have orders, and hence the first join filters out a bunch of them.


Very cool! Now I want to figure out the Postgres EXPLAIN JSON format and start parsing other DBs to fit, just so I can use this tool on them.


This is great, should really be a part of pgadmin4.


pgadmin4's Query Tool [0] has a similar analyzer.

[0] https://www.pgadmin.org/docs/pgadmin4/1.x/query_tool.html


Very nice! I wonder if it would be possible to embed that into pgAdmin? Possibly with Electron?


Absolutely fantastic! Thank You!


Thank you, this is very useful!



Thanks for the shout-out - I'm the author of the base library you listed that can be used to work with the Postgres parser (libpg_query).

That said, the tool being discussed here (pev) is used to interpret EXPLAIN plans, which libpg_query doesn't give you, as its only concerned with the parsing stage, not the planning stage.

In order to plan a query you'll need more information from the actual database, including table statistics, config settings and the size of the underlying table :)


The output of explain is not "parse this SQL statement and show me what it means" but "given this SQL statement, how will it be executed on this database (which is affected by the table's stored statistics and takes into account any indexes)". I am thereby not sure how these query parsing libraries are relevant.


Thanks for taking the time to comment (?).

Never-before-on-the-HN-front-page tools for working with PostgreSQL queries seemed relevant to me; this info may help a few others.


Rad




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: