Hacker News new | past | comments | ask | show | jobs | submit login
A Quick Introduction to Graphviz (worthe-it.co.za)
249 points by jworthe on Sept 24, 2017 | hide | past | favorite | 44 comments

Thank you all for the comments and suggestions. Most of the criticisms are quite fair: graph language improvements, modernization including default styles that aren't from 1980, bring Javascript interfaces like viz.js into the code base, coding multithreaded solvers, etc. Emden Gansner and I left AT&T a few years ago (in my case, it was AT&T's idea not mine) and we have not put a lot of work into the code lately. Emden's done a lot to move the website to gitlab.com that is about ready to go (instead of hosting the site in a VM on a 1.1Ghz Pentium or whatever we left in the rednet closet in AT&T.) We have a small backlog of algorithmic improvements. For instance, Emden implemented Ulrik Brandes' "untangling hairballs" algorithm but it's not in the code base. I realize this is a standard open source refrain, but "if anyone knows how we could support ourselves by working on this, let us know."

Some of the main lessons from this software, which can be applied by other projects:

- For certain applications, people need rich types of diagrams, not just dots and lines. Many automatic tools including ours often still do not reach the quality of the best handmade diagrams.

- We can benefit from using more general systems of constraints, and this area of research deserves further study and exploration.

- Interaction, complex APIs, embedding your software in bigger, more elaborate systems is great, but presents some users with a lot of complication that is too costly to surmount.

- Documenting your work in a way that makes it accessible is as important as the work itself. We did fall short there.

We're grateful we had a good run with graphviz.

Best regards, Stephen North north from graphviz.org

You might like to look into Patreon as an income model -- you can set it up to get payed each time you release a new feature, or periodically, and people pledge money towards you.

Hi Stephen,

We are a small team specializing in data visualization, especially in web. I feel like we can help you with this amazing project. And we would be proud to do so.

See our recent work published on HN to get the idea of what we can do [1].

If you have any thoughts on how we can contribute, let me know by writing me an email on michael[at]tehcookies.com.

[1] http://tehcookies.com/devsalaries

- MK

For everyone interested in algorithms behind Graphviz, here's a paper so start with:


This work is absolutely amazing. I've tried implementing it a few years ago (as an excervise when preparing for Google interview). So naive. I've found out that I have read each and every sentence extremely carefully to get things right. I got quite far but never finished the implementation. (And I did not get an offer from Google either.)

The clue is in the name - 1993. It is just a pile of incompatible heuristics. I have used it, but I feel its inadequacies like a stiletto to the heart. Draw a quadrilateral with NEATO and see how many times you get a hideous bow tie. Pathetic. Its penumbral effect has been to stifle any real progress in better rival open source libraries, and hence we are all so much more impoverished today.

neato uses distance embedding methods. The paper linked describes Sugiyama-style rank embeddings as implemented in dot. 1) They're incomparable. 2) the theory behind neato was published in 2004 (http://www.graphviz.org/Documentation/GKN04.pdf).

There are research groups (I'm thinking of Tim Dwyer's specifically) that are specifically trying to improve the situation. When you say "it's just a pile of incompatible heuristics", it sounds like you haven't taken the time to try and implement a better solution.

Graph drawing is the kind of thing that seems trivial, right until you try to work on it.

(disclaimer: I'm clearly biased here since I worked with the people who wrote graphviz.)

I'm a little taken aback at your vitriol. Is "Stifled other libraries" a synonym for "Does the job well enough that nothing else was able to compete" ? That's been my sense. A lingua franca for graph visualization, pervasively available.

I hardly think they'd turn away algorithmic contributions, if you've got some.

Do you have an example of your hideous bow ties? I don't usually use graphviz to 'draw' anything, so I don't really have the context to understand your complaint. I've never seen a result I think of as a hideous bow tie.

Graphviz is such an amazing DSL. I thought myself it in college and do all of my architecture drawings in it. (I even outlined a chat bots conversation flow with it.) I'm surprised by how many developers don't know about it/use it. It might be because of it's late 90's website, http://www.graphviz.org/, which is a shame.

I love Graphviz and use it a lot but can't help but think every time: there must be a better way to describe directed graphs. Does anyone know of any alternative DSLs that try to solve the same problem in a radically different way?

Some that I've seen are js-sequence[1], flowchart.js[2], and Mermaid[3]. They aren't all directed graphs, but they are pretty neat and relevant.

[1]: https://bramp.github.io/js-sequence-diagrams/ [2]: http://flowchart.js.org [3]: https://mermaidjs.github.io/

Radically different, no, but I use blockdiag and its variations now (netdiag, seqdiag, etc) for automation of text based diagram generation. I still occasionally use graphviz and plantuml, but they aren't that different. I've played with but not really used the only two that you might consider radically different, yEd and Gephi.

Originally what spurred me towards text based diagram generation was a visceral hatred for out of date visio network maps, so I wrote a script to automatically update network maps and diffarchive old ones so I could walk back in time on the network.

It was something that could be put in a report and look nice (due to manual layout control) vs the automap generation of things like LibreNMS/Zabbix/Nagios.

There is also plantuml, IMO the best tool and language from all mentioned. If you are on Windows, cinst plantuml.

It provides bunch of types of diagrams, including GUI mockups (salt), gantt charts, all UML types, time diagrams etc.

It can also export to ASCII art, totally awesome.

Plugins are there for everything, such as Gitlab, VSCode, Redmine etc. and you can easily integrate it otherwise.

Highly recommended.

What probblems do you see, for example, in the graphviz language ?

It seems to me that the syntax is rather simple and minimal.

I’m not the OP, but of the top of my head, I would like to have:

  - scoping for node and edge styles (among other things, this would help
    towards copy-pasting graphs into subgraphs)
  - named styles (e.g. define *”shape=foo”* as red rectangles with italic
  - more consistency in graph attributes across layout engines
  - more intuitive effect of graph attributes on layout
  - easier way to style node content (the ‘old’ way was ugly, but the
    restricted subset of html isn’t that nice, either)
  - ability to switch layout engines in subgraphs (among other things,
    this would help towards copy-pasting graphs into subgraphs)
Also (from the article) ”If your rankdir is vertical, then you need to use {} to change the record type’s direction.”, IMO, is not what most users would expect, and is a pain in the ass when experimenting to find an optimal layout engine/graph attributes pair.

I agree with most of the issues the sibling commenter mentions though I haven't encountered all of them. It's a declarative language that has adopted a C/ALGOL style syntax. I find the way properties and declarations mutate and propagate quite unintuitive and I hate the boilerplate.

I haven't thought about it enough but I am a big fan of Elm's declarative graphics [1]. I think starting with something like that would be interesting.

    n1 = circle "node 1"
    n2 = square "node 2" |> outlined red
    main = graph [edge n1 n2, edge n2 n1]
[1]: https://csmith111.gitbooks.io/functional-reactive-programmin...

I don't now about radically different.. after all a graph is just a bunch of nodes, edges and associated attributes. There are standard alternatives to the DOT/GV graph description language though, such as GraphML (XML based) and GML (others too, but these seem to be most supported).

For graph layout, tools worth looking at include OGDF (an open source C++ library), the commercial yFiles Java library with free yEd editor/layout tool (includes support for GraphML attributes), and Gephi (an open source Java-based graph layout/visualization tool).

If your graph represents some type of directed dataflow, then maybe Google's Tensorboard graph visualizer (intended primarily for neural nets) is of interest - it's open source and very slick (esp. wrt subgraph collapsing/expanding), but you'd be on your own in terms of importing a foreign graph format.

Ahhh that table based layout: haven't seen a beauty like this for way too long :D Anyways content matters.

Graphviz is fantastic. I'm pretty sure you can still read your .dot files into OmniGraffle (for when you need to add a lot of color and annotations). I also think its power comes from being to be turned into a callable library useful in several other languages (python). Also I can't imagine using a service that hosts my private data in order to build a layout... but maybe that's just me.

Two other alternatives: The best one for network-bounce / sequence diagrams.. great for when you need to explain all the bounces in your mulit-tiered web app https://www.websequencediagrams.com/ and http://sverweij.github.io/mscgen_js/

PLUG: Another alternative might be Breakdown Notes. It let's you upload graphviz xdot files so you can change colors, layout and text . https://www.breakdown-notes.com/blog/graphviz

I believe you need the pro version of OmniGraffle to open them.

I don't think so, the sites seem to indicate free can also open dot files[0], I do believe there is a limit on the number of elements though.

IIRC I did some graphs for my bachelor thesis (10+ years ago) with graphviz, and then basically just passed them through OmniGraffle to make them beautiful.

[0] https://support.omnigroup.com/doc-assets//OmniGraffle-Mac/Om...

Probably the only VCS-friendly format for graphs. If you want to write technical design documentation, you have to put it along with sources (because this is the natural lifecycle environment of sources - fixed word-processing-like documents or confluence will never be able to handle well versions, branches, backports), and for this reason, a diff-friendly text format is mandatory.

Using Graphviz is the only way I know to build such specs. Bonus point: many online VCS renderer are able to display .md format, with embedded .svg images (directly produced from Graphviz).

I've been developing a radix tree on the side and one of the hardest things was to visualize the tree when debugging. Adding some code to print it as DOT was super simple and made it easy to copy paste into an online graphiz visualizer to inspect the tree.

I discovered Graphviz a few years ago and am a huge fan. I’ve built many visualizations with it, it’s amazingly simple to use yet amazingly powerful. I really wish I had found it ten years earlier.

I've used this tool a bit, it's the good kind of voodoo, but it desperately needs modernization, it looks like it was forged in the '90s and was never updated.

What would you change? To me it feels nearly perfect and feature complete.

At the moment most popular interpreters will generate graphs that look... old. It might be the serif font, it might be the slightly clunky label placement. It could even be the black outline, no fill of the default.

Of course, you can change the defaults, but creating the 'look' you want textually without an interactive creator is non-trivial. Given that the main strength of graphviz is that I can write it in a stream of consciousness (like markdown), or generate it automatically using some kind of script from another data source, this is somewhat of a blocker in its usefulness.

And, of course, the customisation syntax is fairly clunky and not easily readable. I don't think I've seen a DSL that does colouring well in this regard, but graphviz goes a step further by not being consistent with CSS or other markups.

It is good in terms of feature-completeness, but as mentioned in the TC, there are areas where it could be polished up a bit to make it feel more modern.

I agree styling isn't exactly easy, but I have built very nice mplex diagrams I'm a fraction of the time I could have done it in a tool like Visio. Compared to endlessly thrashing about in a GUI it is still a dream. I think because it is just polish no one has really had the desire to do a significant update.

There are a bunch of prettifiers for the SVG output, I couldn't find the one I used last time, but stumbled across this one that seems very neat and even makes the graph interactive: https://github.com/mountainstorm/jquery.graphviz.svg/

Edit: found it https://github.com/vidarh/diagram-tools/tree/master

Yeah, just the edges sanded off really - more of a markdown. The YAML to graphviz's JSON.

And then the graphics need to look more D3 than UML.

Love Graphviz, been using it for some time now. It's kind of painful rendering large graphs. Does anyone know the particular reason why `dot` and friends only seem to take advantage of one CPU core when drawing graphs? Multi-threading unimplemented? - or impossible to implement for some reason?

Sounds about right, 'dot' is about as old as the hills, so it probably wasn't an issue back in the day.

Does anyone have any favorite tools for visualizing very large graphs?

I'm thinking 200k nodes and 2 million edges? Graphviz segfaults graphs a fraction of the size and even at that scale a node by node representation isn't super helpful without being able to zoom out and look for patterns.

What layout algorithm are you using for graphviz? sfdp is less computationally intensive than dot/neato/circo, and I would have thought it would be able to cope (assuming you have enough RAM/swap) - this gallery [1] uses graphviz to visualize networks from the University of Florida Sparse Matrix collection, the largest of which "have tens of millions of nodes and over a billion of edges".

Gephi and Cytoscape are the main free graphical applications for analyzing networks; they might work for you. Tulip [2] might also be worth a try.

[1]: https://web.archive.org/web/20111102185002/http://www2.resea...

[2]: http://tulip.labri.fr/TulipDrupal/

We made http://github.com/graphistry/pygraphistry for that sort of thing, feel free to ping for an API key!

The idea is by connecting GPUs in your browser to GPUs in the cloud, we can do bigger and bigger datasets over time. We're currently 1M nodes & edges in interactive-time (so no leaving your computer for 1hr+ or crashing), and actively working on the V2 engine to get us to 100X more. And yep, generally we don't want to stay long in large views, so we see it more about being scalable / smart / usable enough to let you go in-and-out.

I've occasionally looked into larger projects, though not that large.

There are some tools in R and python that seem to turn up. I'm sorry I don't have anything more specific than that to suggest, though it might be a fruitful direction.

Out of curiosity: is IBM Rational Rose doing the job? Haven't used it for years.

Gephi might work.

For my senior thesis in undergrad (2008), I used graphviz as part of my tooling for visualizing graph algorithms. I held onto that code and it still chills in GitHub (although I wouldn’t be intro’d to github till grad school a year later).


Sometimes it’s painful to go back and see the code you wrote that long ago, but what’s great is to see frameworks like this still chugging along providing value for over a decade.

Great intro! I love graphviz, it's saved me tons of time. My former colleague wrote a "code parser" that went through SAS and SQL code, creating dependency trees of the data/tables and procedures/functions used. She wrote it in SAS, which alone is crazy. Her code would output .dot files, and then she'd create PNGs out of them.

When we needed high res images, we exported to SVG, and I was able to import those into Visio. She saved me days or weeks of documenting.

I use graphviz quite a bit for doing diagrams of data flow or UI flows. It's easy to get something sufficient to communicate, but in my experience very hard to get something you'd be happy putting in a presentation that's going to be seen outside engineering.

I want to put the plug in for nwdiag which is like graphviz for network diagrams: http://blockdiag.com/en/nwdiag/nwdiag-examples.html

Saves heaps of time over fiddling with Visio.

Use it all the time to describe entwined processes and for engineering when you have lots of different equipment connected in of ways & need to describe it.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact