Hacker News new | comments | show | ask | jobs | submit login
Glance, a visualizer for Haskell code (github.com)
128 points by chewxy on Jan 11, 2017 | hide | past | web | favorite | 32 comments

If anyone like me thinks all these visualizations just make the code more confusing, I want to point out it might be a personal (although it can be very strong) preference.

I always firmly believe programs are universally clearer represented as line of texts with indentations, and any attempt to visualize it doesn't help except for simple toy programs. Then I met some architects who design buildings with code using Grasshopper 3D (which is a graphical functional language for 3D modelling) [0]. Those people can easily navigate a messy web of connected lines for their hugely complex models, yet finding a block of text confusing and unintuitive. I am sure some of the more visual-inclined haskellers will find Glance very useful.


I think the question of whether 'visual' code is good/useful should be broken down a little more: which specific aspects of a programming language could usefully draw from a larger visual vocabulary?

It's clear that text is useful for some aspects: it allows you to easily define and incorporate a very large number of distinct symbols for naming things—and in code we need to name lots of things.

What about sequences of expressions (say a block of 20 lines within a single function) doing arithmetic, assigning values to variables, invoking functions etc. Again, the linear/sequential nature of text has intrinsic properties that make effective for this.

Now what about things like function and class definitions, which are essentially conveying information about nesting/containment/category definition. Are there properties of text that make it intrinsically appropriate here? I say no. The fact that we use curly braces and tabbing to indicate these kinds of structures is pretty clearly historical accident.

The true benefits of moving to a hybrid visual language may not be readily apparent because we haven't had an opportunity to easily experiment with different ideas in this realm. As inexperienced outsiders only the obvious substitutions come to mind (e.g. replacing curly braces tabs with a colored rectangle). But I think it's an area that offers a very large range of possibilities, that we're still largely ignorant of.

An example of one structure I could see as more useful: the default view of class and function definitions etc. (any bits of code used for 'organizational' purposes) show up as nodes in a network, and depending on how zoomed in you are, you may or may not see the text of code inside. You frequently switch between this and a call graph view that connects modules by a control flow visualization instead. And, every aspect of the appearance of the language is configurable by something like 'syntax-defining' CSS (e.g. it's not strictly for visual properties but could change keywords or whether semi-colons are used or not, etc.). (This would be relatively impossible if you insist on parsing text to derive a model of your program—which if you think about is another thing we do largely because of historical accident. I've written about an alternate approach here: http://westoncb.blogspot.com/2015/06/how-to-make-view-indepe...)

What you are describing sounds a lot like what Node-RED does. https://nodered.org/

Each kind of node is basically a class, and is distinguished by its color and interfaces - input nodes are blue with a button that triggers an action and one output, debug nodes are green with one input and a switch to toggle if output is being logged, function nodes are tan and have one input and three outputs (iirc, "normal" output - more on this later - stderr and return value).

Nodes are linked up output to input - no limit on the number of links a given interface can have - and transfer information as json objects. Often nodes expect a top level key named "payload" by default, though most node behavior can be customised.

So a hello world program would be something like: blue kickoff node with a button --> tan function node inside which you define a js function that takes an object named msg as an argument and sets msg.payload to "hello world!" --> green debug node set to log msg.payload and switched on. You click the button and "hello world!" appears in the log (and reappears as many times as you click).

The nice thing is that there are a lot of kinds of nodes that abstract away common tasks - very much like what ansible does with modules. One read/writes files, one executes shell commands, one listens/broadcasts to mqtt channels, one queries sql databases, etc. When the correct node is used for a task vs wrapping all the desired logic in a single function node (which you could do), the resulting graph or flow is very easy to understand or reason about.

It's not quite what I was getting at, but it does sound interesting.

What I had in mind was staying in the procedural/oop paradigm of e.g. Java, but without the requirement of a fixed syntax, way more flexible rendering options, and the possibily of more diverse input schemes than typing one character at a time.

I think you nailed an important point here: text is great in representing code because text is sequential, and so does a lot of the code we write, and the computer it runs on. Maybe if the code have different structure, some 'visual' code would serve better. Idk, maybe coding for an FPGA/GPU would benefit from a visual representation?

LabViz is another well used graphic programming enviroment. It's also the base of the LEGO uses on their toys.

I'll have to disagree. It's not a matter of personal preference. Those tools scale only into a few hundred blocks, and even for naturally visual things (like logic circuits designing) people can only grasp more complex constructions when they are text. Even if you make the same abstracction blocks available in both.

Those architects probably just don't hit their limit. Physicists are all the time reaching their limit on LabViz scripts, and very vocal about it.

How can you accurately comment on something that you don't even know the name of? You are almost certainly referring to LabVIEW, which is a graphical programming language commonly used in instrumentation applications and also serves as the foundation of the LEGO MINDSTORMS software.

I have built many large applications in LabVIEW (>1,000 VIs), and I can easily state there is no inherent fault in being to build large systems with a graphical language vs a text-based one. Why? Because I have actually done it, resulting in extensible and maintainable code. The dataflow nature of the language makes it easy to understand how data flows through your system. It even has by-value OOP and an actor framework.

Physicists complaining about it have no merit because they have zero idea about software development, architectures, design patterns, etc. They love Python for whatever reason and don't complain about it, yet it's still terrible code. I have seen LabVIEW sworn off by top of the line physicists only to be shown their Python program that was a single file with about 15,000 lines of code and such atrocities as functions with greater than 20 arguments spread over 10 lines. It was an atrocity.

I have written multithreaded applications in Python, and the equivalent LabVIEW application would be far simpler because you get multithreaded behavior for free.

Many software engineers swear graphical languages are only for toy applications, but yet they've never actually built anything of note themselves in one. So it's a poor argument to listen to (as in the person you replied to) when it's purely speculation based on no real data. They somehow forget that they spent four years in college getting it drilled into their head that linear, text-based files are the only way to program a computer. It's interesting to note than any high level thought (e.g. mathematics) bears more resemblance to graphical notation than pure text-based notation.

> It's interesting to note than any high level thought (e.g. mathematics) bears more resemblance to graphical notation than pure text-based notation.

Exactly! The more general the structures you're thinking about, the more visual approaches seem to become effective. That's my experience and Hadamard has some good discussion and data on it being a widespread trend in mathematical thinking.

Maybe a good approach for thinking about (at least partially) visual languages is considering which approaches are effective in thought for which types of subject matter. It definitely seems like the more general/abstract/high-level thought-categories make better use of relationship-focused visuals.

Also: that's why 'LabViz' yielded no seemingly relevant results on image search...

Check out Bubble.is I think they got the abstraction layer right.

what do you think of visualizations such as this? it seems helpful for what he's doing here.



Brian Beckman: The Zen of Stateless State - The State Monad


Creator of Glance here, happy to answer any questions.

I didn't realize graphviz could do graphs of that quality. I'm used to graphviz's examples of DAGs and badly formatted text. Can you tell us more of how you're using graphviz? Wouldn't you need to move away from it as you move towards an interactive editor, in favor of other more javascript-based layout algorithms? What path do you think you'd follow for that?

Graphviz is only used to find the positions for the nodes using Graphviz's Neato algorithm. The nodes and the lines themselves are all rendered using the Haskell library Diagrams [0].

The next step for the project is to improve graph layout (see Glance issue here [1]), which likely means moving away from Graphviz.

What tools to use for interactivity or an editor is still up in the air.

[0] http://projects.haskell.org/diagrams/

[1] https://github.com/rgleichman/glance/issues/1

Why are the nodes rotated like that? It seems very distracting, and I think that might also be partly causing your "layout is too spread out" issue? [1] It could also be exacerbating your second issue of crossing edges - you are getting crossing edges in even the simplest graphs (e.g. your "f1" function example, with 5 nodes).

GraphViz's "dot" algorithm (i.e., Sugiyama-style graph drawing algorithm) [2] should give a fairly compact representation that is organized into layers, and avoids crossing edges in at least simple cases, but rotating the nodes would again "spread out" the layout by forcing increasing height of each layer.

Under "Possible solutions" you mention "create a better graph layout algorithm" - that sounds quite ambitious, wouldn't this be a PhD-thesis-level research task in itself?

The only graph drawing library I'm aware of that might be competitive with GraphViz's algorithms is MSAGL [3] but that's a .NET library.

[1] https://github.com/rgleichman/glance/issues/1

[2] https://en.wikipedia.org/wiki/Layered_graph_drawing

[3] https://www.microsoft.com/en-us/research/project/microsoft-a...

Rotating nodes is an easy way to reduce line crossings. Here's a comparison [0].

However, it does seem that vertical and angled text is harder to read than horizontal text, so there is room to improve here.

[0] https://gist.github.com/rgleichman/f812150151b549ca9f634832c...

Fantastic work!

I've tried implementing something very similar for Clojure, but I thought I was insane so I stopped :)

Notice how close to the lisp syntax these visualisations are, eg: (* 3 5) or (* (+ 8 7) 2)

Visual representation of code opens up a lot of possibilities, like being able to click on a block and visually debug it or write all kinds of unit tests on it's 'back'.

You can also 'visually' show how data passes from function to function or edit the constants in their slots and see how the changes propagate throughout the code..

Anyway, I'll check it out in more detail later, great work !

If this were more compact and didn't rotate text I think it would work much better.

I am very interested in program visualization, but I think trying to visualize things at this level is just nutballs.

Anyway, it's the wrong problem. I don't need help understanding x * 3 + y. I need help understanding what these 30kLOC in these 17 files do.

If for some reason you actually need to understand 30kloc in 17 files in any meaningfully substantial fashion, I posit this will always require some time investment on your part. (Such as outcommenting most of `main` / a module, then progressively interactively proceeding to dive into the codebase via repl/compile trial+error etc.) I gather that such diagrams and visual aids etc help much more once one's basic understanding "clicks": as most of us can more easily mentally lock-in our understanding for later recollection via some visual mnemonics / cues.

As always.. not the silver bullet, just the perfect tool for a context and use-case only you can decide/suspect/discover =)

Edit: actually looked at the screens. Hmmm seems like a language/programming learner-aid tool mostly. Fair game for dem young'uns I guess!

Getting and understanding for the architecture of a code base is often very hard (time consuming) just based on the code. There are usually some high-level principles behind how the code base is put together, but rarely is this effectively communicated in the code. Usually it is comunicated in a sidechannel like a diagram (which may be outdated), or orally from someone experienced with the code base (assuming someone is available). Module import graphs and callgraphs can be useful for this. But because they are rarely used when writing/designing the software, good outputs are rarely optimized for. For instance reducing the number of levels in the module hierarchy, possibly simplifying both architecture and the resulting diagrams.

Sure it will involve time investment on my part, but I want a visualization tool to help lighten that investment and help me come to the understanding more quickly.

The question of how concepts and algorithms are represented is interesting. While this is a neat project, a lot of people including myself might say that a "visual" representation of code is actually more confusing. Perhaps it's because textual representations are 1-dimensional i.e. the representation of code as text can only grow in one direction. This makes it inherently simple in some sense.

I have found in my time teaching people about deep learning that proper visualizations of algorithms can help people grok things better[0]. Granted backprop and neural networks are really simple one dimensional algorithms.

[0]: http://blog.chewxy.com/2016/12/06/a-direct-way-of-understand...

The link is a blank page for me.

Yeah... I seem to have borked my wordpress over the weekend while attempting to transfer it out to Hugo. Apologies.

I enjoy the developer's definition of 'visual'; just surrounding textual code in nicely coloured boxes.

Those are textual identifiers, not code. The relations are all graphic.

If he started using graphic icons, he wouldn't be representing Haskell code anymore.

I admire the Haskell folks, much in the way I admire theoretical mathematicians... which is to say, I'm sure the work they are doing/did has some value, somewhere (in the past).

But the Haskell proponents that I have known have admitted that the reason they don't use Haskell more is because the libraries and ecosystem aren't adequate. That's not to mention that maybe Haskell is just too much for even the above average dev to grok.

So why push deeper? Is this just academia stroking itself? Why not invest this energy in something slightly more commercial?

> […] the reason they don't use Haskell more is because the libraries and ecosystem aren't adequate.

The adequacy of the Haskell ecosystem really varies depending on the kind of application that you want to develop. If you're writing a compiler or a web service, Haskell is a pretty good choice.

This [1] is a nice overview of the state of the Haskell ecosystem.

Let me also plug two more links [2] [3] that explain why Haskell is actually a very good choice for (large) commercial projects.

[1] https://github.com/Gabriel439/post-rfc/blob/master/sotu.md

[2] https://www.fpcomplete.com/blog/2016/12/software-project-mai...

[3] https://www.reddit.com/r/haskell/comments/54umkh/haskell_for...

As I dove into the Haskell space, I noticed where all the major "innovations" I encountered (as game-changing and substantially productivity/expressiveness/deeper-understanding-enhancing for the developer) in the mainstream tech stacks (whether .net java python ruby rust jquery coffeescript node or the more recent hypes) in the last decade were really discovered/pioneered/refined/formulated/scrutinized/formalized. It's definitely the most interesting and educating ecosystem I have discovered to play around and evolve my breath and depth of developer skills in since I basically wrote my first lines of code in Basic and Pascal long ago. Just when I got fed up with the boilerplate staleness of it all, I found the actual frontier. Good enough for me, some hackers are kept going by figuring out where currently their old "endless fascination with brainy stuff but also immediately interactively attackable with a REPL" nerve will be well tickled for the next n years.

So yes "why push deeper"? It's just what some developers feel they need to do in order not to just throw out their laptops and turn to full-time gardening, full-time redditting/4chaning/twittering, or cooking shows. Not that there's anything wrong with any of these..

I really appreciate this comment. I hope that at some point I see the balanced outcome of such approaches to technology. I'm so close the the "throwing out the laptop" scenario you describe.

> So why push deeper?

This would appear, contrarily, to be shallower.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact