Hacker News new | past | comments | ask | show | jobs | submit login
Grim C++ Tales from the Crypt: The Visitor Pattern (2017) (cppcrypt.tumblr.com)
114 points by meebob 12 days ago | hide | past | web | favorite | 60 comments





The thing that annoys me most about this pattern is that people always miss the "there needs to be lots of things you want to perform on those thingies".

And so we end up with a huge pile of visitor code, and there is one or maybe two actual visitors.

When it would have been far easier to read and modify if it was just coded out normally.


I’ve never considered this, but I think it’s a fair assessment! It’s particularly striking when you see a compiler with multiple visitors that are refactored into one or two visitors to reduce the number of traversals. I think I understand the cause, it’s easier to start with a visitor pattern than refactoring into a visitor pattern.

Nevertheless, for many problems, like compiler construction, I’ve never seen a pattern that matched the utility of the visitor (even outside of imperative and object-oriented patterns)


Use the right language: one with ML style pattern matching.

check out multi-methods and multiple-dispatch OO

IMHO, the most important condition is the need to do different stuff on different (small) subsets of classes, typically combined with a (non-trivial) tree traversal. Otherwise, its more straightforward to add regular virtual methods.

Or in this case, since the IThingyInteractor can do exactly the same thing with an IThingy if IThingy has a name() method, you can just implement it with single dispatch.

I don't know if I have ever actually seen a true implementation of this pattern. What I have seen is a pattern someone called a Visitor where a tree is iterated over and the "Visitor" makes virtual function calls on the composite node in a single function body without ever doing something different depending on the type of the node.


If you're like me and couldn't get past the first few slides because it's a terrible reading experience, go to the archive, which shows the whole thing in thumbnails. It makes it little better, even though the thumbnails are in reverse:

https://cppcrypt.tumblr.com/archive


This would definitely be better as a blog post than a dog post.

> dog post

This kind of abomination even has a name.


What is a dog post?

It's when you cut your blog post into tweet-sized blocks and post each tweet in a separate jpeg with a cartoon dog.

I requested desktop site and then from there I read them one by one, tapping the “next” links. It was very readable IMO but I am used to having read cartoons online before that you navigate like this so.

The visitor pattern isn't appropriate everywhere, but it's quite good for traversing and manipulating ASTs (abstract syntax trees). An AST typically has many kinds of nodes, but programs that traverse the AST usually only care about some small subset of those nodes. The visitor pattern provides a clean solution: a generic AST traversal library visits all nodes in the tree while allowing the program to customize what happens when visiting specific nodes. I've found the visitor pattern very effective for creating derivative ASTs without knowing about everything that might be in the AST.

Yes, this is the only place I have ever used the visitor / double dispatch pattern. The main idea is it fulfills the "Open-Closed" principle:

https://en.wikipedia.org/wiki/Open–closed_principle

Otherwise you find that your AST data structure is never finished -- you are constantly added stuff to it. I wish there was a clearer way in C++, but this is the C++ solution.


But often a pattern matching switch expression would be nicer, if your language has such a thing.

It's things like the visitor pattern that led me to implement Clasp - a Common Lisp that interoperates with C++ and uses llvm as the backend (github.com/clasp-developers/clasp.git). Common Lisp has generic functions and multiple dispatch - so it doesn't need the wretched visitor pattern. Try writing a compiler using the visitor pattern - I dare you. :-)

The visitor pattern doesn't disappear under CLOS, but all the framework boilerplate code does.

So that is to say, we can still have a situation where we have a tree of objects T with nodes N of different classes (expression, if-statement, ...) and visitor objects of different classses (pretty-printer, evaluator, ...).

Then given some generic function G and visitor object V, we walk the nodes of the tree, and simply (funcall G N V) for each node N that we encounter.

The remaining verbiage is then in all the method specializations we have to write for G for all the N V combinations that occur.

For all the framework brevity, we yet have an additional flexibility relative to the Visitor Pattern: namely, we can not only vary the choice of visitor object V, but also of G. The Visitor Pattern fixes the generic function in terms of some hard-coded some visit()/accept() protocol.


This... I can actually see the point in this. The Visitor pattern is one of those patterns I never really saw the point of, and which mostly struck me as an over-engineered hack about a shortcoming in a language.

This example actually makes clear why you'd need to do it this was in C++ at least. Not sure which other languages would need this. It's certainly not pretty. Then again, dispatching twice is not so bad compared to your average Java-style over-engineering.


It's not a "shortcoming" of the language per se. Visitor is a mitigation of the Expression Problem[0] on the Object Oriented spectrum. Object oriented designs need to use Visitor to add functionality to existing data structures, whereas doing such is a natural feature of Functional languages (though FP languages have their own awkward angle in attempting to add new data structures to existing functionality). This is a big motivator for most modern programming languages supporting multiple design paradigms.

[0] https://eli.thegreenplace.net/2016/the-expression-problem-an...


It's a shortcoming of languages that don't have multiple dispatch.

Again, Eli Bendersky has an excellent article about the issue. In particular, he notes that this has been discussed for the case of C++: https://eli.thegreenplace.net/2016/a-polyglots-guide-to-mult...

A follow-up goes into Python: https://eli.thegreenplace.net/2016/a-polyglots-guide-to-mult...


Why should anyone force changes into the core language if all you want to do is use an obscure programming construct in a corner case that's easily implemented with a design pattern?

Some of these complains are at best very misguided.


Have you considered that multiple dispatch is obscure because it requires jumping through such hoops to implement in most mainstream languages, rather than because of its lack of utility?

The short answer is that the existence of multiple dispatch isn't motivated by the visitor pattern.

Moreover, multiple dispatch isn't obscure at all.

C++ has it, but only statically, for instance, in the form of overloaded functions. C++ chooses a function overload by looking at the types of all parameters and arguments.

Multiple dispatch is similar, but the run-time types of the arguments are used.

This is broadly applicable in making all sorts of situations nicer.


Why have functions and control structures when we can just jmp?

That's a disingenuous comparison. Functions are used pervasively, but even the visitor pattern isn't used very often.

But in languages that support multiple dispatch at the language level, they're just another way of writing polymorphic functions. They're not just a niche feature. When you don't have to write all of the ceremony of the visitor pattern, it's easier to use visitors everywhere without thinking about them.

I've always heard that pattern itself is older than C++, originating from Smalltalk.

It does come from Smalltalk, which is in part why it looks so sloppy and strange in other languages (a Ruby example I saw even had an intermediary class for the visitor part, which doesn't really happen in Smalltalk).

The reason to use this pattern is to leverage real polymorphism. In Squeak/Pharo, the graphics system is called Morphic, which uses this pattern all the time to draw different kinds of objects to different kinds of Canvases. With a visitor pattern, each object can "know" how to draw itself to a given canvas.


A state machine with context is one case I have used it for.

I clicked the link and then tumblr asked me about privacy, with a link to "Manage options".

I clicked the link, hoping to disable x,y,z and there is just text. No options!

Not reading further.


What a terrible UI on that website. Constantly clicking "Next"! Yuk. And every page required zooming in and out to make it fit on my screen. For a tech focused comic, you'd think the author could run it through an ImageMagick batch resize job. Bonus points for allowing keyboard back and forth navigation.

I agree, horrible, luckily I use vimium so after the first page it was a repetitive key sequence to move forward

You can click anywhere on the image too; not aiming for that tiny [next] does help a bit.

Welcome to Tumblr.

While I, as a rule, try not to stand against good storytelling, I clicked far too many times while thinking "get to the point" before I just closed the tab.

The example code the author comes up with: https://github.com/dpugson/examples/blob/master/chpt1_the_vi...

Which implements the pseudocode: https://cppcrypt.tumblr.com/post/168134402897

    main {
        thingies = [ purpleThingy, littleThingy ]
        interactions = [ commentOn, cherish ]

        for (interaction in interactions)
          for (thing in thingies)
             interaction.interact(thing)
    }
The author does mention things like "function pointers can be used in simple cases" and "std::variant would avoid the need to overload the method". But the main reason for having the C++ code the way it is is because "you can't dispatch to overloaded methods at runtime". https://cppcrypt.tumblr.com/post/169439207562 ff.

The thing with the visitor pattern is that it indeed is useful for many operations over fixed data. But the same holds for simple pattern matching. Inheritance (or rather subtyping) is useful for changing data but a fixed set of operations. But what the eff do we do when we have many operations on changing data? The expression problem still is an interesting problem, because a data structure that solves it would basically be the "gold standard".

I don't understand why the class instance needs to accept the interaction. Why not just invoke the interaction directly?

    interactor_p->interact(thingy_p)
Would this still be considered the visitor pattern or is the extra layer of indirection important?

Your code won't compile because interactor::interact is not defined on the base class. That line is needed because the overload resolution is static. So with the macro ACCEPTS, you build the code where the overload is resolved.

I see. There is no interact signature that accepts IThingy, which is the whole point of this pattern. Thanks.

I don't know about the limitations of C++, but something like what you describe would happen in Smalltalk. Take for example a Canvas object and a bunch of different shape objects (square, circle, etc):

  "Square class"
  drawOn: aCanvas
      aCanvas line: (some point) to: (some point).
      "etc"
  
  "Triangle class"
  drawOn: aCanvas
      aCanvas line: self leftVertex to: self rightVertex.
      "etc"

  "Canvas class"
  drawShapes: aCollection
      aCollection do: [ :shape |
          shape drawOn: self ]

C++ match expressions would make std::variant a really nice alternative.

I assume there is already a proposal out?


Pattern matching is on many people's radar, including the "Direction Group" (See: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p093...). This proposal was discussed at the committee meeting last month: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p137... I'd love to see it standardized in 23, but I have no idea how likely that is. (It definately won't be in c++-20, which is now closed except for tweaks).

For reference, Rust version shows how much simpler that could be with pattern matching - https://gist.github.com/km216/aad5c0fa11f32aa562af2370a32208...

Until C++2z rolls out with match support, you can roll your own. Nikolai Wuttke gave a lecture at MeetingCpp 2018 (https://www.youtube.com/watch?v=CELWr9roNno) that presents code that lets you do this:

    // Client visitor code
    match(
      thingy,
      [](const PurpleThingy& ) {std::cout << "purple thingy\n";},
      [](const LittleThingy& ) {std::cout << "little thingy\n";},
      [](const auto& ) {std::cout << "any other type\n";}
    );

Working example: https://coliru.stacked-crooked.com/a/dfed39bc7fcdfb01

std::visit mostly allows you to do that already:

https://coliru.stacked-crooked.com/a/be5c44281eea8bc4

Then only unfortunate missing piece of the puzzle is that there's no trivial way to create a closure out of this, so it requires a bit more manual work to propagate local state to the visitor.


If you build your visitor out of lambdas, instead of a struct, you can propagate local state to the visitor easily by using lambda captures. There are good examples of this approach at https://en.cppreference.com/w/cpp/utility/variant/visit

Unfortunately, you can't leverage template substitution rules with the lambda approach. And that's really necessary if you want to have actual powerful match expressions.

For those working with C++ an answer might come in C++23.

http://open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1371r0....


When I originally read about Visitor in the GoF book it was fun to work through its convoluted way but seemed ugly in any language, including the original SmallTalk.

I'd probably just do type checks and maybe use a 2D table for the possible interactions.


I enjoyed the format of this, kept me interested. Nowadays I feel like if I read a blog post I have such a short attention span I'd just glance over it.

Would love to see more stories.


The only time I have ever encountered a code base where the visitor pattern was necessary and fulfilling its purpose is Babel (the JS compiler).

Agreed. The _only_ times I've ever been justified in implementing this pattern have been when doing compiler work.

Someone else up-thread mentioned what I think is the key requirement making visitors worthwhile is non-trivial tree traversal. ASTs seem to fit this description more than any other data structure I've had to work with day-to-day.

Outside of language trees, I wonder where else visitors are common?


ETL for very complex data structures maybe? Oh the irony. I've never seen the visitor pattern used in an ETL pipeline, which might be why they're always such God forsaken messes.

[flagged]


I probably don't need to read about C++ patterns anymore, so if this was an article I'd probably have skipped. The cartoon dog was pointless but somehow made me read it. It felt like there would be a punchline.

their GDPR compliance screen is full of anti patterns. Will not read.

What do you mean, you just have to click the right nondescript links in the right order, then manually uncheck sharing data with each of their 200+ partners.

How the &$#! do I read this

If you ask me, the actual problem is the tree data structure... Maybe post order forms lend themselves better to processing.

The even deeper problem is what ASTs are meant to represent, to which the answer would be "possibly a lot of diverse things". Diversity is never good in a computational context. But I don't see a good way to avoid it in the context of programming languages and ASTs. The reason for the diversity is that programming languages should allow humans to specify what should happen in very few keystrokes.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: