I'm quite happy to see more work on discrete generative models -- probabilistic programming languages are still wrestling (or simply ignoring!) the problem of "disintegration", where conditioning changes the base measure because it collapses the dimensionality of the probability manifold (similar to the issue Arjovsky identified with GANs). See e.g. http://homes.sice.indiana.edu/ccshan/rational/disint2arg.pdf and https://probprog.cc/assets/posters/thu/78.pdf. To me (although this is probably too radical a move to be palatable to most people) we should largely abandon continuous distributions and start building out discrete probability spaces and methods with the same vigor that continuous probability got in the form of measure theory. These probability trees some like a natural data structure to begin this. I'd also like to see representations for working with probabilities on discrete manifolds that approximate continuous space in computationally efficient ways -- there could be work in this direction already but I'm not aware of it.
Also, a fun implication of these probability trees I'd like to see explored is structural sharing: you need not copy an entire tree to represent the result of a conditioning or intervention. In general you need to copy only a number of nodes given by the size lying above the cut set, in a similar way to how immutable data structures like HAMTs can represent modified hash maps etc with persistent space efficiency by reusing unchanged nodes. If one expects to condition often with certain variables, it would then be useful to hoist such cut sets as high as possible -- does such a 'transpose' operation exist? I admit I did not read the paper thoroughly enough to know if this was mentioned.
The one reply about it in that thread said: "It got posted on HN a few days ago but I am surprised that it did not get more traction."