The fun thing is that this sounds like "just" a more type-safe version of the way DSLs are usually implemented in imperative languages, particularly scripting languages.
In Python, if I have:
def foo(x):
return x**2 - 1
I can plug in a concrete argument:
>>> foo(2)
3
But I can also plug in abstract values provided by some library where all operators are overloaded to return another AST node. For instance:
And then I can perform abstract operations on the resulting AST node, like solving for x:
>>> solve(expr == 0)
[x = -1]
Of course this is completely non-type-safe. And without purity, there's no guarantee that `foo(Int('x'))` is truly equivalent to running foo on an abstract value `x`. If there's an `if` statement in there, the condition will either pass or not, and the returned AST node will only include that control flow path. The DSL in question tries to block that by preventing AST nodes from being converted to boolean, but it's not perfect. Haskell has an advantage there.
On the other hand, for the situation at the end of the post, where the author wants to model imperative computations, it seems like an imperative language would make for easier embedding. In an imperative language, each AST node can be given its own unique ID when it's constructed. So there's no trouble distinguishing between, e.g., calling some function bar() and then using the return value twice, versus calling bar() twice. No need to use a separate <- operator for assignments.
But to be fair, explicitly distinguishing side effects is kind of the whole point of Haskell!
(And if you wanted badly enough to do it in Haskell without requiring <-, you could use unsafePerformIO to assign AST node IDs.)
I'm surprised that you say an imperative language would give you an _easier_ embedding than using Haskell. I would have said the reverse: Haskell makes it impossible to confuse an imperative command used during AST assembly with a representation of an imperative command. That's one of the payoffs of using monads.
> I'm surprised that you say an imperative language would give you an _easier_ embedding than using Haskell.
I don't think your parent says this. The three comparisons I found were:
> The fun thing is that this sounds like "just" a more type-safe version of the way DSLs are usually implemented in imperative languages, particularly scripting languages.
This sounds like saying that this trick is already common in imperative languages, not necessarily easier.
> The DSL in question tries to block that by preventing AST nodes from being converted to boolean, but it's not perfect. Haskell has an advantage there.
This explicitly gives the win to Haskell.
> So there's no trouble distinguishing between, e.g., calling some function bar() and then using the return value twice, versus calling bar() twice. No need to use a separate <- operator for assignments.
This one seems to give the win to imperative languages, but, as far as I can tell, just on the basis of syntax.
I didn't mean to imply that you couldn't do the same in Haskell without the extra type-checking tricks in the article.
I guess my main point is that the actual runtime semantics – like, if you take away the type checking, which functions are being called on what – are no different from what you'd see in Python.
That differs from a lot of other Haskell techniques, where a direct port to Python would include tons of nested lambdas (which are relatively awkward in Python, and indeed in most imperative languages), and would also be fairly error-prone without the type checker to keep things straight. Even relatively simple uses of do notation fall into that category.
You can directly port the Python technique you mentioned without any extra closures.
(I mean not more than the normal amount of closures you'd get from straight-forward code that just calculates the answer as a number, instead of giving you the AST.)
In Python you get run-time typing and dispatch. In Haskell you get static typing [and monomorphisation]. GP means that static typing is safer and faster.
The linear functions/SMC work is really cool, but I am surprised to see the author calling it more mature than the compiling to CCCs work. The linear-smc library hasn't had an upload in a while, and it's currently missing any haddocks beyond the extracted type signatures. It's also a shame that it had to build its own typeclass for monoidal categories instead of using the one in https://hackage.haskell.org/package/categories .
I think the linear-smc work is more exciting and I hope it matures with a bit more elbow grease. There was an ICFP talk that accompanied the linked paper ( https://www.youtube.com/watch?v=90OJz0QE4qE ) and monoidal categories can model lots of "boxes and wires" things. Linear functions potentially give you a much more ergonomic DSL for building up those boxes-and-wires models, and in a way that lets you write abstractions that work over any model.
> The linear functions/SMC work is really cool, but I am surprised to see the author calling it more mature than the compiling to CCCs work.
TFA doesn't call that more mature, just more workable and a thinner abstraction that presumably compiles to more efficient code than the proc-based alternative.
Calling the proc stuff not mature might be totally fair when you consider that TFA got almost everything they wanted w/o it and the interface they got is more natural and the compiled code more efficient! Or it might just be fair to say that it is mature and just the wrong tool for TFA's task. I wouldn't know which is the case -- I'm not at the level of making my own categories.
What a delightful blog post. One need not fully understand all the prologue bits to understand the interface and implementation produced.
This technique could be very useful in developing other DSLs on Haskell. Basically the author found a simple monadic-looking mapping of Haskell functions onto a different Category where you can do interesting things with those functions. In TFA's case the idea is to write flow diagrams where "boxes" embed [user-provided] Haskell functions, then the machinery can do whatever might be expected for diagrams, like: render visual representations of them, evaluate them, analyze them, etc. Because the boxes embed Haskell functions the flow diagrams can do useful work, so they're more than just documentation, they can be code and documentation. Basically this is a flow diagram programming language as a DSL embedded in Haskell.
Imagine using this technique to build things like networking stacks, say. Or emulators of various sorts (CPUs/ISAs, say). Or maybe one could use this to make something like a PLC in Haskell. It's really quite a neat and powerful idea, especially if it really does compile to efficient code.
Kinda tangential (since the linear vs Cartesian distinction wasn't made a big deal of here) but I liked the clarifying discussion (to layman me) of how "linear" and "affine" are related to the usage ones learns in uh middle school?
In Python, if I have:
I can plug in a concrete argument: But I can also plug in abstract values provided by some library where all operators are overloaded to return another AST node. For instance: And then I can perform abstract operations on the resulting AST node, like solving for x: Of course this is completely non-type-safe. And without purity, there's no guarantee that `foo(Int('x'))` is truly equivalent to running foo on an abstract value `x`. If there's an `if` statement in there, the condition will either pass or not, and the returned AST node will only include that control flow path. The DSL in question tries to block that by preventing AST nodes from being converted to boolean, but it's not perfect. Haskell has an advantage there.On the other hand, for the situation at the end of the post, where the author wants to model imperative computations, it seems like an imperative language would make for easier embedding. In an imperative language, each AST node can be given its own unique ID when it's constructed. So there's no trouble distinguishing between, e.g., calling some function bar() and then using the return value twice, versus calling bar() twice. No need to use a separate <- operator for assignments.
But to be fair, explicitly distinguishing side effects is kind of the whole point of Haskell!
(And if you wanted badly enough to do it in Haskell without requiring <-, you could use unsafePerformIO to assign AST node IDs.)
reply