Judging from the article, if you parsed "if 3", you'd get back out
(prefix if 3)
instead of an error, since `if` is just a regular prefix operator. So, presumably there's some sort of well-formedness checking that goes on after parsing? Does anyone know more? The website says very little.
True, 'if 3' will stay there if there is no rewrite for it. You can also use treat a lack of possible rewrite as an error, this is what the Tao 3D document description language does. You'd get an error like "No form matches 'if 3'".
Underneath, there is a rewrite operator, ->, that means to transform the shape on the left into the shape on the right.
The precise semantics are a tiny bit more complicated than that, but it iterates until there is no more transformation that applies. Some specific rewrites will be transformed into "opcodes" of the underlying machine model. So A+B with integers on each side will execute an addition (the inner representation of the transform is something like A+B -> opcode Add).
Yeah that puzzles me a bit too. Multifix operators are not more complicated than prefix/postfix/infix (well, only marginally). For instance, you can give each operator a left priority and a right priority and merge them when they meet with the same priority. Then `if x then y else z` would become (if/then/else x y z) or something like that. I think that's saner and easier to handle than what he does.
In XL, a multifix is a combination of infix, prefix or postfix. Note that the various nodes exist only for the programmer's convenience, i.e. they represent a user-visible syntax. Internally, I could do with only a node with two children.
Something like [A,B,C,D] writes as:
Block with [] containing
Infix , containing
Symbol A
Infix , containing
Symbol B
Infix , containing
Symbol C
Symbol D
So this allows me to represent multifix along with their expected syntax. I could also represent for example:
[A, B; C, D]
if A then B else C unless D
for A in B..C loop D
The last one is actually one of the for loop forms in XL
Multifix is more general than a combination of infix, prefix and postfix, though. For instance, a [] block is a multifix operator, but you can't implement it as a combination.
The advantage of supporting multifix directly is that you get a more convenient and less confusing representation. For instance, given "if A then B else C" why should I expect (else (then (if A) B) C) and not (if (then A (else B C))? It's not obvious, and both are unwieldy compared to a ternary operator like (if/then/else A B C).
Multifix is not more general than combinations of the 8 node types in XL, only more powerful than a combination of some subset.
It's not simpler. XL started with a multifix representation, see http://mozart-dev.sourceforge.net/notes.html. Switching to the current representation was a MAJOR simplification.
The current representation captures the way humans parse the code. Infix captures "A+B" or "A and B". Prefix captures "+3" or "sin x". Postfix captures "3!" or "3km". Block captures "[A]", "(A)", "{A}" or indentation. Since humans perceive a difference, you need to record that difference somewhere. XL records that structure in the parse tree itself, not on side data structures such as grammar tables.
This approach also enables multiple tree shapes that overlap. Consider the following XL program (I replaced asterisks with slashes because the asterisk means "italics" for HN):
A/B+C -> multiply_and_add A, B, C
A+B/C -> multiply_and_add B, C, A
A+B -> add A, B
A*B -> mul A, B
In that case, I can match a multifix operator like multiply_and_add without needing a representation that would exclude matching A+B or A/B in other scenarios. This is especially important if you have type constraints on the arguments, e.g.:
A:matrix/B:matrix+C:matrix -> ...
A:matrix/B:real+C:real -> ...
Those would be checked against X/Y+Z in the code, but if X, Y and Z are real numbers, they would not match.
I have gone the other way with the languages I designed, from the system you describe to multifix, and have had the opposite experience. A proper implementation of general operator precedence grammars is much simpler than implementing prefix, postfix, infix and blocks specifically.
Now, I'm not sure how you implemented multifix, but a good general implementation for it is not immediately obvious, so I suspect you simply used an algorithm that's more convoluted than necessary and made you overestimate the scheme's actual complexity. Implemented properly, though, it's ridiculously simple. To demonstrate, I have hacked together a JavaScript version here (sorry, I didn't get around to commenting the code yet):
The core of the parser is the oparse function, which clocks in at 32 lines of code. Along with a basic tokenizer, order function, evaluator and some fancy display code, the whole thing is barely over 200 LOC. I dug around for XL's source and from what I can tell, what you have is not simpler than this.
I also never suggested making multifix operators like multiply_and_add. Multifix is for operators which are inherently ternary, quaternary and so forth. For instance, when I see "a ? b : c" I don't parse it as prefix, postfix or binary infix, I parse it as ternary.