> "Foo | Bar variants are stored as ascending OCaml ints, starting from 0." I was initially worried that OCaml could only do this because they have type inference and therefore the compiler knows much more about what type every AST/IR node is.
No, they are able to do that because they have static typing and they don't have to distinguish between values of "type A = FooA | BarA" and "type B = FooB | BarB" at run-time, so they can represent FooA, FooB, 0, false, unit etc. all the same. They can't do the same with their "extensible variants" though, so those IIRC are encoded as objects.
Wait, how does Scrapscript distinguish between values of different types? Say,
Right now they're all like OCaml polymorphic variants. There are no static types. That's a future thing we'd like to add to the interpreter/compiler
Your example is legal (modulo the constraint that every tag must have an associated value) and works because we store the tag either in the pointer or the tag field.
So for every variant case, you generate a unique integer tag, different from a tag of any other variant cases ever compiled in the future or the past? OCaml, for example, tries to do exactly that by hashing the names of cases in the polymorphic variants, and in case of hash collisions it just gives up [0].
That's correct. No hashes; just string equality on the tag name. Everything gets collapsed to an integer. I'll have to read about why OCaml hashes things. Maybe separate compilation?
Edit: oh, it's unique per compilation and isn't meant to be shared around. The variants are shared with string names in flat scraps (small serialized ASTs)
Yes, separate compilation, and since your language, as I understand it, will support receiving and using scraps from wherever, you'll have to deal with it too unless you insist on strictly source-code sharing only.
You see, those are eternal (well, perennial) problems with distributed code management and every language/computing platform has to deal with them somehow, so that's why I'm interested in what your approach is, especially so since you aim to "solve the software sharability problem".
For the (slow) interpreter, it's just a string. For the compiler, we may end up just using a string in the end too, and interning, and then doing identity equality on the pointer. This is especially cheap because we have small/in-pointer/immediate strings.
No, they are able to do that because they have static typing and they don't have to distinguish between values of "type A = FooA | BarA" and "type B = FooB | BarB" at run-time, so they can represent FooA, FooB, 0, false, unit etc. all the same. They can't do the same with their "extensible variants" though, so those IIRC are encoded as objects.
Wait, how does Scrapscript distinguish between values of different types? Say,
Is that legal? If yes, how does it work?