Around that time I'd noticed that the stringly-typed languages of shell, awk, and make have no garbage collection. They're value oriented and not reference oriented. The goal of Oil was basically to "upgrade" shell into something more like Python, and on the surface that looks straightforward, but the reference model vs. value model makes it pretty different.
I've gone back and forth on that... whether I should just slap GC into a shell, or whether I should try to preserve the value model while making it less stringly-typed and more expressive.
I have a more ambitious idea for a language where every value is essentially a pair (Go-like string slice, structured data), but I probably won't get to that in the near term. For a concrete example, I noticed from writing some HTML processors that both DOM and SAX APIs have a bunch of flaws for processing semi-structured text.
Also, on the other point, let's consider these two possibilities about Chez / Dybvig:
1) Dybvig is an average programmer who got "superpowers" from Lisp, i.e. got a 10x boost in productivity.
2) Dybvig and Chez are an outlier, e.g. like Fabrice Bellard and his recent QuickJS (and other projects), or Lattner and LLVM/Clang/Swift. The salient point is that these projects have nothing to do with Lisp.
It's obviously not an either/or thing, but I'd say the answer is closer to #2. The complexity is from the problem domain, and many major software projects are minor miracles that only a few people are qualified for, regardless of language.
As I understand it, shell, awk, and Tcl are more or less linear languages, except that copying and destruction are implicit. It's fairly straightforward to make a linear language that includes more expressive types than just strings; Tcl 8 even did it without breaking backward-compatibility. Linearity is pretty incompatible with the OO worldview in which objects have identity and mutable state, but to the extent that you provide Clojure-like FP-persistent data structures whose operations merely return new states, maybe the OO worldview can just go suck it. Rust takes a different tack in which copying is optionally explicit (though destruction still implicit: "affine typing" rather than "linear typing") and you can use "borrowed references" to avoid the headache of explicitly returning all the argument objects you didn't destruct.
A nice thing about variations like Rust's and Clojure's is that you can preserve the expressiveness of the Lisp object-graph memory model without all of its drawbacks: no aliasing bugs, no circular references complicating the life of the GC, and in Rust's case, no garbage collector at all.
> I have a more ambitious idea for a language where every value is essentially a pair (Go-like string slice, structured data), but I probably won't get to that in the near term. For a concrete example, I noticed from writing some HTML processors that both DOM and SAX APIs have a bunch of flaws for processing semi-structured text
Right, text markup isn't tree-structured, and I think HTML5 actually prescribes fixups for <i>some <b>incorrectly</i> nested</b> markup. The GNU Emacs/XEmacs schism was largely about how to handle this problem for text editing; the XEmacs side added "extents" to which you could attach structured data (such as a font or a link destination), which are more or less the type of values you're describing, while the GNU Emacs side instead added "text properties" which conceptually applied independently to each of the characters of buffers or strings, but under the covers were of course optimized with an extent-like representation. Neither side of the schism considered replacing text buffers with S-expression-like trees like the DOM; that was Microsoft's fault :)
Are you thinking that when you concatenate a couple of such string slices, the operation will be O(1) because it uses ropes behind the scenes? Or are you thinking of not having concatenation at all as a primitive operation, instead using an unboundedly-growing scratch buffer (not, itself, a value) that you can append them both to and then take slices of? Is the text in the slices mutable, and if so, can it also grow and shrink? I think there's a large design space of interesting things you could do.
About Dybvig and Chez, sure, Kent Dybvig is a wizard. But so is Guido van Rossum, whatever embarrassing errors he may have made regarding first-class blocks and tail-call optimization in Python. I haven't asked Dybvig, but I don't think he's a Bellard-class wizard, so I don't think that's the answer.
Or are you saying that Python's problem domain is much harder than Chez Scheme's domain?
By the way, my apologies for having delayed so long in responding to your thoughtful notes.
A tentative slogan is "bijections that aren't really bijections", i.e. to describe the "fallacy" of (bytes on disk -> data structure in memory -> bytes).
Hence the (slice, data structure) representation.
That's the correctness aspect. I also think there's a big performance aspect, e.g. it appears to me that the performance of many/most parsers is dominated by object allocation.
Some vaguely related stuff here: https://github.com/oilshell/oil/wiki/Compact-AST-Representat...
Anyway I don't think this will make it into Oil any time in the next year, but I think there is room for it in programming languages, and lots of other people are barking up that tree (e.g. FlatBuffers relate to a big part of it). I would be happy to continue batting around ideas in non-nested HN thread :) So as mentioned elsewhere, feel free to send me a mail or ping me on Zulip!
I'm looking at Dercuano but I'm confused why it's a tarball and not a live website :)
So you can keep reading it after I'm dead.