Hacker News new | past | comments | ask | show | jobs | submit login

> I do believe that a better code editor is possible through non-plaintext programming. (Serialize token trees and ASTs, instead of plaintext.)

Funnily enough, this is itself another of those mirages, like visual programming, that have been chased for years and failed to gain traction (outside specialised applications).

For my money, the reason for this is that a human editing code needs to write something invalid - on your way from Valid Program A to Valid Program B, you will temporarily write Invalid Jumble Of Bytes X. If your editor tries to prevent you writing invalid jumbles of bytes, you will be fighting it constantly.

The only languages with widely-used AST-based editing is the Lisp family (with paredit). They get away with this because:

1. Lisp 'syntax' is so low-level that it doesn't constrain your (invalid) intermediate states much. (ie you can still write a (let) or (cond) with the wrong number of arguments while you're thinking).

2. Paredit modes always have an "escape hatch" for editing text directly (eg you can usually highlight and delete an unbalanced parenthesis). You don't need it often (see #1) - but when you need it, you really need it.




I've been thinking along the same lines. The "fighting your editor" problem cannot be ignored, and it's common in VPs. Real programming code needs "jank". We need to be able to move between different states:

Bags of characters <-> Unstructured trees of tokens <-> ASTs

(BTW paredit is super cool and I'd like to see more of its kind!)


Thing is, the architecture you just described is a modern IDE with refactoring etc.

This is a well-trodden road. It starts with "wouldn't it be awesome if we could manipulate everything as ASTs", then usability intervenes and we fall back to "well, we need to be able to selectively edit as text", which means you need to be able to convert everything to bags of characters and back. And now you've built that conversion, you might as well represent the "source of truth" as bags of characters like everyone else does.


I agree with what you're saying, except the last part. Using ASTs as the source of truth (embedded in a source control forest) has benefits that are worth the difficulty.


What's the benefit? If your tooling already requires lossless round-trips to and from text, why invent a funky storage format that doesn't interoperate and can't be fixed with a text editor when it all blows up? You already have a perfect serialisation of the AST, in a format every other tool understands - that is, source code.

(And heaven forbid you should want to make a checkpoint commit that doesn't parse...)


Some potential use cases:

- Using Merkle trees, we can assign every node a hash-based ID. So now we can refer to other nodes by ID. This lets us store a graph whose vertices are all the tree nodes.

- With the graph, we can now reference bindings not by string literal, but by ID. This eliminates shadowing problems and missing imports.

- There is now one source of truth for the names of variables and functions. As a result, in source control, a commit that renames something is a one-line change.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: