Hacker News new | past | comments | ask | show | jobs | submit login
Treefrog: A code editor that uses both AST and text editing commands (treefrog-editor.com)
114 points by gushogg-blake 5 months ago | hide | past | favorite | 78 comments

Backstory - I started experimenting with AST editing ideas after getting frustrated with my editor's lack of understanding of the structure and meaning of code, ie. the editing primitives all dealt with text and simple changes to the code often required repetitive manual text edits that had nothing to do with the meaning of the change. I went through a couple of prototypes and landed on the idea of augmenting a standard editing interface with a new mode that works on a simplified representation of the structure, which basically follows indentation levels - blocks of code like functions, loops, etc, can be moved and manipulated as units, and there are specific editing commands for the different elements.

The main idea is to design all of the editing and navigation commands around the thought processes (instead of designing around a data structure, which I think is where traditional editors have gone wrong), so that it feels natural and intuitive to use - like something that's been designed for what editing code actually involves, as opposed to a text editor that's had lots of code intelligence stuff added on top of it.

I wrote a tree editor for programs for my Master's thesis, in 1980. You can get it at https://www.cs.toronto.edu/~radford/MSc-thesis.abstract.html

The editor operates on the AST down to the lowest level, with careful design of keyboard input and cursor feedback to make that not too painful for things like infix operators. The language is described by a grammar, and there is an AST macro language.

My conclusion afterwards was that the shift at that time away from keyboard input towards editing with a mouse was negative for editing in terms of the AST (rather than text), since with AST editing there's a less direct connection between seeing and pointing and what you're editing. I got interested in other things.

But perhaps that was a premature conclusion.

What do you think of TeXmacs, which uses a WYSIWYG tree editor for writing scientific/math documents?

(TeXmacs is not based on TeX nor emacs but is inspired by both.)

Math notation is hopelessly stuck in the 19th century and needs an update to the 20th. Polish notation with operators of known arity is an excellent way to ease natural scientists/mathematicians into a modern tree based grammar for mathematics without a soup of parenthesis, e.g. / 1 / 2 3 vs / / 1 2 3

A full s-expression based grammar for mathematics is unfortunately not something that you can write on paper and requires white space delineation to be readable. On a screen with something like emacs paredit both of those are a breeze however. The fact that higher order functions don't need a special notation is really freeing.

What value would Polish notation add over (largely) infix for the typical mathematician? Why do mathematicians need to be told what notation to use by non-mathematicians when their notations (note the plural) are in wide, and apparently successful, use today?

How would Polish notation help, for instance, with things like describing set membership or logical relations? How is this more readable (using ASCII):

  -> a b
Than the current approach:

  a -> b
Which provides a convention sense of the relationship that the former does not? The second one can be "read" in order: a implies b. The first cannot.

>Why do mathematicians need to be told what notation to use by non-mathematicians when their notations (note the plural) are in wide, and apparently successful, use today?

The same reason why geometers needed to be told to start using Arabic numerals by algebraists: mathematics is asking new questions and the old tools aren't good enough. That we even need to make a notational distinction between operators like addition and operators like high order partial derivatives should show that clearly enough.

>How would Polish notation help, for instance, with things like describing set membership or logical relations? How is this more readable (using ASCII):

Start working on longer expressions and the lack of parenthesis becomes a godsend.

-> -> a b c and -> a -> b c can be parsed in only one way, by comparison with infix notation you need miscellaneous parenthesis to show the way you want a -> b -> c to be parsed: (a -> b) -> c and a -> (b -> c). Or you need to memorize an arbitrary number of rules which increase exponentially with each new operator you add.

> Which provides a convention sense of the relationship that the former does not? The second one can be "read" in order: a implies b. The first cannot.

In English and only for binary operators, and even then you can just as easily understand "add 5 and 6" as you can "5 plus 6". How would you use infix notation to express a ternary or higher operator? The notation for the definite integral needs a super and sub script as well as a dummy variable for an operator that is fundamentally a ternary one. By comparison in prefix notation it becomes definite_integral upper_limit lower_limit function.

> The same reason why geometers needed to be told to start using Arabic numerals by algebraists. Mathematics is changing and the old tools aren't good enough to answer current questions.

At least those were other mathematicians who could understand the domain and the importance of notations for the work.

Hah, no, "-> -> a b c" is not easy to parse, except for a computer. going are people not to read easily. It's a silly argument that prefix is fundamentally easier for humans when it disrupts the flow even in your example.

Even using the words, "implies implies a b c", does not parse well as a reader, even if it can only be parsed one way. If you go to a higher arity operation, switch to function (which is prefix, like polish) notation if it makes sense. But even then, we have things like the integral which uses physical placement to take (typically) up to 3 arguments:

  ∫ f(x) dx
With other similar operations. And if order of operations is not clear from an expression, then yes, use parentheses. That's what they have been used for for quite a while.

>Hah, no, "-> -> a b c" is not easy to parse, except for a computer. going are people not to read easily. It's a silly argument that prefix is fundamentally easier for humans when it disrupts the flow even in your example.

The only argument I'm seeing here is that you can't parse prefix notation and won't put the effort into trying it, which is the same argument people used against Arabic numerals in the middle ages.

A prefix only language is fundamentally easier to read and write to anyone who hasn't spend two decades learning the mishmash of pre, in, post, and upside-down fix that mathematics currently uses.

>But even then, we have things like the integral which uses physical placement to take (typically) up to 3 arguments:

Using four variables. It's amazing to me people defend this type of mess as something meaningful instead of a terrible historical contingency which we should get rid of as quickly as we can. In prefix notation you can just use "integral upper lower function" to convey the same information in pure text without the need for inventing TeX and vector displays.

Try out your notation, which I will agree can be unambiguous in a technical sense, but will quickly become unreadable for actual human beings, on a few modest examples.

  = 1 + square sin x square cos x -- a trig identity

  / + - b √ - square b * * 4 a c * 2 a
      ^ oops, ambiguous, you need a new symbol for negation
Note as well, in that second example, the great distance between the division operation and the divisor. This does not lend itself to readability by people, they have to keep the entire stack of operations in their head. Ok, so you switch the notation to be more tree-like and use indentation (how I write complex Lisp math expressions):

      v still ambiguous, what alternate symbol can we use?
  / + - b
      √ - square b
           * * 4 a c
    * 2 a
Ok, somewhat clearer, but hardly "easy" to read for a person. At this point you may as well just put back the parentheses (and be back at Lisp s-exprs, which as you noted earlier is not well-suited to handwriting). The mathematical expression is clearer for people. It requires them to keep less in their head at any one point in the reading.

  -b + √b^2 - 4ac
I'm not saying this is perfect, there is room for improvement in this form, but it is going to be better for the average math literate person (by which I mean, someone who has at least covered high school algebra). Even if you taught prefix notation from the start, the user of that notation still has to keep more in their head as they try to parse it.

> Try out your notation, which I will agree can be unambiguous in a technical sense, but will quickly become unreadable for actual human beings, on a few modest examples.

It's perfectly legible to me and I use it daily in my (mathematical) work.

Again all you're saying is that you're not used to it and it must be a bad notation. Just like how people who were shown Arabic numerals defended Roman numerals.

I'd do you one better and point out that the example should be = 1 λ x square sin x square cos x since having unbound variables is poor form after someone invented lambda calculus.

And yes, you need a symbol for negation which is a step up since a-b is also ambiguous in the same sense.

For the second one you easily see why words are superior to symbols when you stop trying to use hieroglyphs and start using letters instead: div add neg b sqrt sub pow b 2 sub times 4 times a c times 2 a. I can define new functions with well known meaning without having to copy special characters or invent type setting systems.

So, you're a troll then, you can't even be consistent in your presentation.

+ and - are bad to use, but λ is ok to use? Words are the thing to use for "add" and "sub" and "div" (why not the whole word?), but not "equals" or "define function"? And the words you choose are strictly English, which makes your mathematical notation no longer mutually comprehensible across most of the globe? Those are all positives to you? That's how math notation will be dragged into the 20th century?

And, what, we now teach lambda calculus to elementary school kids once they start learning about variables? At least try to come up with a plausible argument.

> I use it daily in my (mathematical) work.

I find this highly unlikely, but sure, if you're not a troll, you've found a notation that (apparently) works for you (what kind of math?) and you think it's better but haven't bothered to think it though.

And you still haven't answered the question on the second example, you just made it more verbose. The reader still has to remember (with zero punctuation or other indicators in your example) that the dividend for the div is the next 14 tokens and the divisor the last 3. Because that's totally readable and comprehensible.

If you're not a troll, your idea isn't even half-baked.

Nice! One thing I thought ASTs weren't that good at was the initial writing of code - did you find that?

A big part of the design work was trying to get it to be good at that, while still being entirely tree based. But you'll have to judge yourself how successful I was.

Well, your judging that would be easier if you could use it, rather than just read about it... I still have the source code, but since it was designed to work with a vector graphics display attached to a PDP-11, getting it to work in a modern environment would take some work. We had only one vector graphics display, so there was no chance then to have a significant user community, which was another problem.

An explanation of what AST actually means would be helpful at the top of the intro page. Quite of a few of the selling points were lost on me due to lack of familiarity with the term. I can (and will) google it, but if I am looking at a new product, especially one I have to buy, it'd be nice to have the resources available to understand what it offers without turning to a third party.

Yeah, I had to dredge my memory for what an AST is - ran into it ages ago on a compiler design project, but I could see how it works in a text manipulation context as well.


Interesting, there's been an [ongoing discussion to use tree-sitter with Emacs](https://archive.casouri.cat/note/2021/emacs-tree-sitter/inde...) because right now, syntax highlighting is a mix of regex and functions.

I don't how language parsing should be exposed to make it easy to have parsing in editors. But I believe some efforts like tree-sitter are most welcome and also more inline with what we expect editors to be.

Neovim has moved in this direction too. There’s a lot of potential there.

I hacked up a plug-in yesterday to let me run tree sitter queries over my codebase. I’d upgraded a dependency and some patterns were no longer valid so I was able to track them down in a way that would never have been possible with regex etc (certain function called with more than one argument chained with one of several other functions).

Yeah, tree-sitter and LSP are both steps in the right direction I think.

There has been a movement to try something like this under the buzzword "projectional editing". The idea was to edit a representation of the code and not the code itself. People and companies experimented with it but didn't get much adoption. Not sure why.

Perhaps you need to allow the developers to reach invalid states, because the editors I tried just refused entries which would result in a syntax violation. This is harder to implement though. :)

Yeah, that's one of the main issues with ASTs and why I think they make the same mistake as text editors ultimately - designing around a particular data structure just constrains you into weird UX scenarios that don't have anything to do with the actual problem of making something that lets you write code.

Friend of mine said he worked with a system like that. Problem with it was it couldn't save code that didn't compile. That's a defect. Am reminded of someones comment that they would like a compiler that just stubs out code that doesn't compile. For a system to be workable you need to be able to handle that.

I feel that because no one uses AST aware editors no one understands what that would bring to the table. A counter example I've used CAD programs. And all of them store the design files as databases internally. Which means you can and people do preform database operations on them. Generate reports and perform update operations just like you would with an SQL type database.

This looks broken for me -- nothing loads on the homepage and I get some JS errors. I am on Firefox on macOS.

same. tried FF and Chrome on Windows, and FF on Ubuntu. Blank space where the editor should be.

Thanks - fixed

I vaguely remember someone telling me, a long time ago, that Thomas Reps' PhD thesis


showed that it wasn't practical to build this kind of editor. Perhaps that was not at all the point; it was a long time ago.

I wonder, though, can anyone explain 1) if Reps' thesis does bear on the Treefrog work 2) what it was in Reps' thesis that bears on the difficulty of making syntax-tree based code development tools?

I wonder if it was important that an actual abstract syntax tree was used? I'm realising the title might be a bit misleading - it should probably be "AST-inspired" commands, as the tree-sitter trees are technically concrete as opposed to abstract and the commands aren't tightly coupled to a particular representation.

The S,D shortcuts work for me, but I can't get J or K to do anything? Some UX feedback on what is valid at the current time may be useful. One problem with structural/projection editors is it's not as clear what is a valid operation in the current state.

There's some existing work in this area you might find interesting as well:



https://hazel.org/build/dev/ (expand the left side pane by clicking the (?) to see the valid operations)

Thanks. Just checked and j/k are actually not implemented yet, so that explains that!

Clever line of thinking.

Is it fair to think of IDEs that include language-specific refactoring capabilities as a providing a developer abstraction over AST editing?

Thanks! Yeah, I think of refactorings as being a level of abstraction above text and AST editing - kind of like macros for AST edits.

Intriguing idea. This would be very interesting to try with a Lisp or Scheme like language. Does it have any support for S-exp based languages?

Speaking of... By 1978 InterLISP had grown up quite a bit and its tree-based editor was quite powerful. It supported several tools like Masterscope and DWIM (Do What I Mean), the likes of which I would not see in text-based editors for almost two decades. It was quite habitable, even though a CLI tool and tree-based. I wrote all the code for my dissertation using that editor on a DEC-20. So much better than even EMACS at the time. The simplicity of S-expressions was critical to its success, IMHO. Extending the editor was simple, too. LISP, you know.

A long long time ago on a toy lisp called Interlisp/65 I used something that was like the s-exp equivalent of a line editor.


Source is here: https://atariwiki.org/wiki/Wiki.jsp?page=LispEditor

If you really wanted a vi-like experience for editing lisp, you could figure out how to make an editor like that full-screen and interactive.

I haven't added any yet, but all parsing is done with tree-sitter so languages can be added pretty quickly. One issue is that tree mode works on whole lines (I think the benefit of AST-style navigation vs moving a text cursor diminishes as the selections you want get smaller), so it's better suited to languages where the blocks are naturally separated by newlines as opposed to the ")))))" style.

Are you using the official tree-sitter grammars as-is? Or tweaking/writing your own to support your use case? From what I've seen not all of them are suited to structural editing, since the emphasis is on syntax highlighting.

Yeah using them as is and haven't had any issues so far. I think they are at a slightly lower level of abstraction than an AST, e.g. an async method in JS is a method_definition with an "async" node followed by a property_identifier, whereas an AST might wrap all this into an AsyncMethod node - but all the info you need is there.

Parsing on newlines is probably not a great fit for Lisp-syntax languages. Does the availability of LSPs make a difference? Asking as I don't know your tool.

Yeah, s-exprs don't break down by lines very well, but they are easy-ish to work with in a structured format, similar to HTML/XML editors since the delineation between elements is usually very clear. The only problem with lisps would be when macros are involved where there is more fluidity in the structure. But something like this:

  (defun double-all-the-things (things)
    (mapcar (lambda (x) (* 2 x)) things))
would be straightforward to navigate through with an s-expr aware editor.

It's more that the selection operates on whole lines in Tree mode, so it would parse fine with the tree-sitter grammar (for syntax highlighting etc) but selecting the inner contents of a block would take the block's closing ) with it. This would only matter if you wanted to move the contents somewhere though - editing and refactoring commands (including with the help of LSPs) will work, with the selection being the outermost node on the first selected line.

Paredit has entered the chat.

Interlisp, fructure

Some feedback on the actual implementation:

- I cannot reach inside if and while statements using tree mode

- It is not clear which lines are available for refactorings

- dragging a line reveals "+ If" and "+ else if" under an if statement. Dropping the line onto it does not work.

- tree based moving does not move the viewport

About the business side of it:

I think it will be hard to create a whole editor just for these features. I know that Jetbrains editors also have ast aware refactorings and I often use moveBySymbol in sublime text which acts a lot like your tree navigation. I think it makes more sense to create this as a plugin, but it will be difficult to monetize that.

Anyway I applaud you for creating something that actually works!

Thanks for the feedback!

- can you move into an if/while if it's not already selected? For already selected blocks I've had it both ways and found it better to be able to drag from anywhere in the block - you can select a child by clicking it. Both ways have their advantages and disadvantages though. "d" should also work to navigate to a block's first child.

- yes maybe some kind of visual indicator would be good, but if required after the first few uses then the interaction would need to be redesigned I think, as there shouldn't be any unexpected or unintuitive behaviour

- hmm, that's working for me, does anything at all happen when you drop it?

- yeah, haven't implemented that yet

I thought about doing it as a plugin as well, but I think trying to add something genuinely new to an existing editor would be possibly harder than just making one (I've been working on Treefrog for about 7 months full time and using it as my only editor for the last 2), and would end up either not fitting in well with the existing UI or having clunky UX to switch between modes etc. Designing new modes from the ground up (and being able to prototype them quickly with web tech and Electron) allows for much easier experimentation.

> can you move into an if/while if it's not already selected?

Sorry I meant the condition of while, for and if.

> hmm, that's working for me, does anything at all happen when you drop it?

Now that I try it again it seems to be working. Before it would just drop the line bellow or above the "+ if".

Ah OK. Tree mode navigation only works on whole lines as I think there's benefit to keeping the model simple and text cursors are good enough for horizontal navigation. I do have a couple of ideas for this though - 1) a key to select the condition so that you can edit it in normal mode (standard text editing), e.g. select the if then press c to change the condition; 2) some way of temporarily expanding single-line constructs into multi-line ones so that the vertical navigation commands navigate and edit them - so you would navigate to it then press a key that would temporarily expand the if (condition) to if (\n\tcondition\n). This would also work with things like argument lists - temporarily expand to get the arguments on separate lines for editing.

The features I've always dreamed about from an AST aware editor was shell-like functionality where operations to rename, move, copy, create, etc were akin to file manipulations done on the command line. At one point I was thinking about the possibility of FUSE-ifying(https://en.wikipedia.org/wiki/Filesystem_in_Userspace) some code so that a normal shell could be used. Any chance of shell-like functionality from treefrog?

Interesting! Yes, one of the main things I'm doing at the moment is thinking of and prototyping new commands and ways to represent code. I like the shell idea.

So things like classes and functions would be represented as files within a dir representing the actual file?

Yeah! I think the difficult part of this idea though is maintaining/moving the necessary dependencies, or at least keeping track of them to warn that they're no longer satisfied.

In many cases it probably requires awareness of packaging in a way that normal AST manipulation does not, at least to be particularly useful it would.

Right - this is where I would use an LSP server

The text in the editor is blurry - looks like it's rendered at a lower resolution, and then upscaled with bilinear filtering.

(This is on a 4K display running Win11 with UI scale set to 200%)

Ah yeah, it's canvas based so I suppose it will be being scaled like an image. Thanks!

I think you need to scale your canvas by window.devicePixelRatio for higher density screens.

Anyway I am receiving an error on page load: Uncaught TypeError: Cannot read properties of undefined (reading 'watch') under platform.jsonStore.watch

Thanks! Error should be fixed now

One pet peeve of mine that this kind of editor might be able to address is how live compilation is sketchy when you're actively editing a file. For example when you're in the middle of an edit and you end up with a huge number of compilation errors because the parse of the file is in flux. Having the editor and the syntax parser aware of each other can help fix such issues.

Semantic merging would also be a nice feature.

I remember watching a talk about this, the performance improvement was staggering. Would be cool to see a big vendor like VSCode use it.

The scroll is really hard to use on my macbook. I can't scroll slowly enough. It moves massively up and down with each scroll even when I am very careful.

Thanks, I haven't tried it on a touchpad yet, will have a look.

Should be fixed now

That was fast : )

:) I was surprised touchpad did anything tbh - turns out it sends wheel events with different deltaY/deltaX values, so I just had to use the actual values instead of what it was doing before (checking if < or > 0 and scrolling by 3 rows/columns worth)

This is the future of programming. Code isn’t (just) text, it’s a tree and we can edit it in a much more elegant way utilising it’s structure

A quick note: the editor looks blurry/low-res on a 4K monitor (and I would guess, probably on other high-res displays as well).

Thanks! Are you on a mac? Someone else noticed it on mac, just on the actual editor, so I thought it might be to do with rendering text on canvas (sometimes you have to use half-pixel coordinates to get things to align properly)

Not on a Mac. Just a 27-inch 4K display (with a 2.5x scaling configuration).

What are hypothetical benefits of using this versus Webstorm/Intellij with Vim mode?

One example would be turning a group of statements "xyz" into "if (some condition) then xyz, else (something else)" - with a text editing interface you have to think a bit about the exact text selection you want (or enter visual mode in vi to get the lines), cut them, write the if statement, paste them in, and usually some reformatting, whereas with Tree mode (the dedicated mode for expressing meaningful edits to the code) you can do Esc-w for "wrap", type the if statement, and insert the original statements from a dedicated clipboard.

Other cases are similar, e.g. being able to just drag and drop a div into another div without thinking about the exact text selection - the benefit is having to think slightly less about the mechanics of your editor or the particular data structure (text or ASTs).

Thanks for explanation!

I think for the first case there is a ‘wrap in if-else’ refactoring in Webstorm/Intellij, but I rarely use it.

Second case is interesting, I would be excited to have this for React components.

Isn't this what the Intellij/Eclispe IDEs have been doing since the very beginning? All the powerful refactoring features are based on AST manipulation.

Yes, sort of, but they've always been added on top of a standard text editing interface so tend to be behind context menus (which is OK as they're also often high-level refactorings that aren't used that often). Treefrog's structure-aware edits are lower-level and more deeply integrated into the editing interface - there's a whole mode dedicated to them where the mouse and keyboard behave differently. The aim is to design an intuitive editing interface; the AST itself is just one tool to help with that.

Any chance of opensourcing this?

I'm not necessarily against it in the long term, just keeping all options open for now as I want to be able to work on it full time and that obviously involves making money somehow.

Understandable. I wonder if a patreon model might work.

I've not hear many success stories about building companies around single not-yet-established pieces of software. But I've had similar dreams and wish you luck.

IDE-features like “Refactor” are my idea of how this kind of editing already exists.

some notes:

* ctrl-home/end do not go to very top/bottom

* python tree-mode does not do anything sensible

the project page doesn't anything about supported languages ?

Current langs are javascript, html and (s)css - there's a dropdown on the toolbar to create a new file in a given lang (the others there are partially supported, no Tree mode commands). The goal is to support all popular languages, as they can be added fairly easily with tree-sitter grammars.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact