Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Monocle – bidirectional code generation library
148 points by lucasluitjes on April 12, 2022 | hide | past | favorite | 39 comments
I just published a bidirectional code generation library. Afaik it's the first of its kind, and it opens up a lot of possibilities for cool new types of dev tools. The PoC is for ruby, but the concept is very portable. https://blog.luitjes.it/posts/monocle-bidirectional-code-gen...



There is a fair amount of academic work on bidirectional tree transformation with lenses, e.g. <https://www.cs.cornell.edu/~jnfoster/papers/lenses.pdf>. It breaks down to proving three operations (Get, Put, Create) that observe three laws (called GetPut, PutGet, and CreateGet); these give you bidirectional transformations you can compose arbitrarily. Later work introduces concepts like "quotienting" (for when you actually want the transformation to be lossy in certain ways) or "discerning" (non-total) lenses.


OP here, we actually considered using the GetPut/Putget/CreateGet terminology during a refactor, but it didn't seem to map perfectly. As I understand it, with lenses you generally define them through a DSL. Monocle lets you define them through example code. The goal was to make it as easy as possible for a decent programmer to write a lot of them. I couldn't find that type abstraction in academic work, but it's entirely possible I didn't use the right search terms.


Super cool. Bidirectional code generation is something I've spent a bit of time thinking about: I've been building a spreadsheet that generates Python code when you edit it [1], but some of our users also want the ability to edit the Python code they generate and have that reflect in the sheet itself.

Template -> Code -> Template is one really hard part of this, and something this tool seems to take a really good approach to. If you're interested in related subjects (like transpiliation), I'd recommend this overview article as a great approach [2]. From this article: "Now, there is a single biggest mistake we see in persons trying to implement a transpiler without experience in this: they try to generate directly the code of the target language from the AST of the original language."

Question to the OP - you mention you parse the Ruby AST - do you also transform this into an AST of your template syntax before generating the template? Aka, do you avoid this sin, or is it not an issue for you?

With Mito, there is additional complexity beyond just going from Template -> Code -> Template, in that we also need to understand _which_ variables are being changed and in what way. This is necessary because a spreadsheet stores other data about your variables beyond just their current value. As an example, which of these columns in a dataframe are a result of a formula vs. being in the original dataset isn't something that is just stored in the dataframe itself.

I haven't tried too hard, but I don't think there's a general solution; it feels like it requires some sort of symbolic execution in the general case, and is tough to do well even in simple cases. Our "fail loudly and early" equivalent feels like it would be a lot higher than the 10-20% this tool can deliver!

Anyways, bidirectional spreadsehet code generation is low on the priorities... but it's a fun one to dream about :-)

[1] https://trymito.io [2] https://tomassetti.me/how-to-write-a-transpiler/


Mito looks interesting! Regarding the sin, I don't think it entirely applies here, unless I'm misunderstanding you - I'm not super familiar with transpilers.

In this case the template is also ruby code, but with placeholders for the values. So both the template and the input get parsed into ASTs, then both of them are recursively walked and compared on the fly. During that walk it builds up the data structure with initial values.

In the code generation direction it walks one AST and does lookups and rewrites with values from the values.

Does that make sense?


Gotcha. Pretty much the template language is close enough to the destination language that you don’t need an AST for the template language at all.

That makes sense! Seems like a legit simplification to take advantage of.

Do you have any plans to make the template language more advanced/complex?


Well, looking at the rails monocles (for matching controllers, views, etc) a common pattern is custom matchers/replacers for things like "ast.children.first.to_s.singularize.camelize", so a shorthand for defining those in both directions could be handy. Other than that, not right now.

But if there are common situations where you end up writing the same matchers/replacers, those would be prime candidates for built-in placeholders.


Inflex (https://inflex.io/) does this bidirectional editing: you can edit any data structure as either a graphical object or the associated code, and the other one updates appropriately. (See e.g. https://discourse.inflex.io/t/how-to-make-and-access-a-recor...)

As for the problem of which things of a data frame are generated from a formula and which are from normal form, Inflex compares the structure with the AST of the source, making it easy to tell that [{foo: ...}] are normal form and so editable graphically, whereas xs.filter(..) is not. The neat thing is you can still edit formulae that are deep within a normal form nested structure.

It helps that Inflex doesn’t have syntactic sugar, what’s parsed is what is in the final AST. It also has a symbolic evaluator, somewhat, so it’s fine to have a list of functions for example and edit the list. The evaluator produces the same AST, rather than an alien format.

This bidirectionality also applies to rich text editors. (https://mobile.twitter.com/InflexHQ/status/14923564133263360...)

Mito has the advantage of being a familiar language and ecosystem, but Python itself has a traditional runtime, it’s imperative and not expression oriented, and lacks sound static type information, so it’s inherently more difficult to achieve some things with it that are easier with a typed pure functional language, especially a custom one.


Whoops, looks like I should've posted this as a link rather than text containing a link. Here's something clickable: https://blog.luitjes.it/posts/monocle-bidirectional-code-gen...


Delete the post and repost it and I'll vote it up.


I don't have the delete button, but I've emailed support. Thanks!


Anyway you're following a good trail. It drives me nuts that conventional parsing tools only work in one direction and can just barely comprehend why it is (people who write compilers really care about speed.)


Thanks! Yes, it seems like an underappreciated direction. Hoping that this'll be a push towards that.


Yeah back in the 90's it was called "round-tripping"

https://www.ibm.com/docs/en/rhapsody/8.2?topic=developing-ro...

I did a lot of code generation work in those years, working on the two dominant Mac-based generators (AppMaker and Prototyper) but was never ambitious enough to try round-tripping because of the horrors of parsing C++.


https://en.wikipedia.org/wiki/Round-trip_engineering :

> Round-trip engineering (RTE) is a functionality of software development tools that synchronizes two or more related software artifacts, such as, source code, models, configuration files, and even documentation.[1] The need for round-trip engineering arises when the same information is present in multiple artifacts and therefore an inconsistency may occur if not all artifacts are consistently updated to reflect a given change. For example, some piece of information was added to/changed in only one artifact and, as a result, it became missing in/inconsistent with the other artifacts.

Source-to-source_compiler > See also > #ROSE, : https://en.wikipedia.org/wiki/Source-to-source_compiler


This is a powerful idea! If I understand Monocle's use case, JS has similar AST parsing and code generation tools that are used broadly. There may be some ideas to learn from that community.

JS AST specs: estree [0] and babel's AST [1]

Parsers: babel [2], acorn [3], or espree [4]

Transformers: babel, recast [5], or jscodeshift [6]

Codegen: babel or escodegen [7]

[0] https://github.com/estree/estree

[1] https://babeljs.io/docs/en/babel-parser#output

[2] https://github.com/babel/babel

[3] https://github.com/acornjs/acorn

[4] https://github.com/eslint/espree

[5] https://github.com/benjamn/recast

[6] https://github.com/facebook/jscodeshift

[7] https://github.com/estools/escodegen


Haha, the old Java GUI builders in the 90s did something like this. You could either drag around the window (and the code would be updated) or modify the code (within limits) and it would parse it into the GUI builder.

Who's old enough to remember the Symantec Visual Cafe IDE?


I remember! The times where RAD was the next big thing.

I only used it after it was "remixed" into JBuilder. I actually used Visual Age from IBM, which I reckon was similar to Visual Cafe.


Yep, visual cafe was interesting. I was used to Delphi and I didn’t understand why cafe was so much worse.


This is neat! I’m curious if you see this being extended for other languages, or the concept being applied in other projects?

As for similar concepts, several projects by builder.io have some overlap. Most notably Mitosis[1], but I’d be shocked if TS-Lite[2] isn’t using similar techniques. Potentially Qwik[3] as well but I’m not sure, I would have bet that’s using Mitosis but it looks like that’s the other way around.

1: https://github.com/BuilderIO/mitosis

2: https://github.com/BuilderIO/ts-lite/tree/main/packages/core

3: https://github.com/BuilderIO/qwik


Wow, those projects look really cool! There's definitely some overlap there.

I'd love to see it used for other projects/languages. At the bottom of the post I put a bunch of ideas that I'd integrate with it, if I had more time.


I don’t know enough about the Ruby ecosystem to definitively rule out the possibility you’re already using it, but in case you’re not currently… are you interested in using/supporting tree-sitter grammars? There’s a healthy ecosystem of well maintained grammars (of which the Ruby grammar is widely regarded as notably complex ;).

Feel free to get in contact (I’m easy to find) if you want to see if it makes sense to join forces/make this or something like it less of a solo effort. I have professional (open source, nothing will go to waste) and personal interest in this space and know other folks who do too.


My dream would be bidirectional openAPI. I used it for a project and liked it but after the generators are run, making changes becomes more not less work to keep the spec and code in sync.

Something like this would be amazing if it could be integrated as a generator. I'd love to do it if I had the time


I use a mixed approach for OpenAPI, but not bidirectional.

I have OpenAPI pieces generated from my Go source code (comment, types, function signatures) as JSON.

I also have a manually-edited master YAML document that refers to generated bits via $ref links.

I then use openapi-preprocessor [1] (disclaimer: I wrote it) to produce a final openapi.json file which is committed in the repo.

When I want to extend the API in a spec-first process, I can add the new routes manually in the YAML file. When I do the implementation I replace the manual bits by the generated one when they are ready. When committing I can check the diff of openapi.json to verify I'm not losing in the process.

[1] https://github.com/dolmen-go/openapi-preprocessor


What you've done looks pretty smart and definitely worth a deeper look. My main interest is in visual design generating code, especially for animation timing.

The concept may be portable - the devil is in the millions of details on which I've seen many promising tools bog down and die.

Also, please, don't say _first of its kind_ unless you've done enough research to be confident.


Cool idea! Was wondering if you could elaborate a little on what type of new dev tools would benefit from this. One of the prototypes I worked on was bidirectional code gen for no-code tools but it felt like it might not be the most useful thing.


Yeah, bidirectional code gen for no-code tools was our use case too (there's a demo video in the post if you're interested). I saw a bidirectional no-code platform called Vision X the other day, people are definitely working on it. Are the prototypes you worked on online somewhere? It sounds interesting!

There are a bunch of ideas for dev tools in the original post. For example if you integrate with linters, you could define more complex code smells without all the AST juggling. Upgrading rails apps (or other frameworks that have a similarly well-defined structure) to new versions might work, by defining monocles for old the old and new version.


It's not online but a little bit of that initial work went into greppo.io. I was thinking more in terms of severless platforms and not dev tools in general. However the theme of my thinking is that I don't know if there is a biz model here. As an open source tool, for sure!


Have you read/study about Category Theory?

I find it incidentally related that you named your project "Monocle" while it does something quite similar to a concept there called lenses, so have a look at it if you have a chance!

Btw, really neat idea.


A friend who did some work on the project said it reminded him of lenses in functional programming. Since the company name was Snooty Software, he suggested calling the library Monocle, which seems appropriate :-)


A very popular Scala optics library is also called Monocle. I’ve been a happy user for a few years:

https://github.com/optics-dev/Monocle


Is the codebase flexible enough to add other source and templating languages? What would be involved in that?


For regular languages (not templating), if they have good tooling for converting to/from AST it's quite possible. The codebase isn't very flexible right now, but it's also not very big. In fact monocle itself is less than 900 lines of code.

I'm not sure if it makes sense for monocle to support multiple languages, or if each language should have its own port. Someone writing monocles in a specific language probably wants to do custom scripting in the language they're used to.

Templating languages are trickier, because they usually don't have great tooling. In fact, we wrote our own tooling to convert from ERB to builder (a template system where you generate HTML through Ruby methods) and back. So for any templating language you would probably write a tool that converts from that language to builder, and back.

On the other hand, ERB is about as free-form as it gets. Templating languages that are more strict are probably easier to add. For more info on how ERB is supported, I wrote another post that goes into detail: https://blog.luitjes.it/posts/erb2builder/


If there is an existing parser, like for a lot of templating languages, this is can be done, since templates are parsed to ruby in a separate step. Otherwise, it would require creating a ruby-based parser for a language, or porting the concepts to that language.


Presumably the results in the reverse direction are sometimes ambiguous, so x1 -> y -> x2.

I wonder it there’s a general code improver that tries this transform on every function looking for a shorter x2 than x1.


Looks super cool!

For those more experienced in Programming Language Theory, how does code generation slot into PL theory? Is there some kind of common formalism for it?


Code generation... for data models - is some important context not in the post (I assumed it would be a Ruby compiler and decompiler, for example.)


It is a compiler/decompiler, data models is just a simple example. There are also examples of generating and parsing controllers, views, schema and route files. But you can define your own templates, so you can parse/generate pretty much anything that has conforms to a common structure.


Cool. How about speed? Cold you run benchmarks and show results?


Love that it's for Rails. Might actually use this!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: