
Type-safely embed DSLs directly into Java - mckinney
https://github.com/manifold-systems/manifold/tree/master/manifold-core-parent/manifold#embedding-with-fragments-experimental
======
lmilcin
Except it is not really embedded in Java, only in comments.

I sincerely hope this dies a quick death.

Java with its jars and infrastructure already makes it plenty convenient to
work with resources without them having to be in your face when you are
editing the file. Good IDE lets you move between the resource file and the
code with a single click.

~~~
ulkesh
Yeah, I'm not sure I get it either. More power to the author for solving a
problem they may have, but I have yet, in my 20+ years experience with Java,
had a need to do this kind of code embedding in the source code (whether in
comments or not). Sure, it's nice to be able to run other languages at run-
time (e.g. GraalVM, etc), but is there really a need to have this kind of
language interoperability at compile time?

~~~
mckinney
> but is there really a need to have this kind of language interoperability at
> compile time?

Well, if you want to leverage Java's static type system (and why not?), the
answer is, yes. I imagine you'd want type and member references to the other
language to resolve statically using the compiler, right? Similarly, why not
have the same functionality in your IDE? Plus code completion, usage
searching, refactoring?

Now, as I mentioned in an earlier comment, the embedding part of this
addresses just a small segment of use-cases e.g., scoped query editing. The
vast majority of other cases work directly against resource files, type-
safely. Read more about that here:

[https://github.com/manifold-systems/manifold](https://github.com/manifold-
systems/manifold)

~~~
ulkesh
I’m not at all contesting your reasons, which I expect were plentiful enough
to build this system. I’m simply saying that if I want to use Java’s static
type system, I’ll just write in Java. Perhaps I’m an old codger these days in
my ripe early 40s.

------
marktangotango
I'm curious how this works under the hood, looks like it's added as an
annotation processor, and the DSL is embedded in specially formatted comments,
but not in annotations. This implies to me the annotation processor is
processing the comments in a source file? I did not know annotation processors
could do this! Also implies comments carry through to the compiled .class
files which I did not think they did either (runtime bytedcode weaving)?

~~~
mckinney
Hi, you can blame me for this. It's a Java compiler plugin[1], which is
similar to an annotation processor, but can hook into the compiler at a much
earlier stage, which allows it to contribute to all phases of the compiler,
including the Parser phase. The Manifold plugin takes full advantage of this,
hence its ability to analyze comments, contribute to bytecode generation, etc.

[1]
[https://docs.oracle.com/javase/8/docs/jdk/api/javac/tree/com...](https://docs.oracle.com/javase/8/docs/jdk/api/javac/tree/com/sun/source/util/Plugin.html)

~~~
eatonphil
So you have a javascript frontend hooked into the java compiler? Or how does
your javascript integration work?

~~~
mckinney
At compile-time manifold-js[1] parses JS and generates Java types (stubs),
which forward execution to rhino at runtime. Basically, JS seamlessly mapped
onto Java's type system.

[1] [https://github.com/manifold-
systems/manifold/tree/master/man...](https://github.com/manifold-
systems/manifold/tree/master/manifold-deps-parent/manifold-js)

------
tofflos
Thank you for making this.

Have you considered an approach based on annotations and multi-line strings?
I'm not sure string constants are valid targets for annotations but maybe that
could be added to the language?

    
    
        @Language(name="Javascript")
        """
            function callBark(aBarker) {
                aBarker.bark();
            }
        """

~~~
mckinney
Thank you!

Yes, Manifold already supports Java 15 multi-line strings (aka text blocks)
like this:

    
    
        var myValue = """
        [>.js<]
          function foo() {
            return "hi";
          }  
    
          foo();
        """;
    

You can embed a resource fragment as either a declaration or a value. You use
a string to embed a value fragment as with the JS example above. Note the
[>.js<] header indicates the resource type for the string, similar to your
annotation. As you surmised, an expression such as a string literal cannot be
annotated.

------
rhacker
The graphql example is not the greatest, but I would personally use GQL Code
Generator to create a little strongy typed client for all your .graphql files:

[https://graphql-code-generator.com/docs/plugins/java](https://graphql-code-
generator.com/docs/plugins/java)

Not sure if there's a strong use-case in general here. Even Groovy code, which
is a popular choice for JVM DSL languages, has pretty excellent stub
generation.

~~~
mckinney
But the main idea with this is to avoid the pitfalls of conventional code
generation such as GQL Code Generator. See "The Big Picture"[1] in the core
docs.

[1] [https://github.com/manifold-
systems/manifold/tree/master/man...](https://github.com/manifold-
systems/manifold/tree/master/manifold-core-parent/manifold#the-big-picture)

~~~
sreque
When I see "no code on disk" I see that as a downside. What happens when this
generated code has a bug? How do I set a break point in this code? How do I
see the code to know what it is doing?

Extending the compiler is great... until it isn't. Imagine a feature in java
the language not working correctly due to a compiler bug. Debugging or working
around this issue could be very difficult. With compiler extensions, you are
now widening the opportunity for this type of bug to occur.

More principled code generation systems like Racket's macro system do in fact
let you expand the code to see what's generated while avoiding having a
separate build step, and even have a macro debugger for interactive debugging.
Mainstream languages have a ways to go before we get this kind of tooling for
language extension frameworks. Until then, I think I prefer having a separate
build step with on-disk, visible, and debug-able code.

~~~
abeppu
I definitely agree; the author clearly thought no on-disk produced code was a
feature, but it makes me nervous. Similarly, the author then goes on to say
how the framework supports circular dependencies between manifolds, or
manifolds that each modify one another's types ... and I immediately think
that will create debugging nightmares.

~~~
mckinney
In most cases, the devs that should be concerned about debugging generated
code are the authors of the generator. If there are bugs in released code,
consumers of the API should file against them. In any respect, as I mentioned
in a previous comment when a resource is selected you can view its generated
source via Edit | View Java Source.

------
jnellis
You can write other languages like this in Groovy using Intellij. Intellij
allows you to annotate a multiline string as another language, then it will
parse and syntax color, autoformat for you. The use case would be when you are
templating frontend code snippets on the server side. I'm not sure if its just
Groovy though, I'm betting Intellij can probably do this with other languages
that have multiline strings.

~~~
mckinney
Right, but that is just syntax highlighting. This is resolving the content of
the string type-safely, at compile-time.

------
qisyou
This is cooool!

