We are absolutely swimming in little languages. Consider these languages:
- The language of regular expressions
- SQL queries
- In web frameworks, the language of routes
- etc.
Unfortunately we embed a lot of these languages as strings. This is problematic because the language usually sees just opaque strings—we can't apply any of our lovely static analysis tools to these little embedded languages.
I'm doing some research in this area. We just got a paper published at ECOOP. The big idea is that, with a little bit of clever metaprogramming, we can help the type checker understand these little languages better and give us more helpful hints or execute more efficiently. This isn't a new idea, but no one has given it a name before. Here's the blog post version: https://lambdaland.org/posts/2024-07-15_type_tailoring/
What, type tailoring? Language-oriented programming is a different idea than type tailoring. If you don’t believe me, ask one of my coauthors. ;) Unsurprisingly though, Racket has some of the best support for type tailoring.
No, having embedded dsls that aren't just treated as strings, but as integral parts of the system that get syntax checks etc. at compile time and that analyzers can understand.
Right—that's the power and beauty of a good metaprogramming system is you can extend the language to meet your domain—not the other way around. Type tailoring is all about (ab)using metaprogramming to target the elaboration of surface syntax into something the type checker understands. So, the reason why it's sounding familiar is because type tailoring uses metaprogramming and targets eDSLs. Some of the stuff you're likely referring to (please send me a link with examples) probably is doing type tailoring.
The problem is that, until now, there hasn't been much awareness of the underlying thing going on here with types and metaprogramming. You might implement an eDSL with macros and get some nice type checking for free (I'm working on another project right now that does exactly this) but, with this type tailoring framework we're proposing, you might see ways that you can get more out of what you're doing by leveraging static information better and by programming the elaborator more cleverly. Does that help?
This is a super minor correction on my part, but this is a weird phrasing because "the ACM" is a huge organization that publishes proceedings for a great range of interests --- it isn't a venue in itself. The name of the venue you're looking to reference is Communications of the ACM, more commonly just referred to as Communications or CACM.
As an aside, Communications is a sort of unique venue for publication. The submissions are peer-reviewed, but the nature of the submissions is more similar to a blog post or editorial article than traditional papers you'll find in other proceedings from conferences and journals (and, indeed, the ACM refers to CACM as a "magazine"). It makes for good "fun" reading!
The vibe of CACM also has changed significantly over the decades it's been around. When it started in the late 1950s, and through the mid 90s, it was really a "Journal of the ACM"-lite; the research articles were quite good! Then it morphed into a trade magazine and, by the late 2000's, it was kind of an embarrassment.
To Vardi's credit (the previous CACM EIC), CACM clawed back some of its technical chops in the 2010s. I wouldn't claim it's near the quality of 1970s CACM, but it actually has technical content in it again. Equations, even, gasp!
Yes, the CACM along with all of the SIG* journals. TBH I gave up my membership when the ACM seemed more of a money grab and less about computing.
You are correct, and I'll deny having a 1978 copy of the SIGPLAN (Programming Languages) Conference docs on the history of some of the languages that were popular then. :-)
SIGPLAN is my only ACM membership; I figure I can support my primary publishing organization at $25/year or whatever. So your 1978 issue sounds like a pretty cool piece of history to have to me!
- Grammar, parser, compiler, interpreter (delete as appropriate)
- Editor plugins for nice syntax highlighting
- Language server
- Packages for common things
- Nice website (or no one will use it)
- etc...
So the pressure is always to shoe-horn a big existing language into you problem. Maybe you can build a nice library if your language has decent syntax (or little to no syntax). If you have an AST representation, you probably dump it to JSON etc.
I am curious if any projects are trying to make this easier.
Charles Simonyi and his Intentional Software tried to solve this, publishing some interesting articles in the 1990es. However their technology was not broadly used and they were acquired by Microsoft.
The key ideas are called Intentional Programming and Language Workbenches.
The best accessible implementation of that is JetBrains’ MPS (it is free). It allows you to define a language and “projectional” editors together.
It is really fascinating but it suffers from a learning curve where there is no small step from what people use in their everyday common languages and IDEs to building domain-specific solutions with MPS, so adoption is low.
Markus Voelter has some highly recommendable publications and elaborate applications of MPS for domains specific languages, see http://voelter.de/
I am sure there is something great in that area but it has not found the right form and shape yet, so keep exploring.
A rare mention of Intentional Programming aka IP (https://en.wikipedia.org/wiki/Intentional_programming) on HN. I first came to know of this from an article by Charles Simonyi titled "The Death of Computer Languages, The Birth of Intentional Programming" on MSDN. But alas, the promise never came to pass. The only other place i know of which covers it is a chapter in the book Generative Programming Methods, Tools, and Applications by Krysztof Czarnecki et al. IP is rather hard to understand (i still don't get it completely) and afaik there are no publicly available tools/IDEs to learn/play with it.
I don't believe Jetbrains MPS is a IP programming editor, it is meant for designing DSLs. IP has aspects of a DSL but is not the same.
Finally a huge upvote for mentioning Markus Voelter who is THE Expert in DSL design/implementation/usage. Checkout his articles/essays and the free ebook "Domain Engineering: Designing, Implementing and Using Domain-Specific Languages" from his above mentioned site.
The original Intentional Programming in the mid-nineties was a much broader vision than Language Workbenches, more like a grand unified theory of software development and related tools such as IDEs, languages, compilers, and a marketplace of components in that space.
My understanding, from the demos they were giving around 15 years ago, is that the Intentional company ended up focusing on a smaller feature set similar to MPS (I don’t have personal development experience with the Intentional product, only MPS).
It would be interesting to learn more about their work and lessons learned.
GraalVM's Truffle languages try to make this easier. You still have to write a parser and an interpreter. But from these you get a compiler for free. Plus tools like a debugger, I think maybe a language server as well. And your language can easily call out to the Java standard library, which solves the problem of standard packages. But you're tied to Java.
Building a small language is super easy. You can even just use JSON as a functional but ugly stand in for the syntax. However I usually just write a quick recursive descendant parser. It is easy and quick to do once you know how to do it.
Tcl was, or is, a nice way to add a language that's very flexible and customizable to a larger system. It's pretty easy to create your own DSL's with it.
I think he mentions in his book that he was inspired by Bentley's above article to write his book. I remember also that the book was quite detailed (with the implementation in C) though not a full-blown compiler/interpreter book and the little language was something to do with image processing (i have to browse my copy again :-). This book deserves to be better known.
- The language of regular expressions
- SQL queries
- In web frameworks, the language of routes
- etc.
Unfortunately we embed a lot of these languages as strings. This is problematic because the language usually sees just opaque strings—we can't apply any of our lovely static analysis tools to these little embedded languages.
I'm doing some research in this area. We just got a paper published at ECOOP. The big idea is that, with a little bit of clever metaprogramming, we can help the type checker understand these little languages better and give us more helpful hints or execute more efficiently. This isn't a new idea, but no one has given it a name before. Here's the blog post version: https://lambdaland.org/posts/2024-07-15_type_tailoring/
(HN discussion): https://news.ycombinator.com/item?id=40990232