Hacker News new | past | comments | ask | show | jobs | submit login
Building a language server for Muon (nickmqb.github.io)
101 points by nickmqb on Nov 26, 2019 | hide | past | favorite | 54 comments



An interesting article.

I was minded to write an Ask HN related to this. Why do so many new languages launch without what I'd consider to be essential functionality for any language; autocomplete, good IDE integration, intellisense, auto refactoring and navigation support?

I get that all languages are just text at the end of the day but are people really holding the whole API surface and type arguments of standard libraries and 3rd party libraries in their head? Do they remember variable names across files and directories. I feel like my memory must be absolutely terrible because there's no way I could work in a language without decent autocomplete.


You have to pick your battles. But I'll say that the Language Server Protocol is a game changer. Now at least the battle is communicating with a single well documented high level editor protocol that works everywhere. Prior you'd have to first choose an editor to integrate with, chug through far lower level details of how to color code, make dropdowns, etc, and do it in the editor's plugin language rather than your compiler's. My guess is you'll see more editor integration for new languages in the future if LSP stays relevant.

In my own little toy languages, integration with LSP is one of the first things I do after I have the basis for a hello world. So, before if statements or loops. It makes a nice feeling to be able to see your language get to look like a professional thing so rapidly. Also I think it helps in developing the compiler, that you know in advance what functionality it needs to have in order to provide good editor support. Also helps guide what unit tests should look like, and helps find bugs because they are apparent as you type.

So we could see an explosion of domain languages, since they are usually more about tooling, and with tooling a nice editor integration is very helpful.


To step back a bit, the way I see it, the main goal as a language designer is to ensure that users can be productive in the language. Many factors play a role here. Language features are an obvious one, but there's also (in no particular order): standard library, ease of integration with other code, documentation, community, tooling, etc. That's a lot, so as daxfohl says, picking your battles is key. Not everyone will agree on the relative priorities of these factors, and that's totally fine too. However, I personally think that tooling is (one of the) super important factors. "Fast & snappy tools" is actually one of the explicit design goals for Muon [1] (design principle #12). So releasing a language server so early on in the life of the language underscores that, yes, Muon is a language with a strong tooling focus, and I hope that resonates with people.

[1] https://github.com/nickmqb/muon


Thank you for your detailed response and the great article.

I'm interested in whether the focus on tooling as one of the priorities ended up informing elements of the language design. To take an example mentioned elsewhere, autocomplete is (generally) easier to implement for OO languages than FP ones because the dot syntax for method calls helps scope the options.

Are there instances where you've changed design or felt constrained by making it work with tools?


One design decision that comes to mind is Muon's significant whitespace. Having that makes it much easier to do parser error recovery, which in turn made it easier to build the language server. Another one is that I initially had the rule that all typenames had to start with a capital letter, and nothing else could. Which simplified parsing even further, but turned out to be too restrictive, so I got rid of that. Your question is something I've wondered about too. How can the design of a language co-evolve with its tools? If you have pointers to existing research on this I would be interested to learn about it.


Ooh, great question and answer. I'd never thought of that advantage of significant whitespace.


Because you either invest in language features or editor features.

I agree but most languages are built by single persons or very small teams. They have to select their battles. The last languages developed with that in mind were in order: TypeScript (essentially only for editors), Kotlin (developed by an IDE company) and C# (developed in a HUGE team including a huge IDE team next office door).


Notably both C# and TypeScript are developed in the same Microsoft division as Visual Studio and VSCode. :p


Not only that, but many of the people are the same.


I've spent three years working on/off on a language. It isn't ready for prime time yet, but I'm beginning to think about the editor side of things because I can now write significant sized programs in it.

But I can understand not wanting to.

I need to go learn an API, and learn how to express the grammar of my language in a new way because I probably can't just reuse the highly tuned recursive descent parser that's already part of the language:

Parts of my language aren't well expressed by some of the ways an LSP is expected to work, stuff inside blocks is only verified as valid, not parsed to check their types or construction. So even a "jump to definition" or "this is clearly an error" is going to involve a lot more work than the interpreter actually does.

Autocomplete itself could be a huge headache due to the highly dynamic nature of the language (variable names can be as dynamically created), without actually running the program, guessing what's inside the environment is going to be incomplete.

In short - editor support is either going to be half-assed (why do it), or more work than writing the language was (tedious).


Many projects are open-source, developed by the community. It comes with several caveats:

1) many open-source developers use vim/emacs, maybe something like Sublime Text and don't have experience with IDE offerings for languages like Java/C#. As such, they don't really require much more than syntax highlighting and basic autocompletion.

2) many of the new languages boast powerful metaprogramming capabilities. Metaprogramming makes it very hard to implement working refactoring, because the IDE cannot reason about the code as well as in more 'static' languages like Java.

3) most new languages coming out are closer to the functional paradigm rather than OOP. I think OOP lends itself much more for refactoring features and things like autocomplete.


I have yet to find an actual Java autocomplete plug-in that’s not just “we’re running Eclipse in the background over your code”. Just because it’s highly static and regular doesn’t mean it has good tooling outside of IDEs!


That probably has more to do with the success of the JDT and it being open source than anything else. It really isn’t that hard to write an auto complete engine for Java, but you have to spend a lot of time plumbing a partial or incremental code compiler (like Roslyn did with C#).

The primary reason auto complete works better for OO languages is an accident of syntax in the use of dot access on a subject expression.


I think part of my question was related to your part 1. Are people genuinely remembering the entire API surface of their application, their language's standard library and all 3rd party dependencies for their code when using a text-mode editor like Vim?

The only way I could possibly work like that is if my program wasn't bigger than a couple of files not exceeding 50 lines or if I spent all day in the API docs.

It has me worried I've just got incredibly early onset memory loss I guess.


I’m one of those vim users - I never use autocomplete, autorefactor or a debugger for that matter. Occasionally I have them available in something like RStudio but I don’t feel like I gain much from it.

I find myself constantly jumping through the code (with / for find) and I have 3 or 4 terminals open on the same display (to see how things are defined in various files) with stackoverflow a key binding away. I definitely don’t remember how exactly I named my variables but I find that at the same time knowing just the variable name is not enough, I also want to see where and how it’s defined. I use “grep -R” a lot for that as well.

Also often I end up editing files over ssh somewhere so then an IDE also wouldn’t be available, and vim always is.

I’ve never tried tools for refactoring. But as mentioned elsewhere that probably wouldn’t work well for functional and/or niche languages.

I generally feel pretty productive, if I had felt that there is a reward to learning these IDEs I wouldn’t have hesitated to do so.


Part of the answer is languages have very different properties.

> a couple of files not exceeding 50 lines

Which in say Stan, might be a complete application, and a month's work, spent mostly wrestling with domain math and algorithm performance.

Versus in say stereotyped industry Java, where it might be a day's work, doing a bit of boilerplate, design patterns, and api plumbing.

Autocomplete is important if it's common to see api names of length 40. Less so if a Haskell-ish 4 is common, and 12 less so.

I wonder if there's an opportunity to better convey the feel of working in various languages and environments. With so many live coding youtube videos, perhaps one could extract gif animations, to illustrate and weave into cross-language stories of best practices and inspiring expertise, of nifty tooling and UX concepts. For some better professional onboarding story than "use a diversity of languages and environments, each for some months or years, and you'll eventually develop a feel for what's possible, for what they each do well".


> I think part of my question was related to your part 1. Are people genuinely remembering the entire API surface of their application, their language's standard library and all 3rd party dependencies for their code when using a text-mode editor like Vim?

No. For example, I use emacs, and when I want to use an API for a type I just... grep for the type name, find its file, and skim its APIs, which takes me a couple of seconds but I am able to find what I want in seconds on code bases with multiple millions of lines of code.

Every year people try IntelliJ, or some other IDE, with better auto completion support. After half a day full of getting it to crash, they just go back to vim or emacs, and wait till next year to try these IDEs again.


Lisp and Smalltalk practically invented the IDE concept, they have more than enough metaprogramming capabilities.

In fact the first refactoring tools were developed on Smalltalk.


> Why do so many new languages launch without what I'd consider to be essential functionality for any language; autocomplete, good IDE integration, intellisense, auto refactoring and navigation support?

Probably because developing all of that stuff takes a lot of time still, even though it is probably easier now than it was before.

Everything gets done eventually if it’s important, but it may take a while until something gets done even though it is important because there is so much else to do as well.


It depends on the audience and goals for your language. If your language is intended to be a playground for new ideas, like a new type system or semantic model, you may only be concerned about getting that working and into other peoples' hands for experimentation, rather than usability.


I can certainly see why you'd not take the time to/need to do it if the purpose of the language is to explore concepts in language design.

But it seems to be missing from a lot of languages that launch with the intention of being used for large non-trivial multi-user codebases.

My question I guess was intended less as a criticism of language designers for not including these features and more to find out what I'm missing, that I'm not able to remember the entire (for example) Python standard library, or even the name of methods and classes I previously implemented.


Most of the effort in making a language a production language is on the tools and libraries end of things, and LSP is sort of the tip of the iceberg in terms of getting into that stuff. Before that, you probably want to have features like "good error messages" or "working string and math libraries".

For a long period in the past 15-odd years, new "Web" languages were getting plenty of adoption because the state of the tooling for that segment remained barebones everywhere, with a lot of functionality already in SQL or JS and anything in between being glue, and so competition on language features and syntax took precedent. It's probably in a consolidation phase now - things are getting more exciting in the lower layers of the stack instead.


> Why do so many new languages launch without what I'd consider to be essential functionality for any language; autocomplete, good IDE integration, intellisense, auto refactoring and navigation support?

Because their target audience at launch time is not "end-users", but to create a community to share with the task of language building. And that is a very different kind of project compared to using a language to write applications.

It is not meant to be a functionally complete product, but a demo of what could be the kernel of a really cool product (say, v0.1) -- for purposes of recruiting a team/investment to help carry out that vision and actually ship a useful product (v1.0).


I can certainly see this is the case for new languages intended to introduce new and paradigm shifting concepts.

To move the goalposts somewhat I primarily had languages like say Python, Elm, maybe Rust or Go, in mind. I haven't done Python in ages so this is probably an unfair judgement in light of tools like PyCharm but it still seems its supported use-case is text-mode script writing and the ecosystem is still lacking when it comes to large-scale application development. Yet it seems to be far more popular (on here and from a developer happiness standpoint) for application development versus languages which provide tools specialized for this (Java, Kotlin, C#, etc).


> Why do so many new languages launch without what I'd consider to be essential functionality for any language

Maybe because what 'you consider' is not a universal metric of utility?

I, personally, care not one whit about the useless stuff you mentioned. (Probably because I never programmed in Java and intend to keep it that way while I live.)


Lisp had this kind of interactive programming support for a long time using SLIME and Emacs. Why isn't this a more common feature in the more recent languages?


There are two aspects here: Interactive Development is incredible difficult. There is a talk of Anders Heijlsberg about that compiler building is no longer the same than 10 years ago. When you code, you need sub-100-ms reactions, the parser needs to be happy parsing incomplete, typically errorous code and then you need a code analysis engine dealing with that half-baked AST. Also traditional compilers are a strict pipeline and not a loop etc. Microsoft had to rewrite the C# compiler for that. The PHP LSP had to create an own parser engine because no suitable was available. It is essentially a lecture for future language creator: If you write a compiler, do not only built it for binary emitting but also for editors and analyzers.

And yes, the smart-asses of the 70/80s (aka LISP and Smalltalk) thought about that long time before anyone else. But hey, today's landscape is a bit more complex (think about CSS styles in a HTML embedded in a TypeScript enriched JavaScript JSX file).

The second aspect: LSP is great because it was a cross-vendor, cross-language initiative which had in its first year already two dozen languages on board in addition to half a dozen editors (VS Code, Emacs, Atom, VIM, ...). Emacs plugins could never deliver that because they were bound to Emacs.


RE the second aspect: that's why I'm a fan of LSP and the related debugging server protocol. There's no way Emacs can have native IDE-like support for all the languages out there, and the small developer base makes it hard to compete with IDEs for popular languages, that have commercial interests backing them. LSP promises to level the playing field - it only takes one good language server to make IDE features suddenly available not just for IDEs, but also Emacs and Vim and others.


For that matter, VB, C#, Java and even Delphi all had a RAD approach to developing UIs. Why don't more modern languages work on even a basic UI as part of their standard libraries. What kind of system doesn't have a UI these days? Outside of servers obviously. The closest Go and Rust (modern languages...) are getting to a UI seems to be WebAssembly, which for now points towards an Electron UI solution.

I'm not super familiar with SLIME, but that's the goals of LangServ is to consolidate everything into a known protocol. There's also a spec for debugging.

Oddly enough the Wikipedia page for SLIME doesn't seem to mention Racket[0], but it does mention Clojure.

[0]: https://en.wikipedia.org/wiki/SLIME


A flexible UI library that is up to modern expectations is complex and hard. It is even harder if it is supposed to be cross platform. So the only realistic option is to bind to a good existing library. However, these are all written in C++ and hard to bind to anything else. Gtk isn't sufficiently cross-platform to be attractive.

In the end, this is a project that is at least as much effort a creating a programming language.


And here are UWP Apps. They are based on a vNext edition of COM and use language projections to enable a OO-style usage in C#, JavaScript, C++. Actually works. In the end you work in the languages with these objects like any other object in that language.

It is possible, just require major engineering smartness (like writing and improving compilers, code generation, etc).


Hell, it doesn't even have to be ultra advanced. Something close enough to Tk (not GTK, but Tcl/Tk which gets used by Python and a few other languages) would be fine in my book. It is simple, and a start, and allows the community to derive from a working solution to form amazing alternatives.

In fact... Racket has their own UI approach, but I'm not sure if it's just GTK under the hood or what, but it works cross platform.


You're pretty much making my point for me, aren't you? UWP is a single platform interface only and the development of the bindings has taken a tremendous amount of work, as you say yourself.

Projects like sip and PySide show how much effort it takes to make Qt bindings form Python.

And all of this discussion just the bindings, not the actual widget libraries themselves and the effort that goes into those.


Yepp. It is just a sub point I address. That you can successful translate c++ UI libs into other languages.

But generally, you are right. Language development is expensive enough. UI stack development is in difference to class lib development not a duty of a language developer.


It's actually really common but the problem was that, until recently, it was an IDE specific feature and some of the best IDEs weren't free.

What this article is about is the new(ish) open protocol originally developed for Visual Studio Code that is being adopted by quite a few other editors. The language server protocol (LSP) enables developers to write the kind of tools you're describing and have them support any editor (which has LSP support) on any OS. It's pretty cool. However it's also a fairly involved process writing one so many of the smaller languages haven't yet had the bandwidth to writing their own language servers.


The problem is much easier (at least 10x easier) for a dynamic language with a REPL or playground environment (scratch in Emacs) than it is with a static language, or providing stateless insight based on cursor location rather than the set of forms you've evaluated.

With a REPL or playground, you've got all the libraries loaded, all the function definitions in a map in your interpreter, and if you have a partial identifier you can simply do a prefix search in the interpreter. The interpreter is live; define a new function, and it's added to the map, so it's available for subsequent completions. If the function definition is wrong, hey, it's a REPL, next line try again please.

In a stateless situation, or for a static language, you need to build that map. Ideally you use the compiler's front end, but then the front end needs to be hardened to recover properly from errors. It needs to be told about the cursor location, and when it hits the cursor location in the middle of an incomplete symbol, it needs to figure out what's appropriate to complete at that point based on the stack of scopes available, and jump back out to the editor (or use coroutines, or pause its thread, or whatever) with what it found. It's simply more work. And if you can't use the front end, because the front end wasn't built for it, and you can't modify it, well you're going to have to build what is effectively a new front end to solve the problem.


> Lisp had this kind of interactive programming support for a long time using SLIME and Emacs.

For those like me who have Vim burnt into their fingers and brain, there is Slimv[1] for Vim. Another alternative is Vlime[2] for Vim. Both these plugins are based on the same client-server architecture that SLIME is based on. In fact, these plugins rely on Swank TCP server (the same thing that SLIME also relies on). Swank receives SLIME commands from Slimv or Vlime and executes them. Of these two Vim plugins, I prefer Slimv personally because apart from supporting Common Lisp, it supports MIT Scheme and Clojure too. Vlime supports Common Lisp only at this time.

[1]: https://github.com/kovisoft/slimv

[2]: https://github.com/l04m33/vlime


Or Spacemacs with Slime.


I thought this article was about "building a language server" like the title says, but this is really just a grammar set for an existing language server.. unless I'm missing something? I realize that this is being picky about wording, but it's a little misleading.


Author here. I wrote both the language server and Muon compiler [1] from scratch. The language server reuses the compiler code, for parsing and type checking. I'm not sure which grammar you're referring to? The VS Code extension [2] does contain a grammar, but it's only used for syntax highlighting.

[1] https://github.com/nickmqb/muon/tree/master/compiler [2] https://github.com/nickmqb/vscode-muon


The properties of the language listed on the websites are compelling, but I wonder what are you using the language for except the compiler?


My main focus has been on the compiler and language server so far. Next up is a tool for generating foreign function definitions. This tool will make it easier to work with 3rd party libraries. Once ready, I'm planning on writing some examples, such as perhaps a small OpenGL program to demonstrate this functionality. Hopefully that will encourage users to give Muon a try for their own projects!


Sorry for the misunderstanding, I didn't read clearly.. my apologies. Very cool!


Is there any good resources on writing language servers? I've tried writing my own one but while LSP is well documented, it wasn't really beginner friendly (ie where do I start when writing one from scratch?).


I didn't look for any resources beyond the LSP spec and VS Code documentation, so I can't offer any recommendations here I'm afraid. That said, looking back I'm pretty happy with the choice of using VS Code as the initial editor to get things up and running. While not perfect (e.g. VS Code will silently ignore your JSON message if you miss a closing brace), the docs are pretty good and it offers a reasonable level of debugging functionality (such as an output window where you can see messages sent to the language server). The Muon VS Code extension [1] is pretty minimal in itself, and it's MIT licensed so feel free fork and use it as a starting point.

[1] https://github.com/nickmqb/vscode-muon


Thank you. I may well do just that. (this is what I love about open source -- when it works well everyone is giving each other a helping leg up).


The Sublime Text LSP plugin has some debugging facilities, you can poke it however you like since it's written in Python, and the project is run by two of the most positive, supportive, and just plain nice people that I've ever dealt with in an open source project.

https://github.com/tomv564/LSP


My impression was that it's glue code that wraps around an existing compiler and turns it into a language server. There are piles of resources out there already on writing compilers


Good article. In fact the muon is very doc. Still, one of the curse when coping segment of python code is to handle those tab and white space. The muon help to have a feature (first line say tab or space). Sigh muon opt for that turn me off.


I agree that Muon's current handling of whitespace is not ideal. I have plans to allow users to customize it. For now, you can use the //tab_size=N directive as documented here: https://github.com/nickmqb/muon/blob/master/docs/muon_by_exa...


Moun’s for loops make me angry.


Can you expand on this?


From "Muon by example" [0], the for loops look pretty natural/intuitive subjectively.

  for i := 1; i < n {
    temp := a
    a += b
    b = temp
  }
> `for` works mostly like in other imperative languages. For convenience, the 3rd term can be omitted; in that case, the index variable is incremented by 1 at the end of each iteration of the loop. The index variable is scoped to the loop body. The variable can be omitted if you want to use an existing index variable, e.g.: for ; i < n; i += 1 { ... }

With arrays:

  for x in arr {
    sum += x
  }

  for x, i in arr {
    sum += x
  }

  for arr {
    sum += it
  }
> A secondary loop variable may be specified which holds the index of the current element. If no loop variable is specified, the name `it` is used.

[0] https://github.com/nickmqb/muon/blob/master/docs/muon_by_exa...


I’m also confused; Muon’s loop syntax and semantics looks nothing out of the ordinary except that it allows omitting expressions…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: