Hacker News new | comments | show | ask | jobs | submit login
Language Server Protocol, Rust and Emacs (kellner.me)
119 points by fanf2 11 months ago | hide | past | web | favorite | 85 comments



https://lists.gnu.org/archive/html/emacs-devel/2017-04/msg00... ("I am urging the GCC developers to work on implementing the language server."). Interesting statement from RMS, given LSP is a Microsoft technology, but it makes sense. The software on both sides of the reference implementation is under a GPL-compatible license. Also, LSP doesn't expose enough information to allow expropriation of GCC's frontend, which was RMS's concern with previous such efforts.

Really says something about Microsoft's evolution if you think about it.


I'm pretty sure RMS said something to the effect that the ship has sailed with Clang. There's no longer any point in trying to prevent people from using proprietary back ends with the GCC front end. If they want to do it, then they can do it with Clang, so there's no point anymore in trying to "adjust" GCC's architecture.

I believe a prominent example is NVidia's CUDA compiler. As far as I understand this is just a modest fork of the Clang front end and a big new LLVM back end, and it's proprietary.

However Google has developed an open source alternative CUDA compiler, which is also based on Clang/LLVM: https://research.google.com/pubs/pub45226.html

Also I would have thought that Clang already has LSP support for C++, but maybe I'm wrong.


Another example are the companies selling compilers for embedded development.

Previously they had proprietary compilers or gcc forks that they weren't that happy contributing back.

Now they can have their proprietary clang forks and don't contribute anything back.

Plenty of slideware examples at LLVM conferences.

On one side clang was great for improving the whole safety and tooling story for C and C++.

On the other side, if clang was available to Brad Cox, NeXT would never had released Objective-C to the community.


> On the other side, if clang was available to Brad Cox, NeXT would never had released Objective-C to the community.

Actually the Stepstone "compiler" was a preprocessor that generated C (like cfront was for C++). Brad Cox never worked for NeXT.

The Objective C front end for gcc was written by Steve Naroff at Next. Later he and I persuaded Jobs that that front end should be merged into the official gcc tree, and so we (Cygnus) worked with NeXT to get merge them and, amazingly, produce Objective-C++ in the process.

Jobs was focused on the goal. When NeXT (and later Apple) were underdogs he was happy to embrace open standards (e.g. JPEG, MPEG, MPEG-3...) when it would help and outside proprietary standards when he thought it was needed (e.g. paid the rtf license).


Thanks for clarifying how everything went, and correcting my "what if".


> Now they can have their proprietary clang forks and don't contribute anything back.

References? Most of the "custom" compilers are based on some ancient version of gcc.

I haven't yet bumped into an embedded compiler that was clang-based.

> Plenty of slideware examples at LLVM conferences.

Well, the conference can obviously stop that, if they wish.


One example is the guys from Embecosm, http://www.embecosm.com/

Not everything they do comes back to LLVM, it depends on the customer willingness to let it happen.


Where do I find GPUCC so that I can install it?


I guess it's upstream now:

http://llvm.org/docs/CompileCudaWithLLVM.html

The team at Google published a paper in CGO 2016 detailing the optimizations they’d made to clang/LLVM. Note that “gpucc” is no longer a meaningful name: The relevant tools are now just vanilla clang/LLVM.

There is a Hello World program on that page, which looks good.


It's a free and open standard. Also a good idea. And technologically on a firm-ish foundation.

Ignoring it just because Micro$oft is attached is idiotic.


> Ignoring it just because Micro$oft is attached is idiotic.

True, but to a casual observer it also sounds like something RMS might do.


You don't need to be so aggressive about this, everyone already agrees with you: nobody, not even the usual Microsoft-haters, is hating on LSP. Not even RMS!


LSP is such a good idea, it is hard to ignore.


History says otherwise, they been ignoring Slime's swank for at least a decade


No, it's merely a work-around for two very distinct and self-inflicted problems:

- The pain of exposing language-native compiler/language AST and indexing APIs to a JavaScript IDE.

- RMS' inability to accept that GCC might need to expose an AST to external consumers, and his paranoia regarding how that might undermine his GPL-everywhere ideals.

Using JSON for local IPC -- and requiring IDE language support to funnel itself across that high-overhead IPC connection -- is fey.


LSP is sensible for the same reason that the middle end in a compiler is sensible - it turns an MxN problem into an M+N problem.

For compilers, M is source languages and N is target platforms. For LSP, M is source languages and N is editors.

They both have the same downside. The more esoteric the language feature, the lower the fidelity of support in the common middle.


Libraries solve the problem of said middle end in a compiler, and in an IDE.

If you insist on using IPC, then you've incurred a great deal of friction in a place that it matters.

If you insist on using JSON IPC, then you've incurred a great deal of overhead in a place that it matters.


This times 1000. AN API IS ALREADY A PROTOCOL but it doesn't require external processes, serialization, extra failure modes, etc, etc.

The fact that the HN community seems to have jumped aboard this idea, "yeah let's just require a server to do something simple like format text in your editor", is completely flabbergasting. People just seem to have NO IDEA how much complexity they are adding, and don't care.

Maybe in 5 years our machines will be running 10,000 processes at boot because people will want a server for every operation...


Do you know how these IDE features are currently implemented in editors like vim? Unless there is built-in support (e.g. ctags), most plug-ins that provide language-specific features do so by running an external tool, sometimes going so far as scraping the compiler output.

This means that on every single change, a new heavyweight process is created, communication happens over unspecified textual formats, and everything is likely to break with the next update because there is no stable interface.

JSON IPC with a continuously-running process using a well-specified protocol is a huge step up in comparison.


Just because the current way it's done is terrible, doesn't mean you should "upgrade" to a bad way instead of a good way.


What would a good way be? Why is having the code completion and parsing in a self-contained library a bad way? Or are you just opposed to using TCP instead of an ABI?


A separate server has also some advantages: (1) it's not going to crash your editor, and (2) it opens up the possibility of keeping the language server running outside of the editor and available to other tools.

Yes, serialization and communication has a lot of extra cost compared to a function call but consider that (1) the request rate is limited by user typing speed, which means around 10 requests per second tops, low enough that it won't matter, especially compared to running a type checker, (2) all the calls in LSP can take a very long time to complete, you won't be able to turn this protocol into blocking API calls that you do from the UI loop, because of this.

Even with a plugin you would need to run into a separate thread (or possibly threads) and cancel requests after a timeout.

That said, LSP is a terrible protocol. They chose to represent all offset in terms of UTF-16 (!) encoding units, which is truly retarded since most editors won't be reading UTF-16 files nor will they be representing them internally as UTF-16.


You can also have state and caching and such within the server so the client can afford not thinking about that as well


> They chose to represent all offset in terms of UTF-16

Wow, yeah, that's pretty terrible.


I imagine JSON serialisation overhead is pretty small when compared to parsing/typechecking a Rust program, which is probably what the Rust Language Server has to do whenever anything changes..

Not to mention that a lot of people capable of writing the tooling would struggle to export a C API. I write Scala in my day job and it would take me a while to learn how to do that - and I've done some programming in C/C++ before.


I'm a little confused as to what you propose instead.

Suppose I have vim, and the Rust compiler. I want to add RLS level of support to vim. I download some vimscript plugin, and what? Do you distribute the rust language server as a compiled plugin that you add to the address space of the editor at runtime? And if there's a bug, and it segfaults, then it takes down my entire VIM process?

It seems like there's some complexity in directly calling the code with an API too. It's actually not to bad to just open a pipe and communicate.

Maybe I'm missing something, but wrangling compiled plugins seems like it'd be a bad time.


While I do love vim, its own high level of internal implementation brokenness doesn't really have much bearing on how one implements this sort of thing in a real multi-language IDE.

And as for your question, the answer is ... yes. Sure. Why not? That's what we already do in nearly all IDEs.


I mean, if you want to build a universal system, leaving out vim and emacs is just shooting yourself in the foot.

Are there any plugins which are binary compatible between more than one IDE? That seems hard.


> I mean, if you want to build a universal system, leaving out vim and emacs is just shooting yourself in the foot.

Said no IDE user, ever.

> Are there any plugins which are binary compatible between more than one IDE? That seems hard.

No. And who cares?


An API that can be accessed from heterogeneous languages will involve IPC.

Particularly since the best API will use the compiler's symbol tables (avoiding implementing syntactic and semantic analysis twice, buggily), and compiler implementation languages are even more diverse than editor implementation languages.


> An API that can be accessed from heterogeneous languages will involve IPC.

No. If your language cannot call into a dynamic library using a well-defined C ABI for your platform, then it is already failing to speak a standard protocol. Building all kinds of crazy, complicated, slow infrastructure in order to get it to successfully speak some other protocol, is a symptom of modern-day clueless programming.

> Particularly since the best API will use the compiler's symbol tables (avoiding implementing syntactic and semantic analysis twice, buggily)

Yes, this is of course a good idea. Why one presumes this requires a separate running process, I have no idea.


> No. If your language cannot call into a dynamic library using a well-defined C ABI for your platform, then it is already failing to speak a standard protocol.

This also involves a marshalling cost at the ABI boundary, which may be lower overhead than parsing JSON, but is significantly more brittle. And it's less ergonomic for many plugin/editor authors. And it can't be spec'd with a schema that isn't just "read the headers."


>This also involves a marshalling cost at the ABI boundary

Only for some languages, and that cost should be far far less than running a separate process, shipping json over pipes, and parsing the json

> And it's less ergonomic for many plugin/editor authors

I think that many modern programmers find this more ergonomic than a C ABI is part of what he is complaining about. Let's get comfortable with what is good, rather than make what we are comfortable with?


Agreed, that pipes+json is much higher overhead, just noting that this proposed approach still isn't free.

> Let's get comfortable with what is good, rather than make what we are comfortable with?

I agree in the abstract! I just emphatically disagree with characterizing a C ABI as "what is good."


I don't understand what you are saying here.

Why is it "significantly more brittle"? It is a well-specified interface. It is less brittle than talking over a socket because the kinds of points of failure involved with sockets don't exist in this case.

> And it can't be spec'd with a schema that isn't just "read the headers."

What does that even mean? It's a protocol just like any protocol, except you get the added benefit that for many languages it can be typechecked. Why are you claiming it can't be specified or that someone has to "read the headers"? What headers?


From your endorsement of "using the compiler's symbol tables" (paraphrasing) I took you to mean that you're proposing binding directly to GCC (or another tool) as a library, relying on it's internal data structures as this C API. Based on this comment, it sounds like you're now suggesting that this API should still be standardized and require translating from the compiler's internals into some standardized AST/symbol format anyways. I still think the latter is bonkers for several reasons (SIGABRT being one), but it's significantly less bonkers than what I had thought you were proposing initially.


> I still think the latter is bonkers for several reasons (SIGABRT being one)...

The fact that we're typing this, and it works -- without the entire world falling apart because of crashes in complex library code -- demonstrates why this is not remotely bonkers.


Not only that, but it means if the library crashes, your editor process dies. That hardly seems better than sending some text over a socket. At least if the external process crashes, your editor can just restart it.


Several people have said this. Look ... a "crash" in a modern operating system is a recoverable exception.


How do you recover from segfault? How do you know what data might be corrupted?


A library API is bound to a specific language/runtime. But every language out there can speak JSON. Language servers are mostly written in the language they are for, because that language already has the compiler APIs. The editor is often written in a different language.


> extra failure modes

Well, different failure modes, maybe. If an external process crashes, then you just have your editor restart it. But if you've linked a library into your editor, and it crashes, then your editor crashes.

I much prefer either keeping that code in a separate process, or having that code written in a memory safe language, where it won't take down your editor when something goes wrong.


Incremental recompilation isn't fast enough to wait for between keystrokes, so in-process servers would run in their own thread. Along with accounting for arbitrarily incompatible language runtimes and memory management schemes, wouldn't we be looking at badly re-implementing half of a process-and-ipc infrastructure here, just without memory protection?

Agreed on JSON, though.


What library supports C, C++, Lisp, Java, JS and C# callers?


I think any C api should be target-able by all of those.


The JetBrains IDE libraries?

Those are separate libraries with support for C, C++, Clojure, Java, Kotlin, C#, JS, PHP, Python, and more.

All of them exposing the entire AST and the entire environment, which is far more than LSP ever did.


The question was not which library could present an AST for all those languages, but which kind of library format can be consumed from all those languages. And ideally without FFI and writing complex wrappers. C libraries non very convenient to use from C#, JVM, JS (node.js) or sometimes it isn't even possible (e.g. for JS in the browser).

The question is important, because otherwise one would constrain editors to be written only in a language which is compatible to the library format.


Using C libraries from C# or the JVM is very convenient, even today. There’s even automated systems to generate the entire bindings for the JVM, I’ve written bindings myself for a few libraries. You can just generate the interface file for Java from the .h with JNAerator, import it, and you’re done.


That doesn't solve the problem of a segfault in the C library crashing the entire JVM.


It does. Because there’s also libraries to automatically spawn a separate JVM and communicate with that via an IPC system. Or even spawn other things.

But if you want a system where I have to transfer gigabytes via a JSON IPC bus every minute, sure. That’s totally not going to destroy performance in projects that are several millions of code long with major auto-generated assets.

The language server protocol is useless for larger interwoven projects. The same issue appears already with JetBrains Rider (a C# IDE where the C# parser is implemented in a separate process)


> The same issue appears already with JetBrains Rider (a C# IDE where the C# parser is implemented in a separate process)

The fact that Resharper (which powers the Rider language processing bits) blocks the VS UI should tell you that this problem lies elsewhere.


Resharper is out-of-process nowadays, afaik. Even in VS.


Then definitely that should raise some alarm flags. Why would an out-of-process server block the VS UI, unless there's some other problem?


Why would you be sending gigabytes of data? You don't have to send the whole project across every time a change is made.


Because tiny changes can have major effects, for example, if you use templating (or Java annotation processing) and change a template that’s used everywhere.

The IDE has to always stay responsive, no matter what is happening, no matter how large the change is.


Such a change may have a huge impact on the internal state of the language server. But on protocol side it's probably minor. You should just one relevant piece of the new state on the next Goto Definition / Auto completion / etc. request. No need to stream the whole new state from the language server to the client on each update.

My current estimation is that the protocol costs are O(1) regarding project size.


That's the issue, that shouldn't be part of the language server, but of the IDE.

Please read this article, which was previously on #1 on HN, and is from a developer who has implemented IDE support for a language by using LSP: https://perplexinglyemma.blogspot.de/2017/06/language-server...

The developer of the rust plugin for KDevelop said he had to rebuild half of the rust parser because the language server protocol is so completely inadequate that it can't even be used to implement half of KDevelops features.

Same with stuff like data stream analysis, how does the language server protocol let you mark a variable and see a graph of where it comes from, and how it's transformed? How do you get proper structural autocomplete?

For many features the IDE needs actual full access to the AST. The language server protocol doesn't provide that, so you have to reimplement half of the language server in your own IDE plugin. And it's far too slow.

The language server protocol is comparable in quality to Kate's language plugins, which is okay for an editor, but completely unacceptable for an IDE. You can't do proper structural refactoring either. Proper structural extraction, proper autocomplete or ast transformations, automated inline functionality across files.

For a lot of functionality you end up having to try and get the AST back from the autocomplete and cursor info that the LSP gives you to fill the IDEs internal data structures, which is ugly, painful, and slow.


I've read the article and I partly agree with the author: To get the best-possible IDE experience you will need more information than LSP provides.

However I disagree that LSP is completely unacceptable for IDEs. I work daily with VSCode and it's LSP based language support for Typescript, C# and others. The amount of IDE features that I get for Typescript is higher than what lots of real IDEs provide for their inbuilt languages. The quality of the LSP based addons seems mainly to depend on the implementation of the language servers and not on the protocol, e.g. the language server experience for TS is top-notch, the one for C# is OK-ish and the one for C++ is pretty weak.

I see no problem on integrating any-kind of auto-complete support through LSP: The IDE/Editor just sends the position inside a document which needs to be completed and the language server responds with all possible suggestions. There's absolutely no need to store an AST in the editor for that. The AST can stay in the language server for that. It's the same for references to types, refactoring commands, etc. Yes, if someone needs some non-standard "data stream analysis" the command and results for that might not yet be standardized in LSP - but they could be added if it's worth it for the users. And I guess one can also have non-standardized optional extension APIs between a language server and an editor. In the end it's the same: The editor asks the language server (which has full access to AST): Give me "a graph of where the variable comes from, and how it's transformed", and the language server responds in a well-defined way which the editor just needs to render.

And back to the article: The point of the article is not necessarily that a Rust Language Service plugin wouldn't be good enough for 90% of all users. From my point of view it's more that the author is excited about implementing language support themself in their preferred way and as perfect as possible. That's an absolutely reasonable thing. If one is interested in implementing type-interference for Rust or other complex languages themselves then I'm sure it's a great learning experience. However it will also be LOTS of work, which for a huge language as Rust might be outside of the scope which most mortal developers will be able to do in their private time. If the end result is a half-baked implementation for an IDE which is more buggy or has less features than a language-server based solution (which could be shared between multiple IDEs), then nothing has been won. The language server approach helps here in a way that instead of many 30% language implementations (or even abondoned ones) in singular IDEs work can be contracted on one LSP implementation, which might reach 70-80% feature set.


> The IDE/Editor just sends the position inside a document which needs to be completed and the language server responds with all possible suggestions. There's absolutely no need to store an AST in the editor for that. The AST can stay in the language server for that. It's the same for references to types, refactoring commands, etc. Yes, if someone needs some non-standard "data stream analysis" the command and results for that might not yet be standardized in LSP - but they could be added if it's worth it for the users. And I guess one can also have non-standardized optional extension APIs between a language server and an editor. In the end it's the same: The editor asks the language server (which has full access to AST): Give me "a graph of where the variable comes from, and how it's transformed", and the language server responds in a well-defined way which the editor just needs to render.

That would be pretty stupid, though, as every language server would duplicate a massive amount of code.


And if you put the logic in the IDE, then every IDE has to duplicate the functionality for every language.

It seems like the language implementation is the right place to put logic that requires information about how the language works.


To implement a proper IDE, you're going to have to "duplicate" that logic anyway.


LSP--or rather the general idea of a common interface between IDEs and language-specific static analyzers--is a very good idea regardless of GCC or VSCode. Which isn't to say that LSP made universally correct design decisions (e.g. it's fair to be grumpy at JSON), and also isn't to say that it represents the be-all end-all of IDE integration (the fact that it necessarily represents a lowest-common-denominator interface is well-understood). But having such a standard to establish a baseline of support between languages and editors with the minimal amount of work possible is something that needed to be done.


Whether or not the IDE is JS based is asymmetrical. A common format for any IDE to integrate with is a fantastic idea.


That would be an API; JSON IPC is used because of the difficulty in hoisting language-native AST/indexing APIs into JavaScript.


I don't see that as actually being true. There are thousands of native npm modules which work just fine with Electron. There's really no problem exposing native APIs to JS.

What VSCode has pioneered, and this is something which Atom got wrong, is a multi-process model, where the various IDE components communicate via IPC. This improves stability and allows isolation of 3rd party plugins, and provides mostly foolproof parallelism. When plugins are such a big part of the experience, this is a very good thing. It's a deliberate design decision, not some attempt to overcome the failings of JavaScript.

Couple this with the fact that limiting "API" to mean "C ABI" forces every compiler author to start exposing complex C structures, unsafe pointers and the like - which if their compiler is written in, say Haskell, or LISP is going to be particularly painful, v.s. implementing text-based RPC over a socket in whatever way is best for them.

If Language Server had been a C API then I seriously doubt it would have got much traction, as it's just too awkward for many compiler authors to implement. Unsafe C APIs are, frankly, last century's technology.


> Unsafe C APIs are, frankly, last century's technology.

Which is why I see as positive the UWP, iOS and Android models regarding application extensions.

They might be more complex to implement than a straight unsafe C API, but that is exactly the goal, to be a safer alternative.


That's true for every combination of language / platform where the languages differ. Rewriting AST parsers for every language would be nuts. Overall this vastly increases accessibility / choice when it comes to languages. That's purely a good thing. People are still free to make hyper-optimized versions for whatever case they would like to.


Regardless of those points, having a standard protocol for this kind of thing makes a lot of sense, instead of binding directly to the AST data structures of each compiler.


let me clarify:

shipping along with your language, some kind of API that is standardized so that editors can easily add high quality IDE level features to support new languages is a great idea

that this is currently being done via a server + json over pipes or whatever is a suboptimal way to do it, I agree.


As a vim user, I'm very excited about LSP. Neovim (close enough) may add native support, which would be great: https://github.com/neovim/neovim/issues/5522

It's possible to do a lot of the nice IDE features in vim (and I'm sure emacs) today w/ all sorts of contortions, but a standard way to get this done would be amazing.


> On the initial opening of a project it will hog the CPU and spin up the fans but that doesn't last too long. Sometimes it completely hangs Emacs. Nothing that killall rls followed by a M-x revert-buffer can't fix.

Yep, sounds like emacs. (I use emacs, but my god is it a horrible collection of half-broken hacks. Still the most useful editor going, though. If anyone wants to fund me, I'll start a project to make an editor that combines the flexibility of emacs with a rigorous, ruthless obsession with speed and correctness. :)


emacs: Emacs Makes Any Computer Slow

Boy it's been a while since I've broken out that one ;)


It's as good a text editor as it is a terrible operating system

(to flip an old one on it's head).


The Language Server Protocol is flawed already in its concept, as this article https://perplexinglyemma.blogspot.de/2017/06/language-server..., which was also on #1 of HN before, argues.

It instead proposes to have the language server expose the entire AST and environment that is available at each point, and have the IDE uses that for autocompletion, as this is far more powerful than what is currently doable with LSP. (Currently, the editor just transmits cursor position and content to the LSP, which then does all the highlighting, autocompletion, etc. This is not only less configurable, and less consistent, but also less usable, as often the LSP isn’t able to offer as smart autocompletions)

Many IDEs, such as the JetBrains IDEA platform, do exactly this with their language plugins.


I mentioned this at the time too: I think this was a deliberate decision by Microsoft and it is not a deal-breaker. LSP is a Minimum Viable Product that was deliberately kept simple to encourage uptake and not spook people away. Once people are comfortable that it is a good idea and not an EEE play by Microsoft, version 2.0 could be upgraded to support things like ASTs.


> version 2.0 could be upgraded to support things like ASTs.

That’s called "you can create an entirely new API that’s entirely different, but might use the same name".

That’s the issue here, LSP is so fundamentally flawed you’d have to scrap it entirely to support what is needed for a real IDE.


Just a warning to anyone expecting auto-completion in Rust to be anything like C++ or Java, or even Typescript. It isn't close to that yet.


To counter that I've got a pretty moderate size Rust project that uses hyper, websockets, glium and a bunch of other libraries. RLS + VSCode has been incredible solid.

It's come a long way in the last few months, I'd recommend giving it a shot again.


The lack of code-inlined error reporting is still a big miss (unless I'm using the extensions wrong somehow). It's also pretty complicated to get right in rust as the errors tend to link through to multiple files for lifetime issue and such. I started tinkering with a version of this yesterday for vscode. Hoping to get a sane v0 off the ground.

Would be cool to have a linter that can run file-by-file as well, though that's tough for clippy as a compiler plugin it seems? Maybe just hacking it to only report on a particular file would work.


> code-inlined error reporting is still a big miss

Do you mean errors next to lines? If so that works today in the RLS variant. You get a red highlight and can mouse-over for details.


For me this only seems to work in the legacy racer mode (in https://github.com/editor-rs/vscode-rust). Have you had success in RLS mode? RLS just spits stuff into console. =|


I've gotten RLS mode to work for VSCode personally. I've had issues with larger projects, but for smaller projects it seems to work fine.


Hmm. I've had trouble getting that working. Will have to toy with it more.


While true, it is good enough for me to use VSCode, otherwise it wouldn't be installed.


How does it compare with racer-mode? Does it have roughly the same functionality or is one a subset of the other?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: