Hacker News new | past | comments | ask | show | jobs | submit login
Rust Moving Towards an IDE-Friendly Compiler with Rust Analyzer (infoq.com)
235 points by sethev 27 days ago | hide | past | web | favorite | 64 comments

> Another thing is difference in handling invalid code. A traditional compiler front-end is usually organized as a progression of phases, where each phase takes an unstructured input, checks the input for validity, and, if it is indeed valid, adds more structure on top. Specifically, an error in an early phase (like parsing) usually means that the latter phase (like type checking) is not run for this bit of code at all. In other words, "correct code" is a happy case, and everything else can be treated as an error condition. In contrast, in IDE code is always broken, because the user constantly modifies it. As soon as the code is valid, the job of IDE ends and the job of the batch compiler begins. So, an IDE-oriented compiler should accommodate an incomplete and broken code, and provide IDE features, like completion, for such code.

This has been one of the most frustrating things about Rust's development experience. A piece of code looks error-free, then you fix an error somewhere else, reach the next compiler phase, and an error pops up where you just were. It makes it really hard to develop something piece by piece.

I'm glad they're aware of these problems and are making a concerted effort to address them.

> A piece of code looks error-free, then you fix an error somewhere else, reach the next compiler phase, and an error pops up where you just were.

Isn't this the case with all compilers? The most basic example being (for example) I have a syntax error on line 10, and that's the only reported problem, but then I fix the syntax error and now I have an error on line 200.

I agree it is a problem, the world would be better if all languages could avoid it, but I don't expect it ever to go away. Do you believe this problem is better or worse in Rust compared to other languages?

Eclipse's ECJ will keep compiling a class even when a specific method is invalid.

And even inside a method it'll try to massage the AST into a somewhat valid state while you're typing so it can keep analyzing the code and show errors/make suggestions. A missing bracket on line 15 won't prevent it from soldiering on to spot a misspelled method call in line 40 even though the nesting is totally messed up and if taken at face value line 40 would be syntactically invalid since it's outside the class body.

And it doesn't just march on to identify errors elsewhere, it can often continue all the way to runnable bytecode. The invalid parts will just be replaced by exceptions that parrot the compiler error. Meanwhile other code paths can already be exercised which can be very handy at times.

On the flipside, I have managed to actually run non-compiling Java classes because of that a couple of times, only to stare at the debugger with wide eyes when I realized what was going on.

I could even understand not trying to handle the brackets scenario. In RLS, a type error on a local variable can hide a borrow-checker problem in a completely different file.

It’s more pronounced than in other languages. You could have an entire file with zero errors. No red squiggles, nothing. Then after you fix your errors in different files suddenly errors show up (eg for borrow checking, thread safety analysis, etc).

That always gets me. I'll be writing some code and think I finally got really proficient with Rust because I'm not getting any errors or warnings, then I try to compile and find out an error in another file was preventing the compiler from showing me everything wrong in my current file.

Yes I think what rust fans should argue is not that it’s just like compile time errors in other languages, but more like run time errors.

In other languages I'm familiar with as long as the syntax can actually be parsed, errors are mostly localized. Of course if you change the type of something that can cause errors elsewhere. But this is different. The new errors have nothing at all to do with the one that was fixed except that they come from an earlier stage of the compilation process.

> Do you believe this problem is better or worse in Rust compared to other languages?

Unlike other languages, Rust still has borrow checking to go after all the type errors are gone.

An idea currently being added to Swift, C++ (via static analysis, core guidelines lifetime profile), Ada/SPARK, D, OCaml and Haskell, originally developed in Cyclone, further explored in ATS.

So that "Unlike other languages" isn't quite true.

What Rust has done, was to prove it is viable to push such ideas into mainstream computing.

This is incorrect. Neither Cyclone nor ATS had borrow checking. In fact the ATS author at one point dismissed borrow checking as too inflexible.

The borrow checker is actually original, though based on known techniques. Specifically, the original part is that it allows the per-use-site choice of aliasing vs. mutability instead of making the decision globally.

I might be wrong on the origins, but I am certainly right that everyone else is now adopting similar algorithms.

So while it is great that Rust is making these concepts mainstream, in about 5 years time, the other languages will have their New Jersey style implementations mature and available for their communities.

Borrow checking cannot be soundly added to C++, or most other languages, for systematic reasons. Soundness isn't a theoretical property here: C++ has constantly been adding things that make it safer as opposed to making it safe, and memory safety problems are still a huge issue.

See "Safe C++ Subset is Vaporware" by Robert O'Callahan: https://robert.ocallahan.org/2016/06/safe-c-subset-is-vapour...

Why do you think I mentioned New Jersey style implementation?

Google and Microsoft are the ones driving that effort, in what concerns C++.

That vapourware is quite welcomed when using DriverKit, Unreal, COM/UWP, Project Treble drivers,..

It depends on the compiler. A naïve parser might reach a syntax error and error out. A more advanced parser will look for a synchronidation point that it can move to and carry on. If the compiler is in a statement, that might be a semicolon, for a function it could be a closing brace. From there it can reset to some degree and carry on parsing the file.

Hadn't thought of that potential benefit to the rust-analyzer work: I could get borrow errors upfront even if there are irrelevant errors in other areas of code.

(borrow checker seems like it is one of the last phases).

That’s how you know you’re reaching the light at the end of the tunnel, when the borrow checker errors start showing up

I've started associating a shower of warnings with "success", because those only appear once all the errors have been taken care of

Or, in my case, it tells me why I have to restructure my code :(

I mean, an early warning, before I actually made those mistakes would help a lot, if it can be done that is :)


Do you have to dump on Rust because of its learning curve? It isn't even that bad. It's not like Golang doesn't have warts of its own.

I have a hard time believing you make this choice often or "usually". This seems like flat out prejudice.

If go was fast enough why didn't you start out using it, or some other language that is 3x slower than rust but has gc and comfy ergonomics. C#, clojure, julia, ocaml( if you dont need threads). Or just plane Java.

> Another thing is difference in handling invalid code. A traditional compiler front-end is usually organized as a progression of phases, where each phase takes an unstructured input, checks the input for validity, and, if it is indeed valid, adds more structure on top.

Is this true? The way the d compiler does it is to make a special ast node type called 'error', which allows it to avoid this problem. I have also gotten error messages from c compilers that indicated they were doing something similar.

Now, in the d compiler, this is not a panacea, because erroneous code can cause erroneous error messages about unreachable code, but it generally works quite well.

rustc does the same. I believe the op complaint is that rustc does not perform type and lifetime checks on blocks of code that haven't been successfully parsed.

Aleksey is true to his “avoid success at all cost” strategy, constantly emphasizing that rust-analyzer is experimental and alpha quality. Even the installation process is I think deliberately clunky. In fact in my opinion it already provides a better IDE experience than RLS and is improving daily, whereas RLS is stuck in maintenance mode.

Also, you won't find a link to the Open Collective project for rust-analyzer on the GitHub repo or website.

I'm also contributing to https://opencollective.com/rust-analyzer, I want it to be a viable project.

That is understandable in the sense that the original RLS seemed to follow an exact opposite of a strategy: it was frequently said to be "pretty good already" and "close to 1.0", but some people experienced crashes and the code completion not working, and were obviously let down, which led to even some animosity towards the project.

Being alpha quality and providing better experience than RLS are not incompatible.

rust-analyzer is definitely alpha-quality at the moment.

Sure, it is a good idea to be upfront about the current state of the project. Still I think I am not mistaken in my impression that you don't wish for rust-analyzer to become popular too early as that would put some unwanted constraints on experimentation.

BTW, thanks the project and good luck! I use it daily and it is awesome.

>Still I think I am not mistaken in my impression that you don't wish for rust-analyzer to become popular too early as that would put some unwanted constraints on experimentation.

You are not mistaken indeed. More specifically, I do want to maintain freedom of pushing completely broken code. It's not necessary about popularity, it's about user expectations. And I also don't allocate as much time as I could into things that directly increase popularity. Like, the "install extension from the market place" could have been done almost a month ago, but I still haven't got to it. It's not that I am deliberately pushing back against such improvements, I just don't push them forward to actively.

What would a compiler for a language like Rust or C++ look like if it was optimized for development iteration speed first? Could you produce binaries that were good enough to be useful for testing and developing?

You'd end up sharing a lot of the same infrastructure that you'd want for an LSP server, basically to do as much preprocessing as you could on the initial compilation and then work incrementally on top of that.

You'd also of course still want your globally-optimizing batch mode compiler for production, but for many projects it seems like such a thing could produce good enough binaries for development, and greatly improve engineer productivity.

A good example is what IBM did with Java back in the day (already in 1997). They built an incremental Java compiler (jikes) that was intended for use during development. This later became the basis for the eclipse compiler which is the compiler Eclipse uses to ensure all Java code is compiled and ready to run the millisecond you stop typing. It's also able to compile and run broken code and it simply will error if you hit the broken parts. Things like autocomplete and other IDE features work even if your code is clearly broken and it will tell you it is broken. Being able to iterate on editing and running unit tests without any noticeable delay is very nice.

Here's a good overview of this on Stackoverflow: https://stackoverflow.com/questions/3061654/what-is-the-diff...

For Rust, it looks like they are following a similar approach with the rust analyzer. This is indeed not intended for building production code, which you'd typically do on some CI server and using optimizations that maybe make the process a bit slower but the output a bit faster/smaller.

Microsoft's Roslyn [1] design (now called .NET Compiler Platform) is what I think any language should strive for, turning the classical black box of the compiler into an incremental pipeline where each stage is annotated/augmented/refined by different mechanisms such as type checkers.

If each step is incremental, it follows that it's easier to also let each step "integrate" deltas; for example, adding a comment shouldn't need to recompile anything at all, merely modify the AST, because it doesn't lead to any actual code artifacts. The second benefit is that all the information needed to inspect the program — from syntax highlighting to type information — is already there.

If you can keep the entire high-level compilation state in memory, then it's easy to query it for all sorts of IDE features. This same design was used for the TypeScript compiler. Anders Heljsberg has a great little lecture [2] about it.

[1] https://github.com/dotnet/roslyn/wiki/Roslyn-Overview

[2] https://www.youtube.com/watch?v=wSdV1M7n4gQ

I"m not too familiar with either Roslyn or the Eclipse Java Compiler (ecj), but since ecj was developed specifically for deep integration into Eclipse JDK, I'd assume that it ended up being really similar (and it would have preceded Roslyn by a couple of years).

I don't know the internal architecture of the Eclipse compiler, but one of the points of the Roslyn architecture is to turn the compiler into an API, allowing things like inserting plugins — custom linter rules, for example — at each point in the pipeline.

Eclipse is interesting in that it grew out of IBM's VisualAge for Java, which was written in Smalltalk as an extension of the original VisualAge IDE/compiler, which goes back all the way to the mid-1980s. I wonder if any of the inspiration for this design came out of Smalltalk's runtime state being persistent. If all your state is derived from earlier state, then there's no need to always start processing from scratch, since you can just continue from where you were before.

That Anders Heljsberg video is great. Watched it last night and I'm still thinking through the implications of pull vs push. Thanks!

It'd look like Delphi i'd guess, at least the earlier versions. Delphi 2 is stupidly fast, almost instant compilation on a contemporary PC that runs at around 200MHz with 16MB of RAM and practically instant (in that it is impossible to notice it) in any modern PC (a synthetic benchmark i wrote some weeks ago had it compile a bit above 10MLOC in 5.56 seconds on my 3700X CPU, which is an average priced consumer CPU - on a single thread).

While it doesn't have the niceties added in later versions and in Free Pascal (like dynamic arrays, generics, anonymous functions and a few other things) it still has more than enough features to create big complex applications (e.g. object oriented language, rich RTTI that allows automatic serialization, properties, native string support, optional reference counting, real module support).

In terms of generated code... well, it isn't exactly great (it came last on this benchmark i wrote some time ago http://runtimeterror.com/tools/raybench/ that compares some C and Pascal compilers - i mainly wrote it for retrocoding, but threw in some modern compilers as well - EDIT: i misremembered, it didn't came last, Borland C++ 5.0 came last, in fact it came slightly above Free Pascal 3.0.4 with its default settings, though by enabling 64bit and modern instructions FPC was able to generate much faster code).

But then again, it is written in itself and as i wrote, it is stupidly fast. So it does the work perfectly fine for most tasks - as long as you don't need brute force number crunching.

Of course i'm not advocating the use of Delphi 2 (and i do not know how it compares with modern Delphi - last time i tried out the free version they have the entire IDE felt very weird and sluggish). However the fact that it exists is a proof that you can have a very fast compiler and IDE for a rich language that produces very good code - maybe not the best possible code, but good enough for a large majority of tasks.

I started out with Pascal. Compilation was such a non-event... Subjectively entire programs compiled faster than a line of Julia takes today. Then again, if I remember correctly Pascal the language was written with ease of compiling in mind. My compiler today has to do a lot more work so I can do a lot less...

Yeah although Delphi/Object Pascal - even in its Delphi 2 incarnation - is a much more complicated beast than Wirth's Pascal. But i think most of the extensions that were introduced over the years were still in the same spirit of easy compilation and parsing.

The Delphi compiler is probably structured similarly to the Rust compiler. They're both classical compiler designs. Rust takes longer to compile for other reasons.

So C++ Builder. :)

Modern Delphi is still quite fast.

C++ Builder 1 (which is contemporary to Delphi 2) is much slower than Delphi 2. It just happens to be faster than any other (full featured - i mean including IDE, debugger and such) C/C++ compiler of its time, but doesn't hold a candle to Delphi 2. Even the IDE is much more sluggish (in 90s hardware, in anything released this side of the millennium it doesn't make much difference :-P).

I don't know about the latest version of modern Delphi, but as i wrote, when i tried the free version 2-3 years ago it wasn't that fast. Lazarus felt faster and much snappier. Perhaps the non-LLVM Delphi compiler (if it is still available) is faster (the Delphi 2 compiler is certainly faster than FPC) but it is held down by all the bloat piled on top.

My point is that still run circles around several AOT compiled languages, including the C++ compilers FOSS devs tend to use, let alone languages like Rust and Swift (on their current state).

I would be quite happy with the compile speed of the latest Delphi version when using Rust.

Ah yeah, but i do not think C++ or Rust are any sort of compilation speed benchmarks (though the C++ compiler used in BCB1 is very fast... for C++) :-P. Free Pascal (which is still much slower than Delphi 2 in terms of compilation speed), D, Go and i'm sure several other AOT languagesi'm forgetting right now have fast compilers.

BTW i haven't used Delphi since its use of LLVM (or it wasn't in the free version) but didn't it affect the compiler's speed?

I was curious about the current state of Delphi, so I downloaded its latest community edition last week, and it still looks quite fast to me.

It would look like Lucid Energize for C++, here a 1993 sales demo:


Or the version 4 of IBM's Visual Age for C++

Visual Age for C++ was quite impressive, the whole program was serialized into a database, an approach similar to Energize, and you could edit C++ code just like in a Smalltalk environment.

Sadly not much of it has survived, so save pages like this while they exist


The problem with those two products was that they were ahead of their time, so with their hardware requirements, ended up being a commercial failure.

Then for something more down to earth, there is C++ Builder, the only actual C++ RAD tooling, but their market is only deep pocket corporations. Think VB experience, but using C++ instead.


Finally there is VC++, which also supports edit-and-continue with incremental compilation and linking, and with C++/CX I thought it could eventually be like C++ Builder but was wrong. Also with the C++/WinRT reboot, the tooling is still playing catch up with C++/CX.

However there is also the option to use interpreters like Cling.

Then you also have the development experience of Common Lisp, Dylan and Eiffel.

All of those have rich IDEs, with interpreter/JIT compilers for development, and then you can also AOT optimize for release builds.

Eiffel to this day uses their JIT (oder MELT as they call it) for rapid development, and then relies on a C or C++ compiler for the release builds.

Dylan while killed by Apple, had a short commercial life, and you can still get old of its open source variant, althought the IDE is Windows only.


Common Lisp experience is available on LispWorks and Allegro CL.

The Mun language is going down the “iteration speed as #1 priority” path. It’s written in Rust and made to also be compatible with C/C++


> optimized for development iteration speed first

I know! we need differentiable compilers! :)

I'm surprised that you consider compilation to be a big productivity limitation (at least where current rust or C++ compilers at low optimization levels aren't enough). A faster computer with more cores addresses compilation speed, but also makes tests run fast. :)

Supermicro sells boxes with up to 4 dual socket zen2 boards (512 cores!) that fit in 2U, e.g. 2124BT-HNTR ... :P and you can stick a bunch of developers remotely connected concurrently compiling on it.

That's pretty much what rust-analyzer aims to be.

Probably go or d.

Slightly tangential but one thing I absolutely love about sbt/play framework is the frigging development loop. You start play using `play ~run`, save changes and by the time you've switched to your browser, the source has been compiled to `.class` and you are ready to hit f5.

It's almost magical, DESPITE scala being slow as snail at compilation.

The scala metals LSP server also has better than enterprise grade documentation. https://scalameta.org/metals/docs/editors/vim.html

The Bloop project is doing amazing work as well in making Scala compilation leagues faster (even faster than SBT). I hear Scala 3 should help significantly in this area as well.

I wish this would be a higher priority generally, developer time being as expensive as it is.

It would be great to have a timeline as well: when does the project plans to superseed the current RLS? What's the goal for end of 2020?

At the same time I know how hard is to estimate projects.

From what I've read, the maintainers don't yet want to discuss when rust-analyzer will be the recommended IDE experience.

As a user, I can say it's pretty good and you should use it today if you're writing Rust code. So to me, it doesn't matter whether the docs recommend it, or if it's distributed by default with the compiler. I just use it and tell everyone I know to use it.

Quote: "The main thing is that the command line (or batch) compiler is primarily optimized for throughput (compiling N thousands lines of code per second), while an IDE compiler is optimized for latency (showing correct completion variants in M milliseconds after user typed new fragment of code)"

And then there's Delphi, doing both.

Delphi could pull this off by being a language designed to be easy to parse (and often unnecessarily restrictive or verbose as a result). This goes all the way back to Turbo Pascal - while Borland's DOS C compilers were also very fast, their Pascal was unbeatable because of very straightforward parsing, and true separate compilation of units.

*can pull this off. To this day Delphi is stupidly fast when compiling. Despite lately being bloated with stuff it didn't really need, but hey! C/C++ standard is all the rage so why not.

I am so grateful for the effort that people are putting into the rust-analyzer project. It keeps getting better. I wonder whether an entire class of challenges that the project faces is attributed to electron?

I use rust-analyzer and I don't think it has any issues related to electron.

You might be confusing rust-analyzer with coc[1]?

1: https://github.com/neoclide/coc.nvim

> I wonder whether an entire class of challenges that the project faces is attributed to electron?

No, not at all.

I have been using racer and neomake with a lot of success, fwiw.

Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact