
Nitra, JetBrains’ research project for language tooling, goes open-source - rdemmer
http://blog.jetbrains.com/blog/2014/05/27/nitra-goes-open-source/
======
bodski
This looks very much in the spirit of Yegges' 'Grok' project at Google [1][2].

Does anyone know if that project is still alive?

[1] [http://bsumm.net/2012/08/11/steve-yegge-and-
grok.html](http://bsumm.net/2012/08/11/steve-yegge-and-grok.html)

[2]
[https://www.youtube.com/watch?v=KTJs-0EInW8](https://www.youtube.com/watch?v=KTJs-0EInW8)

~~~
sitkack
You might be interested in
[https://github.com/yinwang0/pysonar2](https://github.com/yinwang0/pysonar2)

------
drdaeman
Just curious, how do they handle broken code? (Like when you start writing a
line in the middle of the file, not yet done with it, but already need all the
goodness like highlighting and code completion to work.)

A common approach with libraries I've encountered is that parser just stops
with error - but that's almost unacceptable for use in a proper code editor,
which should really try its best to recover and continue processing, even if
some chunk in the middle is failing.

~~~
mwsherman
That part of Visual Studio impresses me a lot, though it’s not obvious to a
user. I haven’t (say) closed the brace yet, so the code is invalid and
therefore a correct AST can’t be parsed.

It has to be heuristic? Or a given (say) line falls back to last known good
state?

~~~
lobster_johnson
I suspect the latter, helped by an ability to recover by inserting missing
characters such as braces and quotes. (Xcode actually offers to fix trivial
errors such as those, I'm sure VS does the same.)

It's an interesting problem. I suppose that as it knows the point of breakage,
it can annotate the AST to indicate breakage, but preserve the subsequent
node; breakage itself becomes a kind of AST node. It's possible that in such a
situation, any subsequent AST nodes probably have to point to their pre-
breakage nodes as parents in order to stay sane. Thus the AST tree becomes a
kind of Git-like revision history that stays fragmented until the next time
the AST fully parses. It could easily be something even simpler, however.

------
girvo
Hah! I've been looking for something like this lately, to have some platform
to integrate Hack and TypeScript into some IDE. TextAdept with LPEG has been
nice, Komodo Edit/IDE would be better but their docs are out of date, Sublime
Text is great but you're too limited in UIs you can build.

I had an idea of an IDE that instead of being built around a language was
built around frameworks and workflows in that language, which would require
deep understand of the target language. This seems like a great step in that
direction, but it's a shame I can't run it on Linux :(

~~~
phpnode
The rest of JetBrain's products are cross platform, so there's a good chance
that this will be in future.

~~~
citizenmatt
We do have a lot of future plans for Nitra, but for now, it's worth pointing
out that the products that are cross platform are Java based, while Nitra is
based on .net. The project is currently Windows only.

~~~
mythz
Interesting, what made you decide on .NET (i.e. given you're predominantly JVM
based)? Was it just based on the Nermerle's team preference?

~~~
citizenmatt
I wouldn't say we're predominantly JVM based. We've also got ReSharper,
dotTrace, dotMemory, dotCover and dotPeek, which are all .net based.

But basically, it's because Nitra is an extension and evolution of work done
on Nemerle, and Nemerle is a .net language.

~~~
mythz
Is ReSharper written with C#? I thought I heard it was written with C++.

~~~
citizenmatt
It's written in C#, with some of the Visual Basic support written in VB. Parts
of ReSharper for C++ are written in C++ and C++/CLI.

------
shadowmint
Can't wait to have a play with this (once I dig up my windows laptop...).

I have such amazing respect for the amazing products from jetbrains; having
toys to play with like this is just fantastic.

I'm particularly interested in the component based grammars; I'd don't quite
understand how you can get away with not breaking the 'parent' grammar when
you drop an arbitrary child grammar inside of it, but quite looking forward to
finding out~

~~~
sparkie
This doesn't _solve_ the problem of combining arbitrary grammars - there's
obviously restrictions on what you can add where, or a requirement to add
special delimiters around child grammars so that they can be parsed correctly,
but Nemerle takes a practical approach to the problem. You still cannot nest
arbitrary grammars inside others several layers deep - as each nested language
requires consideration of its parents to get the parse you intended.

If you're interested in the problem of combining grammars, I'd encourage you
to check out Diekmann & Tratt's Language Boxes ([http://soft-
dev.org/pubs/pdf/diekmann_tratt__parsing_compose...](http://soft-
dev.org/pubs/pdf/diekmann_tratt__parsing_composed_grammars_with_language_boxes.pdf))
[demo:
[http://www.youtube.com/watch?v=LMzrTb22Ot8](http://www.youtube.com/watch?v=LMzrTb22Ot8)],
which provide an elegant solution to the problem, although with the obvious
caveat that it diverges from plain-text file representation of code, and
requires an intelligent editor like their example implementation, eco
([https://bitbucket.org/softdevteam/eco](https://bitbucket.org/softdevteam/eco)).

Perhaps an interesting project would be to combine the two approaches, by
having a language-box aware editor which could automatically insert the
_correct_ delimiters around language-boxes (inferred by usage), and produce
plain-text representations which could still be understood by Nemerle/Nitra,
which is language-box unaware.

------
jimmcslim
I wonder if the open-sourcing of Microsoft's Roslyn, C# compiler project, had
an impact on this decision?

At any rate, I might have a look at this and see if a grammar for Delphi can
be built... the state of tooling on that platform is quite frankly dire.

~~~
citizenmatt
No, it's always been the plan to open source Nitra. It's come from the team
who built Nemerle, which is open source, and the team obviously wanted to
continue in this manner. And JetBrains has a pretty good track record with
open source - e.g. the IDEA platform that is IntelliJ's Community Edition is
fully open source.

------
Igglyboo
Having a minimal idea of how these work, how similar is this to antlr?

"It is also a build tool to compile the grammars into parsers" this line
specifically caught my attention.

~~~
sparkie
The parsing part is similar conceptually and syntactically, but their
implementation is very different. Antlr parses LL grammars - an unambiguous
subset of context-free-grammars which are quite restrictive in the production
rules they allow. This tool on the other hand uses PEGs, which parse a
different (but overlapping) set of grammars, which aren't necessarily limited
to CFGs, but are always guaranteed to be unambiguous. The main feature of PEGs
that allows this is that the ordered choice operator (|) - the correct parse
depends on the order you specify alternations, unlike with Antlr, where all
alternations have equal precedence.

It should be noted though that this tool is much more than just a parser-
generator - it's a framework for developing tools for interacting with
languages, which just happens to use PEG as part of that implementation.

~~~
quotemstr
Note that PEGs are _not_ context-free grammars. They're both more and less
powerful than traditional CFGs, and they're tricky to use: because PEG choice
is _ordered_ and traditional CFG choice is _unordered_ , it's hard to
translate standard language grammars to a PEG recognizer system. That's why,
for my forever-project, I've oped to use scannerless GLR instead of PEGs. Both
PEGs and GLR recognize languages that are closed under composition (the
property that gives you extensibility), but the formalisms for GLR parsers are
much better.

The Harmonia project is the best whack at the problem I've seen. See
[http://harmonia.cs.berkeley.edu/papers/twagner-
parsing.pdf](http://harmonia.cs.berkeley.edu/papers/twagner-parsing.pdf).

As others have mentioned, for an IDE, you also want strong error recovery.
Doing that in a general way when using tools based on declarative grammars is,
well, very hard, especially when you want to recover from brace mismatch
problems. The best approach is "island and reef parsing", where you actually
parse your buffer twice: you first build a map of all the "reefs"
(parenthesis) using a simple recursive descent parser, pair up mismatched
parenthesis using an ad-hoc algorithm, insert corrections for mismatches, then
apply your fully general parser to the result. (The word "parenthesis" here
refers to any balanced construct, even "begin" and "end". You can actually
infer what the "parenthesis" for a given language are by examining the
grammar!)

See also
[http://fileadmin.cs.lth.se/cs/Personal/Emma_Soderberg/docs/S...](http://fileadmin.cs.lth.se/cs/Personal/Emma_Soderberg/docs/SLE08pres.pdf)

------
moondowner
They have a Confluence wiki space for Nitra:
[http://confluence.jetbrains.com/display/Nitra/Home](http://confluence.jetbrains.com/display/Nitra/Home)

Here's the developer installation:
[http://confluence.jetbrains.com/display/Nitra/Developer+Inst...](http://confluence.jetbrains.com/display/Nitra/Developer+Installation)

------
moogly
I would love to see some ReSharper-quality IntelliSense for Rust code in
Visual Studio :)

------
caniszczyk
The Eclipse equivalent of this is Xtext:
[https://www.eclipse.org/Xtext/](https://www.eclipse.org/Xtext/)

------
S4M
So, this will be the Emacs Lisp of IntelliJ?

~~~
lapusta
It's written in Nemerle, that runs on top of CLR. I believe it's targeted to
.NET, VisualStudio & Resharper audience.

~~~
S4M
I don't see how that is addressing my point. Emacs Lisp is the scripting
language of Emacs, and you can use it to build AST to deal with syntax
highlighting and autocompletions amongst other. And that's what Nitra is
intended for, and it doesn't matter that it is written in Nemerle.

~~~
lapusta
IntelliJ platform runs on top of JVM, Nitra runs on top of CLR.

------
th3iedkid
they also have MPS on the grammer less IDE/Language tooling...

------
cbsmith
The article was written as though Emacs didn't exist.

------
juggty_dev
What about the security ?

~~~
citizenmatt
What do you mean? Security of what?

------
JackFr
Sounds a lot like parser combinators, without ever mentioning parser
combinators.

~~~
sparkie
Because they use PEGs, not parser-combinators.

------
ThinkBeat
Kinda like

Lex and Yacc and Bison Antlr Lemon LPEG Ragel re2c

or any other tool on this list

[http://en.wikipedia.org/wiki/Comparison_of_parser_generators](http://en.wikipedia.org/wiki/Comparison_of_parser_generators)

~~~
acdha
Except, of course, for the areas where it's different as explained in the
original post and their post last November:

[http://blog.jetbrains.com/blog/2013/11/12/an-introduction-
to...](http://blog.jetbrains.com/blog/2013/11/12/an-introduction-to-nitra/)

Before dismissing someone's work you could at least skim a blog post.

