Hacker News new | past | comments | ask | show | jobs | submit login

A generic AST and reference graph that shipped a common denominator of languages is not as easy to do as a set of common operations and queries

I think you may be underestimating the feasibility of what you propose.




You don’t need to do a generic AST. Have it be a protocol ... that runs in-process via an API. Because you know what, that’s what an API is ... a protocol for communicating with someone else’s code. But you don’t have to run extra processes or suffer context switches, and you don’t have to be in the business of debugging distributed systems in order to accomplish any tiny thing. Amazing!!!

This whole LSP thing is a mindbogglingly bad idea, brought to you by the same kinds of thought processes that created the disaster that is today’s WWW.


One of the design goals for LSP is that it is not in-process. Process isolation gives the text editor stability in the face of unstable plugins and allows multiple language servers to run concurrently without requiring the target language (which could be anything) to support threads and a C ABI - something which many languages have no need for.

Furthermore, many languages ship with a runtime which does not play nice with other runtimes. Try running a JVM, .NET VM, Golang runtime, and Erlang BEAM VM in the same process and see what happens. Better yet, try debugging it.


The downside to that approach is that it limits you to C calling conventions (most language communities are going to want to write in the target language) and means that the process will be as stable as the least stable plugin. Given how many bug reports were filed against every editor which has ever done that, it's easy to see the appeal of containment to the people working on an editor.

The other side of that is that we're not in the 90s with single core processors and there's been a lot of hardware & software optimization over time. Most people can run something like VSCode and a language server and still have multiple cores left over — and since the events which trigger language server interactions are generally made by a human with a keyboard it's not like you need to be in the microsecond range to keep up.


Try harder.

A PC full of programs built with these assumptions will probably grind to a halt all the time despite having very high spec hardware.


That’s a lot of fearmongering with no evidence. Do you have profiler data showing that the microseconds needed to pass a message between a process is a significant limiting factor in a program which is rate-limited by human text entry?

I mean, taking your argument seriously would mean everything should be hand-tuned assembly. Obviously we figured out that other factors like time to write, security, portability, flexibility, etc. matter as well and engineering is all about finding acceptable balances between them. Microsoft has been writing developer tools since the 1970s and in the absence of actual evidence I’m going to assume they made a well-reasoned decision.


It seems like the person was suggesting if all processes on your PC used a client/server model with message passing/RPC instead of the existing API model, the idle cores you speak about, would not be idle.

While you're right that productivity versus performance is a trade-off, and an editor is not necessarily a high performance application, its not clear to me whether future optimizations would reduce the gap, as much as optimizing compilers did vis-a-vis C and assembly.

In any case, that aside, the core guarantee of software stability with LSP remains to be seen.


> In any case, that aside, the core guarantee of software stability with LSP remains to be seen.

I don't follow this conclusion: haven't we already seen it with the way language servers crash and are just restarted without other side effects?


Fearmongering? That's a strange choice of words.

> Do you have profiler data showing that the microseconds needed to pass a message between a process is a significant limiting factor

What? I think you didn't get my point. Let me try again.

You can look at a single operation and say "oh, that's nothing, it's so cheap, it only takes a millisecond". Even though there's a way to do the same thing that takes much less time.

So this kind of measurement gives you a rational to do things the "wrong" way or shall we saw the "slow" way because you deem it insignificant.

Now imagine that everything the computer is built that way.

Layers upon layers of abstractions.

Each layer made thousands of decisions with the same mindset.

The mindset of sacrificing performance because "well it's easier for me this way".

And it's exactly because of this mindset.

Now you have a super computer that's doing busy work all the time. You think every program on your machine would start instantly because the hardware is so advanced, but nothing acts this way. Everything is still slow.

This is not really fear mongering, this is basically the state of software today. _Most_ software runs very slow, without actually doing that much.


I don't think this is true. Also any solution that involves humans just "Trying harder" is doomed to failure. History has demonstrated that over and over again.

The technologies that win are the ones that account for that.


Right, this seems obvious. Any idea why they opted not to go that way?

I guess the reason is that the original use-case was VS Code & TypeScript (?) and developing a lowest common denominator API for all clients would mean C, so they would have had to program to a C API even though both the client (VS Code?) and the server (Node?) are running JavaScript.

But then maybe the answer should be to provide better C invoke wrappers for high-level languages, not to use HTTP instead of C.


Knowing programmers, it's probably an absolutely irresistible concept that the provider for a given language must be easy (in the Hickey-ian sense) to write entirely in that language.


Not just a concept. Many languages are bootstrapped and therefore the canonical tools for understanding that languages syntax and semantics are in that languages stdlib. You would have replicate all of that in your IDE's language instead. It's DRY at work.


Seems a reasonable request to me.


I think the language server is not a solution to a technical problem; it's a solution to a social/political problem.

How to support intellisense for a language one time and have it work on many editors?


The best part is when I simulate bad network conditions by increasing latency and my autocompletion stops working (timeout 10ms) or becomes unusable.


Why would you need to simulate bad network conditions over localhost? It's probably you're most stable network connection in any situation.

You are simulating a condition that will almost never happen unless your machine itself is dying in which case you have bigger problems than auto-completion to worry about.


You have a custom communication protocol over UDP and discover that with high latency, transferring large chunks takes a long time. You have a laptop available for testing. What do you do?

Sure, could do some hyper complex setup with multiple VMs etc... or you could just run one command to temporarily increase latency on localhost.

If you know some better way of doing this, feel free to tell me.


Create a new loopback interface for testing? No reason you should be using your actual loopback for testing bad networking. Especially since the loopback interface is used for far more things than the one application you are testing these days.

Setting up a new interface is pretty easy on most unix OS's and then you can increase the latency on just that interface without impacting the interface most software on your machine is built to expect to be blazing fast. And you can mess with that interface to your hearts content knowing the only thing affecting it is the applications you are running on it.


That's a good idea actually, that didn't occur to me.


You mean overestimating the feasibility.


I might be, but the general concept of formal grammars are one of the most basic achievements of computer science.

I'd claim that for most languages you could define a formal grammar as an EBNF that would provide some basic utility.

Such a grammar would probably be overly permissive (it doesn't know anything about references, types or host environments) but it would provide you a baseline for syntax checking and autocompletion and could generate an AST that more language-specific rules could evaluate.


I think if you actually try this, you will find out why it's not a feasible concept (or why it's so vague as not to be useful). I think Hjelsberg and the C# team know all about grammars.

In particular, I know that using a BNF is not useful for shell, having ported the POSIX grammar to ANTLR.

http://www.oilshell.org/blog/tags.html?tag=parsing#parsing


Thinking that you can describe every programming language in BNF is similar to the RegEx problem: just as not every language is a Regular language that can be easily described as a single RegEx, not every programming language is (solely) a Context-Free Language with a grammar that can adequately be described in just BNF (or any other CFG description language) without ambiguity.

(Some programming languages are context-sensitive, some programming languages allow ambiguity in the main grammar and have precedence rules or "tie-breakers" to deal with those situations.)

"Universal grammar engines" and "Universal ASTs" are wondrous dreams had by many academics and like the old using RegEx in the wrong place adage: now you have N * M more problems to solve.


Right, CFGs are insufficient for most languages, and yacc is also insufficient for most languages. Yacc is simultaneously less powerful and more powerful than CFGs. Less powerful because it uses the LALR(1) subset of CFGs, and more powerful because you can embed arbitrary code with semantic actions.

I think the OP is looking for something more like this. This paper is from the same author as Nix, so he has some credibility. But I think this paper is not well written, and I'm not sure about the underlying ideas either. (The output of these pure declarative parsers is more complicated to consume as far as I remember.)

Still, the paper does shows how much more there is to consider than "BNF". Thinking "BNF" will solve the problem is a naive view of languages, and as you point out, is very similar to the problem with not understanding what languages "regexes" can express.

http://eelcovisser.org/post/135/pure-and-declarative-syntax-...

Mainstream parser generators pose restrictions on syntax definitions that follow from their implementation algorithm. They hamper evolution, maintainability, and compositionality of syntax definitions.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: