Hacker News new | past | comments | ask | show | jobs | submit login

What is this trying to say? If a crucial portion of your job depends on the ability to parse a language (and it does, if you're a programmer who uses an IDE), then that's a point in favor of a language that's LL(1) rather than context-dependent. Making an analogy to natural languages here isn't relevant.



In the context of the discussion, it has been pointed out that parser generators are not used in practice in multiple very successful compilers for very successful real world languages. The grandparent to my comment claimed that the fact that GCC uses a hand written recursive descent parser is proof that that approach to parsing is good enough for anything. The parent claimed that no, it's a proof that the grammar of C++ is 'terrible'.

My point was that grammatical purity doesn't appear to correlate particularly in the real world with the success/popularity of a language. My analogy to natural languages was relevant in that context. If parser generators are not well suited to some very successful existing languages, it's not particular useful to blame that on the languages and point to currently niche (and relatively young) languages that don't have that 'problem'. In the real world, the most used languages have and will continue to have for the foreseeable future 'terrible' grammars so there will continue to be a need for parsing techniques that can handle them.

I'd actually speculate further that the analogy to natural languages is relevant in that it may not be a coincidence that the most used languages have some of the most complex grammars. Why that might be the case is an interesting question to think about.


> that's a point in favor of a language that's LL(1) rather than context-dependent

and 5 points in favor of a language that's LL(0) rather than LL(1). The logical conclusion of your argument is to use Lisp everywhere.


Lisp is LL(1). If we have #, we don't know what we're looking at until we read the next character, like #s structure, #( vector, #= circle notation, etc.

Actually there can be an integer between the two; but that doesn't change what kind of syntax is being read so it arguably doesn't push things to LL(2).

Other examples: seeing (a . we don't know whether this is the consing dot notation, or the start of a floating-point token.

Speaking of tokens, the Lisp token conversion rules effectively add up to LL(k). 12345 could be an integer or symbol. If the next character is, say, "a" and the token ends, we get a symbol. Basically if we see a token constituent character then k more characters have to be scanned before we can decide what kind of token and consequently what object to reduce to.


That variety of Lisp with its particular brand of syntactic sugar is LL(1), and its lexing LL(k). Because of its circumfix notation, a Lisp language can be LL(0) when the prefix symbols and tokens are defined appropriately. That was the point of my original comment.




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: