Interview with author of Cappuccino and Objective-J

olifante · on July 5, 2011

The interview links to an interesting post by author Francisco Tolmasky: http://cappuccino.org/discuss/2008/12/08/on-leaky-abstractio...

"Objective-J is our take on JavaScript 2.0. Now, this is usually the point at which someone chimes in that JavaScript is perfect, and that I just don’t understand it (you know, despite the fact that I’ve been using it for upwards of 10 years and have worked on its implementation in a major shipping browser). However, I think most people in the community can agree that there are some pretty big holes remaining in JavaScript."

jashkenas · on July 5, 2011

It seems a little bit risky to work on version 2.0 of a language while designing and building your own parser generator at the same time. I'd be curious to know why all of the existing JavaScript parser generators (including PEGs) were deemed to be unsuitable for Objective-J 2.0.

strmpnk · on July 5, 2011

It seems like he needed specific understanding of performance profile and writing your own is a good way to make that guarantee. Granted, language.js has some very interesting features that I've not seen in other PEG parsers so it's already pretty promising.

As far as the second edition of the language, I think they've got a good foundation with 1.0 on where the grammar gotchas will be (and specifically that the syntax was designed to avoid parser overhead since the days of ObjC). A powerful PEG parser will also make iterative extensions to this a lot easier (in my experience at least).

Zef · on July 5, 2011

Maybe he could have adapted an existing one, but I think the main issue was that parsers traditionally just quite when they find an error, language.js has some error recovery feature (naughty or) to proceed with parsing.

tlrobinson · on July 5, 2011

So far there are very few actual syntax additions slated for Objective-J 2.0, and it will be of course be backwards compatible, so we have a large base of existing tests and applications to test against.

Most of the features of Objective-J 2.0 are around what the new parser enables us to do "under the hood" rather than in syntax:

1. dramatically improved performance, in particular eliminating the "with" statements in generated code

2. line number parity in generated code for easier debugging

3. possibly features that require knowing types at compile time:

* aggressive dead-code stripping

* function inlining

* optional static type checking

* object immutability

* Objective-C 2.0-style properties

tolmasky · on July 5, 2011

So the answer to these questions has to be understood in the context of our goals: namely, we needed to have a parser that is very fast to compete with our current hand coded solution, and secondly it had to be very easy to use to encourage people to contribute. A big part of why I did this is because I (like you, I believe) want to see people experimenting more with new languages, and I simply don't think that's realistic with the current offerings. Its unfortunate that we haven't released my video from CappCon yet because I go a lot into this, but basically our experience with Objective-J is that people feel intimidated by the parser code and on top of that its very hard for us to properly analyze the patches they do send (its just a hard piece of code). So the concrete considerations we had:

1. Modify an existing hand coded parser like Narcissus: This would probably have given us the fastest solution (both in time to create and also in running time), but its probably obvious why we'd want to shy away from this: namely we'd be right back to where we started with thousands of lines of very difficult code. The lexer alone is 500 lines of finely tuned code that I would never want to merge a patch into. On top of this we would have to track this project and be careful when keeping up to date. As such we decided we really wanted to go down the parser generator route and see if that was feasible.

2. Why not use an existing non-PEG parser generator? My non-PEG parser choice basically boiled down to Jison (although I tried a few others but Jison was the fastest/most complete by far, its really quite well done). The reasoning here was more qualitative: I am pretty convinced that LALR, LR, etc parsers are simply never going to be accessible to most people. I still find them somewhat obtuse and I think most people agree that PEGs are "easier" (with the downside of purportedly also being "slower"). PEG's top down no-lexer nature makes them very easy to grasp in my opinion and is the right direction in terms of "ease-of-use". Also, the experimental features I wanted to add made a lot more sense in the PEG world. So basically it came down to "if we can get PEG fast enough, we'd definitely prefer that, if not its worth it to come back to Jison and see if we can make any headway in the ease-of-use department".

3. So why not use PEG.js? Now we get into the problem of speed. When parsing the entirety of jquery.js on my MacBook with PEG.js (~9000 lines of uncompressed JavaScript) it takes roughly 90 seconds, and the generated parser is 15K lines long, or 614KB. On the other hand, language.js currently takes 7 seconds to parse all of jquery, and the parser is only 61KB big. (BTW, if you're interested for comparison, it also takes Jison about 7 seconds and the generated parser is around 250KB). On top of this language.js has a lot of "philosophical" differences with PEG.js relating to creating a very user-friendly experience, such as the super easy error support that can never affect performance, and a stricter separation of syntactic and semantic analysis, etc etc etc -- just suffice it to say different enough where the optimal strategy was not to modify PEG.js itself.

So all in all, I'm really happy we made this choice. We have a PEG that is very competitive in terms of speed with Jison, and has the experimental features I was hoping for. language.js is by no means done of course, which is why it hasn't been rolled into Cappuccino proper, but I think this far its clear that it was the right decision for us.