Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Particularly regarding parsing: Here's an excerpt from an interview with Anders Hjelsberg I heard recently. He's the creator of the Turbo Pascal compiler and later on of the creators of C#.

https://www.aarthiandsriram.com/p/our-dream-conversation-and...

``` Turbo Pascal 2 and Hash Tables

Sriram: I think there is some, we're going to talk later about things like GitHub co-pilot, we're dealing with different kinds of code reuse, but I do think there's a little bit of romance in the, for example, I idol you and folks like John Carmack. John Carmack noodling away and trying to make sure every instruction and memory access is aligned on some cash line. The amazing thing is, even today, if you look at AI at the heart of it, you have matrix multiplication and doing things with floating point numbers. Some of these things really tend to matter, which on the theme of optimization, tell us a story of Turbo Pascal 2 and you discovering hash tables because it's legendary.

Anders: [laughs] It is a funny story. You have to remember here that I am self-taught in the art of writing compilers. Now, I have since discovered and read a lot of the literature, but at the time, I was just coding away and doing it the best way I knew how. When you create a compiler, one of the things you have to create is simple tables, or they're not necessarily tables. They could be linked list, they could be whatever, but you have to look up names. When you're trying to compile a code that tries to assign one to X, well, you have to figure out where is X? What memory address have I associated with X? You have to look it up the declaration of X.

In Turbo Pascal 1.0, the first version of Turbo Pascal, all of the variable declarations were just kept on a linked list because that much I knew, but of course, searching linked lists is not particularly efficient when they get large. For really large functions, that would just take an awfully long time. Then, well, maybe you could try first go by first letter, and then have 26 linked lists or whatever, and that could help a little bit. Then I remember reading the literature about these things called hash tables actually in this book by Niklaus Wirth, Algorithms +, what is it? Yes, Algorithms + Data Structures = Programs.

A great book. He explains hash tables and I go, "Oh my God, that's amazing. I got to go try this." I went and implemented it and boom, the compiler went twice as fast. There's Turbo Pascal version 2.0.

[laughter]

Aarthi: Awesome.

Anders: There's a good reason to sell upgrades. ```

Do you really need a proper parser to sell 1.0? A literal compiler project did not fail due to the lack of a "proper" parser (though it can be argued that the lookup table is after the parsing stage; obviously there was not a strong distinction in the Turbo Pascal implementation between tokenization and compilation).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: