Hacker News new | past | comments | ask | show | jobs | submit login
Interviews with programming language creators (2010) [pdf] (bas.bg)
212 points by MrXOR 22 days ago | hide | past | web | favorite | 44 comments

Here's a quote from the Ada interview with S. Tucker Taft

>Do you have any advice for up-and-coming programmers?

> Learn several different programming languages, and actually try to use them before developing a religious affection or distaste for them.

> Try Scheme, try Haskell, try Ada, try Icon, try Ruby, try CAML, try Python, try Prolog. Don’t let yourself fall into a rut of using just one language, thinking that it defines what programming means.

> Try to rise above the syntax and semantics of a single language to think about algorithms and data structures in the abstract. And while you are at it, read articles or books by some of the language design pioneers, like Hoare, Dijkstra, Wirth, Gries, Dahl, Brinch Hansen, Steele, Milner, and Meyer.

I think this point...

Try to rise above the syntax and semantics of a single language to think about algorithms and data structures in the abstract.

...is a stage that not many programmers reach (I certainly haven't). Agree or disagree? Or do you think it's importance is overstated?

I agree with him.

Read CS books about algorithms and datastructures that use pseudo-code instead of specific languages.

Lambda calculus, denotional semantics, theory of objects, ...

Then try to map how those concepts map to the daily languages that you use.

A very contrived example, a B-Tree doesn't stop being a B-Tree just because you switched languages. It just might require a different way to map the abstract concept "B-Tree" to the actual hardware, using the specific set of language features available.

It is absolutely correct. And if you can, you should try to gain some understanding of the inner workings of the languages you use. That often explains why feature X is not implemented.

Absolutely spot on ... learn assembly language then use to draw 3d graphics as well as impliment io or your favorite data structure ... true eye opener

I will be immodest and say that I reached that stage a long time ago. Because, for better or worse, I learned CS before I really learned to program :-/ I did program in middle/high school, but not very often, and I was bad it. I was better at reasoning about data structures than debugging, and I think the latter skill is "table stakes" before calling yourself a programmer.

As a concrete example of thinking outside a particular language, My shell Oil is 24K lines of code that does most of what bash does at 140K LOC, written not in Python, but:

1. OPy, a subset of Python 2 -- http://www.oilshell.org/blog/2018/03/04.html#faq

2. Zephyr ASDL -- http://www.oilshell.org/cross-ref.html?tag=zephyr-asdl#zephy...

3. a mathematical dialect of regular expressions (that runs under both Python's re engine and re2c's dialect, via translation).

In other words, it's a composition of DSLs, independent of any language.

I started a translator from OPy (typed with MyPy) to C++. It partially works but doesn't translate the whole codebase yet. If it works then I will have fully achieved this "language abstraction" goal.

It's analogous to how TeX was implemented by Knuth. TeX is not implemented in Pascal and compiled with a Pascal compiler. It's implemented in H-Pascal, and abstract subset of Pascal. Then it's translated to C and compiled with a C compiler!


In case it isn't obvious, this makes the style of code VERY different and more abstract than what you see in every other codebase.

This approach has downsides -- namely that the shell is way too slow right now. Hopefully the translation will fix that. Either way, I can definitely claim that the ideas and architecture are completely separate from any programming language. I've documented a lot of these ideas on the blog:


Interesting, is there a name for this subset pattern? I write in subsets to. Basically simplify the language slightly to make it easier to build ad hoc parsers and code generators for the problem domain. Linting on steroids, perhaps.

I don't call it this, but maybe:


i.e. composing different DSLs to solve a problem. The codebase is separated into parts and each part may be expressed in a different language.

I tend to call it "metaprogramming", which might be vague but is captures the spirit IMO. Metaprogramming how you implement bespoke DSLs. I categorize textual code generation as the most basic form of metaprogramming, i.e. programming where your data is code.

It's one level removed -- rather than talking about the problem, you're talking about the tools/language/construct you're using to solve the problem.

And yes I would say the other downside is that it's easy to go off the deep end :) But I think that certain problems are difficult and you need some leverage to solve them. For example, writing a bash-compatible shell would be extremely repetitive otherwise. It's like a dozen different ways of groveling through backslashes and braces one-by-one.

I'm a big fan of Language Oriented Programming and Racket, but I guess here what I am talking about is an "in moderation" flavor of that, where you are not building new languages by addition, but rather by subtraction (sticking to a smaller subset of an existing language). It's certainly the same idea, but this one is more of a pragmatic approach that hopefully helps you avoid falling into unanticipated complexity traps that you encounter when building a new language from the bottom up.

That's true -- I don't know of a name for it. There should be one!

Someday I would like to define the subset of HTML I actually use... HTML is messy with lots of implementation quirks. And plenty of tools operate on it in ad hoc ways.

It would be nice to have some notion of correctness for those tools, and defining a subset seems like the best way to do that.

Anytime there's a pattern in language design I don't know the name for I turn to my monoid-loving coworker next to me and ask what the horribly complicated term for it is in Haskell.

For now how about "Chop" Programming? "Choose Half" Oriented Programming. Or "Super" programming for "Subtract Unwanted Programming Elements Rigorously".

HTML5 is supposed to be a specification of HTML, with conformance tests and everything.

Have you got an article where you talk a bit about point #3 - "a mathematical dialect of regular expressions". The words all make sense, but I cannot picture it :-). Your Oil posts always appear up on lobste.rs shortly after they're posted and I thoroughly enjoy them, but I must have missed a few!

I don't know if this is relevant to what Chubot's doing but Brzozowski’s Derivatives of Regular Expressions are pretty mathy. (Disclosure, I wrote this) http://joypy.osdn.io/notebooks/Derivatives_of_Regular_Expres...

That's cool -- I have a big page of notes on derivatives and have thought about replacing my usage of re2c with them (re2c being automata-based).

One benefit of that is aesthetic -- re2c is a 30K line piece of C code itself. The other benefit is practical -- it would be nice to "hoist" it up to the Oil language level, so shell users can use efficiently compiled regexes.

And I filed this issue to try derivatives awhile ago:


There is a possible performance win, but I'm not sure if it makes sense. If you see anything interesting there I'd love to chat about it! (contact info in profile)


What I mean by "math" is either automata-based methods or derivatives. In contrast, Perl-style backtracking engines aren't math.

To me, the "magic" of "regular languages" is that they give you non-determinism for free. The equivalence between NFAs and DFAs isn't obvious, and it's useful in practice.

I've taken advantage of this free nondeterminism in my huge shell lexer. (See my other reply for details)

I'm enamored of dRE (as I call it) but pragmatically it seems to me you could stick with re2c.

I think it's interesting to generalize dRE to general control-flow (not just parsing.)

Are you aware of Abstract State Machines? https://en.wikipedia.org/wiki/Abstract_state_machine

I've discussed related topics but haven't had the chance to address it directly. I'm glad someone is paying attention :)

Concretely, the first two links in this post show (old versions of) frontend/lex.py and osh-lex.re2c.h. TODO for me: put up the latest versions, as well as the huge C file with state machines that re2c eventually generates.

When Are Lexer Modes Useful? http://www.oilshell.org/blog/2017/12/17.html

It works like this:

    ( ) frontend/lex.py (a bunch of Python regexes, has some "metaprogramming") ->
    (+) frontend/lex_gen.py ->
    ( ) osh-lex.re2c.h (re2c input file) ->
    (+) re2c ->
    ( ) osh-lex.h (state machines in C, i.e. DFA as a big  switch/goto)
where the (+) nodes are compilers, and the ( ) are source code files.

The reason I call this "a mathematical dialect" is because the same regular expressions run under Python's re engine (a backtracking engine) and as native code via re2c, an automata-based compiler.

If you scroll toward the bottom of this doc there's a useful table:

Regex Theory and Practice http://www.oilshell.org/share/05-31-pres.html

One side is "Perl-style regexes", which Python's engine is based on. The other side is "regular languages". Regular expressions were always mathematical, but the name got taken over by programmers to mean something different, so I call them "regular languages" or this "mathematical dialect".

These articles explain the difference,


but they're very long and a lot of people still don't understand the difference. https://news.ycombinator.com/item?id=20311630

(which is understandable since it mainly comes up in performance corner cases, and when you want to compile regexes, which most people don't do)

I should write about this, but the lexer is one of the more solid pieces of the project. That is, it's "done" for now, and I need help with all the other parts, so I prioritize writing about those parts!!!

More lexing posts: http://www.oilshell.org/blog/tags.html?tag=lexing#lexing

Learn what you need the machine to do to accomplish a task. In theory, the language shouldn't matter much. In practice, I find syntactical conciseness important.

You remind me of one of my favorite jokes: "Computer science could be called the post-Turing decline in the study of formal systems."

"Start with a brand new language and you essentially start with minus 1,000 points. And now, you’ve got to win back your 1,000 points before we’re even talking. Lots of languages never get to more than minus 500. Yeah, they add value but they didn’t add enough value over what was there before." - Anders Hejlsberg

I think this is very insightful. It's not enough to add (some) value. A language has to add enough value to matter to enough people in enough situations, or it doesn't gain any traction.

That’s a great quote. Also explains why we are seeing less new languages per developer capita nowadays, but far more packages. (There are about 10k languages, over 2M packages)

Sometimes your job is just to be a shoulder on whom others can stand. If you have a new idea but can't figure out everything, it doesn't mean you should abandon the idea. New and better languages often have taken things from other failed efforts. It's not all that bad a thing to have created a language that is -500 from that perspective.

The xkcd "Standards" comic is always relevant:


Mindblowing quote from Anders Hejlsberg: “You can sort of view the sharp sign as four pluses, so it’s C++++” [1]

Never thought about it that way.

[1] https://www.computerworld.com.au/article/261958/a-z_programm...

I always thought of it as (only) two pluses, just differently arranged (diagonally overlapped).

Yeah but I think the idea is that it's (C++)++, i.e., the version that comes after C++ (which is the version that comes after C).

Which does not compile in C++ or C# :)

Only because postincrement returns an rvalue. (++C)++ would be valid :)

I always thought that is obvious :)

Impressive to see how the design decisions of Clojure hold up so well 10 years later.

Also couldn't help but smirk when I read the answer to "What do you think will be Clojure’s lasting legacy?":

> I have no idea. It would be nice if Clojure played a role in popularising a functional style of programming.

Stroustrup C++ 'interview' [1]

[1] https://www-users.cs.york.ac.uk/susan/joke/cpp.htm

I loved reading Coders at work[0], it had a wealth of insight. This looks similar and interesting!

[0] https://en.wikipedia.org/wiki/Coders_at_work

Nice, this looks pretty similar to Coders at Work + they both came out around the same time. Anyone knows if they share a common origin?

I'm a bit disappointed that there's no DMR here, but seems to be a pretty interesting document otherwise.

Huh. Guido van Rossum's title was originally "First Interim Benevolent Dictator For Life".

Where is the interview with the author of Brainfuck??

Our minds would not be able to comprehend the answers.

I know you are joking, but as an esoteric programming language nerd...

Actually, brainfuck is pretty straightforward and is basically a very low level language for dealing with turing tape machines using only 8 primitives.

It's definitely not very human readable, but then, neither is actual machine code -- we use assembly mnemonics to be able to read and understand it.

The most important language is missing :P Java

C is also not there, nor any Lisp.

Detailed historic material about these is copious.

It's interesting to see interviews like with Luca Cardilli of Modula-3 and such.

> C is also not there, nor any Lisp

Shots fired at Rich Hickey and Clojure, then, as they are in the PDF.

The TOC lists C# and C++, but no C.

And Modula-3 but not Modula-2, Pascal or Oberon.

Well, those three would be one interview with the same person. The problem here is getting a interview, not just gathering historic material. Anyway, there is no shortage of detailed historic knowledge about the Wirth languages.


Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact