Hacker News new | past | comments | ask | show | jobs | submit login
(How to Write a (Lisp) Interpreter (in Python)) (norvig.com)
222 points by alexkay on Sept 30, 2010 | hide | past | favorite | 39 comments



Be sure to also check "(How to Write a ((Better) Lisp) Interpreter (in Python))": http://norvig.com/lispy2.html


He implements call/cc with Python's exceptions. Pretty clever.


It does mean you don't get the full power of call/cc: you can't invoke a continuation again after it's completed. There are some cool things that stops you doing -- e.g., using call/cc to implement a backtracking searcher.


Every time I read code from Norvig, programming seems so simple and powerful, and I wonder how it is I spend such effort getting such meager results.


The prettiness of the solution is bounded on top by the prettiness of the problem. The problems we usually solve aren't very pretty.

Granted, some of that ugliness is of our own creation... but not all of it by any means.


This is true. But its still worth trying to write an interpreter for any language. I made up my own language which is a slimmed down and extremely consistent version of javascript. I was making the problem easy for myself but I still felt that I learned something - a way of thinking. Plus at the time my day job was throwing up some very unpretty problems. Writing the interpreter on the side generated momentum that kept me going through all the crap. I often think this may be the real benefit (intended or otherwise) of google's decision to let its employees work on a side project one day a week. It keeps them inspired. It also keeps them from getting creatively frustrated and taking it out on the product code by overengineering it.


Ten years of deliberate practice. (http://www.norvig.com/21-days.html)

Actually, in his case, more like 35.


For those interested in a more in-depth treatment of Lisp interpreters: "Lisp in Small Pieces", by Christian Queinnec is one of the canonical references in that area. (http://www.amazon.com/exec/obidos/ASIN/0521562473)

At university we had a copy on implementation of functional programming languages following "The Architecture of Symbolic Computers" by Peter Kogge, which is very good, too. (http://www.amazon.com/exec/obidos/ASIN/0070355967)


I like the quote that a powerful language should fit a page of code:

"I asserted that you could define the "most powerful language in the world" in "a page of code." I had orignally made the boast because McCarthy's self-describing LISP interpreter was written in itself. It was about "a page", and as far as power goes, LISP was the whole nine-yards for functional languages." --- Alan Kay in http://gagne.homedns.org/~tgagne/contrib/EarlyHistoryST.html


Norvig references Alan Kay's "Maxwell's equations of Software": http://queue.acm.org/detail.cfm?id=1039523

I thought this was a particularly interesting nugget:

""" Just as an aside, to give you an interesting benchmark—on roughly the same system, roughly optimized the same way, a benchmark from 1979 at Xerox PARC runs only 50 times faster today. Moore’s law has given us somewhere between 40,000 and 60,000 times improvement in that time. So there’s approximately a factor of 1,000 in efficiency that has been lost by bad CPU architectures.

The myth that it doesn’t matter what your processor architecture is—that Moore’s law will take care of you—is totally false. """


Processors have also gotten dramatically cheaper. How much computing power can you get with modern hardware if you spend the same amount Xerox PARC would have? That cuts out a good portion of the discrepancy.


I've always wondered about this quote. What, exactly, accounts for the differences here? What is it about that architecture that was so much better than modern architectures?


Actually the early Lisp had some bugs (or strange features). Like no static scoping. But they fixed it.


"They" as in Scheme.


emacs Lisp still has that "bug."


I love how short and simple the tokeniser is:

    def tokenize(s):
        "Convert a string into a list of tokens."
        return s.replace('(',' ( ').replace(')',' ) ').split()


It's simple because it doesn't do much. A really tokeniser would have the ability to handle strings, escape characters and comments.


And that's why it was changed to this kind-of ugly function in lispy2:

    def tokenize(s):
        """Separate string s into tokens.  A token can be:
        a comment (which is ignored); a paren or ,@ or , or quote or quasiquote; a non-string atom;
        or a string consisting of quotes around (non-quotes or backslash plus anything)."""
        tokens = re.findall(r"""\s*(;.*|,@|[('`,)]|[^\s('");]+|"(?:[\\].|[^\\"])*")\s*""", s)
        return [t for t in tokens if not t.startswith(';')]


Kind of an unrelated question. But the phrase without the text inside the parentheses would be "How to write a Interpreter". Wouldn't "An" be a better choice?

In this case, what is the best option? to be faithful to the original sentence, or to stick to what sounds best with the parentheses inserted?


I’d try to follow the rule that a sentence should always work without the parenthesis (counting the preceding space as part of the parenthesis). So I would do “an”.

A similar problem comes up with capitalization in sentences beginning with a parenthesis. For example, in this:

(Luckily) Some are already there.

Capitalizing “Some” looks weird, even granted it’s a clumsy construction. But I think it’s probably best to do it that way.


Thanks. I also thought so myself, though it would look and maybe sound awkward with the parentheses, the original phrase should have precedence.

PS. I just read my comment noticed I may have come off as a Grammar Nazi or something like that. It was an honest interest on the rules of composing sentences with parentheses (which I don't know myself).


I don’t think there’s necessarily a hard-and-fast rule about this, and I find sentences sometimes read easier when a/an is set by the immediately following word inside parens, and sometimes when the choice is set by the word after the parens.


I think in this case it's probably best as is...reading the title as 'Lisp syntax' before English.


Norvig's interpreter is pretty solid, but another fellow did it recently as well: http://fogus.me/fun/lithp/


Python's a great language for teaching this. In my an undergrad CS languages course we wrote both an interpreter and compiler (to MIPS assembly) for Lisp in Python in just a couple weeks. I'd written an interpreter for another language in C++ once and spent 3/4 of my time getting pointers and other language constructs right.


glad to see this up here.

this is a pretty standard exercise in university programming language theory courses. last year when i took the course, we had to write an oCaml interpreter in oCaml, and had about a week to do it (complete with environments, bindings, expressions, custom operators, higher-order functions, etc...). it was challenging, but paid off as far as contributing to my understanding about how a) functional languages work and b) how interpreters work. an interesting follow up would be to write a compiler for the language as well.

if writing an interpreter is something you've never done before then this is, by all means, a worthwhile activity. it would have been nice had it been structured as a series of descriptions and exercises rather than with the answers posted along with it. so easy to look :)


I am in awe every time I read some of Norvig's code, it seems to be so simple, short and readable. He is truly a coding master :))


Corrected Title:

(HowTo (In (Write LispInterpreter) Python))

[if the title actually was LISP syntax, everybody would be programming in LISP today]


Oh, for crying out loud! Have you no sense of history? The whole point of Lisp and Scheme is that you can write the interpreter in Lisp or Scheme! It's a sad day when you have to write a Lisp interpreter in Python, which is itself an interpreter written in C.

God has killed so many kittens because of you. ;-)


Is it turtles all the way down?


Well, the benefit of writing a Lisp-ish interpreter in Lisp is that you can focus on the semantics, rather than how things like garbage collection, type tagging, and symbol interning are implemented. When using another language, the relevant infrastructure might not already be lying around, but you can take advantage its novel features instead.


That may be valuable, but there's value in this too.

I've had a play this morning, and am already most of the way towards having the readline sitting in twisted and the instance feeding a webserver. Which is to say - lisp notation with all the benefit of python's great libraries, and I understand the layers completely.

No kittens were killed.


I wasn't the one who said the trite thing about kittens - I think writing toy interpreters and compilers is a great way to figure out how languages work.


yeah ... that was my point ... it's a tradition to write Lisp in Lisp / Scheme in Scheme but it's also a testimony to the simplicity of Lisp 1.0. The actual first functioning Lisp interpreter was written in IBM 704 assembler, but once that was done, everything else from that day forward could have been done in Lisp. A fair amount of it was, once macros and Lisp compilers showed up.


Lisp is not particularly unique in that regard. Forth is even simpler (perhaps too simple), and it's not that hard to write a naive implementation of Joy or Tcl.

Besides, piggybacking off the existing Lisp implementation doesn't help much when you're trying to figure out how the infrastructure (GC, etc.) is implemented. I learned a LOT implementing a Scheme interpreter in OCaml a couple years ago, and wouldn't discourage anyone from doing so in whatever language they feel most comfortable with. (I'm actually taking a break from writing a small compiler right now, for a proprietary database query language we need to convert.)

I think you are/were mostly being voted down because of your tone.


Relax :)


Upvoted just for the title. :-)


To be honest, when I first read the title I thought that it was yet another basic article about writing a basic lisp interpreter (google returns 47000+ links on this subject). But of course when I saw it was written by lisp über guru Peter Norvig I knew it'd be good.

(I was amazed when I saw that he wrote the scheme interpreter in Java in order to learn java better: http://norvig.com/jscheme-design.html)

Edit: fix grammar.


I think that's a natural course of action for people comfortable with formal semantics of languages: "implement it, or implement something in it" is an excellent way to learn.

OTOH, one gets a lopsided view of the language. I have been learning Java for the last 1.5 months, most of it hacking on ABCL. I have got a good understanding of the runtime system, reflection, class-loading, security and the type-system. But I still look up primitive stuff like string manipulation and file I/O operations.

I miss Lisp a lot, but Java makes up in platform what it lacks as a language. You can use class instrumentation libraries to generate and manipulate code at run time. It's well worth tolerating, at least until I know enough of it to wrap the libraries I need to use from ABCL.

Eclipse is nice once you ditch the mouse and install the Emacs bindings plugin.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: