Hacker News new | past | comments | ask | show | jobs | submit login
Learn C and Build Your Own Lisp (2014) (buildyourownlisp.com)
603 points by AlexeyBrin 9 months ago | hide | past | web | favorite | 86 comments

I have been following this for the past few weeks in my off-time and would recommend it to everyone. It is also geared towards teaching C. I wrote in C many years ago but with the past 8 or so years writing exclusively C# it feels like a homecoming.

It does make use of the author's grammar parser, which is my only minor complaint. Does anyone know a good resource in which the author handcrafts a parser? I'm aware of the various parser generators that exist. I also read the dragon book in my college's compiler design course ~12 years ago. But we also used a parser generator in that class.


That's the algorithm for the common lisp reader. Writing a recursive descent parser based on that spec is fairly straightforward.

And once you hand write your own recursive descent parser you'll probably not want to give up the control that gives you to a parser generator. Yuck.

I was playing with a parser generator last summer, and it was immensely frustrating to bump into bugs it had & features it lacked. In the end, it was much easier easier to write the entire parser myself. So if I understand the point you're making, I agree that giving up control can suck!

You should check out parser combinators. I've also used parser generators in the past and experienced similar frustrations. Parser combinators are just so much easier to use.

Hey, I saw your comment on ketamine but couldn't reply. Are you still taking it?

And then you will write a parser generator with Parsec and you will not want to give up the ease compared to writing your own recursive descent parser :)

Yeah, parser combinators are basically magic, and not in an “overly clever code that you’ll regret later” sense.

And, while a Haskell-style type system makes them nicer, they work really well in dynamically typed languages too. (e.g. https://github.com/drewc/smug )

I'd love a good parser generator to save me writing recursive descent ones roughly annually in my career, but the ones I've encountered are awful.

Hand crafted lisp parser: https://github.com/sctb/lumen/blob/master/reader.l

Equivalent JavaScript: https://github.com/sctb/lumen/blob/master/bin/reader.js

It’s 158 lines of lisp, fully bootstrapped, and zero fanciness.

The "Let's Build A Simple Interpreter" series of posts (starting at https://ruslanspivak.com/lsbasi-part1/) is excellent, geared towards building an interpreter for a Pascal-like language in Python. It does not use a parser generator, and builds up everything incrementally. Unfortunately it's incomplete...

I read through parts of this book a few months back. I also did not like using that parser because I wanted to keep external dependencies to a minimum. Writing a parser for s-expressions turned out to be relatively simple. You can check out my implementation here: https://github.com/00vareladavid/scheme/blob/master/src/pars... . I believe the technique is called "recursive descent parsing".

> Does anyone know a good resource in which the author handcrafts a parser?

http://craftinginterpreters.com/contents.html the book implements the Lox language in Java and C.

I found LLVM’s Kaleidoscope tutorial to be great for learning how to write a parser from scratch.


Honestly, if I were you I would try to write it blind. It's one of the more rewarding things to write, because when you do it correctly, it collapses down to a bunch of macro calls. You'll start off writing it long hand, and eventually you'll start to see some patterns. Then you'll see the need for certain combinators, and then finally you'll see how everything collapses down. It's one of the more rewarding things to write if you're a fan of DRY, and the fuzzy feeling you get when you properly abstract something.

minischeme[0] has a pretty good parser built into the SECD machine.

I've been hacking on it here and there and almost have it r4rs compliant -- replaced the lexer with re2c and was going to do the parser in lemon but the design they already have is just too simple to mess with.

[0]https://github.com/catseye/minischeme --note: the bog stock original is floating around on the internets if you google hard enough.

I read this book about a year ago and later wrote another Lisp [1] from scratch but based on the design in the book. It's still one of my favorite computer science books. He covers quite a range of topics (C programming, traditional Lisp stuff, some other functional programming topics like currying) and the quality is superb. One thing the Lisp interpreter described in the book lacks (if I recall correctly) is a proper garbage collector, which can be an interesting extension to the project if you're up for a challenge.

[1] https://github.com/iafisher/scam

Semi-related, racket http://racket-lang.org/ is a lisp dialect and touts itself as "the world’s first ecosystem for developing and deploying new languages. Make your dream language"

Honest question: why do people like lisp so much? Everything I've been able to learn about the language is that it was designed for a non-register based hardware that never caught on, so now it's just a less effective language that doesn't run as fast as anything else and has a steep learning curve since you need to learn a new programming paradigm. The only good thing I've heard about it is it's simple to implement since it doesn't have a ton of functions, but that doesn't seem like a very good reason to do something just because it's easy to write the code.

I programmed professionally in common lisp for about a decade. Everything you said is exactly backwards:

- You can use whatever paradigm you want.

- Most implementations are fast enough.

- A competent programmer with expert guidance can be effective in common lisp within a day.

- It has many functions, perhaps more than any other language

- It is tricky to implement fully and correctly (I used to contribute to ECL; Embeddable Common-Lisp)

Lisp is perhaps the most effective language.

ESR says[1]: [LISP is worth learning for] the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use LISP itself a lot. (You can get some beginning experience with LISP fairly easily by writing and modifying editing modes for the Emacs text editor, or Script-Fu plugins for the GIMP.)

However RG says[2]: I just wasn't a very good programmer any more. Lisp's power had made me complacent, and the world had passed me by. Looking back, I actually don't think I was ever a very good programmer. I just happened to have the good fortune to recognize a good thing when I saw it, and used the resulting leverage to build a successful career. But I credit much of my success to the people who designed Common Lisp.

[1]: http://www.catb.org/esr/faqs/hacker-howto.html

[2]: http://coding.derkeiler.com/Archive/Lisp/comp.lang.lisp/2006...

It wasn't designed to run on specific hardware, and originally ran on the IBM 709, which has registers. Lisp Machines were designed to run Lisp, not the other way around.

Why Lisp? For me, it gives me the freedom to get software written with the minimum of fuss in the least amount of time. I can keep my Lisp up and running, while I make changes to my code. I can type in and print out complex data structures without having to write parsers or print methods. I can experiment with data without defining new classes. If some functionality doesn't exist, I can add it: there's no distinction between the language itself and the libraries.

Maybe you should speak to more people who use Lisp, or try it yourself. If you don't know functional programming, you can still write basic Lisp until you learn it.

SBCL is together with luajit the fastest dynamic programming implementation there is. It runs laps around things like python and ruby. It does not have a steep learning curve. You become quite capable quite fast,but there is a lot to an implementation like SBCL and it will take a long time to understand all of it.

The metaprogramming utilities of lisp are still unmatched.

Scheme (my daily driver) is a smaller language. It consists of a small set of well chosen primitives that easily compose to build higher abstractions. It is really nice to work with.

Chez is also in this performance league

do you write about your usage / workflow with scheme ? I want to make it my daily driver but I fail so far

Are there benchmarks of SBCL against LispWorks and Allegro?

Generally they are in the same league. I would expect that SBCL is faster in some benchmarks and applications, but slightly slower in real life Lisp applications. Allegro CL and LispWorks have been used for some very large commercial applications - where the application developers demanded special optimizations for those over the years. The implementors actually don't put too much attention to benchmark performance - but application developers pay for real-life performance.

I would expect that SBCL might have an advantage in some areas where it's easier to write fast code, because of its type inference and compiler hints. That's very useful.

Some benchmarks on smaller machines like Macs and ARM-boards. http://lispm.de/lisp/benchmarks.html

Thanks for the insight.

Not that I know of, but LW 64bit and SBCL are comparable in my experience. Allegro I have heard is slightly slower but with a better memory usage (so maybe like CCL?)

I see.

Just mentioned them, because many ignore them as they are commercial products, yet I would consider them the surviving ones from Lisp Machine days developer workflows.

So I would assume, parallel to the the graphical developer tooling, they also offer quite good compilers.

The compiler is only 1/3 of the equation. One needs also a fast runtime (memory management, signal handling, implementation of language primitives like bignum arithmetic, ...) and a fast implementation of the core language library - since that is also mostly written in Lisp.

Additionally there are roughly three usage modes for the larger CL implementations:

1) interpreter, often used for development/debugging

2) safe compiled code with full error checking and full generics, often with lots of debug info

3) optimized code, with various degrees of unsafeness, with no debug information, often with limited or no generics

Often applications are a mix of 2) and 3), where often 3) is limited to the portion of the code that actually needs to be VERY fast. This means, that the large majority of the code is fully safe and fully generic code -> thus it has a lot of influence how fast the language/application feels.

For example when one uses CLOS (providing a lot of generic and extensible machinery), a CLOS implementation will use a lot of caching to make it run fast -> caches cost memory. Now, when one starts a CLOS-based application these caches need to be computed and filled - which might make the application at start up feel a bit slow or sluggish. So it makes sense when generating an application to save it with the caches pre-filled - then at application start, caches are simply already loaded and there is no performance hit at startup. That's not something one can see in a simple micro-benchmark or which depends on the 'compiler' (the part which compiles code and generates the machine code) - that's performance from the wider CL system architecture -> one needs to be able to provide that to CLOS applications to improve user acceptance.

Which is one of the reasons why CLOS is better than any of the bolted on object systems of scheme (coops, goops etc).

I really want to prefer CL, but I always end up trying to write scheme which kind of works, but quickly becomes weird.

There are many Lisps, some of them rank as arguably the fastest dynamic programming languages there are. They can even compete with statically typed languages in performance. So you should expect Java level performance to be possible on many Lisps. In the 1950s, Lisps were considered too slow in comparison to Fortran and C. You have to think though, (as I explain later), Lisp is arguably the first higher level language. In the 1950s, Java, Python, C#, Ruby, JavaScript, Haskell, F#, would all have been considered too slow also if they existed. Lisps compete with Java, C#, Haskell, etc. in terms of performance.

There's now also more archaic Lisps that don't use a GC, and compile to very efficient machine code, that could compete with C speeds.

Most Lisps actually have too many functions. Not that its bad, but I think you got confused with they require very little primitive functions to mean there standard libs to be small.

On to why, it's hard to explain, but it's arguably the greatest syntax ever designed. In that it's simple and consistent, yet can express all things and easily be extended. You really have to try it to understand though, and give it a good 30 days. It's only once you're familiar with it and past the initial hump that you understand its appeal.

Finally, I'll explain the historical reasons people like Lisps. Its because they pioneered garbage collection, conditionals, closures, multiple inheritance, mixins, dynamic typing, macros, meta-programming, functional programming and REPLs (aka programming prompts). They were also the second OOP language, some people say to this day, only Smalltalk and some Lisps are truly OOP. So there's a lot of love due to its legacy of everything it gave us.

They're now being rediscovered, and because most software can now be successful even at Java's performance and memory usage, Lisps are making a come back. Since they no longer have any defeating drawback.

You missed one feature, most of the IDE ideas related to the work done at Xerox PARC languages (Interlisp-D being one of them) and Lisp Machines. :)

"Everything I've been able to learn about the language is that it was designed for a non-register based hardware that never caught on"

That's a silly thing to say, especially when two of Lisp's primitive operations, CAR and CDR, are literally named after IBM 704's registers and the first implementations were done on register-based hardware (IBM mainframes and DEC minicomputers).

I don't know where you get that it can't be fast, that's certainly not true. Gambit Scheme is just one example of a fast lisp.

People like lisps because they are written as ASTs. So you can manipulate programs (ASTs) to create DSLs very easily.

Metaprogramming. At least that's what I enjoy. In Lisp you can "quote" s-expressions (think "function body"), manipulate the AST and evaluate them how you see fit. It's like Ruby's blocks or Python's metaclasses, but with any construct in the language.

Performance wise it's not slower than languages like Python. For some people (not me) macros simplify their development by a lot. The learning curve is honestly pretty shallow; in my experience Lisp is hard for people who haven't really nailed down recursions, and in this respect it's also a great learning tool. Being simple to implement is a big plus as it allows you to focus on the crucial parts -- IMO people learn better when they get to learn one thing at a time and having a small core to your language is important.

> Performance wise it's not slower than languages like Python.

That's a low bar. Performance wise, most languages are faster than Python.

However, CL and Scheme both blitz Python when it comes to performance.

Are there any CL implementations that are as slow as python?

The slowest common lisp I'm aware of is CLISP (used in the Land of Lisp book). I don't even think it is nearly as slow as Python. In fact, even Picolisp (weird, but cool interpreted Lisp outside the Common Lisp and Scheme families) is probably faster than Python, but I've never checked.

I am just providing a lower-bound to its performance.

I doubt anyone in the modern era thinks about 60's hardware when evaluating Lisps. If this is what you think of them, you're reading some odd stuff. Check out Paul Graham's On Lisp, or Seibel's Practical Common Lisp. Most people using lisps aren't implementing them.

People are drawn to them for various reasons, among them: strong metaprogramming facilities, real homoiconicity, DSLs, multiple paradigms, REPL development, hot-loading, and sheer elegance.

why do people not like it so much? i mean, nearly everything you have said is simply incorrect, so it’s hard to know where to start.

No comment on why people like to use it, but I know why they like to learn it: it's a new programming paradigm, as you said. It broadens the way you think about code.

Clojure/ClojureScript are pretty popular for a language that "never caught on."

OK reread and believe you were referring to architecture not language. My mistake.

You can also support the author by purchasing the book from Amazon. (: http://www.amazon.com/Build-Your-Lisp-Daniel-Holden/dp/15010...

There is also "Write Yourself a Scheme in 48 Hours"[0].

I am looking towards following this tutorial.

[0]: https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_...

Not a very good tutorial if you want to learn how to make a lisp interpreter. It uses a parser combinator library, when in reality a lisp parser is so simple to implement that all you need is basic C skills. Infact it feels like a self promotion of his combinator library than an actual tutorial. Better study MAL.

As usual, the classical book for anyone interested digging into Lisp development environments, is "Lisp in Small Pieces".

this book is really beautiful. it is in fact filled with exposition about how to build a lisp, but also how to bootstrap a language (or more generally any non-trivial environment).

queinnec went to some pains to make this book accessible, in particular by heavily reducing his usual reliance on denotational semantics. but its not really a good causal introduction to the topic, it requires a certain amount of work to get through.

but can't recommend this book strongly enough, maybe as the ideal followon to SICP

I've spent an hour on this now, and I've learned more about actual C programming than in several previous attempts put together. What a teacher!

Pretty comprehensive site I should say. Even if not for LISP, it contains good patterns for C, lexers, parsers, interpreters, etc.

As someone interested in Language. I’ll definitely be giving more thought to it.

This is a solid exercise. I particularly enjoy reimplementing the C primitives in Lisp once the system is bootstrapped. I used a quine as one of my unit tests for EVAL too.

Is there a guide like this that uses Golang instead of C?

You could try MAL (Make A Lisp) - it has a go version.


Another way to describe the MAL concept is that it aspires to be implementation-language agnostic, so that it gives you a series of steps that you're meant to follow regardless of what language you're writing your interpreter in. So the fact that someone else has (or hasn't) already succeeded in doing it in language X needn't affect your ability to do it yourself with language X.

I worked through most of MAL with Python and am currently doing it in Haskell. It's a great experience! (I think a few of the concepts could be explained in more detail, though. Maybe I should send in a patch.)

I highly recommend MAL. It does a great job introducing these concepts.

MAL is great!

Writing an interpreter in Go https://www.amazon.com/Writing-Interpreter-Go-Thorsten-Ball/... is pretty good. The book assumes that you know Go and the resulting interpreter is not a Lisp.

I had fun extending the sample language to add new features https://github.com/skx/monkey/ though so far I've still not read the book.

I've been thumbing through that every so often (as is, presumably, the person who stole the package off my doorstep the first time they shipped it) and it's easy enough to follow without having ever written a single line in go.

Though I would suggest reading other sources on how pratt parsers work if you want to fully understand the pure simple genius behind them.

I recently wrote slightly similar example, much simpler, where I essentially implemented 10% of Basic in modern C++: https://github.com/Const-me/MicroScript

the question is which one learns you c the better? learn c and build your own lisp? or learn lisp and build your own c? both poses challenges with regards to the inner workings of c.

The comments here mention a lot of advanced stuff like recursive decent and compiler design. Should a second year student get into this book?

That's a question that doesn't have any really good answer, right? Unless you're looking for someone to say 'yes, unconditionally' or 'actually [x] requires masters-level+ mathematics'.

I'm self-studying and have been working through 'the Dragon book', regarding compilers. I like Lisp due to beginning my learning with HtDP and SICP, and the functional paradigm appeals to me. I'm just beginning K&R C, and Kernighan's Unix Programming Environment. Naturally I'm really excited to find this here so I'll definitely be working through it.

Were I more concerned about ticking boxes on my CV for the person in HR then I probably wouldn't bother.

I got into compilers when I was still in high school, I was 16 years old at the time.

It is only a matter of reading the required material and get going at it.

I'm currently looking into the clojure source code with the tim daly's book.

A great way to learn about the value of syntax.

Does this cover memory management?


Every value has both a constructor that assigns memory, and a destructor that frees it. Every expression also reallocates it's memory as it processes. So memory is managed and doesn't leak, but it doesn't quite fit with what we know as a GC, and could trash performance in some cases.

And Tail-Call-Optimisation isn't really touched on.

If you excuse the plug:

Scheme 9 from Empty Space (https://www.t3x.org/s9book/index.html) explains GC, tail call elimination, continuations, macros, bignums, flonums, etc.

doesn't seem to do GC or TCO!

I've got those in my scheme https://github.com/rain-1/scheme_interpreter but it isn't perfect

Find the style and code lovely.

Is this a good resource to beat the fear of recursion out of me? I get recursion, but it’s still really uncomfortable to think about.

I liked Common Lisp: A Gentle Introduction to Symbolic Computation. You can find its first edition online (free of charge): https://www.cs.cmu.edu/~dst/LispBook/

Apart from explanations, it also contains exercises and their solutions.

Woah! Thank you for the suggestion!

Just read The Little Schemer - the best book on recursion.

I'll check it for sure. Thanks!

learn Haskell or OCaml or something

alternatively just write more code because tbh that sounds like you just started programming

I find this dismissive comment funny: I started programming a long time ago in BASIC then assembly then Pascal then C and I still don't like recursion. I remember having to fix a bug in a recursive code that I didn't write: my feeling is that it is more difficult to debug recursive code than normal code.

If you want a job as an entry level programmer, do not bother learning C, go with something like C#, JavaScript, Java, or Python.

The author doesn't really list any viable reasons to use C over any more modern, convenient language. Just the whole "shows you how memory management really works". A lot of C programmers remind me of the die hard Gentoo/Arch Linux people who think it'll help them "show you how Linux really works".

The value of learning asm and C doesn't necessarily come from using these languages, but lies in understanding how a CPU works and how memory is laid out and managed. So that you can put abstractions and automatic memory management to good work. There's a reason Knuth uses his MIX assembler language for discussing algorithms and algorithmitic complexity analysis. If you don't know these things, you basically don't know what you're doing when you're programming a computer.

Maybe, but the author writes: "This book is for anyone wanting to learn C, or who has once wondered how to build their own programming language ... But C is not about software development and careers."

So the author has something else in mind: building a high level language from the hardware up. C#, JavaScript, Java, and Python already contain much of the basic functionality of Lisp, most importantly a garbage collector, memory safety, and high-level data types.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact