
Simple steps to implementing a programming language - kvalle
http://kjetilvalle.com/posts/implement-a-programming-language.html
======
Udo
Inventing your own programming language is something I can highly recommend.
If you've never done it before it will dramatically broaden your programming
horizon. It's a journey.

About a year ago I decided to make a language mainly because I wanted to
experiment with some programming paradigms. I had made little domain-specific
"languages" many times before, but this was going to be different.

I'd suggest that you try to figure out most things on your own, however. It's
way more rewarding to think about the problems you're trying to solve, go
ahead and implement them. That way you're also forced to have eye-opening meta
thoughts about programming in general. Only after you've gained some
appreciation for the issues involved, look them up and see how other people
solved them. This worked very well for me.

Lisps and Schemes are certainly easier to implement than most syntaxes, but
consider doing something else so you can get a better understanding of the
complexities of parsing a non-Lisp.

In my case, I did my prototype in Java, that's the lexer, parser, runtime and
standard library. While I wouldn't necessarily want to do it that way again,
it was a great learning experience and I was pleased with the result.

About two months ago I decided to give it another try, but this time in C,
using the Lua codebase. So far this has been a very good choice for a second
stab at the idea. And I was pleased that I could recognize a lot of the
patterns in the Lua codebase where I had implemented equivalent parts in my
earlier project. If I hadn't taken the hard route back then, I probably would
have a harder time now.

~~~
the_watcher
How far along as a programmer should one be before attempting building a
language. I am a novice (very, very much so) Rubyist, and don't think I am
anywhere near being able to attempt it, but I do know that, when I was still
in school, the process most likely to make a complicated subject click with me
was to understand it all the way to it's origin (the starkest example was
actually the quadratic formula - I had tons of trouble understanding it and
how to use it until I had a teacher who made us actually derive it, and all of
a sudden I was one of the best in the class). Building a programming language
seems like it might have the same kind of effect on me, and, being a firm
believer of living outside your comfort zone, would like to know if it's
possible for someone who is nowhere near being able to call himself a
programmer to eventually build one.

~~~
mcguire
This may not be a common opinion, but building a language is certainly within
the reach of a novice programmer. The base ideas are very, very simple. You do
have to be careful not to be overwhelmed by either the terminology ("recursive
descent parsing" is a slightly more elaborate version of "reading a file") or
some of the more elaborate things that you don't need for the basics but might
want for a truly useful tool ("parsing" is a deep subject built mostly around
making "reading a file" simpler, for arbitrarily complicated files).

Building languages is also one of the most thoroughly studied topics, which is
both a boon and a bane. On one hand, there are lots of good tutorials and
excellent tools. On the other hand, there is _lots_ of stuff elaborating on
the basics and much of it seems almost deliberately complicated.

Go for it! There are a lot of much worse ways of going into the rabbit hole.

~~~
the_watcher
Thanks! Any suggestions for resources to help me get started? I'm not afraid
of complicated terminology as long as there is a way for me to easily figure
out what it means (whether Wikipedia, a glossary, or simply a great example).

~~~
loup-vaillant
Start with untyped lambda calculus, augmented with numbers, and addition.
That's what I did. Such a language is dead simple: you only have four kinds of
expressions: numbers, additions, functions, and function calls.

Write a parser and a printer (a pretty printer would be best, but you don't
have to). Use one to test the other. Then write an interpreter, who takes an
expression as input, evaluates it, and spits out a number.

The whole thing should fit in a couple hundred lines, tops. The only real
difficulty is to evaluate function calls. The rest is only a matter of
catching runtime errors (like trying to add a number and a function). To
properly implement your language, you may have to learn about closures and
lexical scoping, though there are tricks to sidestep the issue.

To test your language, check out the Y and Z combinators here
[https://en.wikipedia.org/wiki/Fixed_point_combinator](https://en.wikipedia.org/wiki/Fixed_point_combinator)
and use the Z combinator to implement the factorial function.

The next step is to flesh out your language a bit. I suggest you augment it
with "let" bindings first: it's just syntax sugar over lambdas: you change
your parser, but you don't have to touch the interpreter.

------
throwaway13qf85
Nice work. There's a similar article which uses Haskell as the base language
instead of Python, which is also worth reading (this is actually how I learned
both Haskell and Scheme..!)

[https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_...](https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours)

~~~
dmunoz
There is also SICP Chapter 4: Metalinguistic Abstraction. More precisely, 4.1
The Metacircular Evaluator [0]. Chapter 5 is about compiling to a toy register
machine, as well.

I'm a big fan of the approach taking in Programming Languages: Application and
Interpretation (PLAI). The Brown course from 2012 has video lectures available
[2]. This one uses what is very close to Racket as the implementation
language. It's typed, and called plai-typed.

The course these days uses the later chapters of Programming and Programming
Languages (PAPL). This uses a new language, Pyret [3]. It's a bit more clear,
due to the implementation language and source language having obviously
different syntax, compared to plai-typed vs. the s-expression based language
implemented.

[0] [https://mitpress.mit.edu/sicp/full-text/book/book-
Z-H-26.htm...](https://mitpress.mit.edu/sicp/full-text/book/book-
Z-H-26.html#%_sec_4.1)

[1]
[http://cs.brown.edu/courses/cs173/2012/book/](http://cs.brown.edu/courses/cs173/2012/book/)

[2]
[http://cs.brown.edu/courses/cs173/2012/Videos/](http://cs.brown.edu/courses/cs173/2012/Videos/)

[3] [http://www.pyret.org/](http://www.pyret.org/)

------
mqsiuser
During my project I was once at the point where I actually implemented my own
(interpreted) scripting language. I have never thought that I'd be one of the
guys who creates something like this, but once you are so fed up with
something over the years (and I was with ESQL just for being "closed source")
you'd just may be starting to go for it and just try out an dig into it.

And it works. I am now saying things like "If I'd need a JIT compiler for it,
I'd write one. I'll look at everything and evaluate if possible and how to be
done."

Inventing wheels is great for learning and I am now back to Java (good enough
;)

------
NaNaN
Yet another LISP-like language parser tutorial. Just like _Learning ???
Programming in ?? days_. This is just the beginning, guys. LISP-like languages
are not always great. (I'm not to look down this tutorial! Really!) I prefer
imperative programming language parser tutorial. Every time I see such title,
I will think of LISP.

~~~
vidarh
If you want an imperative tutorial then the Crenshaw tutorial [1] or Nicklaus
Wirth's book Compiler Construction[2] are great starting points.

[1] [http://compilers.iecc.com/crenshaw/](http://compilers.iecc.com/crenshaw/)

[2]
[http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf](http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf)

~~~
NaNaN
Thank you. I have read Crenshaw's tutorial before, and I think I should get
some time to translate it with modern language with unicode support to make
more fun. :)

------
kabdib
My rule of thumb when doing a new language: Write the debugger first, you're
going to need it anyway....

~~~
mcguire
Bah! If you need a debugger, you're not thinking hard enough.

~~~
WalterBright
Debuggers are crutches for losers. I use an oscilloscope to debug.

------
mamcx
This is great, but I want the _other_ things that are important but forgotten:

" We will not have:

A proper type system Error handling _Good performance And much, much more "

_Good performance is something that have a lot of literature, so can be
avoided in this tutorials.

I have been in this talk at lambda:

[http://lambda-the-ultimate.org/node/4929#comment-79628](http://lambda-the-
ultimate.org/node/4929#comment-79628)

Where is discussed where to talk about _implement_ compilers. I ask about the
other things that bother me:

\- Debugger/ Debugging experience \- REPL \- Tracing (ie: Dtrace or similar)

and other things that I wish to understand _before_ just make another parser.

How build a "hacker news" for it? Exist some project I could use?

------
DougWebb
This is nice, but I was hoping to find guidelines about designing the syntax
for a new programming language. As others are saying, s-expression syntax is
simple to parse because it's basically already structured like the AST. But
s-expression syntax isn't so easy to develop in, especially for non-
developers.

Why non-developers? I'm interested in DSLs that are used to add plugin-type
scripting capabilities, so that advanced non-developer users can extend
behavior on their own. Think Lua, but more specific to the domain. As far as
syntax, I'm thinking about something more like BASIC for its simplicity.

~~~
dragonwriter
> This is nice, but I was hoping to find guidelines about designing the syntax
> for a new programming language. As others are saying, s-expression syntax is
> simple to parse because it's basically already structured like the AST. But
> s-expression syntax isn't so easy to develop in, especially for non-
> developers.

IME, the "especially for non-developers" part doesn't seem to be true. People
who are exposed to s-expression syntax _first_ don't seem to be any more
likely to have problems with that syntax style than those introduced to
programming with other syntaxes are with whatever syntax the are introduced to
first; the people that are most put-off by s-expressions as a syntax of
expressing programs are people who are attached to some other syntax first,
and particularly people whose experience is with multiple languages all from
the same syntax family, usually the ALGOL-style family.

------
inetsee
I would recommend the Racket tutorial, especially, the section on using
Racket's macro system to implement a slight variation of Algol 60. Working
through this tutorial would give you a foundation on building your own DSLs,
as well as giving you a solid foundation on using Racket and its macro system
itself.

~~~
dmunoz
Could you please link to this? I didn't find anything when searching for the
relevant keywords, or see anything of the sort at [http://docs.racket-
lang.org/](http://docs.racket-lang.org/).

~~~
inetsee
I found the starting point here "[http://docs.racket-
lang.org/algol60/index.html"](http://docs.racket-
lang.org/algol60/index.html").

~~~
dmunoz
I found that as well, but it is not what inetsee described in his post.

I was interested specifically in "using Racket's macro system to implement a
slight variation of Algol 60", which is not what that section or the Racket
guide section on macros [0] does.

It's not clear what was meant by "the Racket tutorial".

[0] [http://docs.racket-lang.org/guide/macros.html](http://docs.racket-
lang.org/guide/macros.html)

------
platz
Handy reference for those embarking on creating a new PL:

[http://colinm.org/language_checklist.html](http://colinm.org/language_checklist.html)

~~~
Marcus316
While this is still amusing, and poignant, I don't believe the goal of the
original link is to encourage people to try and compete with modern languages.

For those of us (like me) who learn best by trying things out, a PL project is
a great way to wrap your head around some of the more abstract concepts of
modern programming languages (like closures, as mentioned in the post).

Still, +1 for amusement.

------
thomassvensen
Fun and useful tutorial, great work, Kjetil!

~~~
kvalle
Thanks!

------
_random_
... 8\. Find someone to do the most boring part - IDE - for you (most OSS
languages).

