
Writing Your Own Programming Language - ingve
https://github.com/marciok/Mu#writing-your-own-programming-language
======
bediger4000
Peter Norvig's "(How to Write a (Lisp) Interpreter (in Python))"
([http://norvig.com/lispy.html](http://norvig.com/lispy.html)) covers a
superset of this material and makes more sense, and actually has a portable
implementation you can run yourself. If you're going to do this, use Norvig as
a guide.

~~~
k__
When I think about writing my own language, I think of something with as few
parenthesis as possible and all the examples use Lisp.

~~~
cardiffspaceman
The reason for LISP or SCHEME as the tutorial is that parsing is easy and
doesn't call for Flex/Bison. There is a level of semantics also that if you
don't strain yourself on edge cases you can do a decent interpreter or
compiler pretty quickly.

Without making this post long I could "tempt" or encourage you by suggesting
that if you take the outermost paren pair off an s-expression, you sortof have
a language of function application:

defun fact n = if (< n 2) 1 (* n (fact (- n 1)))

I think there is an SRFI for Scheme that proposes something like that plus
offside rule to get rid of parens.

You could get rid of more parens by adding operator precedence to the syntax
(and that offside rule would help too), but now you're making the parsing
interesting instead of making the execution interesting.

~~~
kazinator
And the reason you don't want to drag in Flex and Yacc/Bison is that then the
bulk of the course will revolve around the banal issues surrounding syntax,
rather than semantics: high in heat, low in light.

Not even the smarter side of syntax (abstract syntax), but rather character-
and-token level syntactic sugaring.

------
drusepth
Honest question: why do tutorials in this topic seem to always use functional
languages/syntax as examples? Our compilers class at MST had us re-implement a
lisp compiler, but didn't touch on why we used lisp specifically (other than
the professor liking it; we were a largely C++ school). Do they think
functional languages are simpler / less complex / easier to understand? Is
there something inherently easier to implementing a functional language
instead of something more imperative?

~~~
johnbender
I've built interpreters for both a subset of Java and a full Lisp. Here's my
take.

> Is there something inherently easier to implementing a functional language
> instead of something more imperative?

Other answers have focused on the parsing of the language (that is, the
production of an AST) which is much easier to cover instructionally for a Lisp
because it's basically the AST already.

To my mind the semantics' of imperative languages is the real issue. In
particular when defining and/or implementing the semantics for an imperative
language, eventually store (memory) management comes up and everything gets
much more complicated instantly.

[edit] And to go along with the store there are often more forms for which you
need to define the semantics (statements, expressions, classes, etc).

In contrast functional languages can frequently be implemented using term
rewriting which can deal directly with the AST itself.

More broadly, this is why I wish students were required to implement an
interpreter of an imperative language. The act of debugging programs becomes
more difficult for the same reason the semantics is more difficult to define
and implement: it's more complex and there are more nuts and bolts to
consider.

~~~
ced
_In particular when defining and /or implementing the semantics for an
imperative language, eventually store (memory) management comes up and
everything gets much more complicated instantly._

Why? Isn't C-like "call malloc" simpler than the GC which is required for most
(all?) functional languages?

~~~
kd0amg
It's not really being functional that calls for GC -- if you have data
structures (whether they're closures, records, or whatever) escaping up from
the scope in which they were created, you have a tricky lifetime question to
deal with. In both the imperative and functional cases, you _could_ require
explicit allocation and deallocation (like malloc/free), and you _could_ make
it automatic (garbage collection). Functional programming tends to use GC
because passing/returning closures is a common thing to do. These days, I
would say imperative programmers also tend to use GC.

------
xutopia
I have a friend who wrote this book:
[http://createyourproglang.com/](http://createyourproglang.com/)

Jeremy Ashkenas created CoffeeScript after reading that book. I can't
recommend it enough for someone going down this road.

~~~
k__
Ah, I searched this link for a while, thanks.

------
uiri
There are a TON of resources like this which focus on lexing and parsing which
is all fine and dandy but interpreting the resulting Abstract Syntax Tree will
be extremely slow.

Are there any resources on the code generation side of things? Even getting
from a high level language down to SSA seems like a big leap (nevermind going
from SSA to assembly).

~~~
munificent
> Are there any resources on the code generation side of things?

Cooper and Torczon's "Engineering a Compiler" is mostly focused on the back
end. I liked it a lot.

------
morbidhawk
I took a university course where we built an interpreter for MicroScheme. It
was a difficult project but was also really awesome and rewarding. I'd like to
go back and do it again without deadlines to really understand it better. I
think functional programming languages can be a great choice for implementing
an interpreter since metaprogramming is their forte and Racket's match
function helped a lot, it's like the most powerful thing I've ever seen[1]

[1] [https://docs.racket-lang.org/reference/match.html](https://docs.racket-
lang.org/reference/match.html)

------
psewell
No. First, understand what the language is for (if nothing, stop here...).
Second, make the main design choices (expressiveness of type system, level of
control of mutability or lack thereof, etc.). Third, design the type system,
with a view to type inference. Fourth, design and define the semantics. Do
those last two in a way that will let you test your implementation against
these definitions automatically. Fifth, think about sufficiently efficient
implementation strategies. Sixth, pick a syntactic style that will be familiar
to most of your users. Seventh, design the actual syntax. Eighth, implement
it. Ninth, try it out on users and go back to the start. Tenth, rest.

------
hota_mazi
I'm confused, the article says the language uses a postfix operator and then
shows examples like `(s 2 3)`. Isn't that a prefix operator?

------
simono
Why do bloggers focus on lexing and parsing? These things should not take up
80% of the article about creating a programming language.

This fascination with "how" to build something, without considering "what" and
"why", seems to be an issue that gets repeated time after time again.

~~~
lmm
I found [http://hokstad.com/compiler](http://hokstad.com/compiler) a lot
easier to understand than any other compiler tutorial I've seen. Writing a
compiler the way you'd write any other program. It does end up with e.g. a
distinct parser, but only when the reasons that might be a good idea become
apparent.

~~~
vidarh
Thanks :)

Though in retrospect I think it should have been two separate series (and I
need to write a few more parts; I'm _very_ close to having it compile itself
as of last night).

I think it'd have been better to evolve the initial simple language into an
equivalently simple language to parse, and kept the long slog towards
compiling Ruby as a separate thing.

Especially as that has complicated things enough to be in severe need of
various refactoring (which I'm avoiding until it will compile itself, at which
point I'll start cleaning it up while insisting on keeping it able to compile
itself..).

The parser itself started out conceptually quite clean, for example, but the
Ruby grammar is horribly complex, and I keep having to add exeptions that's
turned it quite convoluted. I don't doubt it can be simplified with some
effort once I have the full picture, but it's not great for teaching.

------
TurboHaskal
[https://raw.githubusercontent.com/marciok/Mu/master/WriteYou...](https://raw.githubusercontent.com/marciok/Mu/master/WriteYourLanguage.playground/Pages/Interpreter.xcplaygroundpage/Resources/simple-
ast.png)

 _(Beavis and Butthead laugh)_

