
A programming language in 450 lines of JavaScript - breuleux
http://jsfiddle.net/osfnzyfd/
======
drewdahlman
Would love to see a little behind the scenes for this, maybe a write up / blog
post. Pretty neat little experiment.

~~~
breuleux
I agree I should post one, but it usually takes me a lot of time, and I wasn't
sure how much interest there would be. I'll start working on one.

~~~
bramgg
I started writing my own language in JavaScript last Summer but gave up after
only writing the Lexer because I didn't know how to continue. I would LOVE to
read this blog post!

~~~
Theyeard
Kaleidoscope - LLVM Tutorial [0] is an excellent resource with code samples in
C++ and OCaml.

[0]
[http://llvm.org/docs/tutorial/index.html](http://llvm.org/docs/tutorial/index.html)

~~~
eric_bullington
Agreed, it's an excellent resource. One of the best PL tutorials I've ever
read.

But if you use the OCaml version with a recent version of OCaml, you'll run
into a few issues. I sent in a patch with a few simple updates to the path
mailing list earlier this year, but last I checked it wasn't merged.

Patch is here (it's a text file, despite the "bin" extension):

[http://permalink.gmane.org/gmane.comp.compilers.llvm.cvs/183...](http://permalink.gmane.org/gmane.comp.compilers.llvm.cvs/183464)

I also pasted the patch here:

[http://pastebin.com/YvzB71nJ](http://pastebin.com/YvzB71nJ)

------
marijn
Mine's only 157 lines:
[http://eloquentjavascript.net/11_language.html](http://eloquentjavascript.net/11_language.html)

~~~
Aissen
Yours is a lisp-like, so much easier to parse; and it doesn't have the nice
in-dom ast viewer nor nearly the same number of comments (I compared with
[http://eloquentjavascript.net/code/chapter/11_language.js](http://eloquentjavascript.net/code/chapter/11_language.js)).

~~~
V-2
Still very interesting

~~~
Aissen
It's even more interesting because it's part of book. Just setting the
expectations, because it's not an apple-to-apple comparison.

------
moron4hire
Damn, beat me by 6 lines, though I also cheated and translated to JS to
interpret the code: [https://github.com/capnmidnight/betty-
boop/blob/master/loosp...](https://github.com/capnmidnight/betty-
boop/blob/master/loosp.js)

EDIT: though I did end up writing a Pong game in it:
[https://github.com/capnmidnight/betty-
boop/blob/master/pong....](https://github.com/capnmidnight/betty-
boop/blob/master/pong.html)

------
owlish
Peter Norvig's lis.py[1] is an oldie but a goodie. It really shows you the
beauty and simplicity of Scheme/Lisp interpreters.

[1]: [http://norvig.com/lispy.html](http://norvig.com/lispy.html)

------
lectrick
The best thing about this is being able to play with the entire mechanism
right on jsfiddle. Thanks for this work and want to read the writeup!

------
primitivesuave
This is awesome! I wish somebody had shown me this when I learned about
compilers in CS. The best way I've found in learning about anything
theoretical is to write the code for it.

~~~
Jonovono
Yup! I found this course pretty useful when I was studying it. Fun to jump
right in alongside the theory and all, I find:

[http://createyourproglang.com/](http://createyourproglang.com/)

------
marai2
That's a nice example of some elegant and readable code.

------
EpicDavi
Just a semi-unrelated comment:

The Fibonacci code example classically has a horrid runtime. Will like likely
crash your browser when supplied a number higher than the low thirties. IIRC
it has a O(2^n) runtime and can be improved either by a bottom up approach
(non-recursive) or a "dynamic-programming" (OOohhhh, buzzwords) method where
you use memoization (memo table) to store (or memoize) results.

~~~
MrRage
I always thought the fastest was the closed form formula:
[http://en.wikipedia.org/wiki/Fibonacci_number#Closed-
form_ex...](http://en.wikipedia.org/wiki/Fibonacci_number#Closed-
form_expression)

Unless looping and counting is that much faster than doing some floating point
math?

~~~
aarongolliver
Only "faster" because you precompute digits of the golden ratio, and will
quickly become inaccurate unless you compute more of them

------
whitten
This seems to be a language where there are several kinds of blocks: if-elif-
else , a let-in-end, begin-end and curly braces, square brackets, and round
parenthesis.

There are a few operators, with several levels of precedence, both prefix
operators, infix notation, but no sign of post-fix notation like (!) for 3! ==
6 and 5! == 120

There also seems to be list data type as well as strings and numbers.

~~~
breuleux
Postfix operators do work. Try this:

    
    
        let x! = if x == 0 then 1 else x * (x - 1)! end in 5! end
    

Any operator can be prefix, infix or postfix depending on the relative
priority with the operators next to them. For instance, in "1 ^ * 2" the
engine interprets ^ as postfix because ^ has higher priority (it just gets
null as its right hand side). In "1 * ^ 2", ^ is prefix. Unless specified
otherwise operators are given maximal priority, which is why the factorial
example works.

The engine just has specific provisions to allow operators in "prefix
position" to have different priority than in infix position, otherwise prefix
minus would fail to work as expected.

~~~
eric_bullington
I'm trying to figure out how much work it would take to add on OCaml style
pattern matching to this language, particularly list matching.

Would you mind pointing me in the right direction? Like a very brief(ie., a
sentence or two), high-level overview of how you might approach the task?

~~~
breuleux
Not a lot of work. I have implemented pattern matching before, so I just went
ahead and did it:

[http://jsfiddle.net/f1ycatas/2/](http://jsfiddle.net/f1ycatas/2/)

See line 266 for the syntax definition (the precedence entry for "->" on line
278 is also relevant) and line 409 for the implementation of pattern matching.
I use a few auxiliary functions here: extractArgs just checks that an AST node
has a certain form and extracts parts of it, and listify is a kind of
normalization function that will give you a list of expressions or statements.

I think the code is relatively straightforward: we have a list of "pattern ->
body" statements. For each entry we create a fresh environment for the
variables declared by the pattern and we try to match the pattern with the
value. For matching, if a pattern is a variable name we set that variable in
the environment, if it's a number or string we check for equality, and if it's
a list of subpatterns we check that we have a list of the correct length and
we check the subpatterns recursively. If there's a match, we execute the
corresponding body.

If that interests you, I wrote a compile-to-JS language called Earl Grey
([http://breuleux.github.io/earl-grey/repl/](http://breuleux.github.io/earl-
grey/repl/)) using a similar parser and it has rather sophisticated pattern
matching. The implementation is quite a lot more than 450 lines, though :)

~~~
eric_bullington
Thanks! Nice explanation and very clear code.

------
V-2
Let's live it once again
[https://news.ycombinator.com/item?id=7745561](https://news.ycombinator.com/item?id=7745561)

;)

"Have you heard about our Lord and Saviour JavaScript?"

"How to touch yourself at night without JavaScript knowing it."

"Breaking news! POP3 and IMAP written in Javascript!"

"How I ported the control software of a nuclear reactor to reactive
Javascript"

"Linux kernel ported to JavaScript running in the browser. See how we did it."

"How I made a filesystem in javascript."

"How I got my girlfriend pregnant using JavaScript."

"How to avoid getting HIV using JavaScript."

"The Linux kernel doesn't have enough javascript."

"I recommended my boss to rewrite the local powergrid infrastructure to
javascript and how I lost my job."

"I decided to re-implement Javascript in Javascript. It failed. Here is my
story."

"ABC in # lines of JavaScript (400 comments, 1000 points)"

------
dabockster
Wow, nice project! I would love to see both the lexical and syntactic
analyzers broken down a little more.

------
coppolaemilio
Loving it! It's so nice to see this kind of things around :)

~~~
ravi_thakare
You can also check out
[http://javascript.crockford.com/tdop/tdop.html](http://javascript.crockford.com/tdop/tdop.html)

The article is also in the book Beautiful Code (O'Reilly)

~~~
breuleux
Indeed, that is a very good article. I highly recommend it. TDOP is the
combination of recursive descent and Floyd's operator precedence grammars, the
latter of which is the technique I used (I find it sufficient for my needs).

~~~
barbudorojo
TDOP of Vaughan Pratt is also used in maxima (that is the way they use an
infix language similar to python as a frontend to Lisp. In maxima you can
define easily infix, prefix and postfix operator with a left and right binding
force for your grammar. I couldn't find with grep the file I was looking for
in the maxima source, but for examen Fateman paper "Syntax Extension as a Tool
for Application Programming" is a good reference.

------
kensai
Since some 150 lines are comments or blank, it's more like a 300 lines
programming language. Even more impressive!

------
taternuts
Thanks so much for this - really looking forward to going through it in the
next couple days

~~~
treehau5
Yes me too! I am curious to know the motivation behind it as well. I have been
putting on the back burner going through the "Dragon Book" and I figured a
project like this would be motivation to finally get around to it.

------
Yadi
Ah this is pretty baddass, kudos!

------
pjmlp
This is great! Specially the AST viewer.

Any inspiration from the Tiger book for the language?

------
notastartup
can you create your own programming language with python?

~~~
pjmlp
Any Turing complete language can be used for writing compilers/interpreters,
it is just harder in some than other.

Python is actually pretty neat for doing it and you even have tools like PyPy
(with RPython) to write a JIT for the language if you wish.

~~~
sklogic
You don't even need a Turing-complete language for writing compilers, see
CompCert for example.

~~~
pjmlp
I need to dive into it.

Do you mean it because of the formal logic being used?

~~~
sklogic
This is because compilation can be represented as a _finite_ sequence of
applying simple rewrite rules. You don't need a true Turing-complete language
to do so, because you don't have any indeterminate loops in such process. Any
simple total language (like Coq) is more than enough.

------
worldsayshi
Dat AST rendering.

------
elwell
Or you could use replace() with a fat regex, and then eval(). But this is
awesome.

~~~
vidarh
That's one of the most terrifying sentences I've read.

~~~
klibertp
What? That's actually the way all the compilers/transpilers work. You just
take some string and transform it into some other string that happens to be
runnable.

~~~
vidarh
No, that is actually _not_ the way most compilers/transpilers work.

Using a regex for parsing a whole language is something I've never seen done,
in 25+ years of writing and reading compilers as a hobby.

Using a regex for tokenization or parsing simple sub-expressions is
_sometimes_ done, but that too is fairly rare outside of toy parsers.

Doing the translation step through replace on the basis of a regex is also not
going to work for anything but the simplest translations, not least because
the resulting monstrous regex would be nearly impossible for a human to reason
about.

But the thing that terrified me about it was the thought of someone lifting
something lie that out of the browser sandbox and running it server side: The
combination of a near undecipherable regex for parsing/translation coupled
with eval() to run the result would be a near certain security nightmare.

~~~
klibertp
I think you misunderstood me completely. I was referring to a much more
fundamental nature of compilation, not to any particular implementation
technique. I agree that "a couple (because I assumed the OP meant this and not
a single, giant regexp) of regexes with replace" is not a particularly often
used implementation technique for compilers, but it is a workable one. For
example, compiling some simple custom markup language to html is a good use
case for this. Take a look at "Text processing in Python"
([http://gnosis.cx/TPiP/](http://gnosis.cx/TPiP/)) for a longer discussion.

The `eval()` part is also nothing new or strange. There are many systems which
let you input expressions, compile it and run. Scala does this, as does Nim,
as did Forth for a longest time.

In general compilation is not magic, on the contrary, it's conceptually simple
and it's a good thing to know the basics. This is the view I wanted to
express.

~~~
vidarh
> I think you misunderstood me completely. I was referring to a much more
> fundamental nature of compilation, not to any particular implementation
> technique.

Then you seem to have misunderstood the point of my initial comment entirely,
which is down to the specific case of suggesting regexes + eval() as a
reasonable way to implement a compiler.

> because I assumed the OP meant this and not a single, giant regexp

The linked implementation already uses regexps for tokenization. OPs comment
explicitly states "you could use replace() with a fat regex, and then eval()".
I find it hard to interpret that as anything other than a suggestion to use a
single, giant regexp.

> The `eval()` part is also nothing new or strange. There are many systems
> which let you input expressions, compile it and run.

Having an eval() is nothing "new or strange" but that was not what I was
reacting to.

> In general compilation is not magic, on the contrary, it's conceptually
> simple and it's a good thing to know the basics. This is the view I wanted
> to express.

I've implemented several compilers and interpreters, and is writing a
multiple-year-long article series on writing a Ruby compiler, so I agree
(though Ruby is testing my patience...).

But personally dragging out regexps is the last thing I'd do for illustrating
the conceptual simplicity of compilation... Personally, when I see people
dragging out regexps, and they're longer than about 5 characters, I assume
that's where I'll be most likely to find bugs.

~~~
klibertp
> Then you seem to have misunderstood the point of my initial comment entirely

Yeah, I agree, sorry about that.

> I find it hard to interpret that as anything other than a suggestion to use
> a single, giant regexp.

Now that I look at this you're probably right. I assumed any meaningful
compilation is actually impossible with a single regexp and so tried to
interpret original comment in a way which made at least some sense for me.

> But personally dragging out regexps is the last thing I'd do for
> illustrating the conceptual simplicity of compilation...

Well, there are some advantages to using regexps as an example: they are
widely known and they also are able to describe regular languages. But I have
no experience at all talking to people about compilation, so I'll assume that
you're right and that using regexps makes it actually harder to explain things
:)

------
CmonDev
Actually both are scripting languages. Still impressive though.

