
Show HN: Minimal Lisp in ~70 lines of Haskell - die_sekte
https://gist.github.com/950199
======
tel
I think this might be some of the most beautiful code I've seen in a long
while

    
    
        value :: Parser Value
        value =
              List <$> (char8 '(' *> sepBy value (takeWhile1 isSpace_w8) <* char8 ')')
          <|> Number . fst . fromJust . B.readInteger <$> takeWhile1 isDigit_w8
          <|> Symbol <$> takeWhile1 (inClass "A-Za-z\\-")
    

The power of parser combinators representing the elegant lisp syntax.

~~~
snprbob86
I'm familiar with parser combinators, but not the Attoparsec or other
libraries. Furthermore, I have limited Haskel experience, so I'm struggling to
read this. In particular, the use of symbols and overall terseness are hard to
get through. I'm also really uncomfortable with the order of operations of all
these operators.

Now, I know that Hoogle exists, so I'm able to look these things up, but it's
pretty tough to untangle. Particularly when searching for

    
    
      *>
    

I found something that reads "This module describes a structure intermediate
between a functor and a monad: it provides pure expressions and sequencing,
but no binding. (Technically, a strong lax monoidal functor.)" Which is so
full of jargon, my head hurts. And I even know what most of that jargon means!

So as far as I can tell, here's what's going on....

    
    
      Foo <$>
    

Defines a grammar production using the left argument as a constructor on the
matched value of the right argument.

    
    
      <|>
    

Separates productions as alternatives; in a PEG Grammer style. Both the <$>
and <|> operators combine functions to result in a big master function that
takes input and has the result type that is the big `one of A, B, C` algebraic
type that is defined earlier: "Value".

    
    
      char8
    

Matches a literal character (passed as an argument).

    
    
      char8 '(' *>
    

Reads a parenthesis and throws it out of the result set. Similar for the close
parenthesis later on.

    
    
      takeWhile1 isSpace_w8
    

Produces a parser that consumes whitespace.

    
    
      sepBy value (takeWhile1 isSpace_w8)
    

Combines to produce a parser which reads a list of values (recursive here) and
presumably eliminates the separators from the result.

    
    
      takeWhile1 isDigit_w8
    

Similar story, reads an integer.

    
    
      Number . fst . fromJust . B.readInteger
    

Is a composed constructor that may or may not parse an integer, forcibly
assumes it will succeed to parse one (which we know because we're reading
digits). That integer parse stops when it hits a non-digit, and the unconsumed
characters are returned as the second element of a tuple, so we call fst to
take the first element of the tuple, discarding the unread characters (the
main parser will consume those).

    
    
      takeWhile1 (inClass "A-Za-z\\-")
    

Similar story again to the whitespace and numbers.

OK, yeah, so.... amazing. Incredibly brilliant little bit of code. But holy
hell was that tough to read. And I have no idea if I'd be able to write it in
only a few hours. I'm far from certain I'd ever learn to write it as fast or
faster than something more verbose.

Please let me know if I've misunderstood something....

~~~
palish
_"I'm familiar with parser combinators, but not the Attoparsec or other
libraries. Furthermore, I have limited Haskel experience, so I'm struggling to
read this. In particular, the use of symbols and overall terseness are hard to
get through. I'm also really uncomfortable with the order of operations of all
these operators."_

This is why I've never identified with "code should comment itself".

I'm not afraid to admit it: I am a dumb programmer. As a dumb programmer, the
only way for me to be a good programmer is to be effective.

To be effective, I need to minimize the "time spent understanding code" phase
of software development. Furthermore, I will likely be working within a team.

The opposite of that is to be writing "prototype" code --- code which is
written to explore the problem space and to gain a better understanding of
specific patterns to best accomplish a goal.

But once the prototype phase is over, all of that code is deleted, and is
replaced with commented code. (If possible, I try to explain each step of the
algorithm in English, first, and _then_ write the code. This is slower, but
results in far fewer bugs and a more solid foundation.)

I look at it this way: when my life ends, either I will have gone as far as my
own brain was able to take me --- or I will have gone as far as _my team's_
brains were able to take _us_. I'm willing bet my life that teams of "less
brilliant" people are more effective than individuals of excessive brilliance.
I write my code accordingly.

\-------------------

That said, there is no excuse for laziness. Even if I am merely "competent",
it's important for me to strive to be brilliant, even if I'll never attain it.

~~~
ezyang
In my opinion, this code comments itself, even more so than a manually,
written-out parser would.

Here's how I would think about it. What would a grammar for Lisp look like?

    
    
        value := list | number | symbol
        list := '(' value + ')'
        number := [0-9]+
        symbol := [A-Za-z-]+
    

OK, great. Now how do I look at this code?

    
    
        value :: Parser Value
        value =
              List <$> (char8 '(' *> sepBy value (takeWhile1 isSpace_w8) <* char8 ')')
          <|> Number . fst . fromJust . B.readInteger <$> takeWhile1 isDigit_w8
          <|> Symbol <$> takeWhile1 (inClass "A-Za-z\\-")
    

Even if I don't understand the combinators, I can blank them out for now and
focus on the recognizable semantic bits. I see List, Number and Symbol, ok, so
my guess is that <|> does something like a | might in my grammar, and each
line corresponds to a different way a value can be expressed. For list, I see
some quoted '(' and ')', so I guess that 'char8' means "match this literal."
In fact, I can just read all of that off, and it makes sense. Nevermind what
_> and <_ are doing, I'll ignore that for now. And so forth.

Suppose I wanted to make a superficial change, like make $ a recognized symbol
in the language. That's super easy. I don't even need to look at the docs. If
I want to introduce a new syntactic construct? A little harder; I'll have to
go check the attoparsec docs. But you'd have to look up the docs for a library
in any language, anyway.

All's not well; in particular, this code conflates the tokenizing and parsing
steps (notice that I don't say anything about whitespace in my grammar, but
there's some line-noise here dealing with it.) That decreases the readability
a little, but you gain efficiency with it.

Maybe there's a tradeoff: the use of a library and symbolic combinators makes
it harder to tell precisely how the code manages to actually do any parsing.
That's true of any abstraction. But what's really great about this is that I
can easily tell what the big picture of the code is. A page or two of state
machine would not do that for me!

------
Peaker
The content of Fun is actually the type: Context -> Value -> (Context, Value).

If you flip the arguments, you get: Value -> Context -> (Context, Value). The
(Context -> (Context, Value)) is actually the State type:

    
    
        State s a = s -> (s, a)
    

I modified your code to use this type. During the process, I encountered some
peculiar things (and factored out some things), see comments in code.

<https://gist.github.com/950445>

Funny that it turns out slightly longer when re-using more code due to
syntactic artifacts. Monad comprehensions (recently restored to GHC via an
extension) would resolve that.

~~~
die_sekte
The eval throwing away context does not make any sense in retrospect. I think
I made it that way because of some ghetto-scoping ideas.

You broke car and cdr somehow (these don't evaluate their arguments anymore),
but I have no idea how you did this.

This was mostly a training exercise, this is why there is no significant use
of monads in there: I simply haven't learned enough about them to feel
comfortable using them.

~~~
Peaker
Can you give the example that breaks car/cdr? They seem to be evaluating their
args here:

    
    
        fst $ repl "(car (cons 1 (quote ())))"
        Number 1

~~~
die_sekte
Sorry, that was my fault. I didn't make lambdas evaluate their arguments and I
didn't notice this in my code. I.e. car/cdr work correctly, fun doesn't.

------
stiff
I do not know Haskell enough to fully judge this, so maybe someone can correct
me, but that seems _very_ minimal. From what I understand, it doesn't even
have arithmetic operations, it "executes" incorrect programs without any error
(try repl "())") and also it doesn't have any kind of scoping, for example in
the following code a function argument is bound in the global (and only)
environment:

 _Main > repl "(begin ((fun (x y) y) 1 2) y)"

([("x",Number 1),("y",Number
2),("begin",Fun),("car",Fun),("cdr",Fun),("cons",Fun),("cond",Fun),("def",Fun),("eval",Fun),("fun",Fun),("t",Symbol
"t"),("quote",Fun)],Number 2)_

I don't think it's fair to call this a Lisp at all at this point. There is a
nice writeup of implementing a Lisp in Python by Peter Norvig, where at least
the most basic things are implemented correctly (and the code is documented):
<http://norvig.com/lispy.html>

Also, I get errors even with some very simple statements that theoretically
seem to be implemented like:

 _Main > repl "(cons 1 2)"

([("begin",Fun),("car",Fun),("cdr",Fun),("cons",Fun),("cond",Fun),("def",Fun),("eval",Fun),("fun",Fun),("t",Symbol
"t"),("quote",Fun)],List

[Exception: lisp.hs:48:8-48: Irrefutable pattern failed for pattern (ctx',
[v', (Main.List vs')])_

Am I doing something wrong here?

~~~
sjs
This is clearly a golfing exercise. If you want a more full featured Lisp
check out Write Yourself A Scheme in 48 Hours by Jonathan Tang:
[http://jonathan.tang.name/files/scheme_in_48/tutorial/overvi...](http://jonathan.tang.name/files/scheme_in_48/tutorial/overview.html)

------
kleiba
Cool, only 70 lines!!

(plus a ton of libraries)

j/k ;-) Actually, I like this - it shows how cool Haskell really is! No, wait
- it shows how cool Lisp really is! No wait... I'm confused... :-)

~~~
GregBuchholz
You might also like: Lisprolog
<http://stud1.tuwien.ac.at/~e0225855/lisprolog/lisprolog.html>

"Some online books show how to implement a simple "Prolog" engine in Lisp.
They typically assume a representation of Prolog programs that is convenient
from a Lisp perspective, and can't even parse a single proper Prolog term.
Instead, they require you to manually translate Prolog programs to Lisp forms
that are no longer valid Prolog syntax. With this approach, implementing a
simple "Lisp" in Prolog is even easier ("Lisp in Prolog in zero lines"):
Manually translate each Lisp function to a Prolog predicate with one
additional argument to hold the original function's return value. Done. This
is possible since a function is a special case of a relation, and functional
programming is a restricted form of logic programming.

Here is a bit beyond that: lisprolog.pl

These 162 lines of Prolog code give you an interpreter for a simple Lisp,
including a parser to let you write Lisp code in its natural form."

------
guard-of-terra
See also <http://www.defmacro.org/ramblings/lisp-in-haskell.html>

~~~
thinkingeric
And 'Write Yourself a Scheme in 48 Hours"

[http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_H...](http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours)

------
nkassis
Now if someone will just write a 70 line Haskell interpreter in lisp we can
have lispkell all the way down.

------
meric
How long did it take to write?

~~~
die_sekte
About 2-3 hours. Lots of breaks though; I started about 16 hours ago and
finished about 4 hours ago.

~~~
yason
So it's really 6-12 hours. You're working on the code in the background; you
_probably_ couldn't just have written it three hours straight. Or more likely,
you were sorting out the details within the 12-hour period but you had been
toying around with the idea for days if not weeks.

Just guessing and certainly not trying to look down your work, it's beautiful.
It's beautiful even if it had taken a week to write. The time to write it is
just interesting because it is exactly this that makes programming hard to
measure.

How does any explain his productivity to a boss if you get seemingly nothing
done for four days and then write a nearly complete solution on the fifth day?
How much is one actually working after all, counting all the time that affects
the work? Does it count as work if you go biking on a Saturday but you kind of
subconsciously think about your work and then you write great stuff on Monday,
thanks to that?

One thing is for sure: you're much better off measuring a programmer's efforts
by his accomplishments instead of the time to do them. This is backed up by
the fact that an entrepreneur programmer gets rewarded much more fairly than a
salaried programmer.

