
While writing YAQL I accidentally discovered Lisp - gnosis
http://www.jroller.com/trasukg/entry/while_writing_yaql_i_accidentally
======
stcredzero
_And then the AHA! moment happens. I finally "get" Lisp! It's just a string
representation of the abstract syntax tree for any language!_

So instead of having a separate meta-syntax to manipulate Lisp at the level of
the AST, you simply write Lisp. Lisp is the language that's it's own AST.
Combine this with a complete exposure of the meta-level of the language, and
you get a language where many things that would be very expensive or
impossible in other languages are relatively inexpensive and possible.

Smalltalk is _nearly_ it's own AST. There are enough tools and exposure of the
meta-level to enable things like Lisp macros, but it's still 2-3X harder than
in Lisp. Also, Smalltalk Blocks by themselves are fairly powerful. As a
result, people don't use Lisp style macros in Smalltalk even though they could
if they really wanted to. (Implementing them would just take a few days.) The
cost-benefit just doesn't work out.

(Yes, even Smalltalk is a kind of "Blub" compared to Lisp.)

~~~
sedachv
The reason you can do metaprogramming with blocks in Smalltalk and not most
other "OO" languages is that Smalltalk is based on message-passing, and you
have control over how objects interpret messages sent to them.

Specifically, the doesNotUnderstand method is equivalent to having a fully
late-bound APPLY in Lisp (which means you can implement FEXPRs yourself when
you need them, and FEXPRs are as good as CL or Scheme-style macros for many
purposes):

[http://carcaddar.blogspot.com/2009/04/closure-oriented-
metap...](http://carcaddar.blogspot.com/2009/04/closure-oriented-
metaprogramming-via.html?showComment=1239507240000#c7399536579144180363)

<http://c2.com/cgi/wiki?DoesNotUnderstand>

This is why Alan Kay says message passing is more important than objects.

~~~
igravious
Is this the method_missing function that Ruby invokes when it can't fing the
method to was looking for?

<http://www.alef1.org/ruby/method_missing/index.html>

I guess Ruby _ahem_ borrowed it from doesNotUnderstand in Smalltalk.

~~~
froydnj
Yes, that's the same thing. It's always fun to realize that $HOT_FEATURE in
$LANGUAGE is several decades old, instead of several years.

------
gfodor
On this topic, are there any other _good_ examples of languages that are as
homoiconic as lisp? Or is lisp the logical extreme of what you get when you
follow that line to its end, so by definition there can "be only one?"

~~~
Dav3xor
Forth, in a way, out of left field.

------
tgflynn
Wikipedia defines a homoiconic language as a language

 _in which the primary representation of programs is also a data structure in
a primitive type of the language itself_

This definition doesn't seem quite right to me. It seems like it would apply
to C because C source code is a string which is "a data structure in a
primitive type of the language itself".

Wouldn't a better definition be "a language in which the source code and the
corresponding abstract parse tree have identical representations" ?

~~~
jules
Then Lisp wouldn't qualify.

C arguably is homoiconic. It's just not a convenient representation.

~~~
tgflynn
Care to explain your reasoning ?

I thought this post was all about how Lisp meets that definition.

~~~
jules
Well Lisp doesn't meet this definition:

> "a language in which the source code and the corresponding abstract parse
> tree have identical representations"

As the representation of code is still strings in Lisp, but the AST cons
cells.

~~~
tgflynn
OK:

s/identical representations/identical external representations/

~~~
jules
What do you mean by external representations?

~~~
tgflynn
By external representation I mean a representation of code and/or data that
the programmer can see and/or interact with. With current technologies this
almost always means some form of serialized text (an exception would be
LabView where programs are represented through an interactive graphical
environment).

An internal representation is a representation of code and/or data that is
processed internally by a computer but is not directly visible to the
programmer.

Example:

Internal representation:

    
    
      (defun fact (n)
        (case n (1
                 1)
              (otherwise (* n (fact (- n 1))))))
    

External representation:

The set of linked cons cells that the above list gets parsed into by a Lisp
interpreter.

~~~
tgflynn
I reversed internal and external in the above comment and for some reason am
unable to edit it.

~~~
jules
You can only edit posts for a certain amount of time.

Doesn't the requirement for homoiconicity become vacuous then?

The definition was:

> "a language in which the source code and the corresponding abstract parse
> tree have identical representations"

Now is:

> "a language in which the source code and the corresponding abstract parse
> tree have identical external representations"

Where external representation of source code is defined as the representation
that's used in the interpreter or compiler, that is, the abstract parse tree.
So the definition becomes: "a language in which the abstract parse tree has
the same representation as the abstract parse tree"?

Alternatively replace "external" by "internal" everywhere in this post, if
"internal" is the right name for CONS cells.

~~~
tgflynn
No the reversal in my above comments only applied to the example, not the way
I defined internal vs. external representations. Sorry for the confusion.

My point is that you can't tell by looking at the defun in my above comment
whether you're looking at Lisp source code or the parse tree (represented as a
nested list) for that source code. The same would not be true in C where the
external representation of the parse tree would be the C syntax for
initializing some set of nested struct's. That would look very different than
the source code itself.

~~~
jules
Aha, so homoiconicity == supports quoting. That seems like a reasonable
definition, although it's not perfect I can't think of a better one.

------
jerf
I sometimes call it the "Lisp non-syntax", because it's so trivial to define
something-Lispish and parse it, even in a language with crappy string support
like straight C with nothing but the standard library, that it's hardly like
you're writing a parser at all.

Slamming out a halfway decent parser is getting easier than it used to be with
things like Parsec, and JSON support is nice enough in the languages I work in
that I tend to just use that now, but in the 90s I went to this solution
several times. (XML, on the other hand, wasn't much help. It did indeed take
care of the real grunt work of processing but what was left over still took a
lot of love to translate into the native idioms of the program you were in,
what with the attributes and the mixed content and whatnot.)

------
haberman
It's true that Lisp syntax is simply a way of writing ASTs. But I think the
real truth is deeper than that.

Lisp feels powerful because you can manipulate programs as if they were data.
This is easy in Lisp because your programs are already expressed as a tree,
using a simple data model (cons cells).

In most other programming languages, the program first goes through this
generally very hairy and complicated parser. The parser is not usually
accessible from the language itself, which means you can't easily get your
hands on an AST. Even when it is (like in Python:
<http://docs.python.org/library/ast.html>) the AST is a complicated data
structure that takes a lot of work to make any sense of.

But this doesn't mean that the solution is to write programs as raw ASTs!
Programming language syntax is nice because it lets us write things in ways
that are convenient for _humans_. Most humans would rather read 1 + 2 - 3 * 4
than (+ 1 (- 2 (* 3 4))).

I strongly believe that the way to bring the power of Lisp to other
programming languages is to provide much easier access to parsers and ASTs,
and better APIs for them. If we could accomplish that, we could keep writing
syntax that works for humans, but have easy access to data structures so that
the AST can be very easily inspected, manipulated, generated, compiled,
interpreted, or any of those things that Lispers love to do.

When I say "better APIs," I mean that you should have lots of flexibility for
how you want to consume, process, and store the AST. You should have SAX-like
stream processing and data structures like DOM (but something that binds more
naturally to programming languages, so you can say foo.bar instead of
foo.getElementById("bar")). There should be a convenient serialization format
for the AST, and there should be a nice way to specify the schema of the AST
(like a more generic version of
<http://scottmcpeak.com/elkhound/sources/ast/index.html>).

You might sense this coming, but I've been working for a long time on what I
consider a major solution to this problem. There are two parts to it.

upb (<https://github.com/haberman/upb/wiki/>) is a standalone and very
minimalist implementation of Google's Protocol Buffers serialization format. I
believe Protocol Buffers are an ideal data model for ASTs. Forget for a moment
that Protocol Buffers is a serialization format and consider that you can use
their schema language to specify something like:

    
    
        message Expr {
          // Both operands can be either a raw value or another expression.
          double val1 = 1;
          Expr expr1 = 2;
    
          double val2 = 3;
          Expr expr2 = 4;
    
          // The operator that should be applied to the two values.
          enum Operator {
            PLUS = 1;
            MINUS = 2;
            MULTIPLY = 3;
            DIVIDE = 4;
          }
          Operator operator = 5;
        }
    

With that, you can generate data structures (an AST) that can be used in any
language, with convenient syntax (expr.operator). Now Protocol Buffers are
just a convenient way to create typed trees, just like the Elkhound "AST"
project I linked above, but in a way that can be accessed from ANY programming
language (that has protobuf support), not just one. upb also will support a
streaming interface like SAX.

The second part is my somewhat dormant project Gazelle (<http://www.gazelle-
parser.org/>). I mean to resume it once upb is ready. Gazelle uses most of the
same parsing theory as ANTLR, but instead of generating code for every parser
it has a more VM-like approach. And as its output? You guessed it -- upb and
Protocol Buffers. You can consume the parse tree in a streaming way (SAX), as
a data structure (DOM), you can serialize the AST itself as a Protocol Buffer.

tl;dr: Programming languages are just a human-friendly serialization of an
AST. Parsing is simply tree traversal. Protocol Buffers are simply a universal
way of specifying and manipulating trees. And I'm working on an attempt to
unify it all into a tiny but powerful library that aims to bring the power of
Lisp to all programming languages.

~~~
anamax
> 1 + 2 - 3 * 4 than (+ 1 (- 2 (* 3 4))).

Most humans think that, but no humans know what a << b | c ? a : 5 && j *k < z
^ r % u means.

Until they learn differently, most humans also think that training wheels
help.

------
bcheung
I studied compilers before I studied Lisp so when I first started reading
about Lisp it was very intuitive that it is just an AST.

When trying to design my own languages I kept needing to handle special
conditions and it seemed all roads lead to either Lisp or Smalltalk. It seems
those 2 are fundamental paradigms.

------
sedachv
When you have a parser generator, suddenly all problems start to look like
they need a complicated grammar.

------
motxilo
Quite shocking the moment I realized that ASTs in Lisp are just lists of
lists, and that the closure property allows you to build up complex ASTs
astonishingly fast.

------
troels
It took him 20 years to realise that? Even as a guy who wrote parsers for a
living?

I call shenanigans.

