
Lisp: It's Not About Macros, It's About Read - jlongster
http://jlongster.com/2012/02/18/its-not-about-macros-its-about-read.html
======
masklinn
It's not really about read either, it's about this:

> Wait a second, if you take a look at Lisp code, it’s really all made up of
> lists:

Haskell has `read`, most of your data types can just derive Read and Show and
they'll "magically" get a representation allowing you to `read` and `show`
them.

But that only works for datatypes, you can't do that with code.

In Lisp you trivially can, because code is represented via basic datatypes
(through a reader macro if needed).

It's not macros. It's not read. It's much more basic than that: it's
homoiconicity. From that everything falls out, and without that you need some
sort of separate, special-purpose preprocessor (whether it's a shitty textual
transformer — as in C — or a more advanced and structured one — as with Camlp4
or Template Haskell — does not matter).

And I don't get why the author got the gist (and title) so wrong when _he
himself notes read is irrelevant and useless in and of itself_ :

> Think of read like JSON.parse in Javascript. Except since Javascript code
> isn’t the same as data, you can’t parse any code

`read` does not matter if the language isn't homoiconic, you can `read` all
you want it won't give you anything.

~~~
majmun
isnt every language that takes string homoiconic? because code is string and ,
string is also valid data type in that language? (by wikipedia definition)

~~~
defen
Sort of, but only in a trivial and uninteresting sense - similar to how f(x) =
0 is its own derivative, but is much less interesting in that sense than f(x)
= e^x.

A string has no additional structure, so if you want to do any transformations
beyond simple string/regex substitutions you have to parse it into a more
suitable format.

------
dpkendal
I disagree with this article's perspective, but I understand what the author
intended to say and to a certain extent I like how the idea is presented.

Yes, Lisp's power comes from the embodiment of code and data together in one
manner, and the ability to treat them this way when writing code is good, but
`read` is a coincidence of that power, not a demonstration of how it is used.
Macros are the method by which we harness the power of homoiconicity in an
efficient, powerful manner.

------
kenjackson
As masklinn put it, it's really about the homoiconicity of Lisp. But with that
said, I'm not convinced at how great that is. While macros in general can be
useful, is homoiconicity generally a good thing?

It's rarely the case that I want a single representation for all my data --
and if we treat code as data, do I want a representation that is
indistinguisable from all my other data?

For example, the distinction between data that specifies layout (html, xaml,
etc...) and that which performs logically computation (javascript, c#, etc...)
seems like a useful distinction to have.

While I can appreciate the AST form of s-exprs I also do like the richness of
many standard languages -- and the semantic richness of their ASTs.

Lastly, treating code as data (and vice-versa) has been the bane of many
programmers of days past. Go back 40 years and you can find many developers
who did treat code as data (it was all actually viewed as sequence of bits by
many) and this caused no end of problems. In most modern systems there are
often safeguards to specify data and code segments and ensure that you don't
treat one as the other. While not completely analogous to Lisp macros, it does
show that you tread dangerous ground when you attempt to treat all forms of
data as indistinguishable.

Given the special purpose nature of code, I don't mind (and actually
appreciate) a well thought through syntax, and a special set of functionality
to interact with it -- as I do most special purpose forms of data.

~~~
jhuni
In emacs I represent everything in Lisp and I use separate color schemes for
SXML documents and code. This effectively allows me to distinguish between
these different classes of data. Furthermore, emacs has different sets of
functionality for different types of data, so I feel that I have all the
features you mentioned already.

~~~
kenjackson
It's unsurprising you mention sxml given that XML and s-exprs have a trivial
homomorphism. With that said, I'm also not a big fan for representing
relational data in XML (or sexprs).

Having a uniform representation for data isn't a big enough win to trump
having data represented in a way that is more natural for me to think about.

With that said, if your brain thinks in XML (or sexprs) maybe Lisp will always
work best for you.

------
_delirium
Unfortunately, in practice, it's not really the case that "it’s really easy to
parse Lisp code". Yes, you can do it in some cases, but to do it correctly in
general, at least in CL, is the somewhat infamous "code-walker" problem, which
needs to do all sorts of strange things:

 _Do you handle all varieties of lambda lists? recognize and descend into all
the special forms? what do you do with macros? expand them (a mess)? try to
walk into special-cased standard ones like 'loop' and ignore user-defined
ones?_

The closest you can come to doing it sanely is to use a code-walking library
like the one in arnesi: [http://common-
lisp.net/project/bese/docs/arnesi/html/A_0020C...](http://common-
lisp.net/project/bese/docs/arnesi/html/A_0020Code_0020Walker.html)

~~~
dpkendal
These are all issues of compilation, not reading. Reading Lisp code into cons-
pairs, symbols, numbers, etc. is not that hard at all: you can do it in about
100 lines of code. Lisp can be trivially parsed by a recursive-descent parser.
"'Recursive-descent'," as the UNIX-HATERS Handbook says, "is computer science
jargon for 'simple enough to write on a liter of Coke.'"

~~~
_delirium
Usually resolving things like which identifiers refer to which kinds of things
is considered part of the parsing step, not compilation; for example, the is-
it-a-typedef-or-not resolution problem is frequently cited as the reason C's
grammar isn't context-free. And you can't do even that level of parsing---
determining which symbols are function calls and which aren't---for Lisp
without expanding macros, or special-casing how to descend into known ones.

~~~
dpkendal
Macro-expansion is sometimes considered part of compilation, sometimes a
separate stage between parsing and compilation. I've never seen it referred to
in Lisp as a part of parsing, though I guess it might be possible to do that
with C's text-based macros.

`read` does not perform macro-expansion: that would break data reading and the
reading of quoted forms. macroexpand expands macros at a later stage. Once
expanded, macros either refer to primitive special forms or function calls,
and it's trivial to determine which. Primitive special forms can have their
components macroexpand'd as appropriate. Function calls can be left as-is to
be compiled (well, the arguments can be macroexpand'd). Eventually there will
just be primitive special forms and raw function calls left, ready to be
handed to the compiler.

~~~
_delirium
I can buy that in terms of definitions. My objection is more based on the
common use-case of "parsing" Lisp in this fashion to do some kind of source-
to-source transformation, like the example in this post, which I've also
wanted to do. The post saves itself by only replacing the first symbol in a
form. But what if you wanted to do something fairly simple like replace every
call to _foo_ with some other bit of code?

In idiomatic Lisp code, you either miss lots of the calls, or you have to
complicate your code-walker significantly. This is especially the case if you
write CL in the (common, but not universal) style that makes significant use
of the _loop_ macro, because you either ignore it as a macro, and consider
anything inside it opaque until macro-resolution time (because you don't know
what it does to its arguments), or you special-case it as a new bit of CL
syntax, in which case your parsing is now fancier. Usually you want something
like the latter, because source-to-source transformations expect to also
replace things inside loops. Same with, say, special-casing _setf_ forms, if
you want source-to-source transformations to "do what I mean" in a large
number of cases.

It's true that it's very easy to literally get the list representing the code,
but there's precious little sensible you can _do_ with that list unless you're
willing to descend into some of the more commonly used built-in macros that
most CLers treat as de-facto syntax, which requires knowing something about
the syntax they in effect define.

~~~
mcn
I agree with you and lispm about the complexity of code walking in CL.

For your specific example of replacing calls to foo with another bit of code
you may be able to get away with macrolet. (Example:
<http://letoverlambda.com/index.cl/guest/chap5.html#sec_4> )

------
akkartik
> You can implement a macro system in 30 lines of Lisp. All you need is read,
> and it’s easy.

The linked pastebin isn't a macro system. It's merely a macroexpansion system,
it needs to be evaluated. And it's not as simple as merely wrapping it in
'eval' because of subtleties in getting at the right lexical scope.

More generally, no fair claiming macros are easy because you managed to build
them atop a lisp. You're using all the things other comments here refer to;
claiming it's all 'read' is disingenuous.

I'm still[1] looking for a straightforward non-lisp implementation of real
macros. The clearest I've been able to come up with is an fexpr-based
interpreter: <http://github.com/akkartik/wart>

[1] From nearly 2 years ago: <http://news.ycombinator.com/item?id=1468345>

~~~
wes-exp
The currently front-paged Julia claims to have macros in a non-lisp.
<https://news.ycombinator.com/item?id=3606380>

~~~
calibraxis
And there's Dave Moon's PLOT (Programming Language for Old Timers).
<http://news.ycombinator.com/item?id=537652>

With the interesting critique that "objects" are better than s-expressions for
representing sourcecode. (BTW, Moon did a lot of work on Lisp.)

I've too thought that s-expressions don't necessarily contain as much
information as you'd want. Using Rich Hickey's word from "Simple Made Easy",
maybe they're used to "complect" visual presentation and internal
representation.

Then again, there's metadata...

~~~
akkartik
Thanks for those links (wes-exp as well). But I meant a non-lisp
_implementation_ of _lisp_ macros. Obviously common lisp and racket qualify,
but I'd love to see an implementation that's as simple as possible without
needing to be production-quality.

~~~
akkartik
Also, is PLOT actually available? I think I must have come across it 3 times
and searched for a download link without luck.

~~~
calibraxis
Last I heard, it's not — Moon didn't think his work was high enough quality to
release, and invited others to do so.

------
haberman
Right now there is a divide among programmers. One one side you have people
like the author who crave the power of code-as-data more than they care about
nice syntax and therefore love Lisp. On the other side you have people who
like more conventional syntax more than they care about code-as-data and
therefore don't love Lisp.

Neither side can understand the other: one side says "why do you resist
ultimate power?" and the other side says "how can you possibly think that your
code is readable?"

My belief (and what I am starting to consider my life's work) is that the gap
can be bridged. Lisp's power comes from treating code as data. But _all_ code
becomes data eventually; turning code into data is exactly what parsers do,
and every language has a parser. The author says "it's about read," but "read"
(in his example) is just a parser.

The author asks "How would you do that in Python?" The answer is that it would
be something like this:

    
    
      import ast
      
      class MyTransformer(ast.NodeTransformer):
        pass  # Implement transformation logic here.
      
      node = MyTransformer().visit(ast.parse("x = 1"))
      print ast.dump(node)
    

This works alright, but what I'm after is a more universal solution. With
syntax trees there's a lot of support functionality you frequently want: a way
to specify the schema of the tree, convenient serialization/deserialization,
and ideally a solution that is not specific to any one programming language.

My answer to this question might surprise some people, but after spending a
lot of time thinking about this problem, I'm quite convinced of it. The answer
is Protocol Buffers.

It's true that Protocol Buffers were originally designed for network
messaging, but they turn out to be an incredibly solid foundation on which to
build general-purpose solutions for specifying and manipulating trees of
strongly-typed data without being tied to any one programing language. Just
look at a system like <http://scottmcpeak.com/elkhound/sources/ast/index.html>
that was specifically designed to store AST's and look how similar it is to
.proto files.

(As an aside, programmers have spent the last 15 years or so attempting to use
XML in this role of "generic language-independent tree structured
serialization format," but it wasn't the right fit because most data is not
markup. Protocol Buffers can deliver on everything people _wanted_ XML to be).

Why should manipulating syntax trees require us to _write_ in syntax trees?
The answer is that it shouldn't, but this is non-obvious because of how
inconvenient parsers currently are to use. One of my life's goals is to help
change that. If you find this intriguing, please feel free to follow:

    
    
      https://github.com/haberman/upb/wiki
      https://github.com/haberman/gazelle

~~~
nessus42
_One one side you have people like the author who crave the power of code-as-
data more than they care about nice syntax and therefore love Lisp._

I crave _both_ the power of code-as-data _and_ nice syntax, which is why I
love Lisp.

~~~
haberman
Some people like putting salt on grapefruit. I'm not saying that it's
impossible to like Lisp's syntax, but empirically most people prefer the
ALGOL-like syntax, which is why I referred to it in the next sentence as
"conventional syntax."

I could attempt to prove to you that "conventional syntax" is inherently
superior to Lisp syntax, but that would be a waste of both of our time.

~~~
krunaldo
(+ 1 2 3) =

add 1 and 2 and 3

The syntax is a bit terse but if you teach people a good way to read it it
becomes much more readable than 1+2+3

The only reason we prefer that way is that we are thought that syntax when we
do math in school, I have found it much easier to teach people lisp who have
no or very little formal education in math.

~~~
Confusion
This argument is based on too shallow an analysis and doesn't stand up to
closer examination.

    
    
      (/ (+ (- b) (sqrt (- (* b b) (* 4 a c)))) (* 2 a))
    

Yeah, so it divides (the addition of (-b and the (sqrt of (the difference
between (the product of b and b) and (the product of 4, a and c))) by (the
multiplication of 2 and a))

Right, that's much easier than

    
    
      (-b + sqrt(b*b - 4*a*c)) / (2*a)
    

(-b plus the sqrt of ((b times b) - (4 times a times c))) divided by (2 times
a)

~~~
lispm
If math is a problem for the user, there is the option to use a modified
parser. For example using an infix parser called by a readmacro:

    
    
        (defun foo (a b c)
          #I( 
    
              (-b + sqrt(b*b - 4*a*c)) / (2*a)
    
            ))
    
        CL-USER 8 > (foo 1 2 3)
        #C(-1.0 1.4142135)

~~~
Confusion
Sure, there are solutions and it's awesome that they're both possible and easy
to use. I'm not arguing against Lisp; I just don't agree that its syntax is
_better_. I agree it is not worse, if you survey a sufficiently large variety
of cases.

It may be bikeshedding, but I would not let 'blue is better than red, because
the sky is blue' pass either.

------
mattdeboard
So, I have a question. After reading this article and thinking about
homoiconicity and macros, I remembered the Python source for the `namedtuple`
function: <http://dpaste.com/704870/>

Is this considered a macro? Is it homoiconic? It's code as data and using
input variables to generate code based on that input. It struck me as weird
the first time I read through it but figured since I'm pretty stupid that
there's a good reason for it.

~~~
andreasvc
It's not as powerful as what you could do with Lisp. The code is not a first-
class object, so you can't for example substitute variable a for variable b.
The "data" in code as data refers to an abstract syntax tree, not just a
string which happens to contain code. Python does give you access to abstract
syntax trees, but because of the complexity of the language this is less
useful than with Lisp.

~~~
mattdeboard
Thanks for helping with the distinction

------
yason
Macros are a method to do programmatically what read can do lexically. Using
macros avoids having to load strings into sexps and modify them and build
callable constructs out of the modifications. With macros, you just define the
modification as a transform on s-expressions and let the macro facility do the
rest. The point is to modify the AST and while read can be abused to do that
macros are designed to do just that.

------
chimeracoder
Am I missing something, or should that be read-from-string, not read? My
understanding is that read only operates on streams.

~~~
jlongster
Technically, you are correct. Some schemes implement read as a simple function
which takes a string. "Lisp" isn't referring to any specific lisp and more
just the idea of it.

~~~
chimeracoder
Hm, guile doesn't seem to like it either Anyway, the original article
mentioned Common Lisp, so I assumed that's what you were working with in the
examples.

------
njharman
I do a lot of data-driven development. I'm sure some of the time, 20%?, it'd
be nicer if the code/syntax was directly data. But, 80%, it's all much easier
to deal with it for me being separate.

I'm willing to give up that 20-25% to enjoy and be happy writing the other
80%.

------
lninyo
I tried the example code in cygwin/clisp and it doesn't work. Way to come up
with examples. At least mention which version it does work with?

~~~
spacemanaki
The examples aren't valid Lisp or Scheme. "read" takes an input port not a
string, as someone else mentioned, so you should replace it with "read-from-
string" in Common Lisp or something like "(read (string->input-port "(1 (0) 1
(0) 0)"))" in Scheme. The second and third block of code is Scheme, not CL, so
it won't work in clisp at all.

~~~
lninyo
Thank you for taking the time to explain. In my opinion, these types off
attitudes are what are causing beautiful languages to die horrible deaths. The
blog post extolls the virtues of "Lisp", but fails to mention which dialect.
Extremely off-putting to newcomers if I may say so. At least put a tl;dr up
top saying "This is for experts, noobs GTFO!" ...

