
Homoiconicity isn’t the point (2012) - jxub
http://calculist.org/blog/2012/04/17/homoiconicity-isnt-the-point/
======
zmonx
In my opinion, this article puts too much emphasis on _reading_ , and too
little emphasis on actually _reasoning_ about programs in homoiconic languages
like Prolog and Lisp, and due to this imbalance the conclusion is not
sufficiently justified.

It is true: Being able to "read without parsing" is definitely nice.

But that is only a subset of those advantages that a homoiconic language gives
you. An at least equally important advantage is due to the fact that programs
in homoiconic languages are typically _very easy_ to reason about by built-in
mechanisms in that language.

For example, Prolog programs are readily represented as Prolog _terms_ , and
can be easily reasoned about by built-in mechanisms such as unification.

Since I regard it as a key advantage of homoiconic languages that their
_abstract_ syntax is completely uniform and can typically be easily reasoned
about within such languages, I disagree with the main point that the article
is trying to make.

One interesting fact about homoiconicity is that extremely low-level languages
(like assembly code) _and_ extremely high-level languages (like Prolog) are
homoiconic, yet there is a large gap "in the middle", where there are many
languages (like Java, C, Python etc.) that lack this property.

~~~
catnaroek
> _very easy to reason_ [sic] _about_ by built-in mechanisms in that language.

Languages allow you to express your reasoning, but they don't do the reasoning
by themselves. Also, there is no conclusive evidence that homoiconic languages
have simpler semantics, especially of the denotational kind.

~~~
taeric
I think the meaning is that fewer things are implemented by bringing in
outside capabilities. In lisp, everything's can be expressed in terms of other
lisp code. Typically directly so. In other languages that are not as macro
friendly, most keywords get the explanation of "this is how the computer will
act" and then explanations of new behaviors.

~~~
catnaroek
> In lisp, everything's can be expressed in terms of other lisp code.

Um, and how exactly are primitive forms defined?

> most keywords get the explanation of "this is how the computer will act" and
> then explanations of new behaviors.

Have you heard of Hoare logic? The meaning of ordinary ALGOL-style imperative
programs can be given in terms of relating preconditions to postconditions.
Suppose that you have the Hoare triples:

    
    
        {P} foo {Q}
        {Q} bar {R}
    

Then you can derive the Hoare triple:

    
    
        {P} foo; bar {R}
    

Note that `Q` is not mentioned at all. Hence, any implementation is free to
translate the program

    
    
        foo; bar
    

into something that doesn't have `Q` as an intermediate state.

~~~
taeric
You seem to be arguing past me. If you are just upset that I said everything
instead of most things, yeah, obviously some things are defined elsewhere. To
that end, how things are actually implemented takes a trip to assembly. (I
mainly blame that I'm writing in my phone. Often while in the bus.)

My point was that you don't typically see c constructs explained in terms of
other c constructs. This is quite common in lisp. To see lisp constructs
explained in terms of other lisp constructs. In large because there are few
constructs.

You showing me that you can explain using other logic is kind of my point. It
is awesome that you can do this. I recommend the skill. It is still not
showing c or Java or Haskell or whatever in terms of themselves.

Note that I think you actually can do this, in large. It is not typically
done, though.

~~~
catnaroek
> To that end, how things are actually implemented takes a trip to assembly.

Don't confuse “defined” with “implemented”. This is the entire point to having
an axiomatic semantics!

> My point was that you don't typically see c constructs explained in terms of
> other c constructs.

Languages can't be entirely defined in terms of themselves. At some point you
need to _drop down_ to something else. If most of Lisp is defined in terms of
other Lisp constructs, that is perfectly fine, but, for my purposes, i.e.,
proving things about programs, there are two mutually exclusive possibilities:

(0) The semantics of Lisp is the semantics of its primitive forms, and derived
forms are just Lisp's _standard library_.

(1) So-called “derived” forms have an abstract semantics of their own right,
and their implementation in terms of so-called “primitive” forms is, well, an
implementation detail.

So, my answer to “most of Lisp is defined in terms of Lisp itself” is “that's
cute, but how mathematically elegant is the part of Lisp that is _not_ defined
in terms of itself?”

~~~
taeric
I said explained. Not defined. Not implemented. Definitely not proved. Just
explained. There is typically exposition, as well.

So, by all means, keep arguing points I'm not making. I was offering what I
suspect the parent post meant by it being easier to reason using the mechanics
of the language. Nothing more.

~~~
catnaroek
> I was offering what I suspect the parent post meant by it being easier to
> reason using the mechanics of the language.

And it still doesn't make sense. “Reasoning about programs” is making
inferences about their _meaning_ , i.e., deriving judgments from prior
judgments. How exactly do homoiconic languages make it any easier to make
inferences about the meaning of programs, given that homoiconicity is largely
a property of how concrete and abstract syntaxes are related to each other?
(Not that homoiconicity makes things more difficult either. It is just
completely unrelated to reasoning about programs.)

~~~
zmonx
For one particular example where homoiconicity makes _reasoning_ about
programs easier, consider an important reasoning method called _abstract
interpretation_ :

[https://en.wikipedia.org/wiki/Abstract_interpretation](https://en.wikipedia.org/wiki/Abstract_interpretation)

Using abstract interpretation, you can derive interesting program properties.
The uniformity and simplicity of Prolog code, as well as its built-in language
constructs like unification and backtracking, make it especially easy to write
abstract interpreters for Prolog.

Here is a paper that applies this idea to derive several interesting facts
about programs and their meaning:

Michael Codish and Harald Søndergaard, _Meta-circular Abstract Interpretation
in Prolog_ (2002)
[https://link.springer.com/chapter/10.1007%2F3-540-36377-7_6](https://link.springer.com/chapter/10.1007%2F3-540-36377-7_6)

Abstract interpretation is also applicable to other programming languages.
However, it is much easier to apply to homoiconic languages like Prolog.

~~~
kd0amg
Abstract interpretation is entirely about the semantics of programs and has
nothing to do with their surface syntax.

~~~
zmonx
Yes, indeed!

Please note that what makes this reasoning method so easily applicable in this
case is uniformity of the _abstract_ syntax, _not_ of the surface syntax,
which is also called _concrete_ syntax.

Homoiconicity is a relation between the concrete and _abstract_ syntax tree
(AST) of programs and the language's built-in data structures.

~~~
catnaroek
No idea about you, but when I _reason_ about programs, I spend very little
time manipulating syntax. Most of the time is spent manipulating semantic
objects, like predicates on the program state, whose representation is
independent from even the abstract syntax of a programming language.

------
kazinator
> _You simply can’t produce an AST without expanding macros first._

False; the syntax before expanding macros is an AST, so is the one after. It's
an AST-AST transformation.

If anything, the one with macros is more abstract: because it, like, has the
abstractions in their original abstract form!

Also, non-Lisp languages perform AST-AST transformations; just usually not
with Lisp-like macros. For instance, the AST node for a _while_ loop in a C
compiler might be replaced with a combination of _if_ , _goto_ and statement
label nodes (with generated labels: analogous to a Lisp macro's gensyms).
That's a form of expansion. The input is an AST with _while_ nodes; the output
is one without.

So, no the advantage really is that with Lisp we are reading rather than
parsing. Or, alternatively, that the parsing is very simple and uniform, and
that the language of the parser __over-generates __: it produces a large space
of forms which do not have a meaning, but serve as arbitrary data or can be
given a meaning with new abstractions.

In, say C, we have a syntax in which there are numerous lists: lists of
declaration specifiers in a declaration, lists of parameters, lists of
structure members in a struct declaration, lists of global definition, lists
of statements in a statement body. These all have their own grammar
productions with their own quirks. And none of them have an object model to
which they correspond.

In Lisp, the analogous things are all the same list type with the same
syntactic representation. It corresponds to an object, and is operated upon by
the same access and construction methods.

Homoiconicity isn't the point, because that just refers to storing procedures
in the form in which they were entered ("code is characters, and nothing
but"). Code is structured data is code is the point, with a nice,
straightforward printed representation for working with the data textually.

~~~
derefr
> False; the syntax before expanding macros is an AST, so is the one after.
> It's an AST-AST transformation.

I mean, yes, in the same vacuous sense that the flat stream of tokens output
by a lexer technically qualifies as an AST. If you like, your lexer could
output a "tree" of 1+N nodes: a Parse node, and within it, a list of N
arguments (the lexer-tokens.) You would then apply an AST-AST transformation
that responds to the "parse" node by parsing its contents.

When we talk about an AST in the context of programming, we usually mean to
refer not to any airy CS concept, but specifically to the output of an LR(k)
parser—that is, a bottom-up, context-free parser. To parse the lexer-token
stream in an LR(k) parser, you need to be able to output ("produce") a
structure given a sequence of tokens, without having any context of the
greater rule you're executing "within" (i.e. above on the call-stack) other
than the fact of what the current rule is.

Homoiconicity is exactly the property of a programming-language _grammar_ that
allows code containing macros to pass cleanly through this initial "lexical
parsing" step. Usually, this requires a separation of the grammar from the
syntax of the language, such that the "lexical parsing" grammar will no longer
directly produce any of the "special forms" of the language itself, but these
will rather be later handled by a similar (or the same!) process as macro-
expansion—consuming AST subtrees to produce other, more specialized AST
subtrees.

Or, to put that another way: macros require top-down parsing. If you want to
avoid using a full-on top-down parser, you can instead apply a traditional
bottom-up context-free parser followed by applying a folding transformation to
the generated tree to do "the rest" of the parsing. But, to achieve this
parsing strategy, the property you have to imbue your language grammar with is
called "homoiconicity."

~~~
kazinator
Lisp macros (that the article is talking about) work with the same input and
output language, not with two different languages/representations as is
written. Macros can expand to other macros and even to themselves. It's an
iterative process in which the intermediate pieces of tree have fewer and
fewer macros until none are left. Other than not using macros, the target code
is in the same language.

(In C, the preprocessing phase also works with the same input and output
language. It's not a tree data structure, but rather a sequence of
preprocessor tokens. Tokens in, tokens out. Parsing is then done in the next
phases of translation. Macros have to do some light parsing in order to
delimit the argument lists, of course, and to identify the operators like the
token pasting ##, stringification # and whatever, plus to handle the
preprocessing directives and #if expressions with arithmetic.)

------
whalesalad
This is the most pedantic post I have ever read on the subject. The point is
that code is data, enough said. You manipulate your functions the same way you
manipulate lists, because they are the same thing.

Stuff like this is frankly why so many programmers shy away from lisp and
s-expression based languages.

~~~
lisper
> The point is that code is data

That's the popular slogan, but there'a actually quite a lot more to it than
that. After all, strings are data too, and C programs are represented as
strings, so "code is data" in C too. But that is obviously missing the point.

What's really going on is that, in Lisp, code is a particular kind of data,
specifically, it's a tree rather than a string. Therefore, some (but not all)
of the program's structure is represented directly in the data structure in
which it is represented, and that makes certain kinds of manipulations on code
easier, and it makes other kinds of manipulations harder or even impossible.
But (and this is the key point) the kinds of manipulations that are easier are
the kind you actually want to do in general, and the kind that are harder or
impossible are less useful. The reason for this is that the tree organizes the
program into pieces that are (mostly) semantically meaningful, whereas
representing the program as a string doesn't. It's the exact same phenomenon
that makes it easier to manipulate HTML correctly using a DOM rather than with
regular expressions.

~~~
torstenvl
_The reason for this is that the tree organizes the program into pieces that
are (mostly) semantically meaningful, whereas representing the program as a
string doesn 't. It's the exact same phenomenon that makes it easier to
manipulate HTML correctly using a DOM rather than with regular expressions._

Mind. Blown. Your whole comment is incredible. I've never thought about it
this way or realized what the value proposition was. Thanks!

~~~
lisper
Check out:
[https://news.ycombinator.com/item?id=16395810](https://news.ycombinator.com/item?id=16395810)

------
Peaker
But you can do this without S-expressions.

DLang, for example, has free-form macros that they weirdly call "mixin" (as
in, mixing in some text or declarations into the AST, syntax-wise).

Each mixin must be a valid AST subtree on its own, which gives the same
guarantee that paren matching prior to macro expansion gives you.

Then, the compiler can interleave:

    
    
         parsing
    
      -> evaluating mixin strings
      -> resuming parsing of the mixin subtrees
      -> evaluating the deeper mixin strings
      -> ...
    

You get the power of Lisp macros, but without the Lisp syntax that is
unattractive to many.

------
roenxi
The languages I know typically have a fairly standard way of implementing
functions - either values or references to values are passed as arguments and
then the function does something. Maybe it returns a value. That means a
construct that short circuits, such as:

`if is.open(file) && read(file) { ... }`

can't actually be implemented by a function. That is, you can't write a
function:

`special_and(is.open(file), read(file), ...)`

Because the `read` is automatically evaluated before the function is called.

This means a programming language with lisp-style macros can very easily
implement constructs that behave like `&&` and short circuit, because they are
implemented as a macro instead of a function. This opens up new (& fast) ways
to control program flow that aren't available to a lot of people. The profound
impact is that lisp libraries can tack on new control structures in a way
that, say, C can't.

I'm no language designer, so this is basically out there to see if I get
corrected.

~~~
theoh
For some technical discussion of the order of evaluation of arguments:
[https://en.wikipedia.org/wiki/Evaluation_strategy#Non-
strict...](https://en.wikipedia.org/wiki/Evaluation_strategy#Non-
strict_evaluation) (Short-circuiting is specifically mentioned as having
relevance)

Note that even in a language which is generally strict, laziness can be
provided selectively for individual values, and vice versa. Haskell's laziness
has proven to be a cause of troublesome performance problems, sometimes
needing to be solved by "strictness annotations"; the newer language Idris is
strict, but offers laziness as a type.

[http://docs.idris-lang.org/en/latest/faq/faq.html#how-
can-i-...](http://docs.idris-lang.org/en/latest/faq/faq.html#how-can-i-make-
lazy-control-structures)

EDIT: phamilton beat me to it, maybe the links are still useful.

~~~
sampo
Scala also has lazy values and arguments as an option.

------
lispm
I would not call that 'syntax tree', but something like 'token tree' or
'nested token lists'.

For the Lisp reader the form (+ 1 2) in

    
    
      (first '(+ 1 2))
    

and

    
    
      (* (+ 1 2) 3)
    

looks the same. It has no idea that the first is data or code represented as
data. It has also no idea that the second is actual code and which syntax
(here some prefix syntax) it uses.

All we represent is a bunch of tokens in nested lists.

------
dang
Short but interesting thread at the time:
[https://news.ycombinator.com/item?id=3854262](https://news.ycombinator.com/item?id=3854262).

------
AnimalMuppet
If I understand correctly, homoiconicity isn't really about the symbols used
in the source file to represent code are the same as the symbols used in the
source file to represent data structures. It's really about the parse tree of
the program using the same data structures that programs use for data, _and
therefore you can operate on the parse tree just the same as you can on
regular program data_.

~~~
benlorenzetti
This is probably the raisin d'être of homoiconicity but it's more specific a
concept than that. It's a conceptual property on the grammar of the language:
there need to be control characters/words in the grammar like spaces and
parentheses so that the parser can correctly generate nested lists, or trees
matching the intended AST, even when it doesn't necessarily have full
information about what the languages set of operative keyword are. Basically
the parser has to support adding new, user defined operators.

And then in addition to being able to parse user defined operators, then you
have to have a decently organized, simple system for applying AST
transformations and modifying tree objects. And there again homoconicity in
the grammar can be useful if it makes it easy to textually serialize and set
object attributes.

------
zerker2000
> None of this is to say that it’s _impossible_ to design a macro system for
> languages with non-Lispy syntax

For me this was most poignantly demonstrated by
[https://chrisdone.com/z/](https://chrisdone.com/z/)

------
pjc50
I'm starting to think that the reason Lisp has never taken off outside a
minority of users is that homoiconicity is a downside for the _human_ readers
of the language; computers are happy to count brackets, but humans prefer
either different types of brackets or other separators like ';' or 'newline'.

~~~
nothrabannosir
Brackets in lisp are like the stick shift in a car. It looks scary and
dreadful to anyone who has never learned to use it.

Then, it turns out to be the easiest thing about the entire endeavour, and you
forget about it in all of your first 10 minutes on the job.

~~~
allover
That's your opinion, but I spent weeks with the Clojure book and various
exercises and could honestly never get past the syntax.

Because of this I have zero interest in working with a lisp day-to-day, but
there are multiple C-style langs I'd be happy working with for a day-job.

I think GPP is right in asserting that _most people_ just won't ever get over
it, and you shouldn't be so quick to hand-wave their opinion away.

~~~
lebski88
I introduced Clojure to a Java shop. We took 30 Java developers from zero
experience in lisp to 100% Clojure developers. I have to say that my
experience agrees with the parent. People didn't struggle with the syntax
after a few days.

They struggled with functional programming and immutability. At least to start
with. I actually found it amazing how easily people moved over and how
enjoyable they found it. Out of 30 people only one didn't take to Clojure. He
moved to a C# team for a while but eventually decided to rejoin and pick
Clojure back up.

Obviously everyone is different but this was quite a good sample.

~~~
kazinator
> _They struggled with functional programming and immutability._

So, i.e., they were struggling with just the stuff in Clojure that makes it a
non-Lisp.

Those programmers should have been informed that there are real Lisps out
there in which you don't have to do functional programming, and things are
mutable.

~~~
willismichael
Of course any true Scotsman that wants to program in a real lisp wouldn't use
Clojure.

~~~
kazinator
> _true Scotsman_

I'm not defending a generalization against counterexamples by trying to
exclude them with a moving-goalpost definition.

Mutation and pure procedural programming are part of Lisp. They are part of
Lisp when they are bad, and part of Lisp when they are good. I've never
shifted a definition of what is Lisp to exclude or include these
characteristics in order to suit an argument at hand.

------
quadcore
Guys it's simpler than that I think. First, homoiconicity means _one
representation_ for code and data. Code and data are represented by the same
data structure (in lisp it's list).

So what's it good for? Well, it's a generalise way of doing objects. In OO
code, when you're given an object, say in parameter of a function, you're
given data and, joy, you're given code as well. That's super handy because now
data and _code-that-runs-on-those-data_ come in the same package. You dont
have to know the details (and more importantly, you dont care about the
details) of how that piece of code-and-data was made - Im looking at you
polymorphism - you can interface your algorithm to it and things will run the
way they are supposed to. Notice how your programming has become more
powerful. You've decoupled things here: now you dont need to know how the code
works, but you still can interface to it. Other teams can supply piece of
code-and-data, and, as long as you've agreed on the interface, things will
run. That's classy.

Now you could go one step further. You could go literally _matrix_ on this
concept, and by changing virtually nothing. Let's just represent an object in
a different, yet identical way: as an ordered list of members and methods -
which it literally is. What's that cool for? Well now you have a list, you can
splice it. You can add and remove code-and-data at will which is what you were
doing when using polymorphism (you were swapping methods, adding members, that
kind of stuff).

What's it good for? Step back a minute, what is polymorphism good for? We
mentioned it before, it allows you to decouple implementation from execution.
Well then, homoiconicity is good at the exact same thing. That's it, there is
nothing more to say. If you understand what inheritance and polymorphism are
good for, you understand what homoiconicity is good for: it's tools for
representing and manipulating code-and-data. Notice how polymorphism and
inheritance are tools from compile-time. Homoiconicity is the most usefull at
compile time too, yet can be used at runtime as well.

All in all, that's why coding in lisp will make you a better programmer. OO
languages and Lisp have the same goals. Only one is the nerd version of the
other. Code in lisp and you'll come back to OOP thinking "this looks like
BASIC now".

Ultimately, OO is good. The only thing that's bad with OO is that it's clunky
in practice (what a pain to change a class hierarchy) and therefore gets in
the way of _refactoring_. _Refactoring_ is the key difference between
_waterfall_ and good software development. I had a friend who used to say "you
should be refactoring 30% of the time" and I believe he's right. So while OO
features are arguably good enough, programmers tend to waterfall with it and
that's a killer.

------
russellbeattie
I have to admit I've never heard of the term homoiconicity before... Parsing
it out, I clicked on the link thinking it might be an article about making
icons or emoji on different platforms look similar.

