Hacker News new | past | comments | ask | show | jobs | submit login
Homoiconicity Revisited (expressionsofchange.org)
49 points by vanschelven on June 5, 2020 | hide | past | favorite | 61 comments



I think this is close, but sort of missing the point: it’s possible to extend Common Lisp to take code in the form of Python source code (clpython), JavaScript source code (cljs) or many other textual syntaxes: originally, the iconic s-expression syntax of lisps was intended to be a sort of IR dump for m-expressions.

What makes CL homoiconic has nothing to do with the textual syntax, but rather the existence of a function READ that takes _some_ textual syntax and produces the code to be evaluated in the form of the language’s ordinary datastructures: EVAL and friends consume a representation of the code that is neither just a blob of bytes nor some exotic AST objects, it’s lists, hash-tables, strings, vectors, and all the types of things programmers manipulate every day.

The implication of this is that intercepting and modifying the code between READ and EVAL doesn’t really require any special knowledge: you have to know the semantics of your language, and how to manipulate its basic datastructures, but you don’t need to understand any special “metaobjects” for representing code.


Here is why that reasoning is wrong:

https://stackoverflow.com/questions/9533139/what-would-cloju...

The point is that macros become much harder once you move away from leading parenthesis.

Apple tried your reasoning with the Dylan language in the 90s - and the Dylan language died.


Dylan did not die because of any technical deficiency. Dylan died because Apple management decided to kill the project and so it never achieved critical mass.

Programming languages never live or die on their technical merits. Javascript is a hot mess, and yet it is one of the most popular languages in the world.

Also:

> macros become much harder once you move away from leading parenthesis

That is not really true. You do lose the backquote syntax, but that is only useful for writing the simplest most trivial macros. That turns out to be the kind what most people write and so backquote is a big deal, but if you want to write more complex macros (e.g. cl-who) then backquote becomes less useful and the surface syntax becomes less relevant.


> The point is that macros become much harder once you move away from leading parenthesis.

This is... an extraordinary claim. Prolog uses foo(hello, world) syntax (note the comma) and has no problem with macros or any other manipulation of such terms. Note that there are longer, better reasoned, and higher voted answers to that StackOverflow question than the accepted answer which opts for "macros would become more difficult" FUD.


Sounds like any random language can become homoiconic with the addition of a parser library.


If it parses to primitive data structures (and actually produces a structure) and allows you to intercept the code between parsing and evaluation/compilation, this is exactly what I’m saying.


This sounds like LINQ would definitely qualify? https://docs.microsoft.com/en-us/dotnet/api/system.linq.expr...

I think your best insight here is the importance of the "lid off" intermediate representation; plenty of languages have an eval function of some sort, but it does both READ and EVAL in your terms with no intermediate access to the parsed representation.


The issue I’ve had with things like that is that I have to learn a whole new API for manipulating syntax: while there are definite advantages to this, representing code as everyday types that very generic functions can operate on has the advantage of making metaprogramming look like normal code. e.g. if I’m doing try...finally... stuff alot, I can right a macro that transforms:

(with-db [conn (connect ...)] ...)

To:

(let [db (connect ...)] (try ... (finally (close db)))

Just like:

    (defmacro with-db [[sym expr] & body]
      `(let [~sym ~expr]
        (try ~@body
          (finally (close ~sym)))))
The backtick/tilde notation isn’t macro-specific: it’s a generic way to template readable datastructures that I can use anywhere, so when I see it in a macro, it isn’t some strange API to learn, it’s just a handy way to build up the datatypes I use all the time.


Nope. The homoiconity of Common Lisp and its related ancestor Lisps is based on the fact that the source code you compile is not textual. Yes, there is CL:READ. Notice however that CL:COMPILE and CL:EVAL does not take text. It takes data structure that is also the datastructure manipulated with macros and which you can easily operate upon. Which can be easily serialized in the form of S-Expression.


Wouldn't a parser library that returns a data structure that is the one passed to the equivalent of EVAL and also the one acted on by macros fit this definition? And then also the ability to easily serialize it back into textual syntax? If so, this doesn't seem to require S-Expressions or lisp, but rather, like the parent comment said, just a good macro and parser library design. (It is true that languages haven't generally seemed to prioritize this feature, I'm just saying that I don't see any reason it must be fundamentally unique to lisps.)


Think of it more in terms of how the language is specified. Common Lisp is specified in terms of data structure, not text, and you can depend on that particular data structure and manipulate it etc.

a 3rd party library that, to be portable, needs to ultimately serialize to text, does not support homoiconicity.


Is it really specified in terms of data structure? Are there not rules in the language specification regarding how s-expressions are parsed? Could I create a, for instance, C-like syntax that parses to and can be serialized from this data structure specification, and call that a valid Common Lisp? If so, neat! I do think more languages should be less defined by their concrete syntax.


Yes - the standard does cover "Common Lisp Reader", but it's essentially a self-contained chapter - every special form, standard macro etc. is defined in term of the data structures. So what you'd have is, at most, an extension - to be compatible with Common Lisp, CL:READ would still need to read S-Expressions with standard readable, and CL:WRITE write S-Expressions, but nothing stops you from adding extra reader that uses a different syntax.


Lisp is written in the data structure.

But that's not a requirement for macros. One can write macros from a parsed AST. But that's not what Lisp does.


Nope. It’s not about “parsing”, it’s about representation.

Languages such as Python and C draw clear distinction between literal values on one hand and flow control statements and operators on the other. Numbers, strings, arrays, structs are first-class data. Commands, conditionals, math operators, etc are not; you cannot instantiate them, you cannot manipulate them.

What homoiconic languages do is get rid of that (artificial) distinction.

Lisp takes one approach, which is to describe commands using an existing data structure (list). This overloading means a Lisp program is context-sensitive: evaluate it one way, and you get a nested data structure; evaluate it another, you get behaviors expressed. The former representation, of course, is what Lisp macros manipulate, transforming one set of commands into another.

Programming in Algol-descended languages, we tend to think algorithmically: a sequence of instructions to be performed, one after the other, in order of appearance. Whereas Lisp-like languages tend to encourage more compositional thinking: composing existing behaviors to form new behaviors; in Lisp’s case, by literally composing lists.

Another (novel?) approach to homoiconicity is to make commands themselves a first-class datatype within the language. A programming language does not need swathes of Python/C-style operators and statements to be fully featured; only commands are actually required.

I did this in my kiwi language: a command is written natively as `foo (arg1, arg2)`, which is represented under the hood as a value of type Command, which is itself composed of a Name, a List of zero or more arguments, and a Scope (lexical binding). You can create a command, you can store it and pass it around, and you can evaluate it by retrieving it from storage within a command evaluation (“Run”) context:

    R> store value (foo, show value (“Hello, ”{$input}“!”))
    R> 
    R> input (“Bob”)
    #  “Bob”
    R> 
    R> {$foo}
    Hello, Bob!
Curly braces here indicate tags, which kiwi uses instead of variables to retrieve values from storage. (Tags are first-class values too, literally values describing a substitution to be performed when evaluated.)

..

When it comes to homoiconicity, Lisp actually “cheats” a bit. Because it eagerly (“dumbly”) evaluates argument lists, some commands such as conditionals and lambdas end up being implemented as special forms. They might look the same as every other command but their non-standard behaviors are custom-wired into the runtime. (TBH, Lisp is not that good a Lisp.)

Kiwi, like John Shutt’s Kernel, eliminates the need for special forms entirely by one additional change: decoupling command evaluation from argument evaluation. Commands capture their argument lists unevaluated, thunked with their original scope, leaving each argument to be evaluated by the receiving handler as/when/only if necessary. Thus `AND`/`OR`, `if…else…`, `repeat…`, and other “short-circuiting” operators and statements in Python and C are, in kiwi, just ordinary commands.

What’s striking is how much non-essential complexity these two fundamental design choices eliminate from the language’s semantics, as well as from the subsequent implementation. kiwi has just two built-in behaviors: tag substitution and command evaluation. The core language implementation is tiny; maybe 3000LOC for six standard data types, environment, and evaluator. All other behaviors are provided by external handler libraries: even “basics” like math, flow control, storing values, and defining handlers of your own. Had I’d tried to build a Python-like language, I’d still be writing it 10 years on.

There are other advantages too. K&R spends chapters discussing its various operators and flow control statements; and that’s even before it gets to its stdlibs. I once did a book on a Python-like language; hundreds of pages just to cover the built-in behaviors: murder for me, and probably not much better on readers.

In kiwi, the core documentation covering the built-in data types and how to use them, is less than three dozen pages. You can read it all in half an hour. Command handlers are documented separately, each as its own standardized “manpage” (currently auto-generated in CLI and HTML formats), complete with automated indexing and categorization, TOC and search engine. You can look up any language feature if/when/as you need it, either statically or in an interactive shell. Far quicker than spelunking the Python/C docs. A lot nicer than Bash.

Oh, and because all behaviors are library-defined, kiwi can be used as a data-only language a-la JSON just by running a kiwi interpreter without any libraries loaded. Contrast that with JavaScript’s notorious `eval(jsonString)`. It wasn’t created with this use-case in mind either; it just shook out of its design as a nice free bonus. We ended up using it as our preferred data interchange format for external data sources.

Honestly, I didn’t even plumb half the capabilities the language has. (Meta-programming, GUI form auto-generation, IPC-distributable job descriptions…)

..

Mind, kiwi’s a highly specialized DSL and its pure command syntax makes for some awkward reading code when it comes to tasks such as math. For instance, having to write `input (2), + (2)` rather than the much more familiar `2 + 2`, or even `(+ 2 2)`. Alas it’s also proprietary, which is why I can’t link it directly; I use it here because it’s the homoiconic language I’m most familiar with, and because it demonstrates that even a relative dumbass like me can easily implement a sophisticated working language just by eliminating all the syntactic and semantic complexity that other languages put in for no better reason than “that’s how other languages do it”.

More recently, I’ve been working on a general-purpose language that keeps the same underlying “everything is a command” homoiconicity while also allowing commands to be “skinned” with library-defined operator syntax to aid readability. (i.e. Algebraic syntax is the original DSL!) It’s very much a work in progress and may or may not achieve its design goals, but you can get some idea of how it looks here:

https://github.com/hhas/iris-script/blob/f9d9298824d05eccb22...

Partly inspired by Dylan, a Lisp designed to be skinnable with an extensible Pascal-like syntax, and also worth a look for those less familiar with non-Algol languages:

http://www.gwydiondylan.org/books/drm/drm_7.html

And, of course, by Papert’s Logo:

https://www.amazon.com/Mindstorms-Children-Computers-Powerfu...


> it’s lists, hash-tables, strings, vectors, and all the types of things programmers manipulate every day.

Except the lisps with good error messages don't really use those things without tagging on the file and line number where they came from. You end up working with "syntax objects" or something instead.


TXR Lisp keeps a weak hash table which associates objects in the syntax with file/line information, without changing their representation in any way.

There is no need for ugly, cumbersome syntax objects that destroy Lisp.

Whether or not the recording of source loc info is enabled is controlled by a special variable. It is disabled by default, because you probably don't want that overhead if you're calling the reader for reams of data. Functions like compile-file and load enable it locally.

To take advantage of the info for error reporting in a user defined macro, you can simply do this:

   (defmacro mymacro (:form f arg1 arg2 ...) ;; get the form using :form
     (when (somehow-no-good arg1) ;; something wrong with arguments
       (compile-error f "first argument ~s is no good!" arg1))
     ...)
You get an error which mentions mymacro, and the location.

   (foo.lisp:10) mymacro: first argument 42 is no good!


This is interesting and clever - thank you for sharing it. However, it seems like the hash table would work on either the address or the shape of the code, and in either case you could build a pathological case which would break it. It's probably a really nice solution for the common/realistic case though.


Common Lisp does that. There is a portability library that provides an implementation-independent interface to that functionality for Lisp programmers: https://github.com/scymtym/trivial-with-current-source-form


Racket does this, I personally have no issues with the error messages in Common Lisp implementations, that generally implement this sort of thing by generating special source forms that are just lists wrapping other parts of the program.


Isn't what you're saying captured (mostly) by the second bullet point?


Sort of, but I think the third bullet point is irrelevant: the correspondence of the visual syntax to the data structure representation isn’t really essential to homoiconicity. The alternative syntax in http://readable.sourceforge.io is still homoiconic, despite being a somewhat different syntax for the same program.

EDIT: If anything, I’d define homoiconicity by the ability to write the same program (as EVAL sees it) in a variety of different textual formats: e.g. ‘foo is equivalent to (quote foo) and both programs are indistinguishable to EVAL


I think this definition makes sense, though I agree with https://news.ycombinator.com/item?id=23426047 that the notion of homoiconicity, and therefore a precise definition of it, is not really useful.

What I'm disappointed by is the analysis of the examples and counter-examples, which even for a cursory analysis is somewhat sloppy. For example: "S-expressions are trees, and the nesting of items is immediately apparent: ( denotes the start of a child, and ) its end." Not really, as I would say that "(f x y z)" either has four children, or one distinguished function symbol and three children. But there is only one pair of parentheses. Even if you take the reductionist view that there really are two children, namely "f" and "(x y z)", there is only one pair of parentheses in the original.

Further: "JavaScript has strong language support for composable, tree-like, structures (bullet 1) and the structure is immediately apparent by looking at the pairs of { and } brackets (bullet 3)." It's more complicated than that, since "f(g(x + y * z))" is also highly structured, and much of the structure is immediately apparent, but some of it isn't because it uses precedence rules. Also, no {} in sight. And then: "Java has no strong language support for literal representation of simple composable data in the language (bullet 1) and also fails to meet the other criteria." I don't see how Java fails bullet 3 if JavaScript is claimed to pass it. The syntax uses {} and other grouping and expression forming mechanisms in essentially the same way. If anything, Java makes structure a tiny bit more explicit since it requires explicit semicolons to delimit statements.


> I don't see how Java fails bullet 3 if JavaScript is claimed to pass it.

I understood it as Java not having a data structure that can be used as a tree available as literal. JS has that. This is valid JS but not valid Java: [[1, 2, [1, 2, []]], [[], [], 0], 1, 2, 3]


It might not be what the article meant since it only mentioned { } braces in connection with JavaScript, but this is a very good point!


I read the javascript part as specifically referring to JSON, not to all javascript code.


Interesting. I don't think that's the intended reading, since there are lots of references throughout to "programs", but JSON files aren't programs in the same sense as JavaScript files are. Also there is an explicit distinction between JavaScript and JSON in the article: "...a JavaScript program is not provided as a JavaScript object (or JSON object)".


I wonder where tcl falls on that list. It's sorta like everything is a command / list of commands and a command is a list of strings ?


> let’s say a word means what the people who use it mean by it.

This is how words work, so I don't know why the author says "let's say". There is no "correct" definition of a word; the dictionary which you might consult to obtain such a definition is itself just recording what is the most commonly-used meaning of the word among people who use it most natively.

I make this point not out of pure pedantry, although that may play a part, but also because nobody seems to know what "homoiconicity" actually is — nor does it seem to actually be very important for us to find out! When people cannot agree on what a word means, I think it's reasonable to say that that word doesn't really have a meaning.

Some time ago, I had the opportunity to talk to Matthias Felleisen — the progenitor of the Racket project (a derivative of Scheme, itself a derivative of Lisp). When I asked whether he felt homoiconicity was an important quality of a language, he indicated something to the effect of (heavily paraphrasing) "I don't know what that word means, and you don't know what that word means, but whatever you think it means is not important. It doesn't matter."

Shriram Krishnamurthi, a former student of Felleisen's and one of the original members of the Racket project, can be found to disparage the word whenever he finds it on Twitter. He also consistently refers to it as h11y (that's "H-eleven-Y"), not just to save characters in tweets, but also, I think, because the word itself is not very important. I've seen multiple exchanges where he challenges peoples' attempts at defining the term precisely, and invariably they give up. (One such exchange is visible at [0].)

Another Racketeer, Matthew Flatt, recently started a push to build a new Racket language (now called Rhombus, I think) which abandons s-expressions almost altogether, but without losing any of the benefits of Racket.

If these people, who are all everyday Lispers, and who are widely known and respected in the programming language community, don't have a solid definition of "homoiconicity" and, further, don't think it's worth investing the time to define it... maybe we should stop getting articles like this, and maybe we should instead tell people to abandon their search for the One True Meaning.

(I do want to say that I think this is one of the better articles about the topic that I've read, and the author did a good job. I just think the topic is not as interesting as the author expects their readers to find it.)

Homoiconicity doesn't matter. Nobody can define it very precisely, so no two people will have the exact same definitions in their head when discussing it, so any talk regarding it will be necessarily lossy (or, in the worst case, completely non-productive). If the literal representation of your language happens to evoke some imagery in your mind of the secret machinations of the underlying forms or something, that's great! But I don't think it's worth pursuing this quality specifically. What syntax works this way for you may not work well for others, or vice versa.

I think that, instead, it is better to focus on more specific goals. Maybe your language is very tree-oriented, so you make it easy to represent trees directly. Something like that. But the specific goal of "homoiconicity" — whatever that means to you — is perhaps not worthwhile, nor should we be seeking to define it. We should leave it by the wayside and move on!

[0] https://twitter.com/ShriramKMurthi/status/104694495053182976...


I believe words and the concept they represent are important, Homoiconicity is very very important, but the concept is something subjective that you get after working years on it.

After programming for a long time in other languages, I had the intuition of wanting something like lisp, but I could not express those ideas with exactitude. They were diffuse, not clearly defined ideas.

Then after having a master teach me lisp I got it. I also had the intuition of lisp limitations too and started working around those limitations and getting to know in specific terms what those limitations are, and the most important thing, working on solving it.

You have people around the internet with very little programming experience discussing all day about concepts they hardly understand, because they have not programmed a lot to really grasp it.

This is like discussing the sex of Angels.

A lot of the problems that exist have lots to do with the limitation on the input technology of computers.

Creating anything takes millions of time more energy and work that just thinking about it.

I could think about computers talking to me and me talking with the computer. It takes lots of work(years) to actually do it.

I could dream about the computer writing code for me, but it can take decades of actual work to do it.

Spending your time just talking is so easy compared to actually going the distance working around it.


> Homoiconicity is very very important

Why is homoiconicity "very very important"? Nobody seems able to say.

> You have people around the internet with very little programming experience discussing all day about concepts they hardly understand, because they have not programmed a lot to really grasp it.

The people I quoted were all principal developers of arguably the most successful Lisp derivative in the past few decades. The annual Scheme Workshop is almost entirely Racket these days. I doubt you are in a place to claim that "they have not programmed a lot to really grasp [homoiconicity]". So I'm confused what your point is here.


“Why is homoiconicity "very very important"?”

I’ll take “What is parsimony?” for twenty, Alex.

That said, I think “very very advantageous” would be more correct.

If your language is a sprawling inconsistent mess of non-essential complexity, what chance of the systems built with that language being any better? And what chance of the language itself growing and evolving in future?

Complexity kills progress. Forward movement finally grinds to a halt under its own impossible weight; at which point some genius will pronounce “Let’s abandon all that and start over completely afresh”, only to make the exact same foundational mistakes as the last time. Rinse-and-repeat ad nauseum for the illusion of progress without actually ever progressing anywhere. And so it goes.

The only way to a maintain sustainable growth is to eliminate all non-essential complexity constantly and ruthlessly. You can either climb that ladder and discover where it goes, or forever sit on its bottom rung playing with yourself.


You're actually proving my point. :)

You never defined what it is that you mean by "homoiconicity", nor did you motivate why homoiconicity specifically is important or advantageous. You made, essentially, two points:

- homoiconicity is "very very advantageous"

- excessive, non-essential complexity will hinder progress eventually

You did not argue that homoiconicity defeats complexity in any way, and you can't make such an argument without first defining what exactly you mean by "homoiconicity".

Like I said, this has only proven my point: people like to argue in favor of homoiconicity without defining it sufficiently. Until someone comes up with an unambiguous definition that exactly fits all of their preconceived notions about the word, the word itself is absolutely useless.

For further clarity: I was never arguing against any of the features that people often mean when they say "homoiconicity is important". Rather, I was arguing that the word "homoiconicity" is unimportant because nobody knows what it means. I have yet to find a truly great definition of the term that encompasses exactly all those languages people tend to feel are homoiconic and none of those that people feel aren't.



The first Lisp syntax idea was to use s-expressions only for data and use a kind of algebraic notation for the code part. Thus programs would use s-expressions, where such data would be written. But that was the syntax on paper. The actual Lisp implementation was interpreting and compiling s-expression code&data. Developers had to hand translate the original syntax into s-expressions. Lisp developers found that very useful and attempts to replace it never had much success for actual Lisp users. At the same time it was always a hurdle for new users or people who would learn Lisp in some university or school setting. In books from the 60s one can read the same complaints about parentheses, etc., like we still hear today.

Later several times there were large and small scale attempts to abandon the s-expression syntax or to develop new languages with all the advantages of Lisp, but not its syntax. The "Lisp 2" project was a huge effort into that direction and failed. Logo was another. ML similar (which was successful and spawned a bunch of new languages). Apple Dylan was another one. Plus many others. Racket now starts the next attempt to get rid of s-expressions. Scheme R6RS was early going into the direction, by offering a fixed syntax definitions for much of the language.

For Lisp the base idea is that programs are not text, but basic data made out of lists, symbols, and a bunch of other data types. The input to the EVALuator is such a data structure. Additionally there are procedures like READ and PRINT, which can translate data from and to textual representations.

So the idea is: code is data. Plus: code has mostly a direct textual representation in the form of nested s-expressions.

The Racket project is more concerned with reaching a larger audience for computer science education, and not so much about sticking to these basic Lisp principles. Their problem is that the s-expression syntax and the code is data idea is difficult for students and of little use, since most popular programming languages work differently (Python, Java, JavaScript, ...).

As a Lisp user I don't care about their educational problem and like the simple code as data idea and what follows from that.


I have a different view:

Everyone knows what homoiconicity is supposed to mean, however, it is not actually a feasible thing to achieve completely and so people use the word approximately..

In my mind I transform "lisp is homoiconic" into "lisp is more homoiconic than most other languages" that is to say, they are closer to achieving this impossible idea.

Really you could let go of this particular "fancy word" (after all it's not like it's the only needlessly complex programming language theory jargon...) and just call it "syntax consistency" or something along those lines.

I sympathize with your stance, precision is important, but people generally don't treat words with any respect so it's somewhat of a futile endeavor.


> they are closer to achieving this impossible idea.

But what is the impossible idea? What is homoiconicity? Can you give a succinct definition that is consistent with your usage?

I don't meant this to sound so antagonistic, but I've seen multiple times where people try to argue in favor of it and come up short when pressed for details.

Unfortunately, I do not think I am sufficiently skilled to carry the debate forward. I really recommend looking through Shriram's Twitter exchange to see some of his rebuttals on the topic.


That makes me think of "pure languages". Some languages are "pure" because they have no side effects? Yes maybe but which practical programming languages have no side-effects?

So similarly as you state about homo-iconicity some languages are simply purer than others.


The use of "pure" in this case isn't to say "better", but rather to do with a mathematical viewpoint.

In math, if I say "x = 3" and then "y = x + 2", it is true that "y = 5" and nothing else (assuming our regular understanding of addition is given).

In programming, it might also have come to pass that some data was output to the console, and some logging took place, and maybe an HTTP request or two were carried out in the interest of making that evaluation. These are not "pure" mathematics; side-effects are impure because they are not accounted for directly in the semantics.

Purity is not on a scale. Either a language is pure, or it isn't. Nobody made any claim that purity is something to strive for in a language. Haskell has it because it was designed specifically to be a language where academics could carry out their "What if?" fantasies (which is catalogued in "A History of Haskell", a paper from HOPL III (2007)).

I will say that purity is useful in most practical programming — just not at the top level. The more purity you have in your code (by which I mean, the fewer side-effects that are not evident in the type system), the easier it is to make changes to your code and to use it with abandon. The more side effects you have, the harder it is to restructure code and reason about how things work. So within the implementation of a particular application, there might be regions of purity and then (encompassing) regions of impurity. But a given function is either pure or impure — no gradient to it at all.


> Purity is not on a scale. Either a language is pure, or it isn't.

This is not quite right. "Purity" mean that with respect to some unspecified by fine-grained semantics various pairs of programs are equivalent. For example, in the denotational semantics of Haskell the following are equivalent

    let x1 = rhs
        x2 = rhs
    in body
and

    let x  = rhs
        x1 = x
        x2 = x
    in body
But they are not equivalent under the operational semantics. The former allocates two closures, the latter one. The "scale" of purity then is measured by how fine-grained the semantics is. Haskell is pure at the "scale" of denotational semantics, but not at the scale of operational semantics.


Ooh I had not heard "purity" defined this way previously in such explicit detail, but it makes perfect sense and I really really like it. Thank you so much for giving me better words for this going forward!

Edit: I think, though, that my point kind of still stands in that, with respect to a specific level of semantic detail, a language is either pure or it isn't. Further, I think it isn't useful to compare Language A at Semantic Level 3 to Language B at Semantic Level 1 to compare purity; you're generally going to be sticking to one level of semantic detail at a time, and within that lens all languages will either be pure or impure. Right? Hmm maybe I need to think on this more.


> This is how words work, so I don't know why the author says "let's say".

I say that specifically because this word was coined in 1965 by Calvin Mooers with a particular specific definition; I'm saying that even though it's a (relative) neologism, the word has already shifted in meaning.

> maybe we should stop getting articles like this, and maybe we should instead > tell people to abandon their search for the One True Meaning

I actually agree... the article (and the previous one) are mostly consequences of starting out in the belief that homoiconicity is something to strive for, and then failing to understand what homoiconicity would really be -- but discovering the things to strive for in the process.


> I say that specifically because this word was coined in 1965 by Calvin Mooers with a particular specific definition; I'm saying that even though it's a (relative) neologism, the word has already shifted in meaning.

Mm what I meant to get at was that you give a definition of "meaning", which is that "a word means what the people who use it mean by it", and your use of the phrase "let's say" makes it sound like this is a charitable admission on your part. And my point is that this is what it means for a word to "mean" something — it's entirely based on what people who use it mean by it, so the way you say "let's say" was rubbing me the wrong way, so to speak.

> I actually agree... the article (and the previous one) are mostly consequences of starting out in the belief that homoiconicity is something to strive for, and then failing to understand what homoiconicity would really be -- but discovering the things to strive for in the process.

I did not take the time to read the previous article, but I regret that now. I will go back and read it. :)

I don't think I did a good enough job saying that I think this is among the best articles on homoiconicity that I've read! I meant for my lament to be more general and not specifically directed at you, but more just at those people striving for homoiconicity without a clear idea of what that means. I think, rereading parts of this article again now, we're more aligned than I had previously though. I'm sorry for not realizing that earlier!


> There is no "correct" definition of a word; the dictionary […] is itself just recording what is the most commonly-used meaning

This is an extremely descriptivistic view of reality. I urge you to moderate the stance.


Linguistics is inherently descriptivist.

Words only have meaning due to community consensus. Yes, we have dictionaries to record what the consensus is (or, more accurately, what the consensus was at a particular point in time), but the dictionary cannot predict semantic drift, nor can it adapt with sufficient speed in the modern era of light-speed communication.

The dictionary only works because most people who use a given word already use it consistently with how the dictionary describes it. It is not the case that most people learn the word "dog" from the dictionary; the dictionary's definition is "correct" because it aligns with the way people already use the word. If Merriam-Webster published a revised definition stating that a dog is "a large, four-legged creature with wings which breathes fire", people would immediately agree that the dictionary is wrong. It only reflects words as they are used, not words as they must be used. (The Merriam-Webster official Twitter account is adamant in promoting this view of dictionaries, for what it's worth.)


Maybe off topic, but I find the styling of this website so pleasant. The colors and fonts really improve my reading experience.


Thanks, though credit where credit is due: it's basically https://github.com/coletownsend/balzac-for-jekyll/ with some minor tweaks


Not sure why, but the list numbers in your working definition are vertically chopped on my iPad (but not my iPhone). Only the right half of each index number is visible.


It seems like there's two ways to go from here -- abandon it, or formalize it. Just because formalizing a concept is difficult, that doesn't make it impossible. I think everyone agrees that there is _something_ that distinguishes lisps from other languages, even if it's hard to say exactly what it is.


An article about homoiconicity without once mentioning either the word "transform(ation)" or "macro". I can't quite make my mind up whether this is a tour de force or if it misses the point entirely.


Isn't the idea of homoiconicity in its simplest form that every data-structure can also be interpreted as a program?


I'd reverse it: every program can also be interpreted as a data-structure.

And this is what the article is about, it's just a lot more in depth dissecting all the ingredients needed to get there.


Every program in every programming language is interpreted as a data structure. Otherwise we would not have compilers and interpreters.


Maybe it's just: 1. code is written in ast, and 2. (macros) you can construct objects that can be interpreted as ast


PostScript is homoiconic, and semantically quite Lisp-like, when it's not being syntactically somewhat Forth-like.

https://news.ycombinator.com/item?id=21968842

>[...] PostScript and Lisp are homoiconic, but Forth is not. The PSIBER paper on medium goes into that (but doesn't mention the word homoiconic, just describes how PS data structures are PS code, so a data editor is a code editor too).

https://medium.com/@donhopkins/the-shape-of-psiber-space-oct...

Also, here is a metacircular PostScript interpreter, ps.ps: a PostScript interpreter written in PostScript! Since PostScript is homoiconic and so much like Lisp, it was as easy as writing a metacircular Lisp interpreter (but quite different in how it works, since PostScript and Lisp have very different execution models).

https://donhopkins.com/home/archive/psiber/cyber/ps.ps

The heart of the metacircular interpreter, "iexec", uses some dynamically defined macros like MumbleFrotz, PushExec, Popexec, containing embedded literal references to data (a dictionary representing the interpreter state, and ExecStack, and array representing the execution stack).

    % interpretivly execute an object

    /iexec { % obj => ...
      100 dict begin
        % This functions "end"s the interpreter dict, executes an object in the
        % context of the interpreted process, and "begin"'s back onto the
        % interpreter dict. Note the circularity.
        /MumbleFrotz [ % obj => ...
          /end load /exec load currentdict /begin load
        ] cvx def

        /ExecStack 32 array def
        /ExecSP -1 def

        /PushExec [ % obj => -
          /ExecSP dup cvx 1 /add load /store load
          ExecStack /exch load /ExecSP cvx /exch load /put load
        ] cvx def

        /PopExec [ % obj => -
          ExecStack /ExecSP cvx /get load
          /ExecSP dup cvx 1 /sub load /store load
        ] cvx def

        /TraceStep {
          iexec-step
        } def

        PushExec

        { ExecSP 0 lt { nullproc exit } if % nothing left to execute? goodbye.

          ExecStack 0 ExecSP 1 add getinterval
          TraceStep pop

          % pop top of exec stack onto the operand stack
          PopExec

          % is it executable? (else just push literal)
          dup xcheck { % obj
            % do we know how to execute it?
            dup type
            //iexec-types 1 index known { % obj type
              //iexec-types exch get exec % ...
            } { % obj type
              % some random type. just push it.
              pop % obj
            } ifelse
          } if % else: obj

        } loop % goodbye-proc

        currentdict /MumbleFrotz undef % Clean up circular reference
      end
      exec % whoever exited the above loop left a goodbye proc on the stack.
    } def

It also has a "vexec" function that executes arbitrary PostScript code, and prints out another text PostScript program to draw an animated trace diagram of the operand stack depth -vs- execution stack depth. That's another simpler kind of macro that produces text instead of structures. So I was using the metacircular PostScript interpreter to visualize one structural PostScript program's execution, by producing and executing the resulting text PostScript program in another PostScript interpreter!

I don't currently have the PS output or rendered images decoded and online, but here's a readme:

https://donhopkins.com/home/archive/psiber/cyber/twist.readm...

This is a plot of the execution stack (x-axis) and the operand stack (y-axis) during the execution of the PostScript /quicksort routine. Each successive picture is more twisted than the last. Twisting is accomplished by rotating the coordinate system clockwise slightly around the center of each of the dots as they are drawn. The rotation around each plotted point accumulates to make the whole drawing curl up. The more twisted away from the original orientation a point is, the later it occurred in time. In the first picture, the untwisted version, up corresponds to a deeper operand stack (pushing things on the stack moves you up), and right corresponds to a deeper execution stack (procedure calls and control structures move you right). The lines follow changes in the state of the stack between steps of the interpreter. (This was made possible by a PostScript interpreter written in PostScript.)

To see the twist animation, run monochrome NeWS, type "psh" to the shell, then type "(/wherever/you/put/it/twist.ps)run". The reason you can't just psh the file directly is that NeWS 1.1 psh does $1 $2 $3 arg substitution, even on binary data! (X11/NeWS psh should work, so you can just go "psh twist.ps")

What the file twist.ps contains is a short header defining the function "c", which reads a canvas and displayed it on the framebuffer. That is followed by a series of "c"'s each followed by a 1 bit deep 1152x900 sun raster files.

-Don Hopkins (don@brillig.umd.edu)


somewhat off-topic, but as both the author and original poster of this article, it's interesting to note that the article appears here as though it was posted 3 hours ago, even though I actually posted it 23 hours ago (when it gathered no attention).


That's because it got put in the second-chance pool - see https://news.ycombinator.com/item?id=11662380 and the links back from there.


Dan,

Emailed, still shadow banned, should I email again?


I wouldn't. Each time you do you put yourself at the top of the inbox, and we work through the inbox from the bottom up—this seems fairest, plus is a natural form of comeuppance for the most demanding emailers.

Please don't post off topic things like this to HN. I'm sorry it takes a long time to answer emails sometimes, but those are the constraints we're under. Believe me, I don't like it either, especially on nights when I'm up late answering them. You're not being treated worse than others, and it's a tad off-putting to be harangued.


Please correct the misspelling of TRAC in the examples section. (Or link directly to Wikipedia.)

And I really enjoyed your analysis. But I think Forth (or Joy) deserves a discussion.

I also wonder how homoiconicity relates to lambda calculus. Pure Lisp contains additional facilities that basically allow one to introspect lambda expressions. Somebody must have studied this theoretically.


> Please correct the misspelling of TRAC in the examples section.

Thanks, done




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: