Homoiconicity isn’t the point (2012) 79 points by pcr910303 9 months ago | hide | past | favorite | 52 comments

 Fwiw, I think what's important about homoiconicity isn't so much that the language uses "boring" data structures for its (intermediate?) syntax tree, but that the both the syntax itself and the representation of the syntax as an AST is simple and obvious.Haskell has TemplateHaskell, which can be used for macro-like things, but it's substantially less ergonomic, not because Haskell isn't "homoiconic", but because the grammer is actually really complex and non-obvious. There's tons of little things that you don't think about when writing Haskell code, but you have to deal with when manipulating it. For example:https://hackage.haskell.org/package/template-haskell-2.15.0....That's a node in the AST that stands for something that is at most one character in the source text, and usually zero. So code manipulating this stuff gets really verbose and clunky. It's still powerful, but it's not the same.As a side project, I'm actually working on an ML-family language with a macro system. It still has a more traditional ML-style syntax, but it is simple so working with it should be comparatively ergonomic. In an ML you wouldn't frequently want to be working with loosely defined data structures anyway; the first thing you'll do is convert it to a more strongly typed form that captures what you really want to be manipulating.
 I hope I stay active in the industry long enough for people with influence to start talking about Human Factors as they apply to development tools. Ten years ago I thought that might be right around now, but today I'd still say ten years from now. I might be in my fusion powered self-driving car waiting for that boat to come in.When I'm digging through a large body of code looking for subtle bugs, I want the code to be boring but not bland. By that I mean, yes, all of the bits should be obvious, because I'm having to contend with the cartesian product of all of the bits. But if everything is self-similar top to bottom, there are no landmarks. It becomes very easy to get 'lost' in the code and have trouble telling if the next candidate for debugging is 'up', 'down' or sideways in the call stack.Fractals are really cool to look at, but they're murder for navigation purposes.
 >Fractals are really cool to look at, but they're murder for navigation purposes.Are they? They imply that the structure is self-similar, which is a good trait for a structure, and makes it easy to read it at any level and get what's going on.That's what trees are, lists of lists are, strings of characters are, etc.>But if everything is self-similar top to bottom, there are no landmarks.The specific functions called at each level are the landmarks.
 What specific functions? That's my point. If you go all in on recursive design, all the functions, variables, and object names are the same all the way up and down your graph. There are no specific functions. It's all grey goo.
 Forgive my failing imagination, but can you give some concrete examples of what you’re describing?
 I cannot speak for the others here, but with languages like say JavaScript, the symbols usually represent something. {...} will usually represent a block of code. "[...]" will usually represent an array-like index.With Lisp you don't have such visual cues; you have to read the function name and perform a mental translation (lookup) of that function name to "find" a purpose in order to know what the "parenthesized unit" is. Thus, it's more mental steps to compute the general meaning of the code.One is mentally searching (mapping) on name, not visual appearance.Lisp fans seem to perform this lookup faster than average. Whether it's because they've been doing it for so long or they have an inborn knack is unknown. It would make a fascinating area of research.I tried to get the hang of name-based fast recognition, but was progressing too slow for my comfort.
 Lisp code usually has a complex tree like indentation & layout. Indentation is provided as a standard service by editors and by Lisp itself. See the function PPRINT as the interface to the pretty printer, which does layout and indentation of source code.We look for visual tree patterns. For example the LET special form: LET BINDINGS BODY  or more detailed LET VAR1 VALUE1 VAR2 VALUE2 ... BODYFORM1 BODYFORM2 ...  The list of binding pairs is another common pattern.There is a small number of tree patterns which are used in the core operators of Lisp. Once you've learned them, reading Lisp is much easier than most people think. CL-USER 13 > (pprint '(let ((one-number 1) (two-numbers 2) (three-numbers 3) (four-numbers 4) (five-numbers 5) (six-numbers 6)) (+ one-number two-numbers three-numbers four-numbers five-numbers six-numbers))) (LET ((ONE-NUMBER 1) (TWO-NUMBERS 2) (THREE-NUMBERS 3) (FOUR-NUMBERS 4) (FIVE-NUMBERS 5) (SIX-NUMBERS 6)) (+ ONE-NUMBER TWO-NUMBERS THREE-NUMBERS FOUR-NUMBERS FIVE-NUMBERS SIX-NUMBERS))
 All indentation does is tell you that there is a hierarchy of some sort. The fact something is one level deeper than another still doesn't tell me generally what it does, because that depends on what the parent(s) does. And it's not a difference maker because "regular" language can also use indents.Further, the coder decides the indentation, and I'm not convinced it's consistent enough. In my example, a curly brace is a curly brace regardless of a coder's preference. It's enforced by the rules the language, not the coder.Another thing I'd like to point out is that languages like JavaScript provide two levels of common abstraction. Using curly braces, parenthesis, and square brackets as an example, you spot them and immediately know the forest-level category something is: code block, function call, or array/structure index.With Lisp, all you are guaranteed have is the function name (first parameter), which you have to do a mental name-to-purpose lookup, which could involve thousands of names. Splitting the lookup into levels improves the mental lookup performance, at least in my head.It also helps one quickly know to ignore something. For example, if I'm looking for a code block, I know I can probably ignore array indexes, and vice versa. It's forest-level exclusion, which is hard to do with single-level lookups because the list is too long: it has to contain the general category of each name. It's extra mental accounting.Hard-wiring in big-picture categories into the syntax allows quickly making forest-level reading decisions, and which individual coders generally can't change.When it comes to team-level read-ability, consistency usually trumps abstraction and many other things.Maybe Lisp can do something similar with "let", bindings, and body, but then it starts to resemble "regular" languages, along with their drawbacks, which is typically less abstraction ability. Consistency "spanks the cowboys", both the good cowboys and the bad cowboys. But at least you know what you have, and can estimate and plan accordingly.
 With lisp you stop thinking about syntax rules anymore (because the program is in a very simple form) so you can focus purely on semantics. Some things, like you say, are too overloaded in Clisp, for example: (defun averagenum (n1 n2 n3 n4) (/ ( + n1 n2 n3 n4) 4))  Clojure recognizes this goo problem and expresses the parameters in a vector instead of how Clisp does it in a list (defn averagenum [n1 n2 n3 n4] (/ ( + n1 n2 n3 n4) 4))  Helps a lot.
 That isn't how that defun would normally be formatted, though. Lisp knows that the 'body' part starts with the 3rd argument (the same way Clojure knows it starts with the second), so it indents the lambda list farther right (if it doesn't fit on the first line, otherwise it would go there).For my eyes, [] aren't distinct enough from () to make the second style preferable. I'd rather have indentation to set it apart.
 I just made them consistent. It certainly helps me visually. Also a list implies you intend to evaluate it by executing a function. A vector implies you do not intend to call a function.
 Yeah, it's definitely a taste thing, but as the first style isn't used in Lisp, comparing its readability doesn't really make sense. Indentation is important to reading Lisp. Indenting it in an odd way makes it harder to parse. It's a bit like giving def foo(x) bar(x) end  as an example of Ruby syntax being overly homogeneous.
  (defun averagenum (n1 n2 n3 n4) (/ (+ n1 n2 n3 n4) 4))  The typical pattern is DEFSOMETHING name arglist body  A Lisp programmer reads those structural patterns, not the delimiters.Lisp programming is more about thinking of trees of code and their possible manipulation - even independent of a visual notation and especially independent of the exact delimiter used.The delimiters are in shape recognition much less important than the shape itself.
 "A Lisp programmer" ... Sheesh, I am a lisp programmer, man, and I assure you I know what trees of code independent of visual notation is. You're making a point that doesn't need to be made here. It's this kind of phrasing that really turns off people from the lisp community.I wrote it that way so it's easier for non-lisp programmers to compare with what they're more used to as well.
 That doesn't help them. Explain it like it is. Lisp is different from what they are used to.
 > All indentation does is tell you that there is a hierarchy of some sort.As I said, it's not just indented, but also layouted. As a Lisp developer I see a LET and then know that a binding+body structural pattern is following. binding+body and variations of those is used in a bunch of operators.That's a simple sequence:1) identify the operator on the left2) it determines the layout pattern3) visually apply pattern recognition based on the layout pattern and classify the blocksLisp is actually quite well to read, but you have to learn the basic structural patterns for a few days.The biggest hurdles are mental blocks and prior exposition to other types of syntax systems.> the coder decides the indentationHe doesn't. Lisp does it. There are indentation rules provided by the development environment. If Lisp does the layout, it uses a complex layout system to adjust the code to the available horizontal space, taking into account the constructs to layout. That's why Lisp developers use these indentation and layout tools for decades.Lisp code itself is fully insensitive to formatting. Tools (editor, Lisp, ...) do the indenting and sometimes also the layouting for the user.> Using curly braces, parenthesis, and square brackets as an example, you spot them and immediately know the forest-level category something is: code block, function call, or array/structure index.Lisp uses symbolic names for that. People are extremely good at reading names. Since Lisp names are always at the beginning of a form, there is no guess/backtracking needed to find out the most top-level form. Where as with languages with infix operators I would need to parse the whole thing, finding out the operators and select the one with the highest precedence.> With Lisp, all you are guaranteed have is the function name (first parameter), which you have to do a mental name-to-purpose lookup, which could involve thousands of names. Splitting the lookup into levels improves the mental lookup performance, at least in my head.Lisp has not just functions, but also macros and special operators. Function calls have a fixed syntax and there is not much to understand.Special operators and macros usually signal with their names the kind of structure and thus the layout expected to follow. For example there are a bunch of macros which begin with WITH- . Those signal this pattern WITH- items and property list BODY  Where 'items and property list' is something like (name item :option1 value1 :option2 value2 ...)  For example a file will be opened like this: (with-open-file (stream file :direction :input) (read stream))  I read WITH-OPEN-FILE and know that it is a WITH- type macro. Then I know the basics of the structure of the code.Lisp has a number of these built-in patterns.
 This is interesting, but not something I've really struggled with in Clojure. I generally try to write small functions, at a single level of abstraction, and if I feel there's too much complexity I'll extract part of it out.It's true that this then results in lots of names. But that's also how I write code in every other language - 90% of reasoning about code is through named variables and functions, and compositions thereof. I'm struggling to picture how this becomes an issue, unless you genuinely don't use abstractions in code at all.
 In particular I’m thinking of architectural astronauts who get excited about a “system” they want to build (or god forbid, already have built) where “everything is a foo”. The hallmarks of these systems include: lots of very self-similar code - often recursive, very low inclusion of domain nouns and verbs, and instead substituting a new (impoverished) domain that is full of vague nouns and verbs (like “node”, “execute” or “handle”). In the worst cases, different parts of the code use a separate definition of the vague nouns.Such people would (and at least one case that I’ve seen, have) rush to call their system homoiconic. I think because they’re more interested in looking clever than being helpful.Not only is such code hard to follow, but as one smart person once put it, any time your code uses different concepts than those from the problem domain, there’s a place where impedance mismatches live. Those look like bugs to your users, and are often the hardest to fix.As the other responder guessed, this is a part about lisps that I’m not overfond of. It’s easier for such frippery to expand to the entire system.
 The alternative to self-similarity at different scales has already been tested and more or less rejected: it's DSLs of the kind Lisp and Smalltalk let you create. The idea being that you build a high level language out of abstractions and it levers up the language.The problem is that a custom language so built is harder for newcomers to the project to get to grips with. If it's a super common problem, and the solution gets popular, it might work out - see e.g. Rails with its DSLs for migrations, routes, etc. But that's a small fraction of problems. Hired developers don't really want to learn you custom thing, too, as it reduces their market value.
 How was that idea rejected? Building languages out of abstractions is almost all of what you do when programming. When you create a bunch of functions and group them in a module, you've created a piece of language. An abstraction layer is a language that things above are coded in.Lisp/Smalltalk DSLs only expand your capabilities here to syntactic abstraction / code generation. But the overall principle is the same.
 > but that the both the syntax itself and the representation of the syntax as an AST is simple and obvious.The PLT literature is full of programming languages with simple and easily-serialized syntax. It's not all about LISP or Church's lambda calculus or whatever, there's plenty more out there.As one example, the bit of Haskell syntax you mention is only "complex and non-obvious" because the Haskell community has yet to internalize the ideas that PLT research has come up with, as to how laziness and strictness should be specified, interact etc. in a programming language (i.e. polarity, focusing, call-by-push-value).
 > the Haskell community has yet to internalize the ideas that PLT research has come up with, as to how laziness and strictness should be specifiedCould you how those ideas should be used in practice? I'm a Haskell programmer and vaguely familiar with the ideas you mention but it's not clear to me exactly how they fit into a practical language.
 My experience with dsl in rust is that they’re hard to use because you can’t use the types to read the source code. Because the dsl can do funny things to the types. I guess powerful macro systems are useful in complex languages but maybe we shouldn’t rely on it too much?
 This may be missing the point, but PostScript is not only homoiconic, but also point-free!https://en.wikipedia.org/wiki/Tacit_programming#Stack-basedhttps://en.wikipedia.org/wiki/Talk%3AHomoiconicity#PostScrip...https://news.ycombinator.com/item?id=18317280>The beauty of your functional approach is that you're using PostScript code as PostScript data, thanks to the fact that PostScript is fully homoiconic, just like Lisp! So it's excellent for defining and processing domain specific languages, and it's effectively like a stack based, point free or "tacic," dynamically bound, object oriented Lisp!https://medium.com/@donhopkins/the-shape-of-psiber-space-oct...>Interacting with the Interpreter: In PostScript, as in Lisp, instructions and data are made out of the same stuff. One of the many interesting implications is that tools for manipulating data structures can be used on programs as well.This is the point:
 Maybe that's why i Always Love both stack concatenative and pf Haskell
 The point of homoiconocity is that your macro language is your language. And your data structure language is your language. As well as your AST language. They are all the same.So the author is right about observing that "intermediate ast" and "ast" share the same concrete syntax and macros transform one into the other. But macros are also defined in that language. Furthermore the output of your program is definitely in that language.
 There are two features of Jai I hope turn up in other languages. I'm fascinated by the 'struct of arrays' data pattern (columnar vs row-oriented storage, in effect), but also the ability to declare functions that run at compile time instead of run time, instead of macros.
 > also the ability to declare functions that run at compile time instead of run time, instead of macrosLisp macros are such functions. But you can also declare regular functions to be compiled around macroexpansion time, which makes them available to call during macro expansion. You can use the same functions during macroexpansion/compiling and runtime too, and you can compile new functions at runtime.
 > declare functions that run at compile time instead of run time, instead of macros.Lots of languages have this, including C++.
 Jai does it to a degree that is certainly not common.C++ in particular only has true guaranteed compile time execution for constexpr functions in static_asserts and a few other small cases.Jai on the other hand can do pretty much anything at compile time.
 constexpr is Turing complete and it’s only limit is recursion depth of the compiler, which can be changed.And as of C++17/20 some of the weird stuff like not using conditionals and such isn’t present anymore.Also if you’re a masochist, the template system is Turing complete.
 I was not talking about the turing completeness of constexpr.There are only a few contexts in which constexpr functions are guaranteed to be evaluated at compile-time. This contrasts Jai's compile time only functions.Additionally, Jai can perform arbitrary IO at compile-time which is simply not possible with c++. One of the early demos had a program which played a videogame at compile-time.
 And Zig!
 That's something Forth can do. It's how the language is extended.
 A thread from 2018: https://news.ycombinator.com/item?id=16387222Discussed at the time (a little but quite well): https://news.ycombinator.com/item?id=3854262
 Homoiconicity means same representation. Literally, the word homo means same, and icon, same icons. The source code and the data-sructures are represented the same using the same iconography, aka, syntax.What's the point of that? Well, let's see... Why is working with Json in JavaScript so much better than in most other language?When the syntax for data-structures and code and data serialization, and configuration, etc. is the exact same, it is really harmonious to work with. That's one of the great things about homoiconicity.The article is saying that macros are the point, and macros are great for sure, and it is one of the points, but homoiconicity is also a great point. It is useful even in non-macro scenarios, and it is even more useful when combined with macros, since it makes writing and reading them that much easier.
 Maybe what Lisp needs to make it popular with kids these days is a hip new syntax.Instead of nested in-and-out bubbles like (foo (bar)), it could have nested up-and-down ramps like \foo \bar// or /foo /bar\\, to represent a "change of level".That way you could have positive and negative nesting, and turn programs inside-out!Y-Combinator: \defun Y \f/ \\lambda \g/ \funcall g g// \lambda \g/ \funcall f \lambda \&rest a/ \apply \funcall g g/ a//////  Y-Uncombinator: /defun Y /f\ //lambda /g\ /funcall g g\\ /lambda /g\ /funcall f /lambda /&rest a\ /apply /funcall g g\ a\\\\\\
 > "change of level"The obvious interface for such a concept would be to have matching parentheses "snap" into a kind of token field around the corresponding text. It would be pretty easy to then build a kind of raised relief map of code built from nested token fields that suggest elevation changes instead of nested parentheses. All in realtime as the user types, of course.In the history of Lisps is there such an interface? My gut tells me either a) no, or b) yes, there's a maximally unusable version which was abandoned immediately upon recognition that it would suffice as bait to inculcate newcomers into the cult of Lisp parentheses.Edit: clarification
 That's a great suggestion. I'll start the wiki!
 Aren't you just exchanging the pair (,) for /,\ or \,/? What real benefits that implies apart for aesthetics?
 Purely aesthetics! ;) I wouldn't want to mess up a good thing. You would just have a positive and a negative way of writing the same thing.You could turn the paren bubbles inside-out like )foo )bar(( but that doesn't look as cool to me as flipping the ramps upside-down like /foo /bar\\ and \foo \bar//.
 It would give the kids something to keep bikeshedding about, and spawn 20 different code style standards and associated autoformatters. All of that pointless, but hopefully, such injection of energy into the ecosystem would help Lisp get more popular.
 racket rhombus will attempt that.
 I wonder what you could say about this foreign dialect of lambda-calculus, lambdatalk : http://lambdaway.free.fr/lambdaspeech/?view=lambda or http://lambdaway.free.fr/ where the evaluator is meanly built on a single regexp going back and forth on the code string, replacing directly s-expressions by words. Just reading and replace without parsing.
 TCL is kind of like Lisp with strings instead of s-expressions, if you squint at it right and hold your nose. All evaluation is simply text substitution.But TCL is not nearly as powerful or efficient as Lisp.TCL's historic advantages (which were unique and important in 1988) are that it's free, easy to integrate with C code, and it comes with a nice user interface toolkit: Tk, which also has a great interactive canvas drawing api.Tk is nice because it was designed around TCL from day one, which vastly simplified its design, since it didn't suffer from Greenspun's tenth rule like most GUI toolkits do, because it already had half of Common Lisp: TCL.https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule>Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.Advantages of Tcl over Lisp (2005) (tcl.tk)https://wiki.tcl-lang.org/page/Advantages+of+Tcl+over+Lisp
 Yeah, that's... exactly what homoiconicity means?
 This seems similar to Ant's relationship to XML?
 That Ant is a domain specific language for translating XML into Java stack traces?
 I meant that Ant is built on a generic language for representing data.But XML isn't a great choice and JSON wouldn't work well either. S-expressions are popular with Lisp programmers and unpopular with most other people.It seems like there might be some other solution?