Personally I find that LISP syntax remove a layer of complexity by directly exposing the AST to my brain instead of adding a layer of internal parsing.
1 + x * 2 - 3 % x
is longer to decipher than
(% (- (+ (* x 2) 1) 3) x)
which is itself harder than
(-> x (* 2) (+ 1) (- 3) (% x))
But it takes a while to be used to it.
And yes, it really helps writing macros, but I wouldn't say this as always be a good thing. Macros are far from being the alpha and omega of programming, as they add an implicit layer of transformation to your code making it easier to write but very often harder to read and reason about.
I started working through Crafting Interpreters, building up a language syntax and grammar from scratch. A lot of work and 75 pages of lex/parse logic and we now have a AST... that we can debug and inspect by looking directly at its sexp representation.
It was the ah-ha moment for me... why not express the source cost directly as that AST? Most languages require lots of ceremony and custom rules just to get here. Sexps are a step ahead (inherently simpler) since they're already parsable as an unambiguous tree structure. It's hard to unsee - reading any non-Lisp language now feels like an additional layer of complexity hiding the real logic.
Much of the complexity and error reporting that exists in the lexer or parser in a non-Lisp language just gets kicked down the road to a later phase in a Lisp.
Sure, s-exprs are much easier to parse. But the compiler or runtime still needs to report an error when you have an s-expr that is syntactically valid but semantically wrong like:
(let ())
(1 + 2)
(define)
Kicking that down the road is a feature because it lets macros operate at a point in time before that validation has occurred. This means they can accept as input s-exprs that are not semantically valid but will become after macro expansion.
But it can be a bug because it means later phases in the compiler and runtime have to do more sanity checking and program validation is woven throughout the entire system. Also, the definition of what "valid" code is for human readers becomes fuzzier.
> later phases in the compiler and runtime have to do more sanity checking
But they always have to do all the sanity checking they need, because earlier compiler stages might introduce errors and propagate errors they neglect to check.
> program validation is woven throughout the entire system
Also normal and unavoidable.
As far as processing has logical phases and layers, validation aligns with those layers (the compiler driver ensures that input files can be read and have the proper text encoding, the more language-specific lexer detects mismatched delimiters and unrecognized keywords, and so on); combining phases, e.g. building a symbol table on the go to detect unidentified identifiers before parsing is complete, is a deliberate choice to improve performance but increase complication.
> because earlier compiler stages might introduce errors and propagate errors they neglect to check.
Static analyzers for IDEs need to handle erroneous code in later phases (for example, being able to partially type check code that contains syntax errors). But, in general, I haven't seen a lot of compiler code that redundantly performs the same validation that was already done in earlier phases. The last thing you want to do when dealing with optimization and code generation is also re-implement your language's type checker.
Those rules help reduce runtime surprises though, to be fair. It's not like they exist for not purpose. It directly represents the language designer making decisions to limit what is a valid representation in that language. Rule #1 of building robust systems is making invalid state unrepresentable, and that's exactly what a lot of languages aim to do.
Note that this approach has been reinvented with great industry success (definitions may differ) at least twice - once in XML and another time with the god-forsaken abomination of YAML, both times without the lisp engine running in the background which actually makes working with ASTs a reasonable proposition. And I’m not what you could call a lisp fan.
I don't find them to be clearer, with the background of knowing many language, because now I have to worry about precedence and I better double check, to not get it wrong or read it wrong.
I agree, and this is why for math expressions that aren't just composition of functions that aren't basic operators, I like to use a macro that lets me type them in as infix. It's the one case where lispy syntax just doesn't work well, IMO.
As someone who isn't a trained programmer (and has no background or understanding of lisp) that looks like you took something sensible and turned it into gibberish.
Is there a recommended "intro to understanding lisp" resource out there for someone like myself to dive in to?
The part that is confusing if you don't know Clojure is (->). This a thread macro, and it passes "x" through a list of functions.
So it basically breaks this down into a list of instructions to do to x. You will multiply it by 2, add 1 to it, take 3 from it, then do the modulus by the original value of x (the value before any of these steps).
Clojurists feel like this looks more readable than the alternative, because you have a list of transformations to read left to right, vs this
But that reading requires looking back and forth to read the operator and the operand. The further you move out the more you shift your eyes and the harder it becomes to quickly jump back to the level of nesting that you are currently on at the other side.
'(
1. This is a list: ( ). Everything is a list. Data structures are all lists.
2. A program is a list of function calls
3. Function calls are lists of instructions and parameters. For (A B C), A is the function name, B and C are parameters.
4. If you don't want to execute a list as a function, but as data, you 'quote' it using a single quote mark '(A B C)
5. Data is code, code is data.)
The language fundamentals of "Clojure for the Brave and True" (best intro to Clojure book IMO) is excellent (if you consider Clojure a lisp). I find the author's style/humor engaging.
So the correct S-exp (let's use mod for modulo rather than %):
(+ 1
(* x 2)
(- (mod 3 x)))
It's a sum of three terms, which are 1, (* x 2) and something negated (- ...),
which is (mod 3 x): remainder of 3 modulo x.
The expression (% (- (+ (* x 2) 1) 3) x) corresponds to the parse
((x * 2 + 1) - 3) % x
I would simplify that before anything by folding the + 1 - 3:
(x * 2 - 2) % x
Thus:
(% (- (* 2 x) 2) x).
Also, in Lisps, numeric constants include the sign. This is different from C and similar languages where -2 is a unary expression which negates 2: two tokens.
So you never need this: (- (+ a b) 3). You'd convert that to (+ a b -3).
Trailing onstant terms in formulas written in Lisp need extra brackets around a - function call.
In real Lisp code you'd likely indent it something like this:
(%
(-
(+ (* x 2)
1)
3)
x)
This makes the structure clearer, although it's still wasteful of space, and you still have to read it "inside-out". The thread macro version would be:
(-> x
(* 2)
(+ 1)
(- 3)
(% x))
It's more compact, there's no ambiguity about order-of-operations, and we can read it in order, as a list of instructions:
"take x, times it by 2, add one, subtract 3, take modulus with the original x".
It's pretty much how you'd type it into a calculator.
For what it's worth (speaking only for myself), I could not live without the threading macros (-> and ->>) in Clojure. Below is an example of some very involved ETL work I just did. For me this is very readable, and I understand if others have other preferences.
(defn run-analysis [path]
;load data from converted arrow file
(let [data (load-data path)]
(-> data
;; Calc a Weeknumber, Ad_Channel, and filter Ad_Channel for retail
add-columns
;; Agg data by DC, Store, WeekNum, Item and sum qty and count lines
rolled-ds
;; Now Agg again, this time counting the weeks and re-sum qty and lines
roll-again)))
> In real Lisp code you'd likely indent it something like this:
Not only would that not be idiomatic, the operator for modulus in Common LISP is mod not %. and the brackets you and the parent used in the s-expr are around the wrong groups of symbols. So you're more likely to see:
Nobody said it had to be Common Lisp. I'm going by the notation the grandparent commenter used. My point was that indentation can clarify the structure of nested sexps vs putting them on one line. And that is actually what people do. "mod" vs "%" hasn't the least to do with it. This isn't even really about arithmetic; those are just at-hand examples the GP commenter chose. Could just as well have been
(foo
(bar
(baz (bax x 2)
"hello")
"world")
"!")
>the brackets you and the parent used in the s-expr are around the wrong groups of symbols
No they're not. Yours is wrong. Multiplication has higher priority than addition so the order of evaluation begins with (x * 2) not (1 + x).
Interestingly the second form is just infix notation where every operator has the same precedence and thus is evaluated left to right. That says to me that it's not infix notation that's inherently weird but instead it's the operator precedence rules of mathemetical infixes that are weird.
> that looks like you took something sensible and turned it into gibberish
This is the main thing I use Lisp (well, Guile Scheme) for. I used to use bc for little scratch pad calculations, now I usually jump into Scheme and do calculations. I don't recall if I thought it looked like gibberish at first but it's intuitive to me now.
Unfortunately our brains are broken by pemdas and need clear delineations to not get confused; this syntax also extends to multiple arguments and is amenable to nesting.
When I learned APL, the expression evaluation order at first seemed odd (strictly right to left with no operator presence, 5*5+4 evals to 45 not 29). After working with it a couple of hours I came to appreciate its simplicity, kind of like the thread operator in your last example.
For writing a program, the s-expression form might become:
(+ (* 2 (^ x 3))
(^ x 2)
(- (* 5 x)))
Whereas:
2*x^3 +
x^2 -
5*x
Would probably error out in most languages, due to parsing issues and ambiguity. Even worse ambiguity, if you put the signs in front, as then every line could be an expression by itself:
It might do the wrong thing in some languages but wouldn't necessarily raise a compiler error, and I'm fairly certain e.g. sympy should have no issue with it.
That's how my brain feels. it connects informations (compound terms) to entities directly, it's almost minimized information required to represent something, unlike algol based languages.