Format is awesome, it is a domain specific language for printing things. It can walk across lists, print all sorts of number radixes, make tables of data, and has its own condition system. I don't get the hate. It is not lispy at all in style, but isn't making a DSL just about the most lispy thing you could do?
> isn't making a DSL just about the most lispy thing you could do?
Yes, but with great power comes great responsibility. The problem with FORMAT and LOOP is not that they are DSL's, but rather that they are badly designed DSLs. The reasons are different in both cases.
FORMAT is badly designed because it is write-only, kind of like Perl regexps. It is very, very hard to debug a complex FORMAT string.
LOOP, by way of very stark contrast, is very readable, even more so than regular Lisp. But it is badly designed because it is chock-full of non-orthogonal constructs. For example, WITH is completely equivalent to a set of LET bindings outside the loop, so it doesn't actually add any functionality. All it does is give you a new non-lispy syntax for creating external bindings. The semantics of FOR depend on what comes after. "for x =" does something completely different from (for example) "for x in". And the list of problems with LOOP goes on and on and on.
So the problem is not that DSLs are bad, the problem is that these DSLs are bad. But since they are part of the standard, we are stuck with them.
The idea is that the WITH variables are local to the LOOP construct. If one leaves the LOOP, the variables are gone. With an outside LET this would not be the case. The LOOP thus gives scope for all lexical variables inside of it.
Additionally the LOOP sees the WITH definition in standard Common Lisp. It does not see any outside LET bindings, unless the Common Lisp would provide a feature to ask the current environment. This allows the LOOP to, for example, see a type declaration and provide a default value for the common cases.
(let (a) ; a is initialized to NIL
(declare (type integer a))
....
(loop repeat 10 do (incf a))
...
a)
vs.
(loop with a fixnum ; a is initialized to 0
repeat 10 do (incf a)
finally (return a))
It provides a compact notation with ONE scoping construct, the LOOP macro.
Similar for the LOOP name. In the same way you could argue that this is just a named BLOCK around the LOOP construct. But, again, the name inside the LOOP makes it clear that the this is the name of this LOOP construct.
> The idea is that the WITH variables are local to the LOOP construct. If one leaves the LOOP, the variables are gone. With an outside LET this would not be the case.
Of course it would. This:
(loop with a = ...)
is exactly equivalent to:
(let ((a ...)) (loop ...))
and so you can e.g. capture with-bindings in closures:
(funcall (loop with a = 123 return (lambda () a))) ==> 123
> Additionally the LOOP sees the WITH definition in standard Common Lisp. It does not see any outside LET bindings, unless the Common Lisp would provide a feature to ask the current environment. This allows the LOOP to for example see a type declaration and provide a default value for the common cases.
I think it is highly questionable whether providing implicit typed default values is actually a useful feature, but assuming for the sake of argument that it is, why not just make a general binding construct that does this? If it's useful, it should be useful (and usable) everywhere, not just inside a loop.
> you could argue that this is just a named BLOCK around the LOOP construct
Indeed you could. :-)
> the name inside the LOOP makes it clear that the this is the name of this LOOP construct.
Yes, but why is that useful? How is (loop named foo ...) any better than (block foo (loop ...)) ? And if it is better, then why not (progn named foo ...) ? (tagbody named foo ...) ?
As I said LOOP is designed such that names, local variables, local iteration variables are all defined INSIDE the LOOP, such that LOOP can process them and that they are all using the LOOP syntax.
(loop with a = ...) maybe equivalent to (let ((a ...)) (loop ...))
but
(let ((a ...)) ... (loop ...) ...) is not equivalent to (loop with a = ...)
The difference is that in the latter it is not clear to the Lisp developer that A is meant as a LOOP local variable.
That's a design choice, not bad design.
Historically LOOP was named FOR and was only a tiny part of CLISP (Conversational Lisp) in Interlisp. There CLISP had a different syntax than normal Lisp, including iteration, binding, infix operations, etc. The idea of such a LOOP macro was then brought to Maclisp, ZetaLisp and, eventually, ANSI Common Lisp.
> The difference is that in the latter it is not clear to the Lisp developer that A is meant as a LOOP local variable.
That's because in the latter example, A is NOT a loop-local variable. If you write:
(let ((a ...))
(some-form))
then it is clear that a is local to (some-form) regardless of what (some-form) actually is. If you write:
(let ((a ...))
(some-form)
(some-other-form))
then obviously it is not clear whether a is local to (some-form) or (some-other-form) or both.
But so what? If it matters that this be clear, just don't put multiple forms inside the body of your LET, and then it is clear. You don't need to design a whole new language to solve this non-existent problem.
> That's a design choice, not bad design.
It introduces a lot of additional complexity, including a syntax that is radically different from anything else in the language, and confers very little benefit. If that is not diagnostic of a bad design I can't imagine what would be.
> You don't need to design a whole new language to solve this non-existent problem.
The designers thought that a whole new language would be useful. One may disagree with it. I would have preferred something like the ITERATE macro, but it has exactly the same idea of ENCLOSING all of these LOOP clauses, but with more parentheses.
(iterate
(with a = 1)
(incf a)
(when (> a 10)
(return a)))
is
(loop with a = 1
do (incf a)
when (> a 10)
return a)
If one cares about radically different syntax then you wouldn't use LOOP at all. (loop for i in l do (print i)) is equivalent to (dolist (i l) (print i)) which is equivalent to something with a LET/TAGBODY... construct. Why not just use the latter? Maybe its design choices made it practical enough?
> The designers thought that a whole new language would be useful.
Yes, obviously they thought this. They were wrong.
> ITERATE
Iterate is a slight improvement over loop because at least it doesn't require a whole new editor mode to format it properly, but simply adding parens to loop doesn't really fix it. The lack of parens is the least of LOOP's problems.
I'll give you two more examples of problems with loop. This is still nowhere near an exhaustive list.
1. There is no way to iterate over a sequence. You can iterate over a list, or you can iterate over a vector, but you cannot efficiently iterate over something that is either a list or a vector. You have to write something like:
(typecase seq
(cons (loop for item in seq ...))
(vector (loop for item across seq ...)))
which forces you to duplicate all of the LOOP code twice. Worse, you have to actually duplicate all the code represented by the elipses. You can't abstract it away in a function or a macro.
2. LOOP is not extensible. There are a lot of things I might want to be able to loop over, like streams, but I can't. There are a lot of control constructs I might want to embed in a loop, like with-open-file, but I can't. Instead, I have to resort to manually writing idioms like:
(loop
with stream = (open path)
with eof-marker = (gensym "EOF")
for thing = (read stream nil eof-marker)
while (not (eq thing eof-marker))
...
finally (close stream))
and again, because of the non-orthogonal syntax, I can't abstract all that away in to a macro either. I have to write it all out manually every single time I want to iterate over the contents of a file.
(And after all that it doesn't even do the right thing if the loop code signals a condition!)
> I don't see how it's just a "slight improvement"
I'd probably have to write a whole blog post to explain why I think iterate is only a slight improvement. But the TL;DR is that IMHO if you are writing code that uses a lot of the features of iterate or loop that is an indication that you are doing something wrong.
To cite but one example: both loop and iterate include constructs for collecting values. But collecting values has nothing to do with iterating or looping. It should be a separate construct. The right way to collect values is something like:
(with-collector collect
... (collect value) ...)
Now you can collect values whether or not you are looping, and regardless of what iteration construct you decide to use. You don't need special constructs for conditional behavior. So, for example, you could do this:
(with-collector collect
(dotimes (i 100)
(if (primep i) (collect i)))
to get a list of primes under 100.
See https://github.com/rongarret/ergolib for an implementation of WITH-COLLECTOR and lots of other constructs that are IMHO the Right Way to write code.
Loop is fine for simple examples like this. But I am currently maintaining a code base that has LOOPs with dozens and dozens -- sometimes a few hundred -- clauses. A single LOOP can extend over multiple pages. It's a nightmare.
Note that LOOP can fail even for simple examples. Suppose I have a list of lists of numbers and I want to collect all the prime numbers. With WITH-COLLECTOR I can do this:
But with LOOP I can't because there is no way for an inner loop to collect into a collector bound in an outer loop. I have to collect the individual sub-lists and then append them, or something like that, which is both inelegant and inefficient.
And if I have a tree of items which I want to walk over and collect all of the once satisfying a predicate, LOOP just doesn't handle that at all. But by separating collection from iteration it becomes trivial:
Neither WITH-COLLECTOR and DO-TREE are part of CL, of course, but writing them is an elementary exercise (and both are part of ergolib if you really don't want to be bothered).
There are lots of things in the Common Lisp standard, which are not extensible, but where implementations and libraries provide extensible versions. Example: the CLOS MOP, sequences, hash tables, ...
Pretty much all of those are extensible, especially given that a) they are provided in source code and b) those LOOP macro implementations have internal extension features. The paper I've linked gives an example how to extend LOOP in SBCL for iteration over sequences.
ITERATE is nice, except for one very serious blemish: the accumulation into variables requires a code walker, because the variables are not declared at the top of the iterate form. Macros that require code walkers do not play nicely with other kids. Macrolet/symbol-macrolet, for example, can fail inside interate.
I don't see a lot of value in "X is just Y and Z, why use X?" arguments. Clearly it has enough ergonomic benefit that people like using it.
I definitely think the design of the DSL is a little chaotic, not to mention difficult to remember, and having two entirely different DSLs implemented by the same macro is downright ridiculous. But it's definitely good enough at its job and better than not having it at all. If it didn't exist, people would probably be DIYing their own (even worse) versions of it, without the assurance of it being part of the specification and ostensibly a well-tested part of any implementation.
It's still lisp right? You can just do your "(out ...)" like the 2003 guy suggests and not use the format form if you don't like, why get rid of it for everyone else?
And sharing lisp code, is that a thing people do? (this half an insult half a joke) So you can easily enforce your own ideals in your own project.
I'm not advocating getting rid of format, or even loop (which is by far the greater of the two evils). I'm pointing them out as cautionary tales for future DSL designers.
Agreed. People give a lot of hate to FORMAT and LOOP, but I think with both of them for the domains they're working in, it's kind of a natural progression of the evolution of a DSL (which as you said is very lispy since reader macros are one of the biggest strengths of CL) that you'll end up with something like FORMAT or LOOP that handles a lot of common use cases for string formatting or iteration as tersely as possible. It's inherently less flexible than pure s-exps (nothing is as flexible), but IME solves most problems where you need string formatting or iteration. Plus it's part of the ANSI CL spec, and sticking as close to the spec as possible and being conservative with external dependencies is usually a good thing in CL.
I feel like avoiding the use of LOOP could be a Master Foo koan.
An ambitious young student of Lisp came to Master Foo, seeking to deepen her knowledge and understanding.
One day, during their studies and practice, the students stopped and said, "Master, I am troubled. In Lisp we have minimal syntax, and we use S-expressions to convey structure, and this syntactic uniformity gives us great power to build our own composable syntax constructs. And yet we have LOOP in the standard, which is an ad-hoc imitation of the same arbitrary syntax of other languages that we use Lisp in order to avoid!"
The next day, the Master brought the student on a hike into the hills. As the sun was setting late in the afternoon, they reached a small cabin near the summit of a hill. A village of huts was visible in the valley below them. "Observe that this building is constructed from many small pieces of wood, each nearly uniform in size and shape. With those pieces I long ago built a hut that resembles the shoddy huts of the village, yet this one is both strong and easy to modify. You could even build it yourself from plans." The student observed and agreed, and was very impressed, but was also confused. "Master, what does this teach me about Lisp?" she asked. Master Foo continued, "You shall stay here on the mountain tonight. Return to me in the morning."
With that, the Master drew a key from his robe, firmly locked the door of the cabin, and hiked away. As he did so, a cold rain and steady wind began to blow. The student spent the night cold and damp on the ground beside the cabin, and awoke enlightened.
The loop macro is impressive for many of its esoteric keywords. Maximizing, collect, append, etc. There is also the for/then construct that I find much easier to work with than alternatives for the same behavior.
I have no doubt much of this can be done with the racket constructs. But I still find I can more easily read loop usages I've written in the past. Whereas many similarly dense for/folds are opaque within a few minutes for me. :(
Well the point is that it could have been an actual macro instead of a function that processes a syntactical-opaque control string.
I do appreciate the convenience and brevity of FORMAT as it currently exists, but I also think it would be pretty cool to have a proper S-expression macro that expands to a FORMAT control string. It would be analogous to a macro that generates SQL or HTML.
Common Lisp provides an explicit macro for this called formatter. The macro produces a function. The format function allows a functional argument in place of the format string for this reason.
(format t (formatter "~a-~b") x y)
is like
(format t "~a-~b" x y)
except that "~a-~b" is transformed into a lambda which takes a stream and two arguments, and which does the specified formatting.
There are so many corners of the spec that I have not yet explored, so I always appreciate trivia like this. However I was thinking more like the macro that the article was proposing.
The point of fp is not to get rid of effects, but to make them explicit and better controllable and do that in a consistent way, not on a case by case decision.
I think that's still pretty well-understood. The real draw to FP in general is you reduce the globally modifiable state to be the things that are truly global to an application. For web applications, this may be a database. For systems applications it may be a filesystem.
Richard P. Gabriel (one of the co-creators of Common Lisp, later a founder of the Lisp vendor, Lucid) has a few interesting things to say about CL's FORMAT in his Patterns Of Software, starting on page 101:
> What are the trade-offs? Format strings don’t look like Lisp, and they constitute a non-Lispy language embedded in Lisp. This isn’t elegant. But, the benefit of this is compact encoding in such a way that the structure of the fill-in-the-blank text is apparent and not the control structure.
IMO McDermott's OUT macro has precisely the drawbacks Gabriel predicts. While McDermott seems to think it's an advantage that
> we no longer have to squeeze the output data into a form intelligible to format, because we can use any Lisp control structure we like
his example PRINT-XAPPING basically duplicates the logic for extracting data from the xapping structure as Steele's FORMAT call, but buries the data extraction into control structure alongside constant strings that go into the output.
And that's where I think FORMAT really a win: a FORMAT control deliberately separates the control flow and constant text from the data extraction logic. Presumably you could write functions that do the same thing using McDermott's OUT macro, but they'll be more verbose and no more enlightening; what's the point?
Does it really make that much difference if comments exist "inside" the embedded sublanguage vs. "outside"? After all, you can always construct a format specification by string concatenation, and in CL, you can trick the parser into doing this for you:
(format stream t
#.(concatenate 'string
;; comment about the first piece
<piece 1>
;; comment about the second piece
<piece 2>)
...)
Additionally, for the record, CL's FORMAT is sufficiently hairy that you can achieve the effect of in-band comments if you wanted to:
(format t "~
~0{ Because the previous line ended with a tilde,
the following newline is ignored. All of this text
occurs inside an iteration construct that loops
zero times, so will not be output. However, it
will consume one element from the list of arguments,
so after this loop, we'll use tilde-colon-asterisk
to back up one element, and let what comes next
decide how to format it. So this is more or less
a comment inside a CL FORMAT string.
~}~:*~
~S" 'Foo)
I think what I'd really like is some way to expand format strings into McDermott-style code (and ideally vice versa).
I think CL-PPCRE gets this correct for a similar domain: regex strings. The library is not pedantic about whether you provide the compact string or an expanded nice version.
Out of curiosity, how often do you find yourself using CL-PPCRE's S-expression notation? (This is a genuine question: I've never felt a desire for an S-expression notation for regular expressions, so I'm curious what I'm missing out on.)
Anyhow, while it's certainly possible to parse FORMAT control strings into S-expressions, ISTM that if you want them to be invertible back into FORMAT strings, you'll end up with control structure and constant strings being contained within the S-expression, with data extraction as a separate concern. IOW, you won't get McDermott's preferred style of interwoven control, data extraction, and constant strings. For instance, you could have this FORMAT control string
I never use it for regex, because when using CL, I tend to be authoring the regex from scratch. But I found it elegant and thought it might be useful to interpret someone else's hairy regex. But the reason I mentioned it, was because I thought it might be more useful to me for FORMAT... just because decades of Perl 5 taught me regex really well, but I don't get to use CL's FORMAT syntax every week. ;-)
"Lisp is a syntactically extensible language, meaning that it is quite easy, using macros, to create arbitrary language extensions, so long as they obey two basic rules: (1) A new statement must look like (op ...), where "..." has balanced parentheses; (2) the lexical conventions inside the new statement must be Lisp's (e.g., more characters (including '*', '+', and such) are ordinary symbol constituents, in contrast to their role in other languages, so adjacent symbols must be separated by whitespace; double quote starts a string; single quote, sharpsign, and a few other characters have special meanings). If you're used to Lisp, these rules are barely noticeable, so that Lisp hackers come to think of it as having the most flexible syntax in the world."
I'm tired of reading about lisp, but I always do it. This gem is the sort of thing I'm after: an explanation of why (specifically) some folks just can't seem to give it up. I admit I still find the language incomprehensible. But at least I understand why others don't. It's a start.
I would kill for the opportunity to work on Lisp full time (well, maybe not kill, but you get the level of motivation).
The reason for it is in your quote - extremely regular syntax. No operators, no precedence, no special syntax forms - just a bunch of lists with symbols. I didn't realise how important it is until I started writing macros in Elixir and realised they are not quite "native", even if really powerful.
You can use reader macros to add any operators or special forms you want to Common Lisp. Things can get especially crazy if you use a reader-macro to override '(
What surprises me is how lisp is such an obvious delight for some at first sight, while it's pure hell for most. There was no need for an explanation about beneficial syntactic reduction, extension and reuse it just massages a part of the brain that screams "Yes please".
There is a night-and-day difference between those two, for you and me. Now imagine you have the hypothetical cognitive disorder which I have christened "dyslispia". Maybe to someone with dyslispia, there isn't much difference between these two; their brain doesn't process and "auto-complete" the enclosure hint of the shape of the parentheses. All they see is noise. Those people are helped by dimming the color or using some indentation-only notation.
I also observe the curious effect that if we swap the shapes of the parentheses, I can still train myself to see the closure by rewiring my cognition to see the parentheses as pointing toward an interior in the convex direction:
Whatever the reason, this is not as bad as ^ and $. It helps if I imagine this in 3D, and pretend that the ))s nil(( is pushed into a pillow, sorta thing.
If dyslispia is real, it's basically a form of dyslexias; it's fundamental brain wiring problem for which there is no cure. No amount of explanations about Lisp will fix it.
In general, to enjoy working with Lisp, the raw experience as such, it probably helps to be well-sighted (no visual impairment), and no dyslispia-like cognitive impairment. My remarks here are mainly about those who report persistent difficulties with Lisp, but who do not mention any visual impairment.
I've read that recently (maybe it was a previous comment of yours) and it's interesting that the geometry of a glyph matters. That said, a ( ... ) has a circularity that meshes well with the brain's notion of 'defined' 'finite' 'circled' 'closed' 'wrapped'. Which is often a desirable property. Any entity in lisp is simply enclosed.
Any sequence is simply separated by space.
Having syntactic genericity for generic trees is quite a free meal IMO.
I feel like having all the options you'll want available is the best practice. I feel like python people went from liking long concatenations with arguments being infix with python2 print statements to str.format in early python3 and friends to liking long concatenations again with fstrings all again. And now we get the joy of having long conditional statements in the middle of a string, rather than it being assigned to a variable before the string, as god intended.
Really, the right answer is having options for string interpolation (as long as they are safe (that is not perl)) and let programmers choose because string interpolation is just another one of those bikeshedding things that developers change their opinions on as the seasons change.
I think they meant string interpolation, not format. SNOBOL looks like a very early example, not sure it's the earliest. Though the example there looks more like a demonstration of, I guess, automatic concatenation versus what most people think of as interpolation ("foo {bar} baz" or "foo $bar baz" or similar).
You might really enjoy Zsh. It's a lot like the best of Ksh and Bash but better, because it has sensible word splitting behavior (disabled by default!) as well as both sequential and associative arrays. I use it for a lot of scripts.
1> (let ((words '#"how now brown"))
(put-line `@(+ 2 2) --- @{words ","}, she said`))
4 --- how,now,brown, she said
t
2> (put-line (pic "0,###,###,###.## <<<<<<< >>>>>>" 1234567.93 "xx" "yy"))
0,001,234,567.93 xx yy
t
3> (format t "~0,8x\n" 12345)
3039
t
Probably slow compared to format, but more easily extendable. The slowness could be fixed for the included formatters with something like CLs compiler macros
There is always a question of format-strings (or format-string like) and native syntax with the manipulation of parameters to be printed in native code. This is the same as C++ iostream vs printf.
Meanwhile Scheme has at least two SRFIs for format strings!
As I suggested elsewhere in this thread, IMO the ideal scenario would be to have both the string and sexpr "interfaces". Since we are stuck with the string version, it would be nice to have a macro that expands to a valid control string. Not unlike how you'd write a macro to generate SQL or HTML.