This article is somewhat puzzling for me. On one hand, the OP clearly knows Clojure very well. The disadvantages of laziness are real and well described.
On the other hand, though, this sounds like a theoretical/academic article to me. I've been using Clojure for 15 years now, 8 of those developing and maintaining a large complex SaaS app. I've also used Clojure for data science, working with large datasets. The disadvantages described in the article bothered me in the first 2 years or so, and never afterwards.
Laziness does not bother me, because I very rarely pass lazy sequences around. The key here is to use transducers: that lets you write composable and reusable transformations that do not care about the kind of sequence they work with. Using transducers also forces you to explicitly realize the entire resulting sequence (note that this does not imply that you will realize the entire source sequence!), thus limiting the scope of lazy sequences and avoiding a whole set of potential pitfalls (with dynamic binding, for example), and providing fantastic performance.
I do like laziness, because when I need it, it's there. And when you need it, you are really happy that it's there.
In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way. That's why I find the article puzzling.
> Laziness does not bother me, because I very rarely pass lazy sequences around.
Sounds like that is going to the point the article is making - the best way to use lazy sequences is not to. Lazy sequence bugs make for a miserable experience. Clojure already has an onboarding problem where every new learner has to discover all the obscure do- and don't-s and go through the lessons of which parts of the language are more of a gimmick vs the parts that do real work. Attempting to do tricks with lazy sequences is part of that but it is polite to warn people before they try rather than when they get to Stack Overflow after hours of head-to-desk work.
Although I will put in a small plug for lazy sequences because they work well in high latency, high data i/o bound situations like paged HTTP calls or reading DB collections from disk. When memory gets tight it can be helpful to be processing partially realized sequences. But the (map println [1 2 3]) experience that everyone has is a big price to pay.
I disagree — I do use lazy sequences, I just rarely pass them around. Very few functions in my code return lazy sequences, and those are usually the "sources": functions that can return database data, for example.
Most of the code does not return lazy sequences, and thanks to transducers can be abstracted away from the entire notion of a sequence.
Well, if you already use transducers, then you are not exactly the target audience:). It's meant more for the younger developers who see a core language feature and feel inclined towards using it despite the drawbacks.
> In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way.
Interesting, because it still does bother me. I mean, if I use lazy sequences and functions on them. Sure, if I consciously avoid them, then it doesn't me anymore, that's the point of the article :D.
I enjoyed the article while I have a very comparable Clojure experience.
There's always new minutiae to learn. Plus I get a handy link that I can just paste next time the topic of laziness comes up in a code review.
I'd just simply point out that there are a few sub-tribes within the Clojure world. Some are very attracted to formalism/correctness, other pride themselves in rejecting them.
same here. in practice its almost never an issue. Always I try to use transducers first. Where its not possible, I ask is the data very large then use lazy sequences else mapv.
I am using clojure for my side projects & hustles. If the project is quick and dirty who cares how its implemented. If project evolves to more serious product, I should re-write anyway and optimize the critical code paths.
Macros that rely on parsing and rewriting their bodies are great way to introduce bugs. The regular threading macros work well enough because they are simple. More complex rewrites don't compose with other macros.
Clojure authors are quite conservative about adding such opinionated instruments into the language, especially when they come from outside. But fortunately, being a Lisp allows Clojure to add such language modifications with libraries without forcing those modifications upon every user of the language.
1) they did add threading macros, which are very popular, and these seem to serve a very similar purpose for transducers. You could argue their inclusion would make the language a little more uniform even.
2) There is precedence for doing the same operation with different mechanisms, like map vs mapv, so the change fits in nicely in that sense
3) as the article points out (edit: just realized you are the author, heh, hello!), transducers often have better performance than alternatives. I think it makes sense to highlight them and encourage their usage as much as possible. The more high performance code in the wild, the better. Even just introducing new ergonomics that facilitate their usage will influence that.
This is sort of true, but it doesn't scale well. You quickly end up having a very verbose `ns` declaration at the top of every file as you try to extend your Clojure
To cut down on the noise, you can coalesce all your extensions into one `ns` but it's not technically feasible with `core` and needs to be done with another library:
It’s too bad that transducers were created long after clojure’s inception. Can you always replace a lazy seq with a transducer? Could the language theoretically be redesigned to replace all default usages of lazy seqs with transducers, even if it were a major breaking change? And have lazy operations be very explicit?
It could certainly be redesigned to have explicit lazy collections/operations and use transducers as the composition glue. It would be a breaking change, so it's never going to happen in Clojure. But if somebody plans to design a language inspired by Clojure, they should certainly take this hint.
If we ever found ourselves in a position where Clojure’s market share was decreasing YoY, do you think it would ever make sense for Clojure’s maintainers to design a new language that implements this + any other issues that come up on Clojure’s yearly survey (and other lessons learned) that might be more easily addressed by sacrificing backwards compatibility? Or do you do think the community would want them to focus on maintaining Clojure themselves?
I realize the maintainers likely would not even be interested in such a thing, of course, just daydreaming.
Transducers were at least 20 years old when Clojure was first created. I have the book "Common Lisp: the language" from 1989 that describes transducers as found in Clojure.
Clojure is the only language where it is baked in that prominently though.
Using transducers is really easy and intuitive (with "comp" letting you compose a pipeline of transformations in a readable way).
Writing custom transducers, especially stateful transducers is really difficult. But that's not something you'll do often. My 10kLOC complex app has three stateful transducers that I wrote.
I think transducers are an under-appreciated aspect of Clojure. They are an extremely valuable and flexible tool, and have allowed me to write reusable and composable code and tackle significant complexity, all with great performance.
Transducers aren’t just faster, they offer more functionality, with the ability to reuse the logic no matter where your inputs and outputs are coming from. So it’s no real surprise that there’s more implementation complexity. Client code isn’t much more complex, and arguably lower mental overhead because you can give stacks of transducers names without having to introduce a whole new function with a sequence as an argument.
Obviously in languages that can reliably perform stream fusion transparently, maybe you care less, but the abstraction isn’t just about the speedup.
Yup. And to be honest there’s a bunch of additional complexity in Clojure transducers that I very much dislike (all the reducing function stuff which is almost never actually used and which is secondary to the purpose of creating a sequence transform via function composition). But it’s a tradeoff you make for additional functionality or speed. Or you move complexity to reduce cognitive overhead for certain use cases. There’s rarely a free lunch.
I've been circling around lisp for a couple of years. I'm starting in a month, I'll spend several hours a day. I still don't know what language I want to learn.
I was drawn to Clojure because it looked like a lisp for getting stuff done. But a few things put me off. This article puts me off more. I want to get the semantics down before I have to think about what's going on under the hood.
Clojure's lazy sequences by default are wonderful ergonomically, but it provides many ways to use strict evaluation if you want to. They aren't really a hassle either. I've been doing Clojure for the last few years and have a few grievances, but overall it's the most coherent, well thought out language I've used and I can't recommend it enough.
There is the issue of startup time with the JVM, but you can also do AOT compilation now so that really isn't a problem. Here are some other cool projects to look at if you're interested:
Despite its weak points, Clojure is still an excellent lisp for getting things done. For long running programs that live on the server, and especially for multithreaded/asynchronous workloads, I find it far better to work with than other lisps.
I wouldn’t let this article put you off. HN is often full of really negative takes like this that bear far less significance than they might suggest.
Clojure is a fantastic language, and probably the best lisp you could start out with due to the fact that you have the entire Java ecosystem at your fingertips.
I’m personally a fan of Clojure, in part due to how practical it is. Some of that practicality comes at the expense of simplicity, however.
I wonder if a Scheme dialect would be a better fit for you? They tend to be smaller and might let you focus on semantics more.
Full disclosure: I haven’t spent nearly as much time with any of the Scheme/Scheme-inspired dialects as I have with Clojure. I’m basing this off of their design philosophy and others’ observations.
He's overanalyzing. I think it's best to choose one randomly, even with a coin toss and learn enough about it. Then later he'll know better what to choose next.
I'm not trying to put down anyone. I'm trying to challenge the person to actually do something instead of doing analysis-paralysis.
The main issue is that Clojure compiler doesn't really optimize lazy sequences right ? Most language compilers do this. Rust lazy iterators for example many times exhibit faster performance than for-lops.
And clojure also doesn't give an error/warning when lazy sequences aren't finalized.
It's not really Rust's compiler that 'optimises lazy sequences'. It's LLVM, which notices that the code emitted happens to be able to be optimised down, if you run it for a really, really long time with some very strong optimisations.
GHC would be a better example, I think. It performs stream fusion. This means it can turn 'map f (map g xs)' into 'map (f . g) xs', and of course it gets more complex than that, but that's the basics. It directly optimises lists (which, this being Haskell, are lazy sequences).
> GHC would be a better example, I think. It performs stream fusion. This means it can turn 'map f (map g xs)' into 'map (f . g) xs', and of course it gets more complex than that, but that's the basics. It directly optimises lists (which, this being Haskell, are lazy sequences).
Is it for built-in map or it would work in a general way for, say, `myMap f (myMap g xs)` ?
Rust's lazy iterators don't cache the results, to be iterated over again. It allows the optimiser to inline all calls and end up with a loop that doesn't need bounds checks, which is why they can be faster than C-style loops over arrays.
This critique is mostly about the semantics. And I agree. For me it's mainly that things happening out of order introduces surprises for reasoning in the normal edit-debug cycle if you've forgotten to use the *v versions of functions. As for performance optimizations, there are some, such as chunking and locals clearing mentioned in the article.
I've been using the trick with enforcing realization by serializing to strings a few times. Slow, but quite useful in many contexts. However, instead of using `(with-out-str (pr ...`, there's simply`pr-str`, which is easier to remember.
> The good parts of laziness: Avoiding unnecessary work
Actually be very careful with side effects. Some functions like `map` and `for` take things in chunks, typically in steps of 32 as most underlying structures are in log-32 leaves.
It can be legitimate to have side effects in lazy processing and in particular to rely that a lazy sequence is not accessed beyond the visible access that is coded in the program.
Suppose we make a sequence of numbers which grows very rapidly, so that by the time we hit the 17th one, we have a bignum that is gigabytes wide.
You probably don't want this to be chunked in batches of 32.
Another situation might be if we have some side effect: the lazy sequence is connected to some external API somehow or foreign code. You might want it so that the observable behaviors happen only to the extent that the sequence is materialized.
The advice to be careful with side effects is good in general; not sure why you're downvoted.
F# is similar in that it supports lazy sequences but is mostly eager otherwise, and often handles errors using exceptions. One does have to be careful, but the benefits far outweigh the risks in my experience.
I think the main issue with lazy sequences is understanding and controlling their scope. Transducers, particularly when utilized within an `into` scope, can encapsulate laziness very neatly. Indeed, transducers utilize lazy sequences internally, and the OP shows their clear performance advantage. I think the article would be more effective if it shifted tone to "Clojure laziness best practices" rather than damning the idea wholesale. There be dragons for sure.
1> (len
(with-stream (s (open-file "/usr/share/dict/words"))
(get-lines s)))
** error reading #<file-stream /usr/share/dict/words b7ad7270>: file closed
** during evaluation of form (len (let ((s (open-file "/usr/share/dict/words")))
(unwind-protect
(get-lines s)
(close-stream s))))
** ... an expansion of (len (with-stream
(s (open-file "/usr/share/dict/words"))
(get-lines s)))
** which is located at expr-1:1
The built-in solution is that when you create a lazy list which reads lines from a stream, that lazy list takes care of closing the stream when it is done.
If the lazy list isn't processed to the end, then the stream semantically leaks; it has to be cleaned up by the garbage collector when the lazy list becomes unreachable.
It is possible to address the error issue with reference counting. Suppose that we define a stream with a reference count, such that it has to be closed that many times before the underlying file descriptor is closed.
I programmed a proof of concept of this today. (I ran into a small issue in the language run-time that I fixed; the close-stream function calls the underlying method and then caches the result, preventing the solution from working.)
(defstruct refcount-close stream-wrap
stream
(count 1)
(:method close (me throw-on-error-p)
(put-line `close called on @me`)
(when (plusp me.count)
(if (zerop (dec me.count))
(close-stream me.stream throw-on-error-p)))))
(flow
(with-stream (s (make-struct-delegate-stream
(new refcount-close
count 2
stream (open-file "/usr/share/dict/words"))))
(get-lines s))
len
prinl)
With my small fix in stream.c (already merged, going into Version 292), the output is:
$ ./txr lazy2.tl
close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 2)
close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 1)
102305
One close comes from the with-stream macro, the other from the lazy list hitting EOF when its length is being calculated.
Without the fix, I don't get the second call; the code works, but the descriptor isn't closed:
$ txr lazy2.tl
close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7b70f10> count 2)
102305
In the former we see the call to close in strace; in the latter we don't.
On the other hand, though, this sounds like a theoretical/academic article to me. I've been using Clojure for 15 years now, 8 of those developing and maintaining a large complex SaaS app. I've also used Clojure for data science, working with large datasets. The disadvantages described in the article bothered me in the first 2 years or so, and never afterwards.
Laziness does not bother me, because I very rarely pass lazy sequences around. The key here is to use transducers: that lets you write composable and reusable transformations that do not care about the kind of sequence they work with. Using transducers also forces you to explicitly realize the entire resulting sequence (note that this does not imply that you will realize the entire source sequence!), thus limiting the scope of lazy sequences and avoiding a whole set of potential pitfalls (with dynamic binding, for example), and providing fantastic performance.
I do like laziness, because when I need it, it's there. And when you need it, you are really happy that it's there.
In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way. That's why I find the article puzzling.