Hacker News new | comments | show | ask | jobs | submit login
Clojure Tradeoffs (gtrak.wordpress.com)
78 points by puredanger on June 26, 2013 | hide | past | web | favorite | 23 comments

"Tradeoffs" implies a balanced look at the positives and negatives of each design/engineering decision under discussion, and what is being traded off against what.

This article is a paean to Clojure, a torrent of praise with little to no (or very faint) criticism, and has very little discussion of what is being traded off for what, and what the positives and negatives are, what the benefits and costs are, of each tradeoff.

The "Tradeoffs for Individuals" paragraph is a good example. Literally every sentence of that paragraph is a variant of "Clojure is cool". There are no 'tradeoffs for individuals' actually discussed in that paragraph, inspite of the title.

This is not to say Clojure isn't cool. It is. Rich Hickey has a great sense of design and is a brilliant engineer. But an article titled "Clojure Tradeoffs" should actually discuss, you know, tradeoffs.

Misleading title, but otherwise a decent article. A good alternative title would be "Why I Think Clojure is a Great Language"

I'll give a real tradeoff - persistent datastructures are much less efficient to iterate through, since they usually involve a) returning "mutated" copies at each iteration (not as bad as it sounds due to structural sharing, but much more expensive than incrementing an index) and b) lots of pointer following & manipulation.

Also, the STM has an interesting intersection with garbage collection depending on your access patterns. Returning modified copies & throwing away references to the old one, rather than mutating, can increase pressure on the GC since you end up with more transient objects. Especially if you're "mutating" objects that all end up living in the tenured generation.

persistent datastructures are much less efficient to iterate through

As far as Clojure iteration is concerned, Vectors (immutable, persistent) are actually faster than ArrayLists (mutable, unsynchronized). Criterium reports:

  (let [v (vec (range 10000))] (bench (reduce + v)))
  Evaluation count : 95340 in 60 samples of 1589 calls.
               Execution time mean : 642.508722 µs
      Execution time std-deviation : 5.452816 µs
     Execution time lower quantile : 636.188082 µs ( 2.5%)
     Execution time upper quantile : 658.023220 µs (97.5%)

  (let [a (java.util.ArrayList. (range 10000))]
    (bench (reduce + a)))

  Evaluation count : 77640 in 60 samples of 1294 calls.
               Execution time mean : 739.523157 µs
      Execution time std-deviation : 6.064187 µs
     Execution time lower quantile : 730.818794 µs ( 2.5%)
     Execution time upper quantile : 751.968107 µs (97.5%)
Amusingly, using iterators directly instead of reduce speeds up ArrayLists (since there's no need to go through seq, I think), and slows down Vectors to the same speed:

  (defn iterator-sum
    [^java.util.Iterator i]
    (loop [sum 0]
      (if (.hasNext i)
        (recur (+ sum (.next i)))
  (defn iterable-sum
    [^Iterable i]
    (iterator-sum (.iterator i)))
  (let [v (vec (range 10000))]
    (bench (iterable-sum v))))
  Execution time mean : 682.147166 µs

  (let [a (java.util.ArrayList. (range 10000))]
    (bench (iterable-sum a))))
  Execution time mean : 683.351808 µs
After conferring with ztellman, I suspect this is due to Clojure's InternalReduce avoiding extra iterator allocations over vectors, since it can recur f directly over the internal array at each leaf node.

For those of you playing along at home, this is because of Clojure's reducers framework [1]. If we were to extend 'clojure.core.protocols.CollReduce' to ArrayList, we'd get similar performance, but the important point is that efficient iteration across immutable collections is very much possible.


EDIT: and my response is just a little redundant now

Are you sure? Reducers framework has its own version of major functions (map, reduce, etc.) and has to be called explicitly by use or require. Last time I checked, in version 1.5, if you just call normal reduce, it does not use reducers framework.

there were changes to the core data structures in order to support reducers, namely the addition of polymorphic reduction by the collection. See: https://github.com/clojure/clojure/commit/5b281880571573c591...

It was a pretty transparent change.

Yes, having to understand the details and implications of clojure's compilation was definitely painful, but the implementing code is accessible. There's always going to be some performance bottlenecks, I guess I'm more concerned with how hard are they to identify and overcome than they might have been having coded stuff up in pure java from the start? Or, once you're committed to a design, say refs and STM, and you hit such a problem, is it difficult to change course?

Why does the iteration have to return copies? Can it not just return the object itself? Or are you talking about return-by-value versus return-by-reference (Java does the latter)?

Immutability means that if you're "modifying" the structure you must actually be "making a new copy with a small change". Structural sharing means that usually only the small change itself gets new allocation, but it's still more pointer-chasing than a continuous memory map.

Lots of algorithms don't work with this kind of copying semantics, but you have all of Purely Functional Data Structures [1] to help.

[1] http://www.amazon.com/Purely-Functional-Structures-Chris-Oka...

Fair enough, but I wasn't talking about modification, I asked about iteration.

Just iterating doesn't make any copies. Replace iterating over with modifying.

Efficient immmutable data structures are easily one of Clojure best features. Easiest trade off ever (though the actual hit to performance isn't that bad) imo.

Not only that, but you can also make a persistent structure locally transient for the duration of the (modifying) iteration, so in lots of cases you do not have to pay the persistence penalty.

Oh, true—I jumped his point

It doesn't. I suspect the parent was thinking of mutation.

I should clarify: "iteration" is a bit of a fuzzy concept in Clojure that encompasses things like "for", "reduce", "some", "map", etc., some of which have differing implementations under the hood. If you're leveraging InternalReduce you're likely to be pretty darn efficient, especially because under the hood it's allowed to multithread with Java fork/join (which you would have to implement much more verbosely if you wanted to use it with an ArrayList). But in a lot of cases, you scan through a collection by taking the first element, then the first of the rest (which is generated by returning a "modified" copy of the original collection), etc.

See for instance, the "some" implementation: https://github.com/clojure/clojure/blob/c6756a8bab137128c811...

> I'll give a real tradeoff - persistent datastructures are much less efficient to iterate through

Pretty sure you mean "mutation", because none of your criticisms makes sense for iteration of persistent datastructures in general (the former one makes no sense at all, the latter one may make sense in tree-based persistent datastructures which will have higher constant factors due to pointer chasing)

If you're constantly returning and saving modified immutable data structures you're doing it wrong. Clojure has mutable data, you use them when you need them, and don't use them when you don't.

Yes, you're right. Tradeoffs was more of a theme than a commitment to being comprehensive. I guess I meant to portray design tradeoffs and their implications, and sort of what perceptions I have about them. It wasn't meant to speculate about what could have happened if different choices were made, or where I feel I'm missing out, or even to be super-objective. The main message was that the interaction of the design tradeoffs guides our subjective aesthetics, and those subjective inclinations result in our programs. Plus, I just wanted to compress the case for clojure for different people, in terms of clojure's design tradeoffs and their implications.

Sure. There is nothing wrong with writing an article " to compress the case for clojure for different people". We need those too.

When I read the title here on HN, I anticipated a balanced and comprehensive discussion on the design tradeoffs and was (mildly) disappointed to see that the article seemed to have a different focus (making a case for why Clojure is a language worth investing in, as far as I can make out).

I think Clojure is a brilliant language, but don't have enough experience with it to write about tradeoffs, and would really like to see a good discussion.

As I said in my original comment, it is a decent blog entry. My criticism is only about the (imo) mismatch between title and content.

FWIW, I agree, I would love to see an expert talk about real tradeoffs, both abstractly and in regards to clojure. I'm just kinda piecing it together from what I think Rich was thinking, what I've picked up from JoC, blogs, talks, and experience.

Don't beat yourself up over it too much. The structure of narrative form is something Clojure inherits from the Lisp tradition.

All essays about Clojure must either be a narrative in which "a man goes on a journey" or one in which "a stranger comes to town." [1] Yours, being the first type is just an old-fashioned love song to Lisp. [2] The other narrative form produces rants about s-expression syntax - though now that we have Python, praise of the semicolon is no longer a given.[3]

[1] https://news.ycombinator.com/item?id=1213770

[2] e.g. RPG's www.dreamsongs.com

[3] Such essays, lacking condescension, may be said to fail to comprehend the essence of Lisp.

This might be more useful (but a little dated) ... http://lisp-univ-etc.blogspot.com/2011/11/clojure-complexity...

I think at least the criticism of immutability and FP resulted from taking concepts to extremes. I made a point to address lock-in of different aspects of clojure in my post.

I think if you're mutating deeply nested things, or doing heavy algorithmic bookkeeping, it's worthwhile to evaluate different approaches (maybe pure java) as an 80/20 thing, and I wonder how hard it would be to have comparable mutability facilities and data structures as a library?

Of course, programmers want to get things done without spending time on building up their own tools, and I understand that.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact