
Ask HN: How dangerous is Clojure's immutability assumption? - Estragon
I'm a long-time python programmer, thinking of learning clojure because I want to play with my droid, and java drives me nuts.  I've been reading <i>Learning Clojure</i>[1], where it talks about data immutability as a fundamental assumption of Clojure's model.  It says "Watch out for cases of mutable Java objects stored in immutable Clojure collection objects. If the mutable object changes, this won't be reflected in the collection's hash."  This sounds potentially horrendously difficult to debug.<p>I don't want to start a language war, here.  I'd be interested to hear people's <i>personal experience</i> with this issue, and also of any tools or language features which mitigate this issue, either by making it a very unlikely mistake, or making such cryptic mutability easy to detect.<p>[1] http://en.wikibooks.org/wiki/Learning_Clojure
======
cemerick
After using clojure as my primary environment for ~ 18 months, and leaning
heavily on existing Java libraries for a lot of foundational stuff, I'd say
that this is a non-issue. Of course, you can get yourself into a lot of
trouble in any environment, but (at least for me and other experienced Clojure
programmers I've worked with and whose results I've looked at) it's rarely
nonobvious where unrestrained (i.e. Java-related) mutability is in the mix --
and in those areas, you take all the usual precautions that you would if you
were using those mutable libraries in Java.

The upshot of this is that if you're using a Java library, you'll generally
want to either:

(a) build a clojure wrapper API so as to enforce some sane semantics on it
(see clojure.contrib.http.agent for a good example of this, where HTTP
interactions are wrapped in clojure agents and a good set of convenience
functions that make working with the JDK's HttpURLConnection and friends _way_
more pleasant than usual).

(b) confine the usage of key Java libraries in such a way that there's a clear
line of demarcation between clojure's mutability and concurrency semantics and
the free-for-all in the rest of Java. This is where the big win is in
programming Swing interfaces, for example, where your core data model would
ideally be implemented using persistent data structures and clojure's
reference objects to ensure sane concurrency semantics, and you take all the
usual precautions when touching the Swing APIs.

~~~
Estragon
Thanks for the advice. Your approach sounds simple enough to follow, if you
pursue it from the start.

------
dons
In 10 years I've had maybe 5 bugs in Haskell caused by foreign language code
mutating objects under the hood, and that breaking referential transparency
guarantees in the Haskell code.

I would not consider this "dangerous". It's a side condition you'll need to
check. The language can make this more or less easy to establish.

Typically it looks like your value is changing under the hood. It's relatively
easy to debug -- since the result is so unexpected.

It's rare in Haskell. I imagine it is a bit more common in Clojure, where they
rely more on Java code than Haskell does on C code.

I don't believe Clojure is an optimizing compiler -- it's not doing any
optimizations based on static guarantees of referential transparency -- so
that simplifies the issue. If the compiler can't guarantee purity, there's
less it can do with to your code to take advantage of that, so less unusual
semantics.

~~~
swannodette
Clojure isn't an optimizing compiler because the JVM is the optimizing
compiler. Clojure datastructures are all declared static final which allows
the JVM to work a lot of magic.

I've been using Clojure for a year and half and never run into such a bug. You
learn quickly that you lost all of the benefits of persistent datastructures
if you're putting mutable objects into them.

As a side note, Rich Hickey has been working on a project called "cells" which
allows you to use even unsafe mutable Java objects with the same safe
concurrency guarantees as Clojure persistent data structures.

------
Confusion
If I understand the problem correctly, then it's much like a problem that
plagues most hashmap/associative array implementations: when an object that is
used as a key is modified (without removing and re-adding it around the
modification), you'll often not be able to find it again.

It takes some debugging to find this is happening, but in five years of
writing Java and Python, I've only had it happen once in either language.

------
mark_l_watson
I use a lot of my (sometimes ancient) Java code with both Clojure and Scala.
So far I have always written wrappers that copy Java data into Clojure or
Scala 'native' data types. If you do this then there is little chance of
having problems like those you are concerned about. It might be more efficient
to use Java types, but not worth the hassle in most cases. (I also use (J)Ruby
a lot, and I have found the secret to happy use of (J)Ruby is to give up the
desire for good run time performance :-)

------
jrockway
_I don't want to start a language war, here._

Then it's probably not a good idea to use the word "horrendous". Your
commentary is not what would start a language war; your tone is.

------
barrkel
Hash functions should generally hash based on a value's identity. Mutable
objects passed around by reference have an identity independent of their
value; mutable objects passed around by value, on the other hand, change their
identity when they're modified.

For example, one list isn't equal to another list, even if it has the same
elements, if modifying one list doesn't modify the other. If they're not the
same, then they shouldn't compare as equal.

This is one reason I think Java's implementation of hashCode() on collection
classes isn't very smart. I think .NET gets it right, having GetHashCode()
return a consistent value for mutable collections. (Similar comments apply to
the corresponding equality operation.)

But mutable objects passed around by value are bad for other reasons, such as
the risk of modifying copies when you think you're modifying an underlying
value.

------
jacquesm
It sounds like it should be relatively easy to write a function that checks in
case of doubt if any of the objects have their contents changed compared to
their hashes.

You could enable something like that during the debugging phase of your
development to get 'peace of mind' that such behaviour is not the source of
any bugs.

------
jganetsk
This is a problem in Java too. Any object can mutate, effectively breaking any
ordered collections that hold it.

