Hacker News new | past | comments | ask | show | jobs | submit login
Immutable Data Structures and JavaScript (jlongster.com)
177 points by jlongster on Oct 5, 2015 | hide | past | web | favorite | 70 comments

The main issue I've had using immutablejs with Redux is debugging. Whereas previously I could simply mouse-over a data structure when I hit a breakpoint (or crash), I now have to do a REPL dance to see what's in any of my data structures.

I also find myself often having to do trial and error stuff to fix my code (also while in a paused state in the console). I mean it's pretty nice that you can actually do this, don't get me wrong. But overall, I am slower and less productive with immutablejs than I am with vanilla JS / JSON objects.

It's a trade off people really should keep in mind before pulling the trigger on immutable data structures. Sure, you get that performance boost, but do you actually need it? Are your React views really so slow (or are you running on slow embedded hardware like I am)? Or are you just drinking the kool aid because immutable data is hot right now?

You get runtime perf improvements in certain cases, at the cost of opaqueness, productivity and complexity. Make sure it is worth it.

Hi, shameless plug: another (typed) immutable data structure library, https://github.com/gcanti/tcomb. Main features:

- works with regular objects and arrays - runtime type checks - easy debugging with Chrome DevTools - immutability and immutability helpers - runtime type introspection - easy JSON serialization / deseralization - pattern matching

Also https://github.com/gcanti/redux-tcomb (Immutable and type-checked state and actions for Redux)

I wouldn't call it an immutable data structure library. As far as I see, it does not implement any data structures.

This looks like a nice alternative to seamless-immutable though.

The solution to this and similar problems is to define more protocols like the iteration protocol [1], but for showing objects, accessing their properties etc.

If libraries and tools rely on protocols instead of using built-in objects then we would be able to plug any kind of data structure implementing the required protocols into any kind of library that takes advantage of those protocols.

[1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

At least in Chrome, you can add a custom devtools formatter so you can inspect Immutable.js datastructures just like native ones: https://gist.github.com/jsdf/e27e0e467dc8db3fb1e1

> Whereas previously I could simply mouse-over a data structure when I hit a breakpoint (or crash), I now have to do a REPL dance to see what's in any of my data structures.

Why is that? IMHO, this sounds like a problem with the dev tools more than anything.

Because the data contained inside immutablejs structures is opaque. It uses "structural sharing" which probably means internally it's more like a fragmented hard drive than a linear sequence of data.

How do they even improve performance with Redux?

If the whole state is one immutable tree, then the next state is a new tree and the whole app had to be rendered again.

I know it probably doesn't work like that, but this is what I think when I read about immutability.

I imagine it works somewhat like this example:

If the toplevel object contains 3 items:

  { messages: list
    users: list
    me: object }
and the toplevel component contains 3 components that correspond to these 3 sub-states: MessageList, UserList, MyInfo

then updating the user list will only cause a re-render of the toplevel component and UserList, but not MessageList or MyInfo or their child components (the same VDOM will be reused and no diff / patching will be performed on that part of the VDOM / DOM, respectively)

Something like

  if (this.shouldComponentUpdate(...)) {
    return this.render(...)
  } else {
    return this._oldVDOM;
then the diff algorithm can quickly compare by reference equality and only fall back to deeper object equality if the reference equality returns false.

Ah. I thought the diff would re-render all children, if their parent state changed.

Clojure's motivating reasons for immutability are around correctness, not speed. In typical Clojure, immutable data structures are somewhat slower (25%-100%) than standard java, but make up for it because they can be trivially parallelized, don't require locking, and are significantly easier to reason about.

From that perspective, immutable structures should be the default, with mutable structures being reserved for when they're truly needed.

Mutable structures should be optimizations of immutable implementations.

People often say that immutable data structures are fast because pointer comparison is fast. But as soon as you do any transformation, even something as simple as map(), the pointers all change. So it seems like the speed of pointer checks is only relevant in cases where you're not doing anything interesting with the data.

In particular, if you need to apply any transformation to your model before displaying it in a view, it's not going to be a fast pointer check - you need to do dirty checks on all the elements just like you would with mutable data structures.

Sure you would need to do go one level deeper if the top level is different, but you can still use pointer equality at the lower levels

  immutableEq(a1, a2) {
    return (a1 === a2) || all(mapPair(a1, a2, immutableEq))
Now if you update the value of a single element in the array a2, you would not have to check all the elements in depth (only their references):

  a1 = Immutable.List(el1, el2, el3, el4, el5)
  a2 = a1.set(2, Immutable.Map({hello: "world"}));
  immutableEq(a1, a2) // all but el3 compared by reference equality

In your example you're just changing one element and leaving the rest the same - a pretty trivial transformation. If you're applying map() to a list then all of the elements change.

If you have a pipeline then it's going to result in everything downstream re-running, unless you compare the output of the map() transformation to the previous value, like some build systems do.

Or alternatively, run all the major data transformations necessary to get from data to VDOM from the render function, passing as much as possible of the original data to the bottom-most components.

To generalize. If a change occurred at a [hot] path of a [deeply nested] map, then only [React] components that depend on the data at any subpath would perform the dirty check (i.e. executing shouldComponentUpdate) to re-render the component.

That depends on your program. If your function is pure, you can memoize, and then it will be a fast pointer check.

True that.

This conforms with my understanding as to why Clojure is slowish. That being said, I don't really understand why Haskell is seen as fast and Clojure is seen as slow (I am ignoring JVM warm-up times here).

Do they use fundamentally different algorithms under the covers?

Clojure isn't very slow. It's possible to write Clojure with identical performance of java (though this isn't idiomatic, and avoids some of Clojure's nicer features).

Clojure's immutable datastructures are ~25% slower than standard java, but also eliminate entire classes of bugs.

I'm not familiar with Haskell's immutable data structures, but I'd be very surprised if there's an optimization that hasn't been adopted by Clojure. Rich Hickey put a lot of work into optimizing them.

Haskell's GHC is a super clever optimizing compiler with a garbage collector built around the sort of memory pressure that arises when you've got tons of immutable data. Some of GHC's brand of speed comes from the super optimization bits where GHC automatically intuits out a mutating, low level, primitive loop out of your high level description. Some comes from loads and loads of tuning. Clojure cannot benefit from the latter because JVMs are tuned differently and cannot benefit from the first because it has less good transformation semantics.

So Immutability is Facebook's solution to excessive virtual DOM checks, which is Facebook's solution to the render everywhere problem, which is Facebook's solution to state change, which is a problem created by abstracting away DOM manipulation, which was never a real problem in the first place...

The chain of reasoning is wonderfully put, but DOM manipulation was definitely a problem. When elements are inserted or removed from lists, you'd have to do a dance of many jQuery() steps. There is no concept of state, it is all tangled up in DOM values, and the most elegant solution was to use $.serialize() if it works in that context.

This mess was why people hated front-end programming. It has only been a year or two since Angular/React became mainstream and allowed us to think of state as plain JS objects. Lest we forget, it was truly bad times before.

Maybe it wasn't, but being able to declare the UI as a function from state (viewmodel) to DOM just once, and never having to manually patch the UI to get it to the right state... is a huge development time saver.

Angular did the same but went on a different path: watching the state (scope) for changes and then digesting the resulting spaghetti updates

React users are simply adopting well known techniques from functional programming languages because they work well with the virtual dom concept. Except for that concept, there is nothing new being invented here.

> abstracting away DOM manipulation, which was never a real problem in the first place

Until you try to build a large app and get murdered by maintainability problems (eg. treating the DOM as a store of mutable state, managing event listener binding lifecycles, etc) and performance issues (eg. DOM reads and writes to occurring in an interleaved fashion inside inner loops)

> which was never a real problem in the first place...

You obviously haven't worked on any legacy app built in jQuery.

"And strict mode, which is the default in ES6", this is not true, I think you probably meant "in ES2015 modules". "Loose" mode still very much exists by default in ES2015 (ES6).

Thanks, I changed that

I'm a big fan of immutable data structures, but I will caution people to blindly adopting them just because they've heard they are "fast" and it's something that has been popularized by companies like Facebook.

The vast majority of apps, when properly designed (and using things such as paginated datasets, and properly managing memory/cleaning up after themselves), do not need immutability.

You should first ask yourself, is this a problem that can be solved with immutable data structures, or why do I need to have huge data structures in the browser for my frontend? Unless it's a very special case, you can do more for frontend performance with lazy loading, and some good old fashioned performance analysis (take inventory of event listeners, look at a heap dump from Chrome dev tools, etc) than you can from adding the significant developer and cognitive overhead of immutable.js or its ilk.

It may seem nice to say if obj === obj2 and if any deeply nested things have changed it's just a pointer, but as others have mentioned this is not the primary use case for many UI's that are constantly recomposing and creating different kinds of data structures from other data structures (think map, creating a hash from an array, etc).

There is no silver bullet and premature optimization with a highly specific design pattern is the root of all evil. Don't worry about performance until there is a legitimate performance problem that needs to be worried about.

I've been pushing functional collections for years (with implementations for Common Lisp and Java: http://www.ergy.com/FSet.html) and I have to say this post amuses me greatly. For most of this time, I've had to deal with people being skeptical about them because they were afraid they would be too slow -- and now the word is spreading that they're actually faster!

I think that's probably wrong. Well-implemented mutable collections are faster than functional collections, if for no other reason than that they allocate less. As others here have said, the argument for functional collections has almost always been on style and clarity grounds. In some cases there's a stronger argument because you need to have many slightly different versions of a collection existing in memory simultaneously; I have written a piece of code like that, but such situations are pretty rare.

I don't want to sound like I'm laughing at your argument, which is all true. It's just funny to see someone putting it that way.

The posts here are missing context: I'm talking about immutable data structures specifically in the context of UIs. It's common knowledge in the React world that there is quite a performance improvement using them, because as you change data you no longer have to do expensive checks to figure out exactly what has changed. You just rerender from the top-down and only update specifically what has changed (which just requires pointer checks now). See http://swannodette.github.io/2013/12/17/the-future-of-javasc....

In general, sure, if you can't take advantage of these semantics then there is going to be overhead and you just gain simplicity. But in certain domains we can take advantage of this simplicity for big performance wins.

I don't use immutable structures for speed (they're often slower), I use them because they can eliminate entire classes of bugs.

This. By making everything immutable by default, a program becomes much easier to reason about. If I can quickly see this variable is set here and since it is immutable it will never change, I can now quickly reason about other parts of the function.

Same here. People are missing the point.

If "fast" is what makes people adopt immutable data structures, awesome. I'm glad even if its for the wrong reasons as they come with many other benefits. We've all been using immutable strings in JS for quite a while without much troubles - why not use immutable objects and lists too?

I really hope immutable collections get added to the standard library eventually (probably ES2017 at this rate). Proxies should allow them to be used with libraries which expect mutable objects and arrays, as long as they don't mutate them.

Having them built in to the language would open up some interesting new possibilities too. It should be possible to send immutable data between Worker threads without the overhead of serialization and deserialization, which is currently one of the main barriers to doing heavy computation in a web worker rather than on the UI thread. Of course, there would be some nasty internal implementation details to sort out, as it would require sharing heaps, but it should be possible.

Where does React's Immutability Helpers fall in this spectrum? I was mulling over switching my entire codebase to ImmutableJS when a friend asked to try them first. It worked out well, but I wonder whether anyone here has experience with it in a large long-running project. So far it has allowed me to use Javascript primitives everywhere; all I need to ensure is to avoid direct mutation and use `update` that gives a new object whose properties shares the same references with the original one, except for the path in which the leaf node that needs to be mutated sits.

I am the author of react-cursor, which abstracts over the react immutability helpers[1], which I have used in four large projects. It scales great and I recommend it over full-on action-store-dispatcher flux (redux and the rest) for almost all projects except the very largest (like multiple frontend teams). Action-store-dispatcher is less direct and has an abstraction tax compared to just connecting views to state directly.

A React.js lead even just wrote on twitter [2], "Starting to see a trend of premature fluxing of react apps. Try plain @reactjs first and bring in flux only when you absolutely need it."

Here's a screenshot of a pretty big project that scaled great with cursors: http://curator-lilita-10664.bitballoon.com/work-area-metadat.... We had all sorts of complexity scaling problems while building this, but none of them were due to state management, migrating to cursors made 99% of pain due to complicated ui state go away.

[1] https://github.com/dustingetz/react-cursor

[2] https://twitter.com/floydophone/status/649786438330945536

Those helpers are basically the equivalent of seamless-immutable, which uses native JS objects but updates them by copying. You make the exact same tradeoffs as if you chose it over Immutable.js. Comparing those two libs represents the choice of either persistent data structures or copying JS ones.

(Note that seamless-immutable also enforces immutability by using `Object.freeze` in development)

It's worth mentioning that the ES6 array spread operator provides similar functionality:


The object spread operator (proposed in ES7) is essentially syntactic sugar for Object.assign:



I've even used jQuery's extend in legacy projects:


Thanks for the helpful reply, much appreciated.

A big issue it seems that these immutable libraries and Redux seem to neglect is JavaScript's OO nature. I'm talking about instantiated function declarations, something that's more readily available with ES6's class notation. Most of these libraries (sans Redux) take advantage of this aspect of JavaScript, yet leave you to figure things out if you do the same. This blog post kind of touches upon this regarding Map and Set.

Now, functional programming is nice and lends itself some useful benefits, but unfortunately JavaScript just wasn't built in the value-oriented sense. And given the native features of JavaScript, such as the ability to create "classes," I'd like to take advantage of these features and reap all their benefits, such as encapsulation (which of course, some in the FP community don't seem to care about[1]).

I'd like to see a middle ground somewhere between having e.g. immutable objects yet respecting the way JavaScript is built. Until then, I'd have to ditch common JavaScript idioms, which are only being advanced in ES6. Not something I'm very keen on doing.

1. http://programmers.stackexchange.com/a/216358

Flux seems linke the FP equivalent to the OOP mvc to me. So there is no need for what you say.

I don't follow the point you're trying to make. What I'm referring to is how JavaScript's object orientedness, which relies heavily on references, doesn't play very well with immutable values.

The only way to really eschew mutability is to ditch the concept of objects representing a state and a set of methods that mutate its state (which maintain encapsulation and enforce separation of concerns). Unless you decide to make all your prototype methods return new instances of the constructor upon mutation, which would be both inefficient and prone to error.

The problem with Immutable.js is that you have to remember to convert them to regular objects to use with third-party modules that expect plain JavaScript objects. One way to address that is to use Object.freeze instead. There's a nice library called icepick[1] that facilitates using Object.freeze to create immutable plain objects.

[1] https://github.com/aearly/icepick

*EDIT: Clarification

The article discusses this, and mentions seamless-immutable.

I haven't really had an issue with this. You can use the functions .toJS(), .toArray(), .toObject() on any Immutable object to go back to a plain JavaScript object.

Sounds like you just skimmed the top of the blog post.

I read it. I was just commenting on an alternative that uses native objects rather than custom objects.

Immutable Records have predefined fields that getters are created for, allowing them to be used as regular objects.

Nice article, but I think some of the ideas need some clarification.

> For example, keys of a Map object are value equality checked... This has a lot of really nice implications.

I'm not sure how immutability helps here. And it's not possible to use reference equality, since with queries, you're converting user input into a data structure.

Edit: never mind. Precisely because you're dealing with input, you have to use value-equality.

One issue with the sample code. In the runQuery snippet, you call .set(...) on queryCache. But since it is an instance of Immutable.Map(), without re-assigning it to queryCache, it would really have no effect.

    queryCache.set(query, results);

    queryCache = queryCache.set(query, results);

Mixing immutably with mutation (using let) seems weird to me. If you want to go immutable to make understanding your app easier then great, but use const throughout then.

Why? `const` isn't that much immutable than `let` is. For any reference type that's bound to a variable with `const`, you can mutate it without getting errors. This was very confusing to me at first, even though I understand the mechanics behind it.

    const a = {}
    // That's OK:
    a.a = 1
    // That's not OK:
    a = 1
While I don't disagree with you on your point, one could easily create the counter-point that by using `const` you're implying something that's not true.

The translation to C would be * const, not const *. Const is strictly better than let because it prevents a certain class of bugs - use it everywhere possible instead of let/var.

Thanks, fixed!

One problem with the Immutable.js library is that you can still not run a fast "diff" of two immutable values. This makes it difficult to write fast updating code.

Correct me if I am wrong, but the big advantage is that in most cases you no longer need to run a diff on the two values. You can do a naive comparison, and if their references are different then you can assume the values have changed.

If you truly do need to run a diff, you're going to run into the exact same optimization problems with both immutable values and mutable values.

Since assignment of an object is by reference, if you say: var stateB = stateA, then both stateA and stateB now point to the same JS object. If you change stateB then stateA will be changed too since they both point to the same object

So you would have to do: var stateB = JSON.parse(JSON.stringify(stateA))

to get a copy and then you can change stateB and it will not be equal to stateA.

The problem is without use of actual "persistent data structures" (not "immutables" based on Object.freeze etc) the copy/clone operation is expensive. With persistent data structures as in ImmutableJS you get faster (O(1)?) copy. Therefore, building your state store on ImmutableJS and doing an explicit comparison (if you're using React) with shouldComponentUpdate by comparing 'prevState != curState' is the only thing you need, and I had seen benchmarks that show 35% increase in performance over not using ImmutableJS.


in response to the question about having a list of items and React invoking shouldComponentUpdate, that should not be the case if React stops comparing when the parent aka the list itself says that update should not happen. Does React in that case descend down the tree to compare the state of the list items? I suppose it would but there must be a way adound it. I heard about batchUpdate add-on in this context. Researching now!

But let's say the UI consists of a list of N text elements. That list is encoded as an immutable vector. When one of the elements changes, react will run through the complete list and call shouldComponentUpdate for each item (!)

In contrast, with a fast "diff" operation, we could simply determine which of the elements in the state have changed, and this would be fast.

Take a look at this:


"Cortex is simply a store that works for updates at any level. It achieves this by utilizing cursors, which lets each nested node keep track of its path as accessed from the root. When an update occurs, the affected node emits an update event whose payload contains the path of the node as well as instructions on how to update the data at the root. Cortex's internal PubSub picks up the event and routes it to the affected root node. From there, every affected node is rewrapped and updated to maintain immutability while leaving all unaffected nodes untouched. This allows for extremely simple and efficient"

How would you implement a fast diff that didn't involve looking at each text value anyways? Would there be a separate store in the structure that tracks changes, or something along those lines?

You can do this by skipping over shared sub structures, while comparing, for example.

I don't see how that is related to immutability. Immutability gives you the ability to verify that it's unchanged in O(1), but it doesn't change your ability to do optimizations like that...

Well, immutable data-structures use "structural sharing", internally. Search for it. For example, this image, [1], shows two data-structures (the green nodes), that internally share many nodes. Now, comparing those two structures can be done efficiently, because the shared parts don't need comparing.

[1] http://eclipsesource.com/blogs/wp-content/uploads/2009/12/cl... (a random image I found on the internet that illustrates my point)

Wouldn't that be an argument for using an immutability library then? If you can implement a shallow equality that takes shared structures into account, what about non-immutable structures makes them a more efficient option for such a thing?

You can still optimize better with immutable types, because if you are diffing you are most likely diffing objects that share a lot of data, and for those subsections of the tree you can still do an `===` equality check and skip it during the diffing process.

Is there any library that does have the diffing feature you're talking about? It sounds like you're describing something like two-way binding, where some code is listening to a callback that fires whenever (for example) an item gets added to an array.

I don't know about every JS UI library, but React certainly doesn't work this way. React would indeed just see that the array object has changed and would go through every element inside. That doesn't seem to be a performance problem, even with thousands of items, although you'll probably want some pagination if your app would otherwise need to generate or change thousands of DOM elements.

Regarding javascript interop and immuatable.js, it seems that the authors of immuatable.js are trying to have their List/Map/Set objects as close to ES2015 ones as possible [1].

[1]: https://github.com/facebook/immutable-js#javascript-first-ap...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact