Hacker News new | past | comments | ask | show | jobs | submit login
Immutable Data Collections for Javascript (github.com/facebook)
204 points by gagan2020 on July 30, 2014 | hide | past | favorite | 89 comments

Hey, I'm the author of this library. It's definitely inspired by mori (and clojure and Haskell) and the reason I ended up building something different was to present a more JavaScript friendly API (and academically, to learn about HAMT). I've built this over the last couple weeks, and we are not using it internally yet - but I wanted to ensure development of it happened in public.

A question: how influential has Swanodette's work with ClojureScript/Om in relation to React been to the direction that you (yourself and Facebook) are taking the evolution of React and its supporting libraries in?

From what I've seen, I'm guessing: quite a bit, but I'm not sure that I've noticed it being mentioned. In particular, I'm guessing that the performance characteristics of Om might be especially inspiring.

Edit: Oh, and good work on the library! It's great to see efforts to make the case of immutability within the world of JS.

I contribute to React. I'm definitely a big fan of ClojureScript and I borrow ideas from it when I can. I currently have a few crazy ideas that are direct results of looking into the Clojure (or Haskell) ecosystem and finding that they've solved my problems, but better. In general, some of the problems we face, at language/library/ecosystem level, might simply not exist in these languages. It's a shame they're not more popular.

Regarding popularity, I feel that sometimes the web development community can be ironically dogmatic, considering that it's a free and open platform (see: HN/Reddit's initial reaction to React). Looking back, I still find it incredible that React actually gained so much traction when it defies so many of the pre-established "best practices". I think lots of people, including me, are learning more and more to take a step back before dismissing an idea.

So if you were one of those that dismissed Clojure/ClojureScript because "parentheses in front of my function call?", "snake case isn't for me", or "it's too different from idiomatic js" (I hope this one isn't too much of a strawman here), then I urge you to give it another try. Learning React doesn't have to be the only time you open your mind to new ideas and reconsider best practices.

Would you mind espousing a bit on what problems we currently face that might not exist in functional languages like Clojure/Haskell?

Also, I just started getting into ClojureScript in order to learn Om. I would love to know your thoughts on why it's worth it to learn Clojure/Haskell.

I don't want to go further into a language/library/ecosystem war here. Glad to talk more about it on #reactjs though =).

Personally, a good amount! I know a lot of the React team are also big fans of Om. React has always been inspired by functional programming and I think in another universe we would have preferred to write it in Scala or Haskell or Clojure. With React (and this Immutable lib) we've been trying to incorporate ideas of functional programming into JavaScript product while still feeling familiar and "native to JS".

Swanodette has already posted on his blog about immutable-js [0].

[0] - http://swannodette.github.io/2014/07/30/hijacking-json/

How's the performance on the immutable vectors?

I wrote an immutable vector library once, but it turned out impractical to do game physics with because it was too slow. Switching to self-modifying vectors fixed the problem. I think the performance hit of allocating new JS objects might be too high for games.

Thanks—this looks pretty exciting! I've been using Mori, but this appears to have a more JS-ish API. Can you give some other comparisons to Mori (i.e. feature set, performance, etc.)?

First off it's very exciting and cool to see a company like Facebook release an open source persistent data structure library for JavaScript.

Mori just has more stuff and the benefit of 3 years of development and production use. This becomes quite apparent on the metrics you asked about:

  - mori is a 32K gzipped browser dependency, tons of helpers functions
  - immutable-js is a 10K gzipped browser dependency 
    (with Uglify and aggressive mangling), mostly just data structures
ClojureScript (and thus Mori) has been continuously tuned for 3 different major JS engines - JavaScriptCore, V8, and SpiderMonkey. Running some basic benchmarks overall Mori comes out ahead on most operations. But there's nothing surprising here, immutable-js is brand spanking new :). Also there are micro-optimizations (which add up) in ClojureScript that are impossible to expose to JS users w/o a compiler.

In anycase this is great stuff and I'm excited to see how JavaScript and more specifically React devs leverage these to explore the ideas currently be played around with in ClojureScript/Om/Elm/etc.

I've run across a couple HAMT implementations you might not have seen:

https://github.com/mattbierner/hamt/ https://github.com/Tvaroh/moreartyjs

The first implementation (hamt) is interesting because it's in a same semantics compile-to-js language (by the author) called Khepri and it benchmarks pretty well on datastructure microbenchmarks.

http://khepri-lang.com/ https://github.com/mattbierner/js-hashtrie-benchmark

The second is a less interesting pure JS implementation that I show as faster than mori and slower than hamt on the hashtrie-benchmark. I've been meaning to test them on a wider variety of benchmarks (more aggregate operations and vs transients) but I only ran across them last week and I've been busy.

Thanks for the links. Sadly doing mori benchmarks like this does't say as much about the performance of ClojureScript data structures than it does about mori specifically - in many cases you are likely just benchmarking multi-arity function dispatch overhead because you don't have the ClojureScript compiler doing direct dispatch for you. If you want accurate benchmarking of ClojureScript data structures you will need to write the benchmark in ClojureScript for the time being and call them from JavaScript.

Also benchmarking in Node.js doesn't reflect the wide variance you find in JS engines. In several cases in the past we avoided V8 specific optimizations because it punished other JS engines.

UPDATE: I also ran some benchmarks and I can't replicate the perf degradation as the size of the hash map increases. My suspicion is that the random key generation may result in many hash collisions w/ Murmur3 which would explain the dramatic perf degradation which is unlikely to occur in the wild.

With the js-hashtrie-benchmark, I wanted to get a rough idea of the relative performance characteristics of the various libraries. Compiling the Mori benchmarks in ClojureScript may produce technically more accurate results, while also giving Mori a sizable compiler optimization advantage over the other libraries, but I was mainly interested in how the various implementations performed when called from JS.

I'm not a benchmarking or ClojureScript expert, so please let me know if you find any problems with my benchmarks or can make any improvements to make them more accurate. Hashing is one area that I need to investigate more in my libraries and in the benchmarks.

Getting more than one datapoint would be nice too. None of my code is optimized specifically for Node, so seeing benchmarks on other engines would be interesting.

I appreciate the response. I was wondering about the lower Mori numbers here because I was sure I'd seen much higher numbers in cljs benchmarks.

Author of the HAMT library and Khepri here. Hadn't heard of moreartyjs before, but it looks like a neat project.

I would be interested in seeing your benchmarks too. Please let me know if you find any problems or have any suggestions for HAMT or js-hashtrie-benchmarks.

The feature set is comparable sans yet to come SortedSet and SortedMap (aka Red Black Tree). Performance should also be comparable if not slightly faster than mori in some iteration cases, as it's tuned for JS instead of ClojureScript and I've forgone some of the functional purity for speed. But this is anecdotal and I don't have any perf tests yet.

I think you might be surprised exactly what running code through an optimising compiler achieves.

I'm curious at what you are hinting at here. Especially when you are discussing the data structure, I've not seen too much magic that an optimizing compiler can really achieve. That is, the compilers are usually more geared to help speed up the code, not so much the data.

Any good links showing otherwise?

I just read through the discussion, and created an "awesome style" note on GitHub based on the resources mentioned. Did I miss anything? https://github.com/memect/awesomeport/blob/master/awesome/im...

does facebook intend to use this internally? The code is (C) Facebook, so Facebook is at least interested enough to pay for the development, right? or is this just some sort of R&D that won't have near term impact at facebook?

What is HMAT?

It would help if I could spell. Fixed the typo and:


There is also Mori[1], a JavaScript API for using ClojureScript's persistent data structures:

  [1] http://swannodette.github.io/mori/

Mori also provides clojure like functionality for the new collections greatly simplifying how you would use them - if you don't use Clojure you might think of the additional functions as an underscore for immutable collections.

I don't see a reason to use any other immutable js lib when Mori exists.

Mori's a bit impractical for use in the browser due to its size.

I'm not seeing how a 32 KB download is impractical in browser use do to size (assuming the above statement about 32KB is true). Is there something else to it?

I didn't see any such statement above. From 176K to 37K (that's what I got on my console anyway) is impressive.

If you use ClojureScript, it strips out unused functions...

That would be true for any properly annotated JS fed to Google Closure Compiler, right?

I definitely need to learn more about it, as dead code elimination for JavaScript seems very, very hard due to its dynamic nature and I'm curious how they do it. Does anyone know about other JS compilers which try to perform dead code elimination?

I really appreciate when a new project mentions the ones in the same space - or the ones that inspired it - and why they aren't good enough and a new project was started.

The guys in the ATP podcast talked about a similar gesture in the Overcast app, and that most users really enjoyed: https://dl.dropboxusercontent.com/u/19746944/overcast_others...

Did FB reference mori? I didn't see it in the README. I was curious if it was just a NIH policy, or if they had reasons why mori didn't fit the bill.

They referenced Clojure. Mori is just Clojure's data structures.

Mori with it's underscore like API indeed seems much simpler and less C++ and Java like. I guess it depends on your taste.

Bit confused (ignorant actually) here. Can someone give a 'real world' example of the benefits of immutabile objects in js?

The first time I understood why immutable objects are important was in this podcast with ClojureScript and Mori author David Nolan: http://podcasts.thoughtbot.com/giantrobots/93

What I thought was the key point was this:

Let's say you have a regular JS array. There is a reference to it in memory and it has a certain value. Now you change an element in that array. You've just mutated it. In JS, the reference in memory doesn't change, but now the value of what it's pointing to has. So in order to know if the value has changed, you need to do a comparison on every element on that array, which is expensive.

Now let's say you have a immutable array (such as from the Immutable FB library or Mori). If you change an element in that array, you get a new array and a new array reference in memory. In other words, if you check that the reference in memory is the same, you are guaranteed that the value is the same (in this case, the array). So checking for equality is fast and cheap because you are only checking the reference, not each value in the array.

One way this is important in React is that now if your component state (or entire app state) is represented by an immutable data structure, you can just check for reference equality in the shouldComponentUpdate() method when deciding to re-render that component. If the reference is equal, you are guaranteed that the data backing that component hasn't changed and you can return false, telling React not to re-render.

What's surprising is that people have made these data structures also very memory efficient. You can see David Nolan talking about how they work and memory comparisons here: https://www.youtube.com/watch?v=mS264h8KGwk

Note that you can have cheap equality detection by using e.g. flags or revision counters and so on (but there are limits due to JavaScript's nature..). The problem is that JavaScript currently doesn't support this properly out of the box, but Object.observe is coming in the next standard.

I personally like the safety benefits of immutability. You can give objects to functions and not worry about the functions changing the object. The only way to guarantee that otherwise is to clone the object, but that takes quite a bit of performance.

The difference is that if you use Object.observe to track changes, then you don't have the ability to hold onto the previous version of an object and know what the old value was.

This is important for animations where you want to animate from the object's old value to the new value.

The objects passed to the Object.observe callback function describe both the new value and the old value, FWIW.

Isn't this library probably cloning?

If you watch the linked video above you will see how clever persistent data structures mean that nothing is ever cloned or copied.

Good to know. I never got into immutable collections because I presumed they would result in things being allocated all the time (and that that would be expensive).

I'd guess this library clones on modification, but I was talking about cloning the object on every function call.

Persistent data structures allow for the appearance of cloning (you have a new thing) but are much more efficient behind the scenes (you don't actually copy everything). So if you find yourself doing a lot of defensive cloning, then Mori or this library would probably be a lot faster for you.

Absolutely! If anyone is reading this, I recommend Eric Lippert's 11 part series on immutability in C#. The concept apply everywhere, of course.


Immutable objects in general or immutable objects in JavaScript specifically?

A general example would be good :)

It is a guarantee to you that the data structure has not been tampered with by the time it reaches you.

    // pseudo code

    var fruitBasket = new FruitBasket('banana, 'apple)


    // here
    assert(fruitBasket.length == 2)
By the time we reach "here", we know that none of the functions since the inception of fruit basket have tampered with it.

This may not be a big deal if you are disciplined about never modifying your source data, but this moves the guarantee from a soft one to a mechanical one.

And then there's all this typical stuff about functional programming blah blah... But that's the most practical example I can think of.

If everything is immutable, it also makes it a lot easier to parallelize tasks, because you can easily deduce a dependency graph.

In multi-threaded environments, immutable objects are thread-safe. When multiple threads are sharing data, you would normally have to worry about synchronization. However, immutable data structures guarantee that the shared data will not change, so synchronization issues simply go away.

That said, I'm not sure how this applies to JS, which is single-threaded.

Even though it's single-threaded (not with web threads maybe?), it can still be asynchronous, which is where the issues really lie.

The original npm package "immutable" was https://github.com/hughfdjackson/immutable (npm install immutable@1.4)

Did Facebook buy the npm name from hughfdjackson, did he give it to them, or did NPM switch ownership?

There's an acknowledgement in the README.


Sounds like more of the pieces of React/Flux are being opensourced. :-)

Interesting to see references to TypeScript. Is it prevalent at FB?

This isn't used internally by Facebook at this point. The author mentioned this after your post. https://news.ycombinator.com/item?id=8108741

To clarify your point, immutable isn't in production at FB yet; we don't know about TypeScript.


But even if it is not being used yet - do we know if Facebook uses Typescript (or some extension of it)?

If you want "pieces" with similar functions to what React has then there is the very modular Mercury too: https://github.com/Raynos/mercury

If anyone is interested in functional collections for Common Lisp or Java, have a look at FSet: http://www.ergy.com/FSet.html

So you can use the Clojurescript/Om/Reagent/React model without Clojurescript, nice for people who don't want or can't jump into Clojurescript right now.

I wish the README would tell us a little bit more about how to use it in conjunction with React. Especially the logic for shouldComponentUpdate() would be interesting.

From my understanding, it makes shouldComponentUpdate() very simple:

return nextProps.value !== this.props.value;

If the object is different, it has changed.

Yes, that's how for I've got, too. But then you have a problem if your React component has multiple attributes. Suddenly your check beckomes much larger:

  return (nextProps.attr1 !== this.props.attr1 ||
          nextProps.attr2 !== this.props.attr2 ||
          nextProps.attr3 !== this.props.attr3 ||
          nextProps.attr4 !== this.props.attr4);
Which would result in a lot of boilerplate code if you had to do this for every single attribute in every component in your application. It would be much simpler to just do this check:

  return nextProps !== this.props;
But that's impossible if you're using JSX, because the JSX compiler automatically inserts a normal JavaScript object into your code, instead of an immutable object from this library.

So from my point of view this immutable-js library and JSX seem incompatible, which is kind of weird since Facebook apparently uses JSX internally for their React code.

PureRenderMixin[1] is doing this without having you to enumerate all the props

[1] http://facebook.github.io/react/docs/pure-render-mixin.html

MyComponent({realProps: immutable props}) comes to mind..

In a flux application you typically always ask flux for the state, though, so it becomes this.getFlux().myThing.

I'm not familiar with React, but why would that not be the case without this lib? {} !=== {}

EDIT: Maybe I understand. You're saying a parent component might do something like: foo.bar = 'blah'; and then doing a simple equality check foo === foo is not enough, you have to check that the properties have not changed either. Is that correct?

Exactly. If you have deeply nested data, multiple properties that may change, or arrays, then it becomes quite difficult (and/or slow) to test if there's a difference.

I don't understand this whole immutable thing... Why not just avoid changing the data once created? Why do you need a library?

Because a data structure is useless if you can't change it. You do want to 'change' the data. Immutable structures let you take a data structure and 'modify' it, getting another structure. Anything with a pointer to the first one has a guarantee that it won't ever change.

With C-style language structures immutable structures wouldn't be much use. With Clojure, for example, immutability is one of a few mutually complementary features of the language in which it makes perfect sense.

Immutability doesn't just mean 'doesn't change'.

Not good enough. When the time comes that you do need to add something, then what? Clone it? In languages that allow you to freeze a collection, adding something to a frozen collection is often a O(n) operation. Most of the collections in this library provide operations that grow collections in (almost) constant time.

> Clone it?

Yes, isn't that enough?

Let's say you want to create a set by recursing along a million long sequence, collecting the unique values. This would be done with an operation called `reduce`.

The time and storage complexity of that would be exponential if each step you had to clone a copy of the output list. With a persistent data structure, as much of the structure as possible is re-used.

And a list is the simplest possible structure. What about a tree?

Using `reduce` is, I would say, in the top 5 techniques used in functional programming. And it would be severely hobbled without efficient persistent immutable structures.

No, it's time and memory inefficient. When you add something to immutable data structure, a new structure is created but internally it shares the data with a structure it was derived from.

I have to admit, I find it odd that we name immutable data structures based on what they don't do instead of what they do.

Of course, what they do is efficient persistence. You want a snapshot of the data? You want a non-volatile reference? You've already got one! That's the feature. They should be called persistent data structures.

You can have an efficiently persistent data structure that isn't immutable.

For example, a rope that reorganizes internal nodes to maximize structure sharing.

That is true! In fact, an earlier version of this library was called "persistent-js", and I know this is a pretty subtle distinction - I found it easier to talk to people about when I talked about it via immutability and so I ended up naming the library that way.

I'm sure you are aware in many cases they are indeed called persistent data structures: http://en.wikipedia.org/wiki/Persistent_data_structure

React's documentation also discuss their immutability helpers [1]. This new library looks like a better way of implementing this though. Will the documentation be amended to discuss use of immutable-js?

[1] http://facebook.github.io/react/docs/update.html

React immutability helpers pros:

- Work on regular JavaScript Object and Array so it's easy to get started and integrate to your app

- You don't need to wrap objects inside of Immutable to use it and you can use `obj.key` and `arr[0]` syntax for accessing them instead of `obj.get('key')` and `arr.get(0)`


- There are no safe-guards if you mutate the objects and it'll likely break your app in subtle ways. This kind of discipline is very hard to get in big projects

- You are stuck with primitive data structures (instead of map hash trie), so it's not going to be as performant and memory efficient.

Also relevant (and not included in this collection) is the following functional red-black tree implementation:


Can someone give me a good example of where immutable data structures are better than mutable ones. I know that you are more likely to mess up and have side effects with mutable data structures, but so far all I have heard is theory.

The big win is they enable lazy evaluation: you can defer the evaluation of an expression until it's needed, or discard the result and compute it again later, or cache the result of an expression and not need to re-evaluate, a flexibility which can be useful in many situations.

You also get undo/redo stacks basically for free, since you simply maintain references to the previous n immutable states.

As Om demonstrates, they also make possible significant optimizations of code that takes various view of the same underlying data, like rendering routines.

Well, for one thing you don't need to make defensive copies all over the place. Look at OBJ05-J and OBJ06-J, for example. Some of these copies can be removed via optimization, but not always.

What's the use of immutable data collections if you have to constantly revert to mutable arrays to render anything? Any gain from the persistent collection is lost in wasteful Array creating (Poor gc)

Hmm, it seems to be written in TypeScript, but the source is only provided in JavaScript.

I actually started by writing it in TS and later moved to just a TS declaration file and a raw JS source. This means it can be used smoothly in both environments (similar to using a definitelytyped resource).

What made you go back to plain javascript?

For an ambitious project this is probably still a better approach. But did you try using the --declaration switch?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact