Hacker News new | comments | ask | show | jobs | submit login
JavaScript: Does It Mutate? (doesitmutate.xyz)
130 points by plurby 7 months ago | hide | past | web | favorite | 77 comments

The greatest sinners are methods that both mutate and return their result. It's possible for people to use some of these for years and not realize they're mutating. This is something python has largely got right.

> The greatest sinners are methods that both mutate and return their result.

Go's `append` builtin is guilty of exactly that sin. And worse, it does not always mutate its parameter in-place, so things get hellish if you start aliasing/shallow-copying slices because depending on the size of the underlying array, appends could overwrite the other slice's data or make the two slices diverge entirely.

append has a good reason for that, it can potentially reallocate the slice and return a new pointer. This is an expression of the underlying truth of the implementation that they chose not to abstract over, a decision I personally appreciate.

That's just a language that is insufficiently powerful to express the underlying truth. In C++ or Rust, the same construct could be expressed without loss of efficiency and without the leaky abstraction.

Can you elaborate?

Not the original poster, but in C++ you would be using a container, likely a vector so the internal data could move, but because the container is updated the caller can be unaware.

The abstraction leaks a bit. The result of a data() or c_str() call cannot be trusted after the vector is manipulated.

When passing a vector by reference to another function you effectively get a double pointer (abstracted behind a reference and a container) so the inner one can be moved.

I'm not familiar with go, but if it is like C in this respect append would have been clearer with a double pointer and no return value.

I did that mistake with `sort`. There was a nasty bug that I tried to fix, when I realised that `sort` actually mutates the array and returns it as well. That was pretty good lesson to learn.

list.sort returns None in python - and I’m sure it’s been that way for a long time. sorted is the function you mostly want as it returns a new list each time.

As someone who writes JavaScript full time, this is my biggest gripe with the language. Folks sorting arrays inline without first copying them is one of the things that I pretty consistently call out in code reviews.

I'm sure that those cases are where you didn't want the mutation, but there are certainly performance and efficiency concerns that are directly satisfied by having the sorting operate on the original array.

This is especially so when considering that such operations can be occurring in the UI thread (and not a web worker) where excessive computation can interfere with the responsiveness of the application.

It would be even better if it was a linting/compile or at least a runtime error to assign a variable to the result of one of those functions.

Python let's you do it, and you get "None" as the result, which makes such mistakes slower to discover.

This would require some way to mark a function as returning or void.

A clean solution would be arity of return values, optionally with a ≥ constraint on the original number of returned values vs bound variable names. E.g. a function returning two values can be bound to zero, one, or two variables, but not three. A function returning 1 can be bound to only 1 or none, and if you return no values, you cannot be bound to anything.

Aside: it is a common misconception that Python has multiple return values: it does not. It can return a tuple, which can be automatically destructured into multiple values:

  a, b = (1, 2) # this could have been a function
and it has syntactic sugar for tuples which allows leaving off the parantheses:

  >>> 1, 2
  (1, 2)
Which, combined, makes it look like multiple return values:

  def foo():
    return 1, 2
but, it really isn't:

  >>> foo()
  (1, 2)
  >>> type(foo())
  <type 'tuple'>
so this would be hard to achieve in Python. :(

Well, couldn't it be like "no return inside the function body" --> no LHS assignment allowed?

Sure :) but it would require a deep change in Python language semantics, without which this isn't possible.

You'd have to carry that information in the AST somehow. From a call site, a function object is just a an object with certain properties. "Did it have a return or not?" is not information which exists on that object. Only the return value is (when you call it), and if it is None, you don't know if that came from "return None" or from the default behaviour.

Python could enrich the function object to include that information, and it would essentially be "supporting 0 or 1 return values", instead of "n return values". But adding that no-assignment rule would be backwards incompatible. At that point, might as well go all the way and support any number return values :)

I feel like these types should be explicitly declared with two different keywords like "func" and "sub". Force one to return a value and prevent the other. This should shame people into writing functions without side effects and make reading code clearer.

Ada uses "function" and "procedure." I think they've relaxed the rules slightly, but it used to be procedures have no return values and functions may not modify their arguments.

This is probably orthogonal though. A function might not return anything and still have side-effects (and vice versa).

That said, a "pure" modifier that ensure that no side-effect code is allowed inside a function would be good indeed.

(There could be purity-levels as well, e.g. absolutely pure (doesn't even log or write something to a file or use a random number generator inside etc) vs "doesn't mutate program state" (which is a different kind of the same purity concept I think).

Well, that's how it works in Erlang... I'm only half-joking - the functions in Erlang are guaranteed to have no side-effects other than sending messages to other processes. So, if some computation never sends or receives a message, it is guaranteed to be side-effect free[1]. This is much weaker a guarantee than the function being pure, but in practice, it makes code reuse just a bit easier and writing stealthily mutating functions just a bit harder, which has real advantages.

[1] Not really, though. There's the process dictionary, and there's no way to statically or dynamically mark a function as not using them. The use of it, however, is discouraged to the point that most popular Erlang books only mention it in passing close to the end, or not at all.

Yeah, effect systems in general are a really cool idea that I wish we saw more of. Haskell and its derivatives are the only semi popular languages that have some sort of system, and those are just a few points in the design space that admittedly aren't perfect.

Realize in C and JS and most languages, the standard equality '=' operator, does exactly this.

For example, console.log(2*(a = 10)) would print 20 while setting a to 10

In C and JS assignment is an expression which returns the value being assigned. This makes the semantics of things like `a = b = 10` trivial.

In Pony, though, this is averted, as the assignment there returns the old value of a variable being assigned:

    var a = 10
    var b: Int
    b = a = 20
    a == 20 and b == 10   // very true
It's a very interesting approach I wish I saw explored more often.

> Realize in C and JS and most languages, the standard equality '=' operator, does exactly this.

> For example, console.log(2(a = 10)) would print 20 while setting a to 10*

`=` is the assignment operator, not the equality operator. The equality operator is `==` in C and `===` in JS.

You are correct, my mistake.

Most languages do not have that feature.

Even the C-like languages usually have some comment by their designers claiming that C made mistake there and they are not adding it to their language.

In other words, the jQuery convention.

So you hate all builder-object methods that return the builder for further building? Ok.

This isn't the builder pattern, since these methods directly mutate the final result rather than a distinct builder object.

Also, it's perfectly possible to have a non-mutatative builder that returns an entirely new builder instance with the given configuration at each step.

Yes. For the reason stated and also because they imply an easy/cheap/straightforward way of taking a copy of one of those objects that actually doesn't work. Does the caller of your function expect you to take that object it passed to it as "configuration" and mutate it?

Which is interesting given how easy it is to mutate in Python.

Easy, yes. And more importantly, explicit, unlike the JS examples in this article.

This is only because good precedents were set by the built-in functions. Mutating functions return None, and only non-mutating functions return something non-trivial. It's purely convention, but therein lies the power of the core language setting a good example.

That's OpenSSL for you.

That's why I really like the ! In Ruby. If used in a function name it indicates that it will mutate its arguments. It's so straightforward instead of just not knowing or having to use references or pointers.

That's not really true. First, it's entirely a convention and isn't always followed. Second, there are many many methods which mutate objects and don't have a bang. For example, most of the methods which mutate arrays.

Yeah, the second part of my comment was wishful thinking. It would be cool if it worked similarly to how Golang capitalizes function names to make them public.

As someone who has worked not very long with Ruby I have also stumbled upon the "!"-methods. It is a great convention - but as already noted, not consequently used.

Have a look at the insert method on array:

  insert(index, obj...) -> ary
From the signature and the convention one could think, that it returns a new array with the object(s) added at the specified index. However, insert mutates the array and returns a reference to the just mutated array, probably for chaining.

However, a convention that is not completely followed in the standard library like this does more harm than good as it can confuse newcomers. And you can not change it due to backwards compatibility.

Didn't know whether to respond here or above to parent, but ! In stdlib of Ruby does not indicate mutation but "dangerous version"... Often the same thing but I believe all methods in std lib with a bang also have a noo-bang version, which usually does not mutate receiver, but could also be, for example, File.read vs File.read!... I forget if Ruby has those but Elixir does and they indicate there that one returns an error or ok prefixed tuple but the dangerous version just raises an error if the file isn't present for example.

True. Unfortunately I think that means ! just becomes very ambiguous since it's not precisely defined. ("Dangerous" means different things to different people.)

you better follow the convention then. just like with the ? methods.

Julia does this too. It's brilliant.

Side note: I just tried solving an actual problem in julia for the first time. That language does a lot of stuff right, at least for my definition of right.

Also commented below but the idea here was not actually mutation but safe and dangerous versions of methods... E.g, below insert is mentioned and it has no safer version so it has no ! version... Elixir adopted this convention from Ruby also, but there is never any mutation in Elixir of course as BEAM doesn't allow it.

This is something ruby inherited from scheme. Scheme library devs are however a lot more militant about it than ruby devs, so everything that mutates has a ! after it.

It's one of the small things about Scheme I love.

foo! for mutating.

foo? for predicates.

foo->bar for type conversations.

Pytorch does something similar, except with an underscore at the end of the name. As the arguments are tensors, it's a really handy convention and the API communicates whether a method or function modifies an argument.

I don't believe this was always the case, so it's probably caused a lot of deprecation warnings for some people.

This is why I really like Rust's "mut" keyword that enforces whether a function argument or method target is mutable. :-)


Under "instances" there is a list of mutators, accessors, and iterators that I find more useful since it is grouped and alphabetized.

Another pitfall I have seen Dev's make is to assume .filter() will return the original array even if the callback returns true for every element. In fact it always returns a copy, even if all elements pass the filter

I recommend the npm package deep-freeze in conjunction with unit testing, makes this way easier to debug.

I have done quite some implementations now of things in an immutable way. Easy to test, easy to spot bugs, easy to work with. Come back a few months later, and purity is removed on many things despite clear comments on the why it is immutable.

I welcome any effort to make developers aware of immutability.

One simple thing to help here is to have the "mutability: flag marked on every method in MDN.

Yeah, similar to how lodash makes it explicit in the docs. That would be great.

Nice idea, but for a website that tells you about what functions mutate, its indicator for this property is rather small.

I'd maybe even throw out all non-mutating functions in the first place.

If it's on that page it mutates, the end.

It should be noted that you can still mutate in these. Like it's not at all recommended, but you have access to the original array in many of the functional "non-mutating" Array methods, for example: https://codepen.io/konsumer/pen/bKKYJO?editors=0010

This can make things very hard to troubleshoot.

And whether it mutates is kind of a surprise. Like reduce doesn't, even though it works similar to forEach, map, and every.

No, all of them do: https://codepen.io/anon/pen/bKKaBy?editors=0010

All of these pass the original array as the last argument, reduce()'s callback just takes one more argument, hence the confusion in the original codepen. It would indeed be highly surprising if some of these functions passed a copy of the original array.

ah, yeah, forgot acc param in reduce. updated.

I use those functions a lot and I'm still impressed by how fast they are ran by current JS interpreters (especially V8).

I did a layer over strings/arrays/objects to get nearly the same behavior everywhere. An example of those functions (in my custom programming language) https://p.sicp.me/h4MJE.js and the compiled code https://p.sicp.me/ctr6M.js.

just use ramda ¯\_(ツ)_/¯

Or Haskell, or a lot of other languages.

It's so empowering to be able to answer this question with an instantaneous "no, of course not, it can't be". People that never experienced it have no idea.

There is another version of this question, "now how do I change this thing? I designed my program so carefully to be almost completely pure, but now I want to try this other thing and I'll have to rewrite the code. To avoid that I will use this other hack that is unreadable and has sub-optimal performance" (Maybe I'm just doing it wrong?)

There are two different large worlds in computing.

In one of them it matters if your computer will stop for 10ms ever couple of minutes, or if you use an extra 20MB of memory. This is the world of C, Rust and assembly, and you'll be stupid to try high level code structures here.

The other world is where mostly everybody lives. Here having your code feature complete and correct is overwhelmingly more relevant than putting it above the 95th percentile in performance, and you would be stupid to keep adding complexity just because it's sub-optimal.

It should be very clear what world you are in, because there are actually very few borderline problems.

It's not about performance, at least not 95th percentile performance, by far. It's almost never about that, and it's not what I was talking about. I'm totally fine with less than 20% almost always, and often performance just doesn't matter at all (as you say). But - if an architecture is wrong, sometimes the best you can get without rewriting everything could be like 1%. Maybe the requirements can simply not be fulfilled this way. In any case, my comment was mostly about the development process. I don't want to start a heated discussion (I've started too many of them in the past) so all I will say is what I care about is control, in multiple ways.

Looks like I misunderstood your comment (but it is sometimes about performance, there are people on that performance above everything world).

Yes, you are going into it wrong. What I can see on that comment is that:

- You expected to write a "mostly pure" program in a pure language. It won't let you do that, you will write a pure program. You also do not need to be careful in doing it. The most obvious thing you get on a strict language is carelessly - you don't have to care about things that the language does not allow. You focus on (re)writing your program.

- As a consequence of the first point, the more strict (on anything, including mutability) a language is, the easier it is to change old code, because you can assume a lot of stuff about it, and because you care less about making mistakes.

- Specifically about mutability, it does not take any control away. Mutable values are exactly as expressive as immutable ones. You can have mutability semantics on an immutable language, and most try to make it as near a first class syntax as possible, but the one change is that you will have to make it explicit, one way or another.

for sure. Immutability is awesome (or I guess more accurately, state minimisation and explicit management of the remainder).

That's my feeling too, would rather not have to trust myself to remember any of this.

Or Lodash-FP...


Looking at a site like this makes me appreciate a language like Clojure more.

I was quite surprised a while ago when I learned that `sort` in Common Lisp can mutate somehow randomly.

> The sorting operation can be destructive in all cases.


Somewhat relatedly, a very common mistake in Lisp is modifying literals; a lot of people don't realise that there's a difference between writing '(a b c) and writing (list 'a 'b 'c), but QUOTE's description in the spec plainly says "The consequences are undefined if literal objects (including quoted objects) are destructively modified." To demonstrate the difference in practice (DELETE is the destructive version of REMOVE):

    CL-USER> (defun abc-without (letter) (delete letter '(a b c)))
    CL-USER> (abc-without 'a)
    (B C)
    CL-USER> (abc-without 'b)
    (A C)
    CL-USER> (abc-without 'c)
This behaves strangely because it modifies the constant list (A B C) rather than consing up a new one on each invocation (like calling LIST or COPY-SEQ would), so by the end the list it passes to DELETE has been reduced to (A).

You'll find people much better suited to explain this on HN, but I'd guess it's about efficiency. CL was defined long ago and one of the goals was for it to be fast. It really is, but there is a pervasive mutability in most of it because of this, with functions like nreverse, rplca, nconc and so on.

Add something like this to your library and you are fine:

  (defun isort (sequence predicate &key (key #'identity)) 
    (sort (copy-seq sequence) predicate :key key))

Three other questions of similar importance are "which exceptions might it throw?" "what are its exception-safety guarantees?" and "is it thread-safe?" (they do not all apply in all cases, of course.)

Re: "is it thread-safe?", the answer is yes. JavaScript is single-threaded and, since SharedArrayBuffer was disabled in all major browsers in response to Spectre, there's no concept of shared memory.

TL;DR: sort and reverse are the tricky mutators in my opinion. Everything else is more obvious.

It'd be useful to filter by whether methods mutate.

It's clear at first glance to someone familiar with the language that the site is about Javascript methods, but for the benefit of those who are not very familiar, it would be nice if the site called that out in the title or a subhed.

Thanks, we've updated the title. We don't always need to call out such things, but when a headline has broader enticement than what it actually delivers then we're into misleading or clickbait territory.

I work in computational genomics so programming languages in general was not even the first thing I thought of.

That was exactly my first thought. Do we just assume javascript this days? Not even a title or two liner of what is going on.

Why should the site need to call that out? It's about Javascript.

A better approach would be to edit the title here, and any other place linking to it.

Well, you'd hope the read-only functions like `every`, `map` etc don't mutate.

For the ones that do, it might be a good idea to provide examples of how to accomplish the same thing but without mutating. For example, show how to use `concat` where you'd use `push`.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact