
The Iterate-and-Mutate Programming Anti-Pattern - dogweather
https://dogsnog.blog/2020/04/23/the-iterate-and-mutate-programming-anti-pattern/
======
tcgarvin
Using common functional tools can be a great way to simplify code like the
example here, but as with most programming techniques, it can easily backfire
in real-life scenarios.

There have been a number of times I've seen junior developers who are high on
functional/immutable paradigms commit mazes of mapping, filtering, rolling and
unrolling, resulting in something that's hard to understand and hard to debug.

Even in the example here, while it's obvious what the replacement code _does_,
it's less clear what it's _intending_. I actually had to refer back to the
original code to understand that the thing we were trying to generate were IDs
of some sort.

~~~
ken
I've seen a-million-and-one junior developers screw up for-loops, too, with
overly complex code that's hard to understand and debug. That's in no way
unique to FP.

I would need at least 3 more pieces of data -- the other (experience_level,
programming_style, bug_rate) tuples -- in order to judge whether this
constitutes a pattern which is best avoided.

~~~
baron816
> I've seen a-million-and-one junior developers screw up for-loops, too, with
> overly complex code that's hard to understand and debug.

Everything can be abused. I see people doing side effects in their mapper
callbacks, deeply nesting ternaries and switch and if/else statements,
mutating far off objects, and just generally making code way more complex than
it needs to be. Usually, this is from junior devs, but not always.

In the right hands, functional programming in JS can be very powerful. It can
be easier to reason about (linear data flow, declarative code), and more
performant (immutable data structures, transducers). In the wrong hands, it
can be even more opaque than imperative, procedural code.

------
qqssccfftt
Around the middle of last decade, a significant chunk of programmers decided
the following:

\- For loops are evil

\- Lambda overhead doesn't exist

\- Nor does the overhead of looping over the same thing N times

Whilst I kind of broadly agree with the article, sometimes you just need a for
loop.

~~~
hackingthenews
Stream overheads exists, but I have never understood why.

Take the example of Java:

Why are loops faster than using streams (in most cases)? Shouldn't the
optimizer/jit be able to see that most streams (even chains of map, filter and
reduce etc) are equivalent to simple loops and just optimize away the overhead
(in the cases where loops are (obviously) faster; non-parallell, serial
computing)?

~~~
Jtsummers
That's exactly what Series does in Common Lisp.

[0]
[https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node347.html](https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node347.html)

~~~
hackingthenews
Also Lazy data structures can achieve this.

And C++ (and rust probably) compilers have no issue doing these optimizations
as well.

------
andrekandre
i thought this was going to be about mutating an array that you are iterating
(which i can understand calling and anti-pattern)

but this is about looping and appending to a new (already empty) array, and he
calls that an anti-pattern...

im not so sure, seems more like "i like this style, so eveything else is an
anti-pattern"

------
kazinator
If you have to walk over some space that is not easily mapped over (perhaps
consisting of disjoint spaces), and collect/produce items according to various
criteria, it doesn't necessarily match, the "prepare a bag, and collect into
it" is definitely a better approach.

He's just working in wretched language without macros.

TXR Lisp:

    
    
      (build
        (each ((i (range 1 5)))
          (let ((id `car-@i`))
            (when (car-empty carid)
              (add id)))))
    

_build_ starts an environment for implicit procedural list construction. The
local function _add_ is visible for adding an item to the bag. The macro
yields the constructed list when it terminates.

The _build_ environment provides _deque_ semantics. Items can be added on both
ends, and also deleted. A depth first search can be succinctly expressed using
the closely related _buildn_ :

    
    
      ;; breadth-first traversal of nested list;
      (defun bf-map (tree visit-fn)
        (buildn
          (add tree)
          (whilet ((item (del)))
            (if (atom item)
              [visit-fn item]
              (each ((el item))
                (add el))))))
    

This example is given in the documentation: [https://www.nongnu.org/txr/txr-
manpage.html#N-01346AAA](https://www.nongnu.org/txr/txr-
manpage.html#N-01346AAA)

------
jimmaswell
Not an anti-pattern at all, especially if more complicated stuff is going on.
I'm not bending over backwards to contort every for loop into the other form.
Not everything needs or benefits from map/select/etc.

------
pritambaral
FWIW, better than both models shown in TFA, I _much_ prefer the declarative
model:

    
    
      (loop for i from 1 to 5
            for car-id = #?"car-${i}"
            when (car-is-empty car-id)
              collect car-id)

------
meheleventyone
Ones is more explicit and the other is more terse but I honestly can’t
understand the argument that one version is more likely to cause bugs than the
other. Is this a simple version for illustration and if so what do the more
complicated versions look like?

~~~
lowdest
The point is to not modify the object that you are looping over from within
the loop. It is very easy to create bugs this way.

~~~
Supermancho
> not modify the object that you are looping over from within the loop

That isn't happening. The loop is over an integral range (1..5 in both
examples). That integer is being hashed in the example and the hash is being
used for a SIDE EFFECT. Internally, the select() is doing the exact same
thing, but there is no label on the value and it's being returned out of the
ostensible container function.

This is not a good example of why this pattern is bad nor of why side effects
are bad.

------
deepsun
What if car_is_empty() function is actually a heavy call to a service that
needs error handling, monitoring, logging etc. E.g. log needs to contain what
car was that that we omitted. It's not that easy anymore with functional
programming.

~~~
dgb23
You would avoid doing this in either style but instead build a query or batch.

A good argument for the second example is readability. The first one re-
invents map & filter which are well known and quickly understood.

------
mannykannot
Let's take a look at the arguments presented here.

> Code that mutates (changes) a variable is harder to reason about, and so
> easier to get wrong.

This is true when the mutation is "off the page" \- in some other scope - but
there is no mutation going on in this example that is hard to understand.

> An explicit 'for' loop is another common source of bugs.

The C-style 'for', where one manipulates a control variable which nominally
denotes those elements of a set which one is concerned with, is very prone to
error because of this indirection, but _both_ versions of this code do this!

> The original snippet is longer because it’s doing boring, low-level work.
> More, boring, code is another source of bugs.

Maybe, but it is not demonstrated here. 'Boring' is an odd adjective here: how
often have you seen the source code for a complete application, let alone
anything larger, that you just can't put down? real-world code is about as
exciting as a balance sheet, and for good reason.

> Long code with low-level operations like for loops and pushing values onto
> arrays is not expressive.

Expressiveness might be a feature of a language, but it is not something
programs inherit merely by being written in a particular language. Programs
are not poetry, and what we want from them is clarity. Expressiveness may help
with that, but this example does not demonstrate it (maybe the author should
have given us an example in APL?)

> The shorter, functional version has no variables, no mutation, and no
> explicit looping. It’s much shorter and easy to understand, once you’ve seen
> this style.

More of the same, with a touch of the Emperor's New Clothes thrown in to
suggest that if we do not get it, we are just not sophisticated enough to
understand.

Don't get me wrong: mutation _is_ the cause of a lot of problems, and I would
prefer for procedural languages to have variables const/immutable by default,
but this sort of article just does not make the case. There must be several
thousand articles like this written every year, all pretty much the same, with
the author never asking whether their example actually makes the case they are
claiming (I would guess because they are so convinced that they are right,
they don't ask the question, and the existence of so many prior articles of
the same sort (which they are effectively plagiarising) makes it seem
obvious.)

To be fair, a decade ago I was in complete agreement with the precursors to
this article.

------
_bxg1
I _usually_ see this pattern as a red flag, but as with anything, there aren't
_zero_ legitimate use-cases for it.

One use-case is when you lack proper iterators. In JavaScript, the
map()/filter() version can be an order of magnitude slower because it's
creating and copying a new array at each step. This can be a worthy trade-off
for readability in certain cases, but in hot paths converting to the iterate-
and-mutate version is one of the most common optimizations I go in and make
later on.

Another use-case is readability (sometimes). reduce(), in particular (going
from a collection of values to a single value), is notorious for being hard to
follow. Of course mapping and filtering are known for usually being very
intuitive. But as a baseline, a step-by-step for loop is always a certain
degree of _kind of_ intuitive. YMMV, but there are cases where it's the more
readable option.

It's probably still worth identifying this as an anti-pattern, so long as you
qualify "anti-pattern" with the fact that tradeoffs always need to be
considered in context.

~~~
baron816
> This can be a worthy trade-off for readability in certain cases, but in hot
> paths...

Hot paths aren’t the common case. You should always default to the more
readable syntax (map/filter) and then refactor once you’ve determined where
your bottlenecks are.

Transducers are also an option to get the map/filter syntax without losing and
performance, but I think most will unfortunately shy away from those.

I do agree that reduce is over used. Reducers as a pattern is really powerful
(with transducers), but no one uses them right.

~~~
_bxg1
> You should always default to the more readable syntax (map/filter) and then
> refactor once you’ve determined where your bottlenecks are

That's exactly what I said:

> in hot paths converting to the iterate-and-mutate version is one of the most
> common optimizations I go in and make later on

> Transducers are also an option to get the map/filter syntax without losing
> and performance

All you need are iterators. Python, Rust, Haskell, most languages support
performing these functions with iterators. There are even iterator libraries
for JS, if you don't mind an extra dependency. But vanilla JS does not support
this.

------
renewiltord
What language runtimes support simple data parallelism for `map`/`filter` with
low syntax cost?

Java lambdas do have instantiation cost but I'd wager that for the majority of
my code low-cost data parallelism beats that. Also, a lot of the theory here
is that the JIT could kill off the instantiation and convert to just calls.
Why doesn't it do that?

Rust, perhaps?

~~~
imtringued
The JVM can trivially inline virtual function calls and lambdas. However, it
will add a sanity check that the function/lambda you're executing matches the
inlined section and if not will use an indirect jump or call as a fallback.

~~~
renewiltord
So if I `List.of(1,2,3,4,5,6,7).stream().parallel().map(x => x * 2)` or the
equivalent, that will inline a left-shift the same way it does it here? That's
pretty cool.

------
sbussard
It would be incredibly helpful to have tips for solving leetcode problems in a
functional way. There seems to be a category of problems that are most
conveniently/efficiently solved using mutation, though I'm personally not a
fan.

~~~
Jtsummers
I've never actually looked at leetcode problems, can you give examples of ones
that seem to be best done using mutation?

~~~
sbussard
Off the top of my head I can only think of the one where you need to merge two
linked lists in sorted order. It seems like most fall into being iterative or
recursive. The iterative solutions tend to involve mutations. Doing them
functionally seems like a bit of extra overhead. I'd love to be wrong about
that because FP seems to be mathematically superior

------
valuearb
Isn’t unreadable code an anti pattern?

~~~
Noumenon72
Pipelines like this tend to come in two flavors:

1\. Long, unreadable, and undebuggable (since you can't step through them)

2\. Ones you wish could have just been a Python list comprehension:

    
    
        return [f"car-#{n}" for n in range (1,6) if car_is_empty(n)]

~~~
pritambaral
Your list comprehension example is incorrect: car_is_empty() takes the id
string, not number.

~~~
Noumenon72
I'm sorry, I actually didn't get what the code was trying to do. Result should
have been named "empty_car_ids" to signal.

What is this "select" function, why isn't it called filter? I actually don't
know how to Google this because the HTML select gets in the way.

I'd really do it in two lines, because I want to get the first half right and
then the second, and be able to inspect the first half independently of the
second while debugging. The pipelines don't let you do that.

    
    
        car_ids = [f"car-#{n}" for n in range (1,6)]
        empty_car_ids  = [id for id in car_ids if car_is_empty(id)]

~~~
valuearb
If I were to teach a professional programming class, I’d automatically fail
the most concise entries, whether they worked or not.

------
Supermancho
These examples aren't in the same language. Not sure how this is a quality
illustration.

~~~
Supermancho
Downvoted? Really? How is this article any different from:

    
    
        Why write all this async javascript? use PID ! {msg}

------
crimsonalucard
It's not an anti pattern. It's actually a zero cost abstraction. Iteration
using recursion causes the stack to fill up.

You could say reduce can work but ultimately reduce needs to be implemented
using recursion or a for loop. Reduce is a higher level function and therefore
not part of the argument.

That being said I do know about Tail recursive optimization. Most language
implementations do not support that though.

