
Mutable Algorithms in Immutable Languages, Part 1 - zik
http://tel.github.io/2014/07/12/mutable_algorithms_in_immutable_languges_part_1/
======
joe_the_user
Hmm,

Now, this and other articles about the difficulty of writing standard
algorithms in functional languages makes me want to see FP's pitch be
_unbundled_. The thing is that FP is pitched as a useful thing to learn to get
an insight into programming, improve your mental fitness, see things at a
higher level and so-forth. But it's also pitched as a way to be more
productive and effective as a programmer. Effectively, it's sold as an
exercise cycle and a real bicycle, a way to improve yourself and good way to
get somewhere.

Now, I'll acknowledge that functional programming probably is great for the
higher-level view of programming and such. But the challenges outlined in this
article makes me doubtful fp is a productivity tool, a system which will allow
some large number of programmers to make a large step in productivity.

I mean, if fp doesn't make simplistic programming faster and stands in the way
of a large number of algorithmic approaches, it seems your productivity would
consistently hobbled. Is there a counter-argument to this problem?

~~~
DerpDerpDerp
I do work in Haskell.

At the end of the day, you still write some stateful, imperative code.

It's just that your utility/work functions are pure, which means that you have
predictable side-effects: you've stuck them all in one (imperative) place, or
perhaps a couple places, but you're guaranteed to know where they are because
the type system enforces notating which code has the ability to cause side
effects. (Makes it easy to grep on type signatures, for example.)

Most of my workload is dominated by high level concerns: correctness, large-
scale data flow, coordination of state, and security.

The purity of functions means that I explain all of the constraints/solutions
of those problems in pieces, which I then compose with a clear data flow
between them (since they can't have shared, mutable state), and only at the
end, lift them to operate on the state of the system (in a stateful,
imperative block).

This way of programming gives me much more confidence in replacing pieces of
the code without worrying about the side effects, because the side effects are
predictably related to that code.

At the end of the day, that predictability of side effects makes the majority
of my work MUCH easier, even if it adds complexity to some low level data
manipulation.

I easily write 10x the high level code I do low level code, so a 20% savings
on that code, even if I double the complexity of low level code, is a net
savings on my time/effort.

tl;dr: Functional programming is a productivity booster if most of your work
is of a high level nature, even if it makes low level manipulations harder to
do. Interfacing with low level imperative code can be (and is) regularly done,
to get the best of both worlds.

~~~
ScottBurson
This is certainly an intriguing answer, and makes me want to try Haskell
sometime, but it also makes me wonder if I can't get the same benefits, with a
little effort and discipline, in a "semi-functional" language such as
something in the ML family or even (if I can live without static typing) Lisp.

In fact, I've always made a point, when working in a semi-functional language,
of writing as much code functionally as possible. My experience is that it has
the same benefits as you describe. So I don't know how much more there is to
be gotten by actually switching to Haskell. My guess is, most of the
additional benefit would come from the type system. But maybe I would be
motivated to write even more code functionally.

~~~
progman
I just started to work with Haskell seriously, and after some trouble I
discovered exactly the same development approach as DerpDerpDerp. However
since I am not yet experienced I wish I had an "observer" to print debug
messages (I know that there is a debugger in EclipseFP but I prefer Emacs for
performance reasons). With observer I mean a compiler pragma that lets me
print Haskell values without violating the functional nature of the code.

By the way, you don't need to live without static typing in Lisp. Shen
provides a powerful type-safe layer on top of Lisp.

[http://www.shenlanguage.org/learn-
shen/types/types_functions...](http://www.shenlanguage.org/learn-
shen/types/types_functions.html)

~~~
taejo
Sounds like you're looking for the Debug.Trace module

~~~
progman
Thanks, I'll take a look at it.

------
tel
After someone asked for it in /r/haskell I added a link to the complete code
for this post.

[https://github.com/tel/tel.github.io/tree/master/public/code...](https://github.com/tel/tel.github.io/tree/master/public/code/MutableImmutable/Part1)

Thus far there are no dependencies on outside libraries (though eventually
I'll have to use `containers`) so you can just drop it into GHCi and play
around!

------
taeric
For a fun mutable algorithm to test this with, I'd be interested in seeing an
implementation of Knuth's Dancing Links.

For that matter, I'm curious on any performance comparisons between algorithms
such as the DLX and any "FP" equivalent. This already exist?

~~~
jerf
Sticking to traditional computer science (i.e., ignoring constants, caching,
and other details, and just sticking to Big-O analysis), it is easy to show,
or indeed easy to simply _see_ that any mutable algorithm can be represented
in a purely functional way by creating a balanced tree that represents memory
and can be written to and read from by index. All reading and writing
operations that treat the tree as simply an expanse of RAM are O(log n), vs.
O(1) of direct RAM access. Therefore, pure functional programs can simulate
any impure algorithm with a slowdown _no worse than_ a log n factor.

Again, let me emphasize this is a _worst-case_ analysis. Many common
algorithms are equivalent regardless, and that's part of the reason tel
specifically names Union/Find as such an algorithm, as it's actually somewhat
rare to encounter one where there isn't a Big-O equivalent algorithm that is
pure. Pure functional algorithms often require some modestly clever
amortization to get there, but that's perfectly valid both in theory and in
practice (many beloved "imperative" structures have amortized complexities,
too, including such basics as a hash table).

In practice pure functional can sometimes come with larger constants; whether
they hit you in practice depends on your use case and, often, sensitivity to
garbage collection pauses.

In light of all that, I think it would be fair to say that your question is
somewhat ill-defined. You can really only compare particular algorithms
against each other, because there's no trivial equivalence between
"imperative" and "pure functional" algorithms. Plus, the barrier between the
two in practice is quite fungible... especially in a garbage-collected
imperative language nothing stops you from using a "functional" algorithm, and
every practical "functional" language will give you a way of running
"imperative" algorithms directly. (Yes, even Haskell. But tel is building up
to that. I won't give it away yet. Stay tuned.)

~~~
taeric
My question may have been ill defined, but your answer was awesome. :)

I have to confess your "easy to show/see" isn't immediately obvious to me yet.
However, that is as likely because I haven't tried hard to see as anything
else. This post is between other things I'm trying to get done. Poorly
already, digging into this is something I fear would not help.

I _am_ interested, though, so more pointers and explanations would be greatly
appreciated.

~~~
dllthomas
Regarding the "easy to see":

1) Take your algorithm that involves updates to memory.

2) Split it up so that "memory" is represented by an ADT - in the original
imperative setting, it's logically a hash table - O(1) read and write.

3) Replace that ADT with a binary tree. Now you have O(lg(n)) read and write.

4) If the previous algorithm was O(f(n)), it can't do any particular thing
more than O(f(n)) times, including memory access. So in the worst case, you've
made O(f(n)) things take O(lg(m)) times as long, so the new algorithm must be
in O(f(n) * lg(m)).

~~~
taeric
Ah, I think I see what you mean. Seems the constants being ignored could be
massive. Are there any comparisons saying how things compare in practice?
(Similar to how heapsort is typically not used, even though it has among the
better Big O values, right?)

~~~
dllthomas
Right, this particular approach is most interesting as an easy upper bound.

Constants are always highly situation dependent. If you are replacing a single
memory lookup with a tree traversal, that's going to be a huge difference. If,
for some reason, access to your mutable variables is already an expensive
operation, it might not make much difference at all. If you need to take
periodic snapshots of your world state, the mutation-free version might come
out way ahead sharing portions of the tree that don't need to be copied.

~~~
taeric
I'm not sure how the periodic snapshot would "come out ahead" with the
mutation free version. Seems the best you could claim is it wouldn't be as far
behind as one might think. Unless periodic equals every change. In which case
I would expect they could be equal. (That is, the extra work required to make
it "mutation free" is extra work. Unless all of the extra work is required, it
is hard to see how that version would "come out ahead.")

And this is why I particularly asked about the DLX algorithm. It is
specifically made for rapid backtracking. Reading briefly [1] shows that it
was even made parallel to speed it up. ("made it parallel" is a gross
simplification of course.) Is a very interesting read on methods to make a
heavily mutation based algorithm parallel.

[1] [http://did.mat.uni-
bayreuth.de/wassermann/allsolutions.ps.gz](http://did.mat.uni-
bayreuth.de/wassermann/allsolutions.ps.gz)

~~~
dllthomas
_' I'm not sure how the periodic snapshot would "come out ahead" with the
mutation free version.'_

A snapshot of freely mutated memory is a O(N) copy.

A snapshot of a immutable tree is a single O(1) pointer copy (you just need to
save the root).

Doing a full copy every change would be tremendously costly (substantially
more than the penalty for walking the tree on that change, and probably
overwhelming the overhead ofr walking the tree on reads).

Doing a full copy every hojillion steps would of course amortize to cheap (and
probably the overhead from walking a tree for reads and writes would overwhelm
it).

Anything real will of course fall somewhere between. As I said, constants are
tremendously context dependent.

Note that this (of course) doesn't _speed up_ the mutation free version - but
if you _have the constraint_ of wanting regular (or otherwise cheap) snapshots
then using the mutation-free version can be the cheapest way of doing that.

I don't know the details of the DLX algorithm, or adaptations to it, well
enough to say much about it in particular off the top of my head. I'd love to
dig into it at some point, but I've unfortunately got higher priorities
presently.

~~~
taeric
This is assuming a very naive snapshot of a mutated memory block. I would
assume that if you were doing something that needed snapshots of each instant
of the program, you would come up with a much more sane algorithm for getting
that done.

It would probably use many of the same tricks as the immutable structures.
Which is why I would assume they would be equal. (That is, I realize that
immutable structures don't do a full copy on every "change." Depending on the
structure and the change, they don't even do a copy at all.)

Consider, we basically just described how git works, no?

~~~
dllthomas
Well, certainly nothing stops you from using the immutable version and calling
it mutable (it just happens to do no mutation!). But my point is that it's a
nontrivial modification compared to other approaches to enabling snapshotting,
and it's a good tool to have in your belt for that kind of situation.

Particularly interesting, genuinely "persistent" data structures (as used by
Okasaki in Purely Functional Data Structures - which is a fantastic read and
tons of fun) can give amortized worst-case bounds even in the presence of a
malicious consumer of the data structure picking which operations to run
against which historical versions of the object.

~~~
taeric
I meant more of the same strategies. Data sharing and the like. In a mutable
language this can be done by simply updating the head pointer easily enough.
In a non-mutable language this is tougher in some respects. This is pretty
much the thing that tripped up a ton of folks back from the early days of
java. "Why doesn't x.replaceAll('a', 'b') change the value of x?" was not an
uncommon mistake to encounter.

DLX is actually a great example of this sort of thing, as the whole point is
that the operation to remove an item from a doubly linked list can be reversed
to add the item back just from looking at the item that was removed.

And again, consider the way that git works. Nobody would call c and the
techniques they use immutable, but the underlying data structure is.

More directly put, I am not calling for all data structures in a classical
"mutation based program" to be mutable. I am curious about some of the more
famous mutation based algorithms and if there are good comparisons to the
analogous versions.

There was a great post here a few weeks ago about the blind spot in functional
programmers in building trees. Having just seen the "threaded trees" for what
I think is my first time, I have to confess it took me longer to make than I
would have thought. Mainly because I was trying to hold on to some of the
"immutable" methods of programming.

~~~
dllthomas
_" I meant more of the same strategies. Data sharing and the like."_

Certainly it is _possible_ to find alternative constructions that work.
Occasionally these may still be faster. However, I strongly contest that it's
"tougher" in a non-mutable context. Specifically:

 _' In a mutable language this can be done by simply updating the head pointer
easily enough. In a non-mutable language this is tougher in some respects.'_

This is wrong. The hard part about this is making sure old things pointed at
by existing snapshots don't change. If your data is immutable, you get that
for free.

Moreover, in terms of complexity of the system, (mutating algorithm + a bunch
of stuff to capture the mutations) is likely to be messier than the
nonmutating algorithm (which is sometimes cleaner than the mutating version to
begin with, but certainly not always).

Also, note that you've moved to talking about "mutable languages", we had been
talking about _algorithms_.

 _' This is pretty much the thing that tripped up a ton of folks back from the
early days of java. "Why doesn't x.replaceAll('a', 'b') change the value of
x?" was not an uncommon mistake to encounter.'_

Which is clearly a problem with "non-mutable languages"? The problem there is
that the Java _paradigm_ had been strongly mutation oriented and then they
dropped an _incongruous_ mutation-free "method that is really more of a
function" in there. Clarity and consistency are important in any setting.

 _" DLX is actually a great example of this sort of thing, as the whole point
is that the operation to remove an item from a doubly linked list can be
reversed to add the item back just from looking at the item that was
removed."_

But you _can 't_ do that if you might be sharing those lists with someone
else. The point is that the constraints imposed by immutability are _often_
the most effective means of addressing other constraints, and so study of
these things is quite valuable. This thread has never been "Haskell is much
better because it doesn't let you mutate anything!" \- both because I've
already acknowledged that in many settings the mutable versions of algorithms
are preferable and because Haskell _does_ let you mutate things (you just have
to be explicit about what).

 _" And again, consider the way that git works. Nobody would call c and the
techniques they use immutable, but the underlying data structure is."_

As an aside, you can write C with very little mutation going on, if that's
what you want to do. I've not looked at the git source, so I have no idea the
degree to which they do.

As I said, though, that's an aside - my main point here is that immutable
data, and algorithms working with it, are valuable and there are situations
where they are the best solution even where mutation is "allowed" and even
where a mutation-heavy version might be preferred in a slightly different
setting.

 _" There was a great post here a few weeks ago about the blind spot in
functional programmers in building trees."_

A cursory search isn't turning this up - it sounds interesting. Do you have
the link?

~~~
taeric
I think we are still talking past each other. So, first the link you asked
for.
[https://news.ycombinator.com/item?id=7928653](https://news.ycombinator.com/item?id=7928653)
If I am misrepresenting it, apologies. And let me know. :)

I think how I'm talking past you is I am perfectly fine with mutation based
algorithms using immutable data structures. That is, union find can easily be
done using a standard Scala immutable Vector for the array. Only caveat is
each "mutation" has to be of the form "x = x.setValue(index, value)".

So, the question I give you is do you consider that a mutation based algorithm
or not? I would, as the heart of the algorithm is still based on the updates
of the array. You are just safe in knowing that any place you have let a
reference to it leak out is never going to change. This is both good and bad,
of course. Depending on what you are doing.

Stated differently, I don't think we have been distinguishing between mutation
based algorithms that use immutable data structures with mutation based data
structures. (That is to say, I have not been concerning myself with that.) So,
if you consider it an immutable algorithm as soon as an immutable data
structure enters into the mix, then yes, most of what I've been saying is
nonsense.

Seems unnecessarily restrictive to me, as just changing it such that each
update to the underlying structure requires changing a pointer as well as
following the data structure update is much less of a change than, for
example, the story that is at the root of this discussion.

For Git, this is roughly what it does. If you do a repack, the new pack is
only used once it is done. They rebuild the entire pack, then update the
reference to the active pack. If you cancel the process at any point, the old
pack is still good and still works. The process of building the structure is
heavily "mutation" based, but once it is made, nothing is ever changed.

And you should look into the DLX made parallel. It is very different than how
algorithms are made so in most popular literature.

------
tlarkworthy
Looking forward to seeing where this goes. This well recieved paper[1],
discusses converting reference & set type mechanics into persistent and
partially persistence data structures. The model is quite similar to what is
going on here. It be great to write algorithms and convert them to immutable,
mutable or partially mutable at will.

[1]
[http://www.cs.cmu.edu/~sleator/papers/Persistence.htm](http://www.cs.cmu.edu/~sleator/papers/Persistence.htm)

~~~
tel
I was considering implementing something similar to the Overmars methods
mentioned on page 90. Maybe I'll read this paper and see if I can tack it on
at the end... Thanks for the link!

------
JoshTriplett
This seems like a reimplementation of ST using an unspecified State Mem.

~~~
tel
It's not _yet_ , but—to cheat a bit here—eventually we're going to use `Mem`
to get quite intimate with `ST` and show why it's important to use `ST`
perhaps more often than people even anticipate.

If you'd like, try to instantiate something along the lines of `Mem (State
(IntMap v))`. It's slightly trickier than it looks due to some typing issues
(which don't matter, but I'll save the reveal). It's also likely to be
slightly buggy.

------
transfire
Oh, that is rich.

~~~
tel
Does it not seem possible to you?

Hopefully by part 3 I'll be able to poke very precisely at the seam between
mutable and immutable algorithms and unlock how to pick between them with
greater refinement.

~~~
rwosync
I'm glad that you've taken the time to write more. Your posts have been very
articulately stated, especially the one on type systems.

~~~
tel
Thanks! It means a lot to hear. It's also quite motivating :)

