Sticking to traditional computer science (i.e., ignoring constants, caching, and other details, and just sticking to Big-O analysis), it is easy to show, or indeed easy to simply see that any mutable algorithm can be represented in a purely functional way by creating a balanced tree that represents memory and can be written to and read from by index. All reading and writing operations that treat the tree as simply an expanse of RAM are O(log n), vs. O(1) of direct RAM access. Therefore, pure functional programs can simulate any impure algorithm with a slowdown no worse than a log n factor.Again, let me emphasize this is a worst-case analysis. Many common algorithms are equivalent regardless, and that's part of the reason tel specifically names Union/Find as such an algorithm, as it's actually somewhat rare to encounter one where there isn't a Big-O equivalent algorithm that is pure. Pure functional algorithms often require some modestly clever amortization to get there, but that's perfectly valid both in theory and in practice (many beloved "imperative" structures have amortized complexities, too, including such basics as a hash table).In practice pure functional can sometimes come with larger constants; whether they hit you in practice depends on your use case and, often, sensitivity to garbage collection pauses.In light of all that, I think it would be fair to say that your question is somewhat ill-defined. You can really only compare particular algorithms against each other, because there's no trivial equivalence between "imperative" and "pure functional" algorithms. Plus, the barrier between the two in practice is quite fungible... especially in a garbage-collected imperative language nothing stops you from using a "functional" algorithm, and every practical "functional" language will give you a way of running "imperative" algorithms directly. (Yes, even Haskell. But tel is building up to that. I won't give it away yet. Stay tuned.)

 My question may have been ill defined, but your answer was awesome. :)I have to confess your "easy to show/see" isn't immediately obvious to me yet. However, that is as likely because I haven't tried hard to see as anything else. This post is between other things I'm trying to get done. Poorly already, digging into this is something I fear would not help.I am interested, though, so more pointers and explanations would be greatly appreciated.
 Regarding the "easy to see":1) Take your algorithm that involves updates to memory.2) Split it up so that "memory" is represented by an ADT - in the original imperative setting, it's logically a hash table - O(1) read and write.3) Replace that ADT with a binary tree. Now you have O(lg(n)) read and write.4) If the previous algorithm was O(f(n)), it can't do any particular thing more than O(f(n)) times, including memory access. So in the worst case, you've made O(f(n)) things take O(lg(m)) times as long, so the new algorithm must be in O(f(n) * lg(m)).
 Ah, I think I see what you mean. Seems the constants being ignored could be massive. Are there any comparisons saying how things compare in practice? (Similar to how heapsort is typically not used, even though it has among the better Big O values, right?)
 Right, this particular approach is most interesting as an easy upper bound.Constants are always highly situation dependent. If you are replacing a single memory lookup with a tree traversal, that's going to be a huge difference. If, for some reason, access to your mutable variables is already an expensive operation, it might not make much difference at all. If you need to take periodic snapshots of your world state, the mutation-free version might come out way ahead sharing portions of the tree that don't need to be copied.
 I'm not sure how the periodic snapshot would "come out ahead" with the mutation free version. Seems the best you could claim is it wouldn't be as far behind as one might think. Unless periodic equals every change. In which case I would expect they could be equal. (That is, the extra work required to make it "mutation free" is extra work. Unless all of the extra work is required, it is hard to see how that version would "come out ahead.")And this is why I particularly asked about the DLX algorithm. It is specifically made for rapid backtracking. Reading briefly [1] shows that it was even made parallel to speed it up. ("made it parallel" is a gross simplification of course.) Is a very interesting read on methods to make a heavily mutation based algorithm parallel.
 DLX might be a fun algorithm to explore too. Union-Find is one I was more familiar with, but I think you're correct that DLX cannot have a reasonable, direct "pure" translation.
 'I'm not sure how the periodic snapshot would "come out ahead" with the mutation free version.'A snapshot of freely mutated memory is a O(N) copy.A snapshot of a immutable tree is a single O(1) pointer copy (you just need to save the root).Doing a full copy every change would be tremendously costly (substantially more than the penalty for walking the tree on that change, and probably overwhelming the overhead ofr walking the tree on reads).Doing a full copy every hojillion steps would of course amortize to cheap (and probably the overhead from walking a tree for reads and writes would overwhelm it).Anything real will of course fall somewhere between. As I said, constants are tremendously context dependent.Note that this (of course) doesn't speed up the mutation free version - but if you have the constraint of wanting regular (or otherwise cheap) snapshots then using the mutation-free version can be the cheapest way of doing that.I don't know the details of the DLX algorithm, or adaptations to it, well enough to say much about it in particular off the top of my head. I'd love to dig into it at some point, but I've unfortunately got higher priorities presently.
 This is assuming a very naive snapshot of a mutated memory block. I would assume that if you were doing something that needed snapshots of each instant of the program, you would come up with a much more sane algorithm for getting that done.It would probably use many of the same tricks as the immutable structures. Which is why I would assume they would be equal. (That is, I realize that immutable structures don't do a full copy on every "change." Depending on the structure and the change, they don't even do a copy at all.)Consider, we basically just described how git works, no?
 Well, certainly nothing stops you from using the immutable version and calling it mutable (it just happens to do no mutation!). But my point is that it's a nontrivial modification compared to other approaches to enabling snapshotting, and it's a good tool to have in your belt for that kind of situation.Particularly interesting, genuinely "persistent" data structures (as used by Okasaki in Purely Functional Data Structures - which is a fantastic read and tons of fun) can give amortized worst-case bounds even in the presence of a malicious consumer of the data structure picking which operations to run against which historical versions of the object.
 I meant more of the same strategies. Data sharing and the like. In a mutable language this can be done by simply updating the head pointer easily enough. In a non-mutable language this is tougher in some respects. This is pretty much the thing that tripped up a ton of folks back from the early days of java. "Why doesn't x.replaceAll('a', 'b') change the value of x?" was not an uncommon mistake to encounter.DLX is actually a great example of this sort of thing, as the whole point is that the operation to remove an item from a doubly linked list can be reversed to add the item back just from looking at the item that was removed.And again, consider the way that git works. Nobody would call c and the techniques they use immutable, but the underlying data structure is.More directly put, I am not calling for all data structures in a classical "mutation based program" to be mutable. I am curious about some of the more famous mutation based algorithms and if there are good comparisons to the analogous versions.There was a great post here a few weeks ago about the blind spot in functional programmers in building trees. Having just seen the "threaded trees" for what I think is my first time, I have to confess it took me longer to make than I would have thought. Mainly because I was trying to hold on to some of the "immutable" methods of programming.