
Recursion and Tail Calls in Go - signa11
http://www.goinggo.net/2013/09/recursion-and-tail-calls-in-go_26.html
======
evancordell
> Recursion is great for algorithms that perform operations on data that can
> benefit from using a stack, FILO (First In Last Out). It can be faster than
> using loops and can make your code much simpler.

I can't think of a scenario where recursion results in a faster algorithm than
an iterative form. Almost always recursion makes an algorithm simple to
express and understand, while the iterative form is almost certainly more
performant and the expense of a little more opacity.

I enjoyed reading all of the Go details even if I think that the takeaway
should've been "write recursive algorithms in an iterative form if performance
is a concern" (in Go).

~~~
Zikes
Can't recursion do multithreading whereas iteration cannot?

~~~
Jtsummers
It doesn't particularly matter. The loop:

    
    
      a, b = some input arrays of identical length
      c = the output array
      for(int i = 0; i < c.length; i++)
        c[i] = a[i]+b[i]
    

can be easily run in parallel, each iteration of the loop kernel is
independent of every other iteration. And, with OpenMP a #pragma could make
that loop run in parallel without any difficulty (long time since I've used
it, forgotten the actual code needed to make it happen but it's easy).

    
    
      vector_sum([],[]) -> [];
      vector_sum([A|Atail],[B|Btail]) -> [A+B|vector_sum(Atail,Btail)]
    

(it's been a while since I erlanged, probably messed up the syntax)

Each iteration should be independent, but the way we've constructed it it's
not easily transformed to run in parallel now. In fact, that's a bad
construction because erlang _does_ have TCE and that recursion isn't in tail-
call form.

EDIT: Also, iteration and recursion are transformable to each other. If you
have a language with tail-call elimination, then recursive code with tail-
calls should be equivalent to code using more conventional loop structures.

If no data needs to be retained between the recursive calls (consider this
probably bad code):

    
    
      int[] vector_add(int[] A,B,C, int index) {
        if(index == C.length) return C;
        C[index] = A[index]+B[index];
        return vector_add(A,B,C,index++);
      }
    

assuming A,B, and C are all initialized correctly outside this call, it's
fundamentally identical to (with TCE, actual assembly may not be the same
without a bit more care taken in defining the function):

    
    
      int[] vector_add(int[] A,B,C) {
        for(int i = 0; i < C.length; i++)
          C[i] = A[i] + B[i];
        return C;
      }
    

That's a trivial transformation. Now, if we need to somehow retain some data
(like an accumulator) it's not much harder:

    
    
      int factorial(int N) {
        int result = 1;
        for(int i = 1; i <= N; i++)
          result *= i;
        return result;
      }
    
      int factorial(int N, int result) {
        if(N = 0) return result;
        return factorial(N-1, result*N);
      }
    

call the second one as factorial(N, 1), you'd likely have a wrapper to
abstract that out for any user so they only have to call factorial(N).

What if the recursion isn't a tail-call, how can we transform it? (bad
factorial):

    
    
      int bad_factorial(N) {
        if (N == 0) return 1;
        return N * bad_factorial(N-1);
      }
    

Assuming we want to keep this stack, the equivalent code with a loop would be
like:

    
    
      int bad_factorial(N) {
        int result = 1;
        Stack values = new Stack();
        for(; N > 0; N--)
          values.push(N);
        while(!values.isEmpty())
          result = values.pop();
        return result;
      }
    

That stack we've created replaces the program stack that was used implicitly
in the recursive version. I won't guarantee the performance or assembly is
identical (they won't be) but they're on the same order for both execution
time and memory use.

~~~
Zikes
Ahh, that makes sense. Thanks!

