> Why don't languages give tail calls their own syntax rather than making it an ...

mikebenfield · on Nov 14, 2017

> because if you need "special syntax" to express the fact that your recursively defined function can technically be evaluated in constant stack space, you probably should be writing it as an iterative function to begin with.

Why? I find recursive functions are often simpler to understand and write than iteration. Why does the compiler need to stop me from doing so? You write "technically" as though it were just an obscure technicality that it's possible to implement a function call without taking up excessive stack space, but to my mind it's a fundamental fact.

> Iteration ain't that hard either folks, and it has the notable advantage of mapping more obviously to the behavior of actual machines we use to execute our code.

In what sense is this true? Recursive function: update these variables (in registers or memory) and jump back up to this instruction. Iteration: update this variable and jump back up to this instruction.

ajross · on Nov 15, 2017

> Why? I find recursive functions are often simpler to understand and write than iteration.

For simple recursive algorithms, that's true. And those tend to overlap well with ones which don't require assisted tail call tagging or whatever.

Those that do, however, tend to be pretty complicated. And the set of programmers who can understand an interative implementation of them is IMHO rather larger than that which can get their heads around the tail call analysis to know how (and when!) to make it work.

Spivak · on Nov 14, 2017

It is an obscure technicality: your code only works because a compiler optimization makes it work. If a different compiler executed your code exactly as written it would crash. Or another way, two pieces of code that are logically equivalent will succeed or crash depending on whether it's written in such a way that the compiler can recognize an optimization.

I'm in favor of having special syntax because compilers shouldn't really be in the business of 'do what I mean not what I say'.

nybble41 · on Nov 14, 2017

Despite the name "tail call optimization", the proper implementation of tail calls is not "just" an optimization. Preserving a stack frame for a function which has nothing left to do is incorrect behavior on the part of the compiler which negatively impacts both the time and space complexity of the resulting object code. The code, as written, does not require a stack frame. If the compiled object code crashes it is not because it was executed "exactly as written", but because the compiler inserted something beyond what was written.

The Scheme approach of mandating the use of tail calls for every call which occurs in "tail position" is the correct one, in my opinion.

Spivak · on Nov 15, 2017

> nothing left to do is incorrect behavior on the part of the compiler

But it's not though. By this token failing to optimize anything could be considered incorrect behavior. Obviously not turning on optimizations makes your program take longer and use more memory. When a function call is made its variables are pushed onto the stack and popped when the function returns. Doing something different when the compiler detects certain properties of your code is the definition of an optimization that at least in theory shouldn't be relied on to produce correct code.

A more correct design, I think, would be having a 'replace' keyword for function calls instead of return that causes it to replace the stack of the caller. Then the behavior is no longer an optimization but behavior. Now you have the best of both worlds. The compiler can even warn you about when you might want to use replace rather than return.

tweakism · on Nov 15, 2017

You suggest literally the same solution that the person you were arguing with suggested!

nybble41 · on Nov 17, 2017

> Obviously not turning on optimizations makes your program take longer and use more memory.

Yes, but it shouldn't take an algorithm which only requires constant space as written and turn it into object code which requires linear space.

Update: Elaborating on this point somewhat—Consider the case of a simple loop with a local variable. The variable is specific to the body of the loop; each iteration, in effect, has its own copy of the variable. What if a compiler implementer decided, for the sake of simplicity, to allocate block-local variables with alloca(), so that new memory is reserved from the stack for each loop iteration, and only freed when the enclosing function returns? Perhaps an optional optimization is provided -floop-variable-space-optimization which "lifts" the variable out of the loop, allowing the memory to be reused and restoring constant space complexity. Would you still consider this an acceptable implementation on the grounds that "obviously not turning on optimizations makes your program ... use more memory"? I think not.

> When a function call is made its variables are pushed onto the stack and popped when the function returns.

That is nothing more than an unfortunate, naïve implementation choice common among compilers which do not implement tail calls properly. Nothing in the definition of a function or a function call requires the use of a stack. What is required is only a continuation—a reference to the code to be executed when the function is finished. When there is a chain of calls F -> G -> H, and the G -> H call is in tail position, the natural continuation for that call is the same continuation which was provided to G. A compiler which does not implement tail calls properly, however, interposes an extra, unnecessary continuation for the call to H, turning a (potentially) constant-space function into a linear one.

> A more correct design, I think, would be having a 'replace' keyword for function calls instead of return that causes it to replace the stack of the caller.

Most modern languages make it simple to determine when a function call is in tail position, and in such cases I feel that a special calling syntax would be an unnecessary distraction. In Scheme, for example, the cases where a tail call is required are obvious from the structure of the code and are, moreover, spelled out in the language standard. However, I will grant that languages like C and C++ which occasionally depend on the compiler adding hidden code at the end of some functions to run destructors or implicitly transform the result type might benefit from such a keyword. It would, at any rate, be an improvement over the current situation where proper tail calls are deemed an optional optimization.

mikebenfield · on Nov 15, 2017

> your code only works because a compiler optimization makes it work. If a different compiler executed your code exactly as written it would crash.

But it's completely arbitrary that you consider this "a compiler optimization" as opposed to "the way the language behaves." The languages you're used to work the way you describe: if the stars are aligned, a particular compiler may do TCO as an extra optimization.

But there is absolutely no reason it has to be this way. It is perfectly possible for a language standard to mandate that TCO must happen and to specify in an easily understood way when it must happen. As Scheme did over 40 years ago.

goialoq · on Nov 14, 2017

How is a keyword (such as OCaml's 'rec') less clear than rewriting the whole function in an iterative form, with junk like temporary loop variables?

> mapping more obviously to the behavior of actual machines we use to execute our code.

There's a reason we usually don't write in assembly.

Spivak · on Nov 14, 2017

Should the compiler do what you mean or do what you say? Is it good design to have a system where your code only works because the compiler was able to recognize an optimization? Why not make it explicit?

ernst_klim · on Nov 15, 2017

>Why not make it explicit

Because we want programming languages to hide the hardware complexity behind elaborated semantics. Programming languages are for humans, not for machines.

catnaroek · on Nov 14, 2017

Structured looping constructs are rather removed from how a machine works - not that this is inherently a bad thing. The way it actually works in a physical machine is via jumps into code that has already run, i.e. self-reference, i.e. recursion.

adrianratnapala · on Nov 14, 2017

My way of saying the same thing is that there is surprisingly little difference between the recursive and iterative models when you start breaking things down.

Indeed, if I were to add an explicit tail syntax to some language, it would probably involve the "goto" keyword.

ajross · on Nov 14, 2017

A loop can be turned into branch-based code with an absolutely trivial transformation (e.g. evaluate the test, branch to the end if false, branch back to the start at the end), I don't quite know what you mean by that. 3GL compilers were available with working looping syntax within years of the first programmable computers. FIVE decades later, and as this thread proves, we still don't have general algorithms to turn recursive solutions into iterative ones at the compiler level.

It's true that it's more abstracted than machine code. It's absoultely not true that it's equivalently complicated to tail recursion detection.

catnaroek · on Nov 15, 2017

> It's absoultely not true that it's equivalently complicated to tail recursion detection.

Tail recursion doesn't need to be “detected”. The correct implementation of all tail calls, recursive or otherwise, is to overwrite the current activation record, rather than save it.