This call to continuation business sounds pretty bad for performance, though: all local state needs to be copied. One could just pass on a pointer to a memory blob / arena or some such, I guess, which is not too different from using the same stack pointer before and after the call, so Hmmm... I wonder what the downsides really are.
Simply to call a helper function and have it return, you need to create a lexical closure and pass that as the return continuation to that helper. Now since everything is being allocated off the stack, that makes the closure a bit cheaper: the environment is all on the stack, and so it is much the same as a frame link pointer plus return address. The difference is that the callee can run out of stack, and have to invoke a stack reset. During the stack reset, the previously cheaply allocated return continuation/lambda will be moved to the heap --- along with others like it up the call stack.