tl;dr we implemented it, it works, but we are not shipping it because we believe it would hurt developers to lose some stack frames from Error.stack in existing sites, among other issues.
Our approach was to make our debugger still show those frames, and to observe that in the year since we've had tail calls in builds, we haven't seen a single compatibility issue from the change to error.stack behavior.
I program in Lua, which does to TCO and I've never found it to be an issue with debugging. Now, that could be because of the ways I use TCO---I tend to use it in a recursive routine:
function mainloop()
local packet = receive()
if not packet then
syslog('warning',"error")
return mainloop()
end
process(packet)
return mainloop()
end
(mainly because Lua does not have a 'continue' statement, and this is the best way I've found to handle that) or in state machines:
function state_a(state)
return state_b(state)
end
function state_b(state)
return state_c(state)
end
function state_c(state)
if somecondition()
return 'done'
else
return state_a(state)
end
end
The thing to remember is that a TCO is a form of GOTO. And with GOTO, you have no stack entry, but given that this is a controlled GOTO, I don't see much wrong with it. Do most programmers find TCO that confusing? Or is it the perception that most programmers will find TCO confusing? Have any real studies been done?
If the main worry is information loss when debugging, why not figure out a mechanism that's 98% accurate but much more efficient than a full redundant shadow stack?
For example, you could compact the stack every 200 frames it grows, but never remove frames in the top 50 or bottom 50. How often in practice would that give you a misleading view of the stack? (Assume that each frame keeps track of how many tail frames are omitted after it. If needed, assume that tail frames within X distance of a non-tail frame will not be omitted ever.)
One of the things I like most about WebKit is that several of you reliably write awesomely informative commit messages.
(I hope Felix sees this particular commit sometime; I think he might feel a little flattered! Also, wow, that's an excellent summary by Peter Bex.)
Another is that you keep looking into old quiet dark corners of language nerdery and actually make use of the good ideas lurking there (notably while retaining the "no performance regressions EVER" tyranny).
I think there are some neat ideas Chicken Scheme's compiler too.
Also, did you have a raiding party on T when doing DFG? (cf. Olin Shivers at http://www.paulgraham.com/thist.html starting with the paragraph, "This brings us to the summer of 1984. The mission was to build the world's most highly-optimising Scheme compiler." and notably also the paragraphs starting "Richard Kelsey..." and "Norman Adams...". . Always take ideas from Shivers, at least if they're faster in practice. Also, sorry for the several edits. I forgot how good this overview was, and how much meat is in it.)
It probably could be, but we deliberately don't do it, because:
1) I'm not aware of complaints about the change to error.stack behavior from users or developers. I don't know of an app that broke because of the change to error.stack. I don't know of an app whose telemetry got messed up because of the change to error.stack. So, we don't have a real-world test case that would be improved or fixed by integrating ShadowChicken into error.stack. We're not going to impose any overhead, or spend time trying to optimize that overhead, if it isn't going to benefit anyone.
I've heard lots of hypothetical fears about error.stack changing, but I haven't seen a real-world example of the change in error.stack behavior being harmful. If you know of an app that breaks because of our error.stack change, please let us know!
2) Philosophically, we view the feature as PTC (proper tail calls), not TCO (tail call optimization). If it was an optimization then we'd want it to be hidden from the user. But that's not what PTC is: it's a guarantee to the user about how the stack will behave. Therefore, we make error.stack precisely reflect PTC. We go to great lengths to emulate PTCs in some cases to make this work, for example if the optimizing JIT is involved. For example:
function foo() { ... } // say that this is inlined
function bar() { return foo(); } // this tail-calls foo. say that this is inlined
function baz() { return bar() + 1; } // say that our top-tier JIT compiles this
In this case, foo and bar will sort of cease to exist since all that really matters for execution is the code that the JIT generated for baz, which now also includes the code for foo and bar. Inlining is super careful about error.stack. In this case, our optimizing JIT's inlining data will include complete information about the call stack (baz->bar->foo) but will flag the bar frame as tail-deleted so that error.stack will only show baz->foo.
So, instead of making ShadowChicken hide PTC from error.stack, we actually have an entirely separate set of features to make error.stack precisely reflect the reality PTC. On the other hand, if you open the inspector, we want ShadowChicken to show you the tail-deleted frames and to flag them appropriately. The inspector integration WIP is here: https://bugs.webkit.org/show_bug.cgi?id=156685
TL;DR. JSC doesn't try to lie to its clients about PTC. PTC is part of the language, so we precisely reflect PTC's behavior in error.stack and in the inspector (the tail-deleted frames show up but are flagged as such).
(EDIT: I changed the definition of bar and baz above because my original example didn't have the tail calls that I wanted.)
I don't have such numbers, and I'm not sure they would be relevant to this discussion.
We already know that debugging isn't the issue. ShadowChicken solves the debugging problem and other VMs could do it, too. ShadowChicken is just one possible algorithm in a much larger family of chicken algorithms.
The only way that PTCs are observable outside the debugger - beyond making you run faster and use less memory - is error.stack. Hence the challenge: find me a website that uses error.stack in such a way that PTC breaks that website. Surely if the other VMs are so against PTC on the grounds that it will break websites, they will be able to tell us about a website that broke in Safari because of PTC.
Even if were the worst engine since IE4, it's Not Chrome(tm) and can help when chrome dev tools fail (or - more likely - I fail at working with chrome dev tools and need a fresh perspective.)
It's also good practice to run the profilers at least occasionally, because (a) performance in Safari is relevant and I've been bitten by idiosyncrasies where one browser took >5x than another (in all directions. And (b), as above, just by being different they may add useful information.
tl;dr we implemented it, it works, but we are not shipping it because we believe it would hurt developers to lose some stack frames from Error.stack in existing sites, among other issues.