It's fairly easy to make them non-brittle, because it's very easy to rig a testing framework to run both the hand-optimized "jet" and the pure code, and compare them.
Standardizing performance is a subtler and more interesting point. It's a fairly safe bet that anything in the kernel that needs to be is jet-propelled. Above that layer, who knows? Good old normative text may handle it.
But in general, the feeling should be like the difference between hardware GL and software GL. The developer doesn't see it and the user sees it only as fast versus slow.
What about when the pure code is too slow to run? Or when the pure code is wrong but the jet is right - so that the nook spec is now not sufficient to determine semantics.
Let me bury that criticism though by saying that this is the most beautiful piece of work I have seen in some time. If more people were willing to be a little crazy we might not be quite as tangled in spaghetti.
One, it takes a good bit of work to boot Arvo beyond a merely correct nock interpreter. The Hoon type system, for instance, is enormously painful if not jet-propelled.
Two, the kind of bug you're describing is in fact the nastiest class of bug. The best way to get around it is to always make sure you write the pure code first. But this isn't possible in a variety of circumstances - and it sure wasn't possible when making Hoon first compile itself. So, as inevitably in the real world, there is no substitute for not screwing up.