If you use undocumented instructions, you're not following "do only operations that are well-defined by the level below". The existence of undocumented instructions doesn't invalidate the concept of correctness by contract.
You don't seem to realize that testing and proofs are just two faces of the same thing. A test is just an empirical proof of something. Both are rooted in the same logic and assumptions. In fact, in some cases, a test can be exhaustive and then it's as good as a proof. Tests usually have the disadvantage of being inexhaustive, and having to work their logic within the language.
I'm getting to the point that security software, such as an OS kernel, that is vulnerable to timing and speculative execution attacks, is that way not only in spite of being proven, but in spite of testing also.
It's not the case that proofs will fail to reveal that problem, but testing will.
No amount of testing will reveal a problem missed by proof methods, if both the testing and proving are rooted in the same false assumptions.
(Assumptions like "the hardware's protections mechanisms are sound, so that no data flows are observable to an unauthorized domain, so we just have to test the software itself is good.")
Erratum: testing, of course, exercises every layer you depend on, whereas proof methods typically assume that the things below conform to their contracts. That is where proofs are weak. A proof will show that your code is correct; a test will reveal a compiler bug, CPU bug, faulty RAM, ...
You don't seem to realize that testing and proofs are just two faces of the same thing. A test is just an empirical proof of something. Both are rooted in the same logic and assumptions. In fact, in some cases, a test can be exhaustive and then it's as good as a proof. Tests usually have the disadvantage of being inexhaustive, and having to work their logic within the language.
I'm getting to the point that security software, such as an OS kernel, that is vulnerable to timing and speculative execution attacks, is that way not only in spite of being proven, but in spite of testing also.
It's not the case that proofs will fail to reveal that problem, but testing will.
No amount of testing will reveal a problem missed by proof methods, if both the testing and proving are rooted in the same false assumptions.
(Assumptions like "the hardware's protections mechanisms are sound, so that no data flows are observable to an unauthorized domain, so we just have to test the software itself is good.")