I'm torn if I should be proud or disappointed that my highest voted comment is a pun. On HN and on a thread that is fairly serious considering it's about modern CPUs catching fire (though I heard Intel was independently working on that catch-fire feature with their newest 28-core release)
I'm pretty sure this attack doesn't apply to AMD, being built on the original Meltdown attack that was Intel specific. So what'll happen is Intel will change their chips to not do prefetching without also doing a permissions check like MAD does. Meltdown solved.
It's not built on meltdown. It's not about violating permissions, it's about treating the contents of a page table as valid even if the page is not present or (in the case of EPT, which is worse) also treating the physical address on the guest as a physical address on the host.
However, unlike meltdown it cannot access data that is not already in the L1 cache.
I mean, the V bit, and the other contexts are just PTE permissions. It's literally the same root cause as meltdown, that page table faults occur particularly asynchronously on Intel hardware and speculation occurs past those faults.
The root cause is the same but it's a different kind of page fault, and the effect is that you cannot read data that is not already present in the cache. On the other hand, meltdown doesn't break through the guest-host barrier when EPT is active.
Yes, deep down they happen for the same reason, but then so does Spectre as well.
They're both around how page faults are asynchronous at a uArch level on Intel, and not any of the other vendors. This and Meltdown don't apply to AMD or ARM.
There is a lot of legacy cruft in x86, but it's the devil we know. After decades of use, we are still discovering vulnerabilities, in a platform thought to be well-understood.
The closest alternative would be ARM. In any case, it's a massive undertaking.
It is not obvious to me that it's the design warts of x86_64 (wholeheartedly agreed--I don't like x86_64 either) are the cause of the security problems we're seeing. Other architectures also have speculative execution and multiple rings. It's a lot easier to avoid vulns that are a consequence of increased performance demands when performance is not the primary reason people pick your platform.
On the contrary there's SPARC, MIPS, PA-RISC, POWER and a whole heap of others that perhaps were written off prematurely. Need to move quickly tho' while some vestiges of expertise still remain.
Sparc, Tilera and Parallela are unfortunately gone. I had high hopes into the latter two, esp. since Grid CPU's are perfect for machine learning. Much better than GPU's.
No, these sorts of faults can occur in any high performance chip. Intel is bearing the brunt only because they're the fastest. But Meltdown also affected an Apple core if I recall correctly and Spectre nailed them all.
This isn't as simple as "Intel/x86 sucks, let's go use SPARC". The causes run much deeper and the necessary fixes may or may not be architecturally elegant or simple.
Last I checked, the Mill is vaporware. They have some interesting patents, a private simulator, parts of a compiler toolchain, and that's it. At least with OpenPOWER and RISC V, you can buy processors and even complete systems.
It's probably time for a new architecture that isn't so convoluted with decades of optimizations and iterative improvements.
It's also time for a computer system with one and only one general purpose processor (no tiny CPUs in storage or "system management" or every other device)
Probably something like a programming language/OS/computer system written new with a CPU based on current GPU designs.
You won't make any CPU of reasonable performance without speculative execution and all the rest. Your limited by data dependencies and the only way to break them is to "cheat".
Unless your willing to run on the equivalent of a Cortex-M0 then you have to live with it.
Delay slots are merely the pipeline of the uarch peeking through, its bad practice because you'll probably want to change the pipeline depth at some point. Other uarchs have them but they hide them from the public API.
VLIW only removes the logic to detect data dependency - it doesn't workaround the actual need to wait for data to be ready.
None of this has much to do with speculative execution which is guessing which way a branch will go. You simply can't have what would be considered a modern computer without it.
There is nothing particularly convoluted in either meltdown or this new attack. This optimization could have happened on any other architecture.
The legacy parts have either been disabled in 64-bit mode, or they are implemented in microcode. Other architectures are not simple either, ARM64 has incredibly complicated paging for example.
At which point do we agree the performance increases over the last 20 years have been built on sand and move elsewhere?