Hacker News new | past | comments | ask | show | jobs | submit login
MDS: Microarchitectural Data Sampling side-channel vulnerabilities in Intel CPUs (mdsattacks.com)
256 points by sirmc 8 days ago | hide | past | web | favorite | 111 comments

Some additional pages by Intel describing mitigation techniques for non-HT domains (including the new overload of the VERW instruction): https://software.intel.com/security-software-guidance/insigh...

Details of which steppings of which processors are affected by which CVEs: https://software.intel.com/security-software-guidance/insigh...

They advise only to use lfence, similar to compiler vendors. I advise to use a full mfence instead when clearing secrets. Load/store ordering is violated in caches. And cleaning secrets is done not so often, it needs to be reliable. MDS is thanksfully only for small data, and modern keys are much larger. But adding a simple verw for the tiny non-cache buffers does not hurt either.

There are 4 separate vulnerabilities in MDS, not just the one reported in the ZombieLoad paper. They each have CVEs.

Chrome Browser response here: https://www.chromium.org/Home/chromium-security/mds

> Linux users should apply kernel and CPU microcode updates as soon as they are available from their distribution vendor, and follow any guidance to adjust system settings.

Canonical says that they have those for 14/16/18.04 [1]. But possibly more interesting is the fact that this disclosure has been so well synchronized. How do the relevant players decide what the threshold is for informing other tech companies? How does everyone know what policies that the constituent companies use to prevent early disclosure or unintended disclosure to 'somewhat-less-trusted-employees'? Is this all coordinated by US CERT?

[1] https://blog.ubuntu.com/2019/05/14/ubuntu-updates-to-mitigat...

As with Spectre/Meltdown, L1TF et al, Intel chooses who to loop in to their disclosure.

All of it is tightly controlled under an embargo. Who they choose to involve is entirely their decision, and is likely based on previous experience with those parties and their likelihood of leaking. Intel doesn't want these kinds of things to leak before official communication is done, or it's pretty much guaranteed to impact their stock price.

This time around has gone much smoother than the previous ones, though L1TF was pretty good too. L1TF was a little rough with the patching side of things because the patches were finalised a little late.

The various distributions and companies knew that the embargo was due to end at 10am pacific, and were probably (like us) refreshing the security advisories page on Intel's site waiting to pull the trigger on all the relevant processes, like publishing blog pages etc.

Well, practice makes perfect... by 2020 the process of disclosing CPU vulnerabilities should be pretty streamlined, if the pace doesn’t slow down.

Wow, ChromeOS decided to disable hyperthreading entirely? That seems like a pretty drastic mitigation. I wonder if that's just a short term solution or if they're planning to leave it that way indefinitely.

OpenBSD preemptively did the same thing [0] in 6.4, released nearly a year ago.

[0]: https://news.ycombinator.com/item?id=17350278

Hyper-Threading has been a source of security concerns for a decade now, and vulnerabilities in existing HT implementations have been trickling out over the last few years. Unlike Management Engine or TrustZone, at least we can disable Hyper-Threading (for a 30% performance hit).

Also, HT is not such a great performance win - on a few different 4-core/8-thread machines, I had access to, loading all 8 threads to "100% CPU" (whatever that means) usually only delivers 20-30% faster computation than with HT off (4-core/4-thread) - which is inline with your 30% number.

And that's an improvement - some 15 years ago, with similar computational loads, most of my tests ran 10-20% faster with the HT off (using 2 core / 2 threads) than with HT on (using 2 core / 4 threads) - there just wasn't enough cache to support those many threads.

A 20-30% increase is a BIG increase for a hardware feature, though. The cost of hyperthreading in transistors mostly amounts to the larger total register set. The whole point is the rest of the decode/dispatch/execute/retire pipeline is all shared.

How is 20%-30% not a great performance win? If I tell you today there's this One Simple Trick that you can do on your computer to instantly gain access to 20%-30% more performance, would you do it in a heartbeat?

What do you think is a good performance improvement then?

(and to the two other responses)

If your workload is already well parallelized, then, yes 20% is quite significant. However, working to parallelize properly over 8 rather than 4 has its own costs.

The thing that bothers me most is that 800% CPU and 500% CPU on this processor are roughly equivalent at 5x100%CPU, it makes everything very hard to reason about when planning capacity.

I think you’re misunderstanding what HT is. It’s not true parallelism, it’s just hiding latency by providing some extra superscalar parallelism. You can’t expect it to give you actual linear improvements in performance because it’s just an illusion.

I understand that very well. But non of the standard tools that manage CPU understand that, and most people don't either.

If I had a nickel for every time I had to explain why "You are at 50% CPU now, but you can't actually run twice as many processes on this machine and get the same runtime", I'd be able to buy a large frapuccino or two at starbucks.

Perhaps I'm uninformed though - is there a tool like htop, which would give me an idea of how close am I to maxing out a CPU?

No there isn’t. But if you understand it I don’t get why you think 20% isn’t a good performance boost, especially considering the rate of return for power and area in silicon.

Because many people believe it is a 100% improvement, plan/budget accordingly, and then look for help.

As far as silicon/power it is nice, but IIRC (I am not involved in purchasing anymore) it used to cost over 50% in USD for those 20% in performance when you non-HT parts were common.

What a strange way to measure the benefits of a performance optimization: "how people will perceive it and then ask me for help".

You ignored the price issue, which was measurable and real, but also:

It (used to be) my job. Does "because people fall for deceptive marketing, waste money, and then waste my time trying to salvage their reputation" sound better?

> loading all 8 threads to "100% CPU" (whatever that means)

What application?

Lots of numerical computations and simulations.

The security concern is remote code execution via JS, and sharing processor time with other people you don't trust, right?

It should be up to the VM-as-a-service and browser vendors to flush the cache properly.

No. The security concern is attackers reading data they shouldn’t. The article explains how.

“Microarchitectural Data Sampling (MDS) is a group of vulnerabilities that allow an attacker to potentially read sensitive data.”

That is way more serious than stealing cycles.

Ye, but I didn't understand how this was different than Spectre, except with different caches.

Still it's fine with no JS and no shared processor time, right?

Right. If you run no foreign code you are safe.

From a brief read, I think it reads in flight data not necessarily cached, so flushing cache won't help unfortunately.

One CPU per process makes a lot more sense, especially now that we have so many specialized CPUs in our machines anyway.

Ye I got a feeling that shared processor time with strangers is not viable without specialized hardware.

I think it isn't viable with non-deterministic (in time) hardware behavior. This means dedicated caches, or no caches at all. Dedicated guaranteed memory speeds and latencies. Dedicated processing units. The untrusted code cannot be affected by other code, otherwise the other code leaks its usage patterns across.

Decade and a half, even. If I remember right, the first CVE for an HT security flaw was summer 2005.

I announced it publicly 14 years ago yesterday.

This one? https://nvd.nist.gov/vuln/detail/CVE-2005-0109

Dang: "Hyper-Threading technology, as used in FreeBSD and other operating systems that are run on Intel Pentium and other processors, allows local users to use a malicious thread to create covert channels, monitor the execution of other threads, and obtain sensitive information such as cryptographic keys, via a timing attack on memory cache misses."

Also, found elsewhere:

"According to Linus Torvalds and others on linux-kernel this is a theoretical attack, paranoid people should disable hyper threading"

Yes. Intel dismissed it at the time, saying that "nobody would ever have untrusted code running on the same hardware on which cryptographic operations are performed".

30% performance hit? I'm sure that heavily depends on the workload... and I'm also sure you lose performance when HT is on, depending on the workload as well.

I've seen this claim made for routers and other low intensity low latency workloads.

That would make sense, my understanding is that with a 100% pegged CPU hyperthreading won't be super beneficial as they aren't real cores, just smarter scheduling. You can't really schedule 100% load better, however for applications that are latency specific it would make more sense, as you don't have the CPU pegged, you just want a faster response.

> You can't really schedule 100% load better

Sure you can. You can do math while another HT is waiting for memory. Sometimes you can even multiplex use of multiple ALUs or one HT can do integer and another can do floating point.

It's actually under high multithreaded load that HT shines, especially if that load is heterogenous or memory latency bound.

I too was once under the misapprehension that HT was "just smarter scheduling", until I took a university course in microarchitecture that explained how Simultaneous Multithreading actually works in terms of maximising utilisation of various types of execution units. I wonder why "smarter scheduling" became a common understanding.

Wouldn't hyperthreading also be more power-efficient compared to running a second core?

Disabling hyper-threading is highly unlikely to produce a 30% performance hit. Most highly optimized software disables or avoids hyper-threading because doing so increases performance.

Hyper-threading tends to benefit the performance of applications that have not been optimized, and therefore presumably are also not particularly performance sensitive in any case.

In highly-parallel workloads like rendering (ray tracing) where pipeline stalls due to loads happen quite regularly, it's fairly easy to get 20-35% speedups with HT.

In music production and C++ code compilation I get a pretty reliable +25% perf boost with HT on (this was not the case a few gens ago though).

Am I reading correctly that this has been under embargo for over a year?

It was discovered June last year according to this timeline, so a lot of people must've successfully kept mum for a long time: https://mdsattacks.com.

Maybe we should start to seriously question the value of so long embargos. This is coordinated disclosure; if the vendor refuses reasonable coordination (and it seems Intel does, with such delays, and also because it stills silos the security researchers way too much), then fuck them and publish (probably not immediately but certainly not after a year...)

It seems that broadly the same principles have been found independently by tons of teams. Expecting that well-financed actors have not explored that field and/or not yield any similar result at this point is completely insane.

Meaning, given the high level of technicality required, it's even doubtful that the embargo protected anybody; it might be that no attacker exist (and I postulate will ever exist) that will be simply waiting for 3rd party disclosure before writing its own exploits in that class. On the other hand, typical security providers monitoring threats in the field might not be aware for a long time of the existence of such vulnerabilities.

Now here arguably the first counter measures are similar to those for L1TF, so hopefully sensitive operators would already have disabled HT. However, it is not very cool to not make them aware of this additional (and slightly different) risk during such a ridiculously long period.

Also: does Intel has competent people working on their shit anymore??? They know the fundamental principles; which is speculative execution on architecturally out-of-reach data, followed by a fault and a subsequent extraction via covert channels of un-rolled-back modified micro-architectural state. The broad micro arch is widely known, so do they really expect that 3rd party security researchers won't found all the places where they were sloppy enough to speculatively execute some code on completely "garbage" data? Or were they themselves unable to do a proper comprehensive review, despite having access to the full detailed design (and despite a dedicated team having been created for that)? In either case, this is not reassuring.

I'm not really sure what the question is supposed to be. You could discover an Intel vulnerability and give them a 90 day timeline, or, for that matter, do what the Metasploit founders would have done and just immediately code up an exploit and publish it with no warning. All of these are viable options and all have precedent; it's up to the people discovering the flaws to make their own decisions.

It's particularly weird in this case to suggest that the embargo didn't help anyone, since (1) nobody appears to have leaked these flaws and (2) the cloud providers all seem to have fixes queued up.

Intel claims to have discovered some of these flaws internally, and this is a bug class we've known about (for realsies) for a little bit over a year now, in a class of products for which development cycles are themselves denominated in multiple years, so I'd cut them a bit of slack.

It really depends. Think about it this way, would you rather have an undisclosed vulnerability go untreated and undetected (as far as everyone knows) for an extra year, or suddenly disclose it to the rest of the world before all major interested parties (big companies, chip vendors, etc) have workarounds and mitigations techniques, so actual malicious attackers can exploit it before the countermeasures are ready?

In an ideal world, you should disclose everything and let everyone know so they can take measures against it, but in reality there might be less damage to let the vulnerability continue stay undetected for a few more months while everyone else plans to patch it and release such fixes as it gets disclosed.

I do agree that almost a whole year is, however, a very long time though.

considering the june/july initial reporting, the stacking of evidence to related exploits and the release in may next year it look more like 9 months plus some slacking due to multiple being reported. Does not sound like a "they kept waiting indefinitely" but more like proper due diligence.

Right, and it takes time to build and comprehensively test a fix.

Anything on the CPU level that needs to be done in microcode is incredibly complex, and hard to test.

Nearly a year, at least, judging by when the CVE numbers were assigned.

Better Intel page on the MDS vulnerability is here: https://www.intel.com/content/www/us/en/architecture-and-tec....

Interesting point: "MDS is addressed in hardware starting with select 8th and 9th Generation Intel® Core™ processors, as well as the 2nd Generation Intel® Xeon® Scalable processor family." Looks like my 8700K isn't on the list though.

According to the researchers in the paper[0] this is not true.

>We have verified that we can leak information across arbitrary address spaces and privilege boundaries, even on recent Intel systems with the latest microcode updates and latest Linux kernel with all the Spectre, Meltdown, L1TF default mitigations up (KPTI, PTE inversion, etc.). In particular, the exploits we discuss below exemplify leaks in all the relevant cases of interest: process-to-process, kernel-to-userspace, guest-to-guest, and SGX-enclave-touserspace leaks. Not to mention that such attacks can be built even from a sandboxed environment such as JavaScript in the browser, where the attacker has limited capabilities compared to a native environment.

[0] https://mdsattacks.com/files/ridl.pdf

I searched the the paper and it doesn't seem to falsify what I linked to, but I'll have to dig deeper into the research. "Recent Intel systems" isn't specific enough.

Page 16 in the slides[1] lists vulnerable processors, 8700K is one of them

[1] https://mdsattacks.com/slides/slides.html

edit: This is mentioned in the paper as well, on page 8

In a Dutch article (https://nos.nl/artikel/2284630-nederlanders-vinden-beveiligi...), one of the researchers says "het aantal mensen bij bedrijven als Intel die zich op dit niveau met beveiliging bezighoudt, is echt op de vingers van twee handen te tellen." = There are 10 or fewer people working on security at this level at companies like Intel. This sounds very hard to believe to me. With the previous attacks there surely are bigger teams working on this kind of stuff?

There's other people working on it outside of Intel, too. https://mdsattacks.com/ if you look at the list of people you'll see there's dozens of folk that independently found and reported the same vulnerabilities.

Bigger is definitely not better for this kind of stuff at least as far as team sizes.

There are probably fewer than 1000 people in the world capable of finding these kinds of vulnerabilities. Sounds about right to have 10 at Intel.

Out of curiosity, where would the others be?

Universities, three letter agencies and private or government actors. At least I would guess that, maybe also a bunch at anti virus developers.

how much more would you expect? 10 people is pushing the two-pizza limit.

People don't necessarily need to be in one big team. Lots of things that are important can be worked on by more than 10 people. (Surely Google and Facebook each have more than 10 people working on security).

Is there evidence that so few people are working on security at Intel?

The overview page, https://cpu.fail/ , is on Hacker News as https://news.ycombinator.com/item?id=19911715 .

What does "UC" in "Meltdown UC" mean?

Microcode most likely.

Thanks! We've merged these.

And unmerged them. See https://news.ycombinator.com/item?id=19912588.

Sorry for the chaos but this was a weird edge case. The marketing maybe went overboard this time?

Also for MDS: https://www.intel.com/content/www/us/en/security-center/advi...

I like how Intel prominently thanks their own employees for finding the bugs and later simply acknowledges the existence of any anyone independent reporters with zero thanks.

Interesting. This could mean that Intel actually discovered and thus knew about all those security-holes before the non-Intel researchers did.

Or they're just being awkwardly disingenuous here, that's also a possibility.

funny that you don't see their email address for contact if you don't have javascript running, which is exactly one of the doors for this kind of vulnerability as they mention on their site

End-user security, in web browser context: do I understand it correctly that if my browser was to only ever execute JavaScipt in bytecode format (without compilation to native code) it would be safe from those kinds of exploits?

Presuming the bytecode interpreter would be "slow enough" and "jittery enough" and "indirect enough" to hamper any attempts at exploiting subtle timing+memory layout bugs like that?

IIRC, Konqueror (of KDE) had reasonably fast bytecode JS engine. I wish the browser was still undergoing fast development, used to be my daily driver for many years.

AFAIK, there are techniques to detect and denoisify minuscule timing differences over millions of samples, and the fundamentals of most techniques apply to interpreters as well, so it is not a solid protection.

That said, it would make things harder in practice since you’re introducing an extra indirection level and just making everything slower.

As for interpreters in modern browsers, I’d be surprised if there’s no way to entirely disable the JIT somehow... since most JIT implementations I have seen have an interpreter fallback for debugging and easier portability to new CPU architectures.

[edit: I finally finished reading everything. It seems like these new leaks can be triggered from JS as they still fundamentally reduce to "read time for memory access"]

For spectre simply having attacker directed control flow was sufficient - so logically almost any scripting language could be exploited.

Same goes for most of the TLB attacks.

Others required native code because they needed to use specific instructions (that aren’t going to be emitted intentionally by any compiler - jit or otherwise).

The interactive guide on that page is an effective way to visualise the components in relation to the attacks.

Does anyone know what this would have been built with?

Am I reading this right? It looks like this bug lets you read any part of an intel chip's processing payload in-flight?

That's gnarly if true.

It seems that we need to move away from clever, complicated low-level micro-optimizations that rely on mangling instructions and just use more cores. That should allow for simpler security model.

There are plenty of scenarios where synchronization overheads between cores dwarf the performance gain, but OoO execution can help.

But maybe instead of having more cores, we should expose the different execution units within a CPU core to the architectural level? That however brings back memories of Itanium, and the general fact that compilers just can't do static scheduling well enough.

I've started to think Itanium might have been sort of on the right track but ahead of its time and in some ways poorly executed.

I still don't think so. Exposing these microarchitectural concerns to the architectural level limits flexibility. In order for compilers to efficiently schedule multiple execution units, the compiler needs to know the exact latency of all instructions. That may be doable for arithmetic, but varies greatly from one generation of processor to the next. And compilers definitely cannot know the latency of a load: from a few cycles in L1 cache, to a few thousand cycles in DRAM, to millions of cycles if there's a page fault. And these things vary a lot, not just between processor generations but within the same processor generation.

Again, Intel CPUs only.

Can I ask for money back? Intel should return 30% cost of all vulnerable CPUs then... because disabling HT is effectively reducing the claimed performance specs.

Foreshadow/L1TF is the only prior problem of this nature that is unique to Intel. Meltdown bugs were also found in ARM and IBM POWER and mainframe designs, and Specture hits all of those and AMD as well.

For me as a home user, taking a performance hit of any kind in response to threats which haven't yet been seen in the wild simply isn't good math.

I don't think that anybody can know whether this is true, since exploitation leaves little evidence. Even before this is witnessed in the wild for the first time, you can't really know which secrets of yours have already been exfiltrated.

Everything that can't be fixed with a ten minute phone call to my bank is already public knowledge thanks to Experian, so I really don't have anything left to fear.

You have no conversations that'd you prefer not be sold on the darknet? With friends, family, therapists, doctors, lawyers, consultants?

No pictures of your kids that they might not want spilled into a searchable database and used for machine learning to sell them things later in life?

No private or symmetric keys which might be used to impersonate you or eavesdrop on you later?

No in-progress documents which you aren't ready to publish?

No conversations with political allies that you might not want the state to peruse?

No intimate conversations with sexual partners?

If that's true, then I think you have a very different attack surface than most people. I think most people are willing to take a small performance hit not to open up access to much of the data that goes across their CPU, which is not an exaggeration for the combination of attacks which have been published against Intel CPUs over the past 3 years.

If someone wants to leverage speculative-execution vulnerabilities to get that sort of information off of my PC, it's not a problem that can be solved by yet another security patch. Don't reduce my PC's performance for the sake of somebody else's security concern.

At the end of the day the only secure computer is one that's turned off and locked up in a supply closet.

Not on any x86 device, no. Not that I'd be a particularly easy target since I use NoScript with a whitelist and keep my router's firewall very strict. I suppose someone could come at me with a malicious Steam game.

"Arguing that you don't care about the right to privacy because you have nothing to hide is no different than saying you don't care about free speech because you have nothing to say." -E. Snowden

Did you miss the "for me" part? I highly encourage everyone to install the patch, especially since the more people do the less I'll need it.

If that is so, please leave your email and password here...

I'd really like to be given a choice, at least. My gaming PC is used exclusively for gaming, so it needs to be performant, but does not need to be secure.

If running Linux you can disable the meltdown/spectre mitigations with the nopti option [1].

1. https://yux.im/posts/technology/security/disable-meltdown-an...

Here is a similar windows tool: https://www.grc.com/inspectre.htm

'nopti' only disables the Meltdown mitigation. To disable all the mitigations (including Spectre, Meltdown, L1TF, and now MDS) you can use mitigations=off on newer kernels.

If you use Steam, it’s in the best interests of you and probably Valve not to worry about attacks to steal your library or get you banned.

Your public clouds at AWS probably use them too... and they won't disable HT ;)

It certainly doesn't feel good. But if the home market remained unpatched with a public POC, they would be attacked. The most likely avenue is by writing malicious web pages to steal bitcoin wallets, etc.

Didn't they already have a proof of concept Spectre or Meltdown exploit via a web page? Malicious ads seem like the best way to spread ransomware, etc.

No. There has been nothing shown that would exploit in real-world conditions. All Spectre exploits have required assistance from the target.

It is funny how ChromeOS is the most ridiculously secure of the commonly available operating systems. It is not as if you can do much other than surf the internet with it.

It makes me chuckle to think that my not-so-computer-literate friend whom I gave a Chromebook to is protected from anyone snooping in on Youtube, Hotmail and Youtube running on this toy machine (designed for 9 year olds). There really is nothing to hide there. Meanwhile, people doing important work on proper computers are properly vulnerable to this new Hyperthreading theoretical attack.

I will be interested to find out if there is a drop-off in performance on ChromeOS, e.g. Youtube stuttering whilst the whatsapp web tab updates itself with a new message. If there is nobody complaining then why did we need Hyper-Threading in the first place?

You can run Android and Linux apps on ChromeOS. And with PWA‘s and WebAssembly maturing the difference between native app and web app is getting smaller and smaller. Many developers use it for work. A lot of dev work in the enterprise isn‘t done locally anyway.

> It is not as if you can do much other than surf the internet with it.

You can run Android apps and run Linux programs.

Spectre/Meltdown are sacred cows in HN. Any comment that tries to bring a realistic view of them is immediately voted into the floor.

Hyper threading was an intel stop-gap reaction to the athalon64 x2, which was a REAL dual core, to buy them time while the pentium D was created and later laughed off the market. We finally got an "OK" dual core from intel when they decided to hack pentium 3 cores together and call it the Core duo, and with the core 2 duo they finally caught back up to AMD (by hacking amd64 instructions onto the P3 cores) and were able to start taking market share back. Nothing interesting happens between then and threadripper, but now we would be back to eating popcorn and watching the rest of the fight..... but the fight is over and everyone is over in the other arena watching arm and webkit winner-take-all style demolishing the incumbent platforms.

> Hyper threading was an intel stop-gap reaction to the athalon64 x2

No. Hyper threading was introduced in Feb 2002. The original single core athlon 64 was Sept 2003. The x2 was 2007.



Calling Core duo a Pentium III core, esp. when talking about microarchs, is a slight misrepresentation. Of course it was way closer and more of a derivative of the PPro descendants. But P6 did not vary a lot between PPro and P III, while before reaching Core Duo it went through Pentium M, and then enhanced. So yeah, it looks like more a Pentium III than a Pentium 4, but it was certainly not just a "hack [gluing] pentium 3 cores together"

Also Netburst was not that bad. It was a dead-end, yes, but on some markets it could compete with what AMD had.

Plus implementing SMT is not necessarily extremely easy compared to SMP, especially when you evolve designs.

And anyway, Intel shipped HT way before AMD shipped the Athlon 64 x2...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact