
Reading privileged memory with a side-channel - brandon
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
======
hyperion2010
An analogy that was useful for explaining part of this to my (non-technical)
father. Maybe others will find it helpful as well.

Imagine that you want to know whether someone has checked out a particular
library book. The library refuses to give you access to their records and does
not keep a slip inside the front cover. You can only see the record of which
books you have checked out.

What you do is follow the person of interest into the library whenever they
return a book. You then ask the librarian for a copy of the books you want to
know whether the person has checked out. If the librarian looks down and says
"You are in luck, I have a copy right here!" then you know the person had
checked out that book. If the librarian has to go look in the stacks and comes
back 5 minutes later with the book, you know that the person didn't check out
that book (this time).

The way to make the library secure against this kind of attack is to require
that all books be reshelved before they can be lent out again, unless the
current borrower is requesting an extension.

There are many other ways to use the behavior of the librarian and the time it
takes to retrieve a book to figure out which books a person is reading.

edit: A closer variant. Call the library pretending to be the person and ask
for a book to be put on hold. Then watch how long it takes them in the
library. If they got that book they will be in and out in a minute (and
perhaps a bit confused), if they didn't take that book it will take 5 minutes.

~~~
314
Your analogy is more apt for side-channel attacks in general. Here is a more
specific version for Meltdown:

A library has two rooms, one for general books and one for restricted books.
The restricted books are not allowed out of the library, and no notes or
recordings are allowed to be taken out of the restricted room.

An attacker wants to sneak information out of the restricted room. To do this
the pick up a pile of non-restricted books and go into the restricted room.
Depending on what they read in there they rearrange the pile of non-restricted
books into a particular order. A guard comes along and sees them, they are
thrown out of the restricted room and their pile of non-restricted books is
put on the issue desk ready to be put back into circulation.

Their conspirator looks at the order of the books on the issue desk and
decodes a piece of information about the book in the restricted room. They
repeat this process about 500000 times a second until they have transcribed
the secret book.

~~~
lma21
What is the analogy behind being able to go into the restricted room?

~~~
314
The restricted room is the part of the machine behind the protection. Memory
reads are not checked at the tine access. They are checked when the
instruction retires.

~~~
AlphaSite
On intel* this isn’t a property intrinsic to superscalar processors, other
architectures check it in flight or while it’s in the issue queue, preventing
this side channel.

------
jotux
Papers describing each attack:

[https://meltdownattack.com/meltdown.pdf](https://meltdownattack.com/meltdown.pdf)

[https://spectreattack.com/spectre.pdf](https://spectreattack.com/spectre.pdf)

From the spectre paper:

>As a proof-of-concept, JavaScript code was written that, when run in the
Google Chrome browser, allows JavaScript to read private memory from the
process in which it runs (cf. Listing 2).

Scary stuff.

~~~
ErikAugust
I've thrown the C code in the Spectre paper up if anyone wants to feel the
magic:
[https://gist.github.com/ErikAugust/724d4a969fb2c6ae1bbd7b2a9...](https://gist.github.com/ErikAugust/724d4a969fb2c6ae1bbd7b2a9e3d4bb6)

~~~
diyseguy
I thought it was supposed to be exploitable by javascript? If you can get to
the machine and run c code, well, that doesn't seem like an exploit?

~~~
porjo
From the Spectre whitepaper:

> In addition to violating process isolation boundaries using native code,
> Spectre attacks can also be used to violate browser sandboxing, by mounting
> them via portable JavaScript code. We wrote a JavaScript program that
> successfully reads data from the address space of the browser process
> running it.

The whitepaper doesn't contain example JS code however

~~~
diyseguy
This whitepaper describes the Javascript exploit in Section IV. I'm struggling
to understand it though:
[http://www.cs.vu.nl/~herbertb/download/papers/anc_ndss17.pdf](http://www.cs.vu.nl/~herbertb/download/papers/anc_ndss17.pdf)

~~~
diyseguy
This too was provided as a proof of concept (without explanation):
[https://brainsmoke.github.io/misc/slicepattern.html](https://brainsmoke.github.io/misc/slicepattern.html).
I'm not sure what I'm looking at though

~~~
diyseguy
This is the first implementation in Javascript I have seen so far:
[http://xlab.tencent.com/special/spectre/js/check.js](http://xlab.tencent.com/special/spectre/js/check.js)

------
mrmondo
"AMD chips are affected by some but not all of the vulnerabilities. AMD said
that there is a "near zero risk to AMD processors at this time." British
chipmaker ARM told news site Axios prior to this report that some of its
processors, including its Cortex-A chips, are affected."

\- [http://www.zdnet.com/article/security-flaws-affect-every-
int...](http://www.zdnet.com/article/security-flaws-affect-every-intel-chip-
since-1995-arm-processors-vulnerable/)

* Edit:

From [https://meltdownattack.com/](https://meltdownattack.com/)

Which systems are affected by Meltdown?

"Desktop, Laptop, and Cloud computers may be affected by Meltdown. More
technically, every Intel processor which implements out-of-order execution is
potentially affected, which is effectively every processor since 1995 (except
Intel Itanium and Intel Atom before 2013). We successfully tested Meltdown on
Intel processor generations released as early as 2011. Currently, we have only
verified Meltdown on Intel processors. At the moment, it is unclear whether
ARM and AMD processors are also affected by Meltdown.

Which systems are affected by Spectre?

Almost every system is affected by Spectre: Desktops, Laptops, Cloud Servers,
as well as Smartphones. More specifically, all modern processors capable of
keeping many instructions in flight are potentially vulnerable. In particular,
we have verified Spectre on Intel, AMD, and ARM processors."

~~~
buryat
That article links a commit [1] that contradicts this statement

> AMD processors are not subject to the types of attacks that the kernel page
> table isolation feature protects against. The AMD microarchitecture does not
> allow memory references, including speculative references, that access
> higher privileged data when running in a lesser privileged mode when that
> access would result in a page fault.

And Axios [2] that Zdnet quotes gave a comment from AMD:

> "To be clear, the security research team identified three variants targeting
> speculative execution. The threat and the response to the three variants
> differ by microprocessor company, and AMD is not susceptible to all three
> variants. Due to differences in AMD's architecture, we believe there is a
> near zero risk to AMD processors at this time. We expect the security
> research to be published later today and will provide further updates at
> that time."

And a comment from ARM: > Please note that our Cortex-M processors, which are
pervasive in low-power, connected IoT devices, are not impacted.

[1]
[https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/...](https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=694d99d40972f12e59a3696effee8a376b79d7c8)

[2] [https://www.axios.com/how-the-giants-of-tech-are-dealing-
wit...](https://www.axios.com/how-the-giants-of-tech-are-dealing-with-a-
massive-chip-vulnerability-2522206367.html)

~~~
ethbro
My read is that vulnerable processors generally have to:

1\. Have out of order execution

2\. Have aggressive speculative memory load / caching behavior

3\. Be able to speculatively cache memory not owned by the current process
(either kernel or otherwise)

4\. Have deterministic ways of triggering a speculative load / read to the
same memory location

2 is probably the saving grace in ARM / low power land, given they don't have
the power budget to trade speculative loads for performance (in the event
they're even out of order in the first place).

Caveat: I'm drinking pretty strong Belgian beer while reading through these
papers.

------
endymi0n
Hard to find a good spot for this, but: Thanks to anyone involved! From
grasping the magnitude of this vulnerability to coordinating it with all major
OS vendors, including Open Source ones that do all of their stuff more or less
„in the open“, it was almost a miracle that the flaw was leaked „only“ a few
days before the embargo - and we‘ll all have patches to protect our
infrastructure just in time.

Interestingly, it also put the LKML developers into an ethical grey zone, as
they had to deceive the public the patch was actually fixing something else
(they did a good and right thing there IMHO).

Despite all the slight problems along the way, kudos to any of the White Hats
dealing with this mess over the last months and handling it super graceful!

~~~
pacavaca
Consider how many other of such "gray" patches could already be in the kernel
;)

------
tonmoy
I'm not that savvy with security so I need a little help understanding this.
According to the google security blog:

> Google Chrome

> Some user or customer action needed. More information here
> ([https://support.google.com/faqs/answer/7622138#chrome](https://support.google.com/faqs/answer/7622138#chrome)).

And the "here" link says:

>Google Chrome Browser

>Current stable versions of Chrome include an optional feature called Site
Isolation which can be enabled to provide mitigation by isolating websites
into separate address spaces. Learn more about Site Isolation and how to take
action to enable it.

>Chrome 64, due to be released on January 23, will contain mitigations to
protect against exploitation.

>Additional mitigations are planned for future versions of Chrome. Learn more
about Chrome's response.

>Desktop (all platforms), Chrome 63:

> Full Site Isolation can be turned on by enabling a flag found at
> chrome://flags/#enable-site-per-process. > Enterprise policies are available
> to turn on Site Isolation for all sites, or just those in a specified list.
> Learn more about Site Isolation by policy.

Does that mean if I don't enable this feature using chrome://flags and tell my
grandma to do this complicated procedure I (or she) will be susceptible to
getting our passwords stolen?

~~~
013a
It probably means if you want mitigations right now, you can flip that flag.
Otherwise wait for Chrome to auto-update with new versions that have
mitigations enabled by default.

~~~
mintplant
Would I be correct in assuming a browser-level mitigation isn't necessary if
you're running a patched OS?

~~~
bpye
The OS patch stops you reading kernel space from user space trivially (ie.
without eBPF in the Project Zero example). You can still cause leakage from
the same context, for example, the V8 JIT can read all of the processes
memory, without site isolation that can include data on other web pages,
passwords, cookies, etc.

------
tytso
From a recently posted patch set:

Subject: Avoid speculative indirect calls in kernel

Any speculative indirect calls in the kernel can be tricked to execute any
kernel code, which may allow side channel attacks that can leak arbitrary
kernel data.

So we want to avoid speculative indirect calls in the kernel.

There's a special code sequence called a retpoline that can do indirect calls
without speculation. We use a new compiler option -mindirect-branch=thunk-
extern (gcc patch will be released separately) to recompile the kernel with
this new sequence.

We also patch all the assembler code in the kernel to use the new sequence.

~~~
khc
Link?

~~~
language
Text and patch start here:
[https://lkml.org/lkml/2018/1/3/780](https://lkml.org/lkml/2018/1/3/780)

Also, see Linus' response here:
[https://lkml.org/lkml/2018/1/3/797](https://lkml.org/lkml/2018/1/3/797)

~~~
Thev00d00
Ahh Linus, never change.

------
nlh
"Before the issues described here were publicly disclosed, Daniel Gruss,
Moritz Lipp, Yuval Yarom, Paul Kocher, Daniel Genkin, Michael Schwarz, Mike
Hamburg, Stefan Mangard, Thomas Prescher and Werner Haas also reported them;
their [writeups/blogposts/paper drafts] are at"

Does anyone have any color/details on how this came to be? A major fundamental
flaw exists that affects all chips for ~10 years, and multiple independent
groups discovered them roughly around the same time this past summer?

My hunch is that someone published some sort of speculative paper / gave a
talk ("this flaw could exist in theory") and then everyone was off to the
races.

But would be curious if anyone knows the real version?

~~~
sprkyco
[https://cyber.wtf/2017/07/28/negative-result-reading-
kernel-...](https://cyber.wtf/2017/07/28/negative-result-reading-kernel-
memory-from-user-mode/)

Failed attempt in July which is being attributed as earliest work via
[https://twitter.com/lavados/status/948700783259811847](https://twitter.com/lavados/status/948700783259811847)

~~~
tdullien
Jann Horn's results & report pre-date the blog post though. The topic was
"ripe", so to speak, so multiple parties investigated it at roughly the same
time.

~~~
ehsankia
Yeah, the blog post says they knew since June 2017, with that blog post being
from July.

> This initial report did not contain any information about variant 3. We had
> discussed whether direct reads from kernel memory could work, but thought
> that it was unlikely. We later tested and reported variant 3 prior to the
> publication of Anders Fogh's work at [https://cyber.wtf/2017/07/28/negative-
> result-reading-kernel-...](https://cyber.wtf/2017/07/28/negative-result-
> reading-kernel-memory-from-user-mode/).

------
baybal2
>running on the host, can read host kernel memory at a rate of around 1500
bytes/second,

I kinda get how it works now. They force a speculative execution to do
something with a protected memory address, and then measure the latency to
guess the content. They did not found a way to continue execution after a page
fault as rumors were.

The fact that speculative execution branch can access protected memory, but
not to commit its own computation results to memory in ia32 was known since
pentium 3 times.

It was dismissed as "theoretical only" vulnurability without possible
practical application. Intel kept saying that for 20 years, but here it is,
voila.

The ice broke in 2016 when Dmitry Ponomarev wrote about first practical
exploit scenario for this well known ia32 branch prediction artifact. Since
then, I believe, quite a few people were trying all and every possible
instruction combination for use in timing attack until somebody finally got
one that works that was shown behind closed doors.

Edit: google finally added reference to Ponomarev's paper. Here is his page
with some other research on the topic
[http://www.cs.binghamton.edu/~dima/](http://www.cs.binghamton.edu/~dima/)

------
rarudduck
Azure's response: [https://azure.microsoft.com/en-us/blog/securing-azure-
custom...](https://azure.microsoft.com/en-us/blog/securing-azure-customers-
from-cpu-vulnerability/)

This part is interesting considering the performance concerns:

"The majority of Azure customers should not see a noticeable performance
impact with this update. We’ve worked to optimize the CPU and disk I/O path
and are not seeing noticeable performance impact after the fix has been
applied. A small set of customers may experience some networking performance
impact. This can be addressed by turning on Azure Accelerated Networking
(Windows, Linux), which is a free capability available to all Azure
customers."

~~~
boulos
Disclosure: I work on Google Cloud.

If you run a multitenant workload on a linux system (say you're a PaaS or even
just hosting a bunch of WordPress side by side) you should update your kernel
as soon as is reasonable. While VM to VM attacks are patched, I'm sure lots of
folks are running untrusted code side by side and need to self patch. This is
why our docs point this out for say GKE: we can't be sure you're running
single tenant, so we're not promising you there's no work to do. Update your
OSes people!

~~~
mike_hearn
No offence intended as I'm sure it's a bit of a madhouse there right now, but
is your statement really correct? I read the Spectre paper quite carefully and
it appears to be unpatchable. Although the Meltdown paper is the one that
conclusively demonstrated user->kernel and vm->vm reads with a PoC, and
Spectre "only" demonstrated user->user reads, the Spectre paper clearly shows
that any read type should be possible as long as the right sort of gadgets can
be found. There seems no particular reason why cross-VM reads shouldn't be
possible using the Spectre techniques and the paper says as much here:

 _For example, if a processor prevents speculative execution of instructions
in user processes from accessing kernel memory, the attack will still work._

and

 _Kernel mode testing has not been performed, but the combination of address
truncation /hashing in the history matching and trainability via jumps to
illegal destinations suggest that attacks against kernel mode may be possible.
The effect on other kinds of jumps, such as interrupts and interrupt returns,
is also unknown_

There doesn't seem to be any reason to believe VM to VM attacks are either
patched nor patchable.

My question to you, which I realise you may be unable to answer - how much
does truly dedicated hardware on GCE cost? No co-tenants at all except maybe
Google controlled code. Do you even offer it at all? I wasn't able to find
much discussion based on a 10 second search.

~~~
boulos
Sorry for the confusion.

 _I_ have been most focused on people being concerned that a neighboring VM
could suddenly be an attacker. You're right that the same kind of thing that
affects your JavaScript engine as a user affects say Apache or anything that
allows requests from external sources. However, that _situation_ already has a
much larger attack surface and people in that space should be updating
themselves whenever there's any CVE like this.

My concern was that the Azure announcement made it sound like they've done the
work, so nothing is required. That's not strictly true, even though providers
have mitigated one set of attacks at the host kernel layer, so I wanted to
correct that.

------
webaholic
Someone correct me if I understood this wrong. The way they are exploiting
speculative execution is to load values from memory regions which they don't
have permission to a cache line, and when the speculation is found to be
false, the processor does not undo the write to the cache line?

The question is, how is the speculative write going to the cache in the first
place? Only retired instructions should be able to modify cache lines AFAIK.
What am I missing?

Edit: Figured it out. The speculatively accessed memory value is used to
compute the address of a load from a memory location which the attacker has
access to. Once the mis-speculation is detected, the attacker will time
accesses to the memory which was speculatively loaded and figure out what the
secret key is. Brilliant!

~~~
violinist
Important to note that at this point they're only reading one bit at a time
from kernel memory, but it could probably be changed to read more--exactly how
many branches it could compare before the mis-speculation is detected is not
discussed, and that could be an area for large speedups in the attack.

------
adriancooney
Wow, what a find for the Project Zero team. This team and idea can only be
described as a success, well done.

------
wslh
Has Google the best security team in the world? It seems like Google security
is in a complete different league. I cannot imagine how this impacts companies
handling fiat money or cryptocurrencies in the cloud like Coinbase in AWS.

~~~
gervase
Project Zero is very well known for things exactly like this. Partially, it's
because they are incredibly talented, but there are also talented people in
academia and in other security consultancies. The biggest difference with
Project Zero is that their _primary_ [0] goal is altruistic: find
vulnerabilities, and let people who can fix them know (vs publishing papers,
securing paying clients, auctioning zero-days, etc), in the interests of
making the internet and computing as a whole a safer place.

[0] Their secondary goals are to protect Google products and services, and to
provide excellent PR in line with what we're discussing right here.

~~~
kzrdude
Hm that's tricky. These awesome findings didn't exactly provide net value for
google, not even on the not so short term (next 10 years?). They've created a
large problem for Google! :-)

~~~
ehsankia
Yes and no. The cost of some malicious party figuring it out and using it on
Google would potentially been far greater than anything this could cost.

------
chrisb
"These vulnerabilities affect many CPUs, including those from AMD, ARM, and
Intel, as well as the devices and operating systems running them."

Curious. All other reports I've read state that AMD CPUs are not vulnerable.

~~~
MBCook
See the Twitter thread here:
[https://twitter.com/nicoleperlroth/status/948678006859591682](https://twitter.com/nicoleperlroth/status/948678006859591682)

(Edit: there are 9 posts total, go to her user page to see them all)

Seems there are two issues. One, called Meltdown, only effects Intel and is
REALLY bad, but the kernel page table changes everyone is making fixes it.

The other, dubbed Spectre, is apparently common to the way all processors
handle speculative execution and is unfixable without new hardware.

I’d like to know more about that but I haven’t seen anything yet.

Whoever discovered this stuff on Google’s team deserves some sort of computer
security Nobel prize.

~~~
Cyph0n
That's not even close to a thread...

You can see all the tweets here (courtesy of @svenluijten):
[https://twitter.com/i/moments/948681915485351938](https://twitter.com/i/moments/948681915485351938).

~~~
PuffinBlue
The linked thread suggests that Spectre doesn't have _any_ mitigation.

> The business/economic implications are not clear, since eventually the only
> way to eradicate the threat posed by Spectre is to swap out hardware.

Is this fully accurate, there's no software mitigation available now?

From [0], the above may be true:

> There is also work to harden software against future exploitation of
> Spectre, respectively to patch software after exploitation through Spectre .

There is 'work'? No current patch? So Spectre is unpatched?

This point doesn't seem to be being highlighted but appears particularly
important.

[0] [https://meltdownattack.com/#faq-fix](https://meltdownattack.com/#faq-fix)

~~~
Cyph0n
Yes, from my understanding, Spectre is an architectural-level flaw in the so-
called speculative execution unit. In other words, Spectre will only be fixed
once Intel, AMD, and ARM redesign the unit and release new processors. Given
the timelines of CPU design, this will take 5-10 years at least.

On the positive side, the flaw is very difficult to exploit in a practical
setting.

~~~
noncoml
> On the positive side, the flaw is very difficult to exploit in a practical
> setting.

Is it?

"As a proof-of-concept, JavaScript code was written that, when run in the
Google Chrome browser, allows JavaScript to read private memory from the
process in which it runs"

~~~
martinald
So is this fixable or not?

~~~
noncoml
Not really.

See: [https://blog.mozilla.org/security/2018/01/03/mitigations-
lan...](https://blog.mozilla.org/security/2018/01/03/mitigations-landing-new-
class-timing-attack/)

------
static_noise
So, as I gather, one of the main culprits is that unwinding of speculatively
executed commands is done incompletely. That is something that the people
doing the unwinding must have noticed and known. Somewhere the decision must
have been made to unwind incompletely for some reasons
(performance/power/cost/time).

As for the difference between AMD and intel. (From other posts here, not this
one.) The speculative execution can access arbitrary memory locations on intel
processors while this is not possible on AMD. This means that on intel
processors you can probe any memory location with only limited privileges.

As for the affected AMD and ARM processors I'm none the wiser. How are they
affected? Which models are affected? Does it allow some kind of privilege
escalation? The next days will surely stay interesting.

~~~
cesarb
You can't unwind completely. Once the cache is full, to load something on the
cache, it has to evict something else. You might be able to evict what you
just loaded, but you can't undo the earlier eviction.

~~~
static_noise
Only if your speculative reads do cause irreversible side-effects on those
caches. You could implement them in a way that doesn't modify the caches...
but that would be complicated and probably use more power and have lower
performance.

~~~
webaholic
One of the main reasons for speculative execution is to fetch data into the
caches ahead of them being needed. If you don't modify the cache, then you
throw that away.

May be one way would be to use a smaller, separate cache for speculative
execution and then copy that value to the regular cache once speculation is
confirmed? This would add a one cycle latency for cache-to-cache transfer but
there might be better ways.

~~~
caf
This might actually _improve_ performance because it would prevent actually-
hot data being evicted from the cache in favour of cold data that was loaded
in a not-taken speculated branch.

------
richadams
[https://spectreattack.com/](https://spectreattack.com/)

Information site with some more information, and links to papers on the two
vulnerabilities, called "Meltdown" and "Spectre" (with logos, of course).

([https://meltdownattack.com/](https://meltdownattack.com/) goes to the same
site)

~~~
hrpnk
Both domains were registered on 2017-12-22. Given the planned disclosure on
9th January that Google mentions and MS and others coding patches silently
[1], do the early reports [2] of kernel patches, does this mean that due to
coding in the open the whole disclosure procedure has been vastly accelerated?

I wonder how the timing relates to New Year and many companies having holidays
in CW1.

[1] [https://lists.freebsd.org/pipermail/freebsd-
security/2018-Ja...](https://lists.freebsd.org/pipermail/freebsd-
security/2018-January/009651.html)

[2]
[https://news.ycombinator.com/item?id=16046636](https://news.ycombinator.com/item?id=16046636)

~~~
_delirium
Accelerated, but not vastly. Google's post says "We reported this issue to
Intel, AMD and ARM on 2017-06-01", so the embargo still ended up holding for 7
months, even with it ending a week early. The domain registration dates of
2017-12-22 seem to be just when Google started to prepare for releasing the
publicity materials, not when the vulnerability was discovered.

~~~
blattimwind
The Google Security Blog post actually says that the open development did not
cause the early breakdown of the embargo in the last 1-2 hours, but

> We are posting before an originally coordinated disclosure date of January
> 9, 2018 because of existing public reports and growing speculation in the
> press and security research community about the issue, which raises the risk
> of exploitation. The full Project Zero report is forthcoming.

------
tarruda
It seems that Richard Stallman is not so paranoid after all:

> I am careful in how I use the Internet.

> I generally do not connect to web sites from my own machine, aside from a
> few sites I have some special relationship with. I usually fetch web pages
> from other sites by sending mail to a program (see
> [https://git.savannah.gnu.org/git/womb/hacks.git](https://git.savannah.gnu.org/git/womb/hacks.git))
> that fetches them, much like wget, and then mails them back to me. Then I
> look at them using a web browser, unless it is easy to see the text in the
> HTML page directly. I usually try lynx first, then a graphical browser if
> the page needs it (using konqueror, which won't fetch from other sites in
> such a situation).

Ref: [https://stallman.org/stallman-
computing.html](https://stallman.org/stallman-computing.html)

------
debt
Speculative execution seems like something that would be very intuitively
insecure even to a layperson(relative to the field of course).

I'm wondering, was this vulnerability theorized first and later found out to
be an actual vulnerability? Or was this something that nobody had any clue
about?

I'm only saying this, because from a security perspective, I imagine somewhere
at some point very early on someone had to have pointed out the potential for
something like speculative execution to eventually cause security problems.

I just don't understand how chip designers assumed speculative execution
wouldn't eventually cause security problems. Is it because chip designers were
prioritizing performance above security?

~~~
mark-r
Speculative execution isn't supposed to leak information; if the speculative
instructions aren't supposed to execute, all traces of them should be rolled
back. I'd be curious to see what the details of this bug really are. I'm not
sure how much will be disclosed in the interests of keeping exploits from
popping up.

~~~
dboreham
"all traces" includes timing differences in execution of non-privileged code,
which it turns out are not rolled back.

~~~
blattimwind
Or side-effects by loading data into the cache hierarchy.

------
im3w1l
I don't think this is the last we have seen of side-channels, it's just a
ridicolously hard problem to get right. And for that reason I can't feel too
angry at the procesor makers.

And I certainly expect to see more things like this (but at least hopefully
with lower bandwidth).

------
anonfunction
AMD put out an announcement:

[https://www.amd.com/en/corporate/speculative-
execution](https://www.amd.com/en/corporate/speculative-execution)

------
aeleos
Wow so intel comes and says what is all the panic about there is nothing wrong
(despite knowing this) and then amazon drops the we are updating everything
right now bomb and then google drops the mother of all cpu bugs. In a previous
thread someone was asking if it really is all that bad and at this point I
think it’s safe to say that yea, it is.

------
partiallypro
So, is AMD effected or not? This seems fairly important. The Google blog post
sort of goes against itself in this regard. AMD itself has said:

"The threat and the response to the three variants differ by microprocessor
company, and AMD is not susceptible to all three variants. Due to differences
in AMD's architecture, we believe there is a near zero risk to AMD processors
at this time."

So either AMD is lying or Google's blog post is wrong. Granted AMD's statement
is a bit muddled, not sure if they mean they aren't susceptible to all THREE
variants (as in only 1/3) or they aren't susceptible to ALL three variants (as
in none of them.)

~~~
AsyncAwait
It seems like it is not affected by the most serious bug, but may be by a
lesser one.

~~~
partiallypro
That's what I'm thinking, effected by Spectre, but not by Meltdown. But more
clarity would be appreciated on Google and AMD's front. I mean from a pure PR
angle, AMD has a lot to gain if they can clear the air more.

------
adrianpike
Can someone with a little more experience this low-level let me know if this
is as bad as I think it is?

Because this looks real bad:

> Reading host memory from a KVM guest

~~~
trevyn
"We wrote a JavaScript program that successfully reads data from the address
space of the browser process running it."

Yeah, it's pretty bad.

~~~
madez
A perfect occasion to invite others into my current exercise of using the web
without JavaScript.

~~~
userbinator
...and for those of us who leave JS off by default except for a few very
trusted sites, the bar for turning on JS on a site that asks to just went up a
lot higher.

------
bloorp
So is speculative execution just inherently flawed like this, or can we expect
chips in 2 years that let operating systems go back to the old TLB behavior?

~~~
AndrewBissell
Yeah I was wondering this myself. Even if there's some fiddly hardware fix to
make speculative execution secure, how much of its performance gains will we
have to give up to get there?

~~~
mtanski
Speculative execution as a concept should not be flawed. My take is that the
results of illegal speculation should never be leaked in a visable way.

~~~
bloorp
As I read through the meltdown paper, it looks really difficult to have the
security we want and the performance we want at the same time. It's pretty
crazy, but here's my limited understanding:

There's a huge shared buffer between two threads. 256 * 4K. One thread reads a
byte of kernel memory, literally any byte it wants, and it then reads one of
those 4K pages from that buffer in order to cache that one memory page that
corresponds to the byte it just read. Then at some point the CPU determines
that the thread shouldn't be permitted to access the kernel memory location,
and rolls back all of that speculative execution, but the cached memory page
isn't affected by the rollback.

The other thread iterates through those 256 pages, timing how long it takes to
read from each page, and the one page that Thread A accessed will have a
different (shorter?) timing because it's cached already. It now understands
one byte of kernel memory that it shouldn't. That's just one byte but the
whole process is so fast that it's easy to just go nuts on the whole kernel
address space.

So what would the fixes be? Disable speculative execution? Only do it if the
target memory location is within userspace, or within the same space as the
executing address? Plug all of the sideband information leak mechanisms? I
dunno.

~~~
rasz
Keep a small pool of cache lines exclusive to speculative execution, discard
when non taken, rename affected cache lines (like register renaming so no
copy) when taken.

~~~
ant6n
Also, separate BTB for each process and privilege level.

~~~
rasz
Yes, this would have a bonus effect of actually gaining IPC in multi process
loads.

------
blattimwind
So this confirms the suspicion that the bug allows VM-to-VM disclosure of
memory, which would conclusively explain the rush.

------
rconti
What are the odds that the NSA already knew about this? Roughly 100%?

~~~
InclinedPlane
This is a toughie. These bugs are basically very difficult to mitigate
completely without fixes at the hardware level. One might imagine the NSA
being coy and patching their own OS's et al to the degree they can while
working to exploit the bug in the wild. However, the reality is that this bug
is almost worse for the NSA than for most other folks, because they have the
most to lose if their security is breached. And they have _a lot_ of machines
out there. The idea of a bug of this severity that leaves no traces is
probably leaving a lot of people at the NSA in cold sweats right now. Meaning
that if they did discover it before other researchers it's questionable
whether they would have tried to exploit it vs. driving towards the most rapid
possible mitigation and fix.

------
TheAlchemist
I believe most crypto exchanges are running in the cloud. What could possibly
go wrong ?

~~~
umanwizard
I just sold all my altcoins for BTC on Binance as soon as I saw this and
transferred them to gdax. Hopefully I can sell them for USD on gdax and
transfer to a real bank before they get hacked.

~~~
candl
Why would you do that? If you are concerned for the security of your coins,
you should have moved them to a wallet you own that is not hosted on an
exchange. The bank you transfer your dollars to is just as likely to get hit
by the exact same vurnerability. In addition you have to pay a fee to move
your coins, then to wire the dollars to your bank account. Moving from crypto
to fiat is also liable to taxation. If the sole goal is to secure your coins
then I don't think that the whole process is worth the hassle. Moving them to
a private wallet would suffice.

~~~
mwgalloway
The majority of coins on Coinbase are in cold-storage and crypto on Coinbase
is insured against this type of breach. I personally wouldn't panic to get my
coins out.

~~~
user5994461
There was an announcement not long ago saying they are not insured.

~~~
mwgalloway
Can you reference that? The CoinBase support page currently says crypto is
insured against security breaches on their end.

~~~
umanwizard
Crypto in their hot wallets, or crypto in cold storage?

------
ateesdalejr
So, basically CPUs will read instructions inside a branch even if the branch
is eventually going to evaluate to false. Does the CPU do this to optimize
branch instructions? The results of instructions that are executed ahead of
time are stored in a cache. How exactly does this exploit read from the cache?
I understand it uses timing somehow but I'm not quite sure exactly how that
works. (I mostly do software.)

~~~
rocqua
It's a timing attack against the cache. The speculative execution might need
to do a read, which means something would need to be evicted from the cache.
This makes a subsequent read against that evicted adres slower.

This way you can detect things based on speculative execution. I don't know
how they go from that to reading memory though.

~~~
pwg
> I don't know how they go from that to reading memory though.

That was the second bit of the example source code:

unsigned long index2 = ((value&1)*0x100)+0x200;

This creates one of two different addresses, depending upon the value of bit
zero of the memory location being attacked. The two different addresses are
farther apart than the size of a cache line.

> unsigned char value2 = arr2->data[index2];

This actually does the read from one of the two different addresses (which
results in the value located at one of them becoming resident in cache). Note
that the value returned here is a "don't care" item.

Then, after everything unwinds from the speculation, the follow on code on the
real path would read from both of the two possible addresses that were put
into "index2". The read that returns data faster must have been in cache.
Knowing which one was in cache, you now know the value of bit zero of the
target address location.

Repeat the same block of code for bits 1-7 and you'll have read a whole byte.
Continue and you can read as much as you like. You just gather data very
slowly (the article mentioned about 2000 bytes per second).

~~~
rocqua
Ah, that makes sense, thanks!

I was thinking of something similar but with a branching operation, but that
would get screwed by branch prediction.

------
AndyNemmity
First implementation I've seen on twitter.

[https://twitter.com/pwnallthethings/status/94869396135866777...](https://twitter.com/pwnallthethings/status/948693961358667777)

------
cfeeley
One of the meltdown paper writers evidently has a sense of humor since
"hunter2" [0] is one of the passwords they use in their demonstration [1]

[0] [http://bash.org/?244321](http://bash.org/?244321)

[1]
[https://meltdownattack.com/meltdown.pdf](https://meltdownattack.com/meltdown.pdf)
(page 13, figure 6)

~~~
f2f
hunter2 is the industry's accepted PoC password.

~~~
pit2
wasn't it dolphin?

~~~
swarnie_
A previous client used this, i always wondered.

------
Havoc
So what exactly are they going to do about spectre? Seems pretty unstoppable
from what I can see.

Can they disable speculative exec completely for sensitive boxes or is this
too baked in?

~~~
Filligree
There's no mitigation. We'll need new CPUs.

Meanwhile, don't ever run untrusted code in the same process as any kind of
secret. Better yet, don't ever run untrusted code.

~~~
hinkley
I wonder what fraction of data inside a kernel is really ‘private’.

Obviously we want 100% of the data in the kernel not to be writeable, but if
only a small amount shouldn’t be accessible at all then maybe the long term
solution is to handle that data in a special way. Something that makes using
_it_ slower but doesn’t make every other syscall suffer as much as a
consequence.

Or maybe the solution is to prioritize moving more and more code into
userspace.

~~~
kuschku
Well the good news is that now microkernels can take over. With KPTI (also
known as FUCKWIT), a syscall is now as expensive as a context switch to
another userland process.

Of course, that means now monolithic kernels run just as slow as microkernels.

~~~
JdeBP
I suggest some reading first, starting with (but not limited to):

* [http://blog.darknedgy.net/technology/2016/01/01/0/](http://blog.darknedgy.net/technology/2016/01/01/0/) ([https://news.ycombinator.com/item?id=10824382](https://news.ycombinator.com/item?id=10824382))

* [https://news.ycombinator.com/item?id=10483467](https://news.ycombinator.com/item?id=10483467)

~~~
kuschku
I've read them — but the important factor is that Linux with KPTI is now doing
a full context switch between userland and kernel, which is the same cost as
switching to another userland process to handle the syscall (which is exactly
what a naive microkernel would do).

I've always been a proponent of microkernels, and this is another situation
that might help with this.

(Personally, I've been affected by the failures of monolithic kernels way too
often. When a simple OpenGL or WebGL program manages to hang your GPU driver,
parts of the kernel, and all DMA operations in the kernel, and your system
becomes unusable, then reasonable isolation would be preferable)

------
wakkaflokka
Can someone more knowledgeable than me in regards to this vulnerability tell
me:

1\. How to best protect my local personal data from being subject to this?

2\. Whether I should seriously consider pulling all my cryptocurrency off of
any exchanges?

~~~
avaika
from my understanding:

1:

\- install security updates for your OS \- if it's not ready yet: disable
JavaScript in your browser by default and enable it only for resources you
trust. otherwise just skip the page. execute third party code with extra
caution. any suspicious code should go away (even not inside vm)

2: as long as it's stored in a wallet on your own hardware which you fully
control, it should be safe enough

------
zitterbewegung
So how much legal liability are they exposed to due to this security flaw?

Since this affects legacy systems that may not be able to be upgraded it seems
like this issue will be around for a very long time.

~~~
userbinator
_Since this affects legacy systems that may not be able to be upgraded it
seems like this issue will be around for a very long time._

It also only affects "legacy systems" which routinely run nontrusted code. If
it's something like e.g. a server in a bank, chances are everything running on
it has already been accounted for. This isn't like e.g. Heartbleed where you
could just connect to any open server and read its memory --- you have to
somehow get your code to run on it first.

~~~
djsumdog
Really makes the case against going to the "cloud" (using hosted VM solutions)
versus just using colocated servers running VMWare that you fully own and
administer.

------
perennate
I can't understand this paragraph from [1]:

> Cloud providers which use Intel CPUs and Xen PV as virtualization without
> having patches applied. Furthermore, cloud providers without real hardware
> virtualization, relying on containers that share one kernel, such as Docker,
> LXC, or OpenVZ are affected.

I take it to imply that hypervisors that use hardware virtualization are not
affected. However, the PoC that reads host memory from a KVM guest seems to
contradict this.

Is it because on Xen HVM, KVM, and similar hypervisors, only kernel pages are
mapped in the address space of the VM thread (so a malicious VM cannot read
memory of other VMs), but on these other hypervisors, pages from other
containers are mapped? Yet the Xen security advisory [2] says:

> Xen guests may be able to infer the contents of arbitrary host memory,
> including memory assigned to other guests.

Relatedly, what sensitive information other than passwords could appear in the
kernel memory? I'd expect that at the very least buffers containing sensitive
data pertaining to other VMs may be leaked.

[1] [https://meltdownattack.com/](https://meltdownattack.com/) [2]
[https://xenbits.xen.org/xsa/advisory-254.html](https://xenbits.xen.org/xsa/advisory-254.html)

~~~
caf
The kernel memory map generally includes the 'direct map' of all physical
memory - so, everything that is resident is potentially at risk.

------
nickysielicki
> Meltdown breaks all security assumptions given by address space isolation as
> well as paravirtualized environments and, thus, every security mechanism
> building upon this foundation.

> On affected systems, Meltdown enables an adversary to read memory of other
> processes or virtual machines in the cloud without any permissions or
> privileges, affecting millions of customers and virtually every user of a
> personal computer.

------
rtpg
Reading over this.... it sounds like ultimately the exploit in Linux still
only works thanks to being able to run stuff in the kernel context through
eBPF?

The first section states that even with the branch prediction you still need
to be in the same memory context to be able to read other process's memory
through this. But eBPF lets you run JIT'd code in the kernel context.

I guess this JITing is also the issue with the web browsers, where you end up
getting access to the entire browser process memory.

But ultimately the dangerous code is still code that got a "privilege
upgrade"? the packet filter code for eBPF, and the JIT'd JS in the browser
exploit?

So if our software _never_ brought user's code into the kernel space, then we
would be a bit safer here? For example if eBPF worked in... kernel space, but
a different kernel space from the main stuff? And Site Isolation in Chrome?

~~~
caf
No. For that attack, the code that is speculatively executed does need to be
in the target context, but that _doesn 't_ mean the code has to be attacker-
supplied (that just makes it easier).

It's also possible to use existing code in the target context as the
speculative execution path if it has the right form (and this is what P0's
Variant 2 POC does, in that case by poisoning the branch predictor in order to
make it speculatively execute a gadget that has the right form).

------
krylon
I should at first point out that I am by no definition an expert on CPU
design, operating systems, or infosec.

But I just remembered that _years_ ago the FreeBSD developers discovered a
vulnerability in Intel's Hyperthreading that could allow a malicious process
to read other processes' memory.[1]

To the degree that I understand what is going on here, that sounds very
similar to the way the current vulnerabilities work.

For a while, back then, I was naive enough to think this would be the end of
SMT on Intel CPUs, but I was very wrong about that.

So I am wondering - is this just a funny coincidence, or could people have
seen this coming back then?

[1] [http://www.daemonology.net/hyperthreading-considered-
harmful...](http://www.daemonology.net/hyperthreading-considered-harmful/)

------
makomk
The ARM whitepaper is also worth a read in terms of how it affects them and
mitigations on that platform: [https://developer.arm.com/support/security-
update](https://developer.arm.com/support/security-update)

------
KenoFischer
I'm really amazed by the simplicity of the meltdown gadget. After the initial
blog post I played with a few variants, but always got the zeroed out register
in the speculative branch. I guess what people (including me) were looking for
here was some other side channel or instruction that did not have this
mitigation in place (e.g. I had hoped a cmpxchg would leak whether the target
memory address matches the register to compare with). The shl/retry loop makes
a lot of sense if you instead assume that the mitigation was implemented
improperly and can race subsequent uops. I really can't imagine why this data
ever made it to the bypass network to be available to other uops.

------
alkonaut
I wonder if the whole thing with enormously complex CPUs requiring deep
pipelines which in turn requires complex speculation etc was a design mistake?
Is there an alternative history where mainstream CPUs are equally fast with a
dumber/simpler design?

~~~
richardwhiuk
Not that we currently know about. RISC instead of CISC is better here as it
shortens the pipeline, but even RISC processors do speculative predictions due
to the cost of waiting till a branch is fully decided.

~~~
alkonaut
What about more radically different designs? E.g Mill or others?

~~~
mike_hearn
The Mill does prediction:

[https://millcomputing.com/docs/prediction/](https://millcomputing.com/docs/prediction/)

It has to. The problem is the speed of light here, not a simple slipup by a
CPU designer.

~~~
alkonaut
So all that needs to be done is make 64GB L1 on the die...

~~~
richardwhiuk
Not possible - the physical size of 64GB (even at nm scale) means that the
time it takes for a signal to traverse it causes memory to take a long time to
access, meaning you need a L0 cache to maintain performance.

~~~
ant6n
We need to go 3d

------
intsunny
Since no one has yet posted Amazon AWS security bulletin:

[https://aws.amazon.com/security/security-
bulletins/AWS-2018-...](https://aws.amazon.com/security/security-
bulletins/AWS-2018-013/)

------
kodablah
[https://github.com/IAIK/meltdown](https://github.com/IAIK/meltdown) 404's. I
assume this is by intention? So full disclosure, but missing the code? Or is
it somewhere else?

~~~
richardwhiuk
Due to early embargo lifting, I expect not everything's been publicized yet

------
bit_logic
According to the page, Project Zero only tested with AMD Bulldozer CPUs. Why
didn't they use something based on Zen/Ryzen? It's not clear if the 3 issues
affect Zen/Ryzen or not.

~~~
Havoc
Ryzen is affected by spectre but not meltdown by the looks of it

------
Unklejoe
Just an idea that I had:

If these exploits seem rely on taking precise timing measurements (on the
order of nanoseconds), could we eliminate or restrict this functionality in
user space?

The Spectre exploit uses the RDTSC instruction, and this can apparently be
restricted to privilege level 0 by setting the TSD flag in CR4.

I know it would kind of suck, but it might be better than nothing.

I would think that most typical user applications wouldn't require that
accurate of a time measurement. If they do, then maybe they can be white
listed?

~~~
voidmain
Denying access to timers is kind of practical for browser JavaScript, and
should and will happen. But it's not practical for native processes, because
_shared memory multithreading_ provides as high precision a timer as anyone
could ask for: just increment a counter in a loop in a different thread.

In fact, the practical JavaScript attacks use this method (using
SharedArrayBuffer) and the browsers are disabling this (new, little used)
feature as a mitigation. But I'm afraid hell will freeze over before
mainstream operating systems deny userspace access to clocks, threads, and
memory mapped files, which is a lower bound on what it would take to make the
attack much harder.

------
geertj
What is the reason that Intel would allow speculative instructions to bypass
the supervisor bit and access arbitrary memory? That seems the root cause for
Meltdown.

Is it that the current privilege level could be different between what it is
now, and what it will be when the speculative instruction retires? If so then
that seems a thin justification. CPL should not change often so it doesn't
seem worth it to allow speculative execution for instructions where a higher
CPL is required.

~~~
humanjvm
IIUC, these speculative instructions respect the current supervisor bit which
was set by the previous faulting instruction.

------
cmurf
_There are 3 known CVEs related to this issue in combination with Intel, AMD,
and ARM architectures. Additional exploits for other architectures are also
known to exist. These include IBM System Z, POWER8 (Big Endian and Little
Endian), and POWER9 (Little Endian)._

[https://access.redhat.com/security/vulnerabilities/speculati...](https://access.redhat.com/security/vulnerabilities/speculativeexecution)

------
anonu
How come this wasn't discovered sooner?

It would seem to me that all the really smart people who designed super-scalar
processors and all the nifty tricks that CPUs do today - would have thought
that these attacks would be in the realm of possibility. If that's the case -
who's to say these attacks haven't been used in the wild by sophisticated
players for years now?

Seems like the perfect attack. Undetectable. No log traces.

------
Darthy
Could somebody please coin a name for this? Wikipedia currently calls it
"Intel KPTI flaw", but that is very vague. It's quite difficult to talk about
something without a simple easy-to-remember name.

Edit: has been settled, it's
[https://en.wikipedia.org/wiki/Meltdown_(security_bug)](https://en.wikipedia.org/wiki/Meltdown_\(security_bug\))
.

~~~
frio
[https://spectreattack.com/](https://spectreattack.com/) :).

------
Pyxl101
Is there any information available about whether the Linux KPTI patch
mitigates the ability to use eBPF to read kernel memory?

I'm asking because eBPF seems to execute within the kernel, and KPTI seemed to
be about unmapping kernel page table when userspace processes execute.

Are there any mitigations to the eBPF attack vector?

~~~
brendangregg
sysctl -w kernel.unprivileged_bpf_disabled=1

I use eBPF all the time, but I never use it as non-root, so I haven't needed
unprivileged bpf anyway.

update: that eBPF vector was already fixed, and another safety measure is
already being considered
[https://lkml.org/lkml/2018/1/3/895](https://lkml.org/lkml/2018/1/3/895)

------
j_coder
Isn't possible for the kernel to patch all clflush instructions when the
software is loaded to keep a circular list of all evicted addresses that would
be evicted again on the interrupt that happens when the protected address is
read? This way the the timing attack would not be possible.

~~~
koverstreet
self modifying code (which exists) would take a massive performance hit. any
time a page is marked +X, the kernel would have to mark it -W, and then on
page fault the kernel would have to check if userspace was changing something
to a clflush instruction.

oh, and x86 has variable length instructions - the same byte stream can decode
as different instructions depending on where you start - so i doubt it's
possible at all on x86 without a massive performance hit (you'd have to keep
track of every jump instruction in the entire address space...)

~~~
j_coder
You are right.

The best approach is to evict all user space pages from cache when an invalid
page access happens if the page fault was caused by the software trying to
read/write kernel space pages.

Massive performance hit but only to misbehaved software. Normal software will
not have the performance hit of the current solution.

Kernel could even switch to unmapped kernel pages solution if too many
read/write attempts.

------
zzzcpan
Does anyone know what kind of isolation still can work after all the patches?
Let's say we want to host users' processes or containers and some of them
could be pwned. I see Google claiming that their VMs are isolated between the
kernel and each other.

------
qaq
Are extensions like 1password vulnerable do they run in the same process as js
from a page?

~~~
omeid2
Process is irrelevant for both Meltdown and Spectre.

~~~
qaq
Process is relevant for what you can get to from V8 VM.

------
ionforce
Is this saying that AMD is affected? Is this the same as the Intel bug
reported earlier?

~~~
JL2010
Of the variants of the attack that can leak privileged memory, AMD is only
impacted if a non-default kernel configuration is enabled: "BPF JIT"

~~~
DannyBee
This is not correct. That's just what the POC chose to implement.

------
muxator
From
[https://meltdownattack.com/meltdown.pdf](https://meltdownattack.com/meltdown.pdf),
page 12:

> Thus, the isolation of containers sharing a kernel can be fully broken using
> Meltdown.

------
delaaxe
Can someone show me an example of JavaScript code running in a browser that
would display a password stored in kernel space?

Websites like the Guardian report that this is now the case but I don't
understand how that's possible.

~~~
mdavidn
The kernel maps itself into the address space of each process as an
optimization to increase the performance of system calls. So yes, it is
possible.

~~~
delaaxe
So, which functions would you have to call? How would you read the secrets?
You can't do any kind of pointer magic in JS (nor system calls).

------
j_coder
Looks like the information was somewhat public available since middle of the
last year on [https://cyber.wtf/2017/07/28/negative-result-reading-
kernel-...](https://cyber.wtf/2017/07/28/negative-result-reading-kernel-
memory-from-user-mode/) and
[http://www.cs.binghamton.edu/%7Edima/micro16.pdf](http://www.cs.binghamton.edu/%7Edima/micro16.pdf).
Also similar methods from 2013 paper [http://www.ieee-
security.org/TC/SP2013/papers/4977a191.pdf](http://www.ieee-
security.org/TC/SP2013/papers/4977a191.pdf) (timing side channel attacks).

Any reason for the panic now? Any know malware using it?

~~~
jpatokal
No. This was all scheduled to be released on January 9th, but the release was
sped up after people started connecting dots.

 _We are posting before an originally coordinated disclosure date of January
9, 2018 because of existing public reports and growing speculation in the
press and security research community about the issue, which raises the risk
of exploitation._

[https://security.googleblog.com/2018/01/todays-cpu-
vulnerabi...](https://security.googleblog.com/2018/01/todays-cpu-
vulnerability-what-you-need.html)

~~~
j_coder
I know it was scheduled but the information on the links are public and prior
to the scheduled disclosure. A hacker could figure out the problem by reading
the available information before the Google Project Zero.

------
bung
Will patches for this eventually trickle down to things like LineageOS?

~~~
phoe-krk
LineageOS is based on official Android sources. The moment the official
Android kernel is patched, LineageOS will use the patch.

~~~
londons_explore
I like the way you say _moment_ as if the android kernel is some single thing
not a hodgepodge of hundreds of different kernels across tens of companies.

------
DarronWyke
Thanks to incidents like these, I'm very happily employed. One of the perks of
working in infosec.

I hereby nominate 2018's song to be Billy Joel's _We Didn 't Start the Fire_.

------
trendia
Does this vulnerability affect Linux only, or any operating system?

~~~
saemil
The issue is with the chip. So, it should impact any OS running on the chip.
This would include Windows as well as MacOS running on Intel chips.

------
Splendor
Do we know how news of this got out before the disclosure date?

~~~
NelsonMinar
See this blog post, which is some very informed speculation based on public
Linux kernel patch activity.
[http://pythonsweetness.tumblr.com/post/169166980422/the-
myst...](http://pythonsweetness.tumblr.com/post/169166980422/the-mysterious-
case-of-the-linux-page-table)

~~~
fishywang
I couldn't find it in the blog post or the Compute Engine Security Bulletin,
does anyone know which version of Linux Kernel contains the mitigation?

~~~
blattimwind
4.14.11 is the only stable kernel as of this writing.

------
evibeefi
This sounds really bad. I wonder: Will this have major implications on
consumers other than slowed down devices?

------
gruez
does this mean the embargo is lifted?

~~~
ipsin
[https://security.googleblog.com/2018/01/todays-cpu-
vulnerabi...](https://security.googleblog.com/2018/01/todays-cpu-
vulnerability-what-you-need.html)

Yes, this explains why it was lifted.

------
ebonassi
Should we start to think seriously to adopt homomorphic encryption on
virtualized environments?

------
Havoc
No wonder they were rushing this.

------
jasonlfunk
As a side topic, are we really in a place that even vulnerabilities need
branding and websites?

~~~
simias
Why not? Those big security vulnerabilities are going to be discussed in years
to come, might as well come up with something a little more catchy than
CVE-2017-5753. I guess they could've gone with more descriptive names.

At least "spectre" and "meltdown" will be memorable even for non-technical
people (who should probably be aware of the issue even if they don't
understand the technical details). "Bounds check bypass" and "branch target
injection" probably sound like random words stringed together for most people.

------
hollerith
Thanks again to the geniuses who arranged things so that almost anyone can
write code that I must run just so I can use the internet to find and to read
public documents

(unless I undergo the tedious process of becoming a noscript user or something
similar).

------
solotronics
best for now to get your crypto coins off the exchanges if you have them there

------
swampthinker
"Testing also showed that an attack running on one virtual machine was able to
access the physical memory of the host machine, and through that, gain read-
access to the memory of a different virtual machine on the same host."

Holy shit.

~~~
static_noise
This basically kills cloud computing for anything sensitive using shared
hardware. In the short term this will actually be good for cloud providers
because the demand for dedicated instances will shoot up as there is no short-
term alternative.

~~~
untog
The short term answer is to patch the servers and swallow the 30% performance
cut. Still likely cheaper than dedicated servers.

~~~
djsumdog
Which could mean huge sales for Intel, or even AMD, if Amazon, DigitalOcean,
Linode and others want to rush to get that lost performance back.

Going to AMD would be incredibly expensive as you'd be replacing nearly
everything, but if Intel gets new chips out in a reasonable amount of time,
they might actually make a killing on this.

------
azurezyq
link for details for that from Project Zero:

[https://googleprojectzero.blogspot.com/2018/01/reading-
privi...](https://googleprojectzero.blogspot.com/2018/01/reading-privileged-
memory-with-side.html)

~~~
AnimalMuppet
Interesting. Quoting a fair-sized chunk for context:

> So far, there are three known variants of the issue:

> Variant 1: bounds check bypass (CVE-2017-5753) > Variant 2: branch target
> injection (CVE-2017-5715) > Variant 3: rogue data cache load (CVE-2017-5754)

> During the course of our research, we developed the following proofs of
> concept (PoCs):

> A PoC that demonstrates the basic principles behind variant 1 in userspace
> on the tested Intel Haswell Xeon CPU, the AMD FX CPU, the AMD PRO CPU and an
> ARM Cortex A57 [2]. This PoC only tests for the ability to read data inside
> mis-speculated execution within the same process, without crossing any
> privilege boundaries.

> A PoC for variant 1 that, when running with normal user privileges under a
> modern Linux kernel with a distro-standard config, can perform arbitrary
> reads in a 4GiB range [3] in kernel virtual memory on the Intel Haswell Xeon
> CPU. If the kernel's BPF JIT is enabled (non-default configuration), it also
> works on the AMD PRO CPU. On the Intel Haswell Xeon CPU, kernel virtual
> memory can be read at a rate of around 2000 bytes per second after around 4
> seconds of startup time. [4]

> A PoC for variant 2 that, when running with root privileges inside a KVM
> guest created using virt-manager on the Intel Haswell Xeon CPU, with a
> specific (now outdated) version of Debian's distro kernel [5] running on the
> host, can read host kernel memory at a rate of around 1500 bytes/second,
> with room for optimization. Before the attack can be performed, some
> initialization has to be performed that takes roughly between 10 and 30
> minutes for a machine with 64GiB of RAM; the needed time should scale
> roughly linearly with the amount of host RAM. (If 2MB hugepages are
> available to the guest, the initialization should be much faster, but that
> hasn't been tested.)

> A PoC for variant 3 that, when running with normal user privileges, can read
> kernel memory on the Intel Haswell Xeon CPU under some precondition. We
> believe that this precondition is that the targeted kernel memory is present
> in the L1D cache.

If I'm reading this right, then the only POC that works against ARM is the
first one, which lets you read data _within the same process_. Not too
impressive. (Yes, I know that I'm reading into this that they tried to run all
the POCs against all the processors. But the "Tested Processors" section lower
down leads me to believe that they did in fact do so.)

The third and fourth POC seem to be Intel-specific.

~~~
makomk
The paper from the other people who discovered this says the same thing: "We
also tried to reproduce the Meltdown bug on several ARM and AMD CPUs. However,
we did not manage to successfully leak kernel memory with the attack de-
scribed in Section 5, neither on ARM nor on AMD." The general purpose attack
that leaks kernel memory, the one that KAISER fixes, only seems to work on
Intel CPUs. Intel's press release was misleading.

~~~
AnimalMuppet
Well... reading further, below the details of the third POC, they say "Our
research was relatively Haswell-centric so far. It would be interesting to see
details e.g. on how the branch prediction of other modern processors works and
how well it can be attacked."

So it seems like they tried it on AMD and ARM, but they tried much harder on
Intel. That's less reassuring than my initial reading.

------
feelin_googley
In 1-2 words, IMO, the problem is "over-optimisation".

It is perhaps beneficial to be using an easily portable OS that can be run on
older computers, and a variety of architectures.

Sometimes older computers are resilient against some of todays attacks _to the
extent those attacks make assumptions about the hardware and software in use_.
(Same is true for software.)

When optimization reaches a point where it exposes one to attacks like the
ones being discussed here, then maybe the question arises whether the
optimization is actually a "design defect".

What is the solution?

IMO, having choice is at least part of any solution.

If _every user is effectively "forced" to use the same hardware and the same
software_, perhaps from a single source or small number of sources, then that
is beneficial for those sources but, IMO, counter to a real solution for
users. Lack of viable alternatives is not beneficial to users.

------
pjf
More details at [https://googleprojectzero.blogspot.com/2018/01/reading-
privi...](https://googleprojectzero.blogspot.com/2018/01/reading-privileged-
memory-with-side.html)

------
masterleep
I wonder what this sentence in the Google product status page
([https://support.google.com/faqs/answer/7622138](https://support.google.com/faqs/answer/7622138))
means, particularly what the inter-guest attack refers to:

"Compute Engine customers must update their virtual machine operating systems
and applications so that their virtual machines are protected from intra-guest
attacks and inter-guest attacks that exploit application-level
vulnerabilities"

~~~
arianvanp
What I understand is that the hypervisor of GCE has been patched already and
so some customer running on the same machine as you can't exploit you. However
if you are running KVM or something yourself on a Cloud instance (vm in a VM)
then you should patch that.

------
shaklee3
Intel has released a statement for the codename Meltdown bug:

[https://newsroom.intel.com/news/intel-responds-to-
security-r...](https://newsroom.intel.com/news/intel-responds-to-security-
research-findings/)

~~~
AndyNemmity
Again conflating the issue to include AMD. This feels so disingenuous.

------
infinity0
> We have some ideas on possible mitigations and provided some of those ideas
> to the processor vendors; however, we believe that the processor vendors are
> in a much better position than we are to design and evaluate mitigations,
> and we expect them to be the source of authoritative guidance.

Intel: "Recent reports that these exploits are caused by a “bug” or a “flaw”
[..] are incorrect."

So much for "authoritative guidance", fuck these guys.

~~~
Someone1234
Arm also claims it is working as intended:

> Arm recognises that the speculation functionality of many modern high-
> performance processors, despite working as intended, can be used in
> conjunction with the timing of cache operations to leak some information as
> described in this blog.

I personally don't agree, but I guess they're trying to avoid needing to issue
a recall for over ten years worth of CPUs?

~~~
wahern
Then surely you must also argue that all data-dependent, side-channel attacks,
such as key recovery attacks against some cryptographic algorithm
implementations, are the fault of the hardware.

Unlike Intel, ARM and AMD are implicated only where the attacker can inject
code or data (specifically data that is manipulated by pre-existing vulnerable
code) into the target address space. The particular kernel exploits require
injection of a JIT-compiled eBPF program, as they said they were unable to
locate any suitable gadgets in existing compiled kernel code. I wouldn't rule
out gadgets being found in the future, but much like cryptographic software
timing attacks, the proper fix is to refactor sensitive software logic to be
data independent. There's no way to implement an out-of-order, superscalar
architecture and protect against this stuff simply because of the nature of
memory hierarchies. All you can do is 1) ensure that privilege boundaries are
obeyed (like AMD and ARM do, but Intel notable doesn't), and 2) provide
guaranteed, constant-time instructions that programmers and compilers can
reliably and conveniently leverage. Unfortunately, all the hardware vendors
have sucked at providing #2 (much timing resilient cryptographic software
relies on implicit, historical timing behavior, not architecturally guaranteed
behavior), but it nonetheless still requires cooperation by software
programers, making it a shared burden.

Also, FWIW, basically everybody outside the Linux echo chamber has known that
eBPF JIT and especially unprivileged eBPF JIT was a disaster waiting to
happen. This is only the latest exploit it's been at the center of, and the
2nd in as many months. The amount of attention and effort that has gone into
securing eBPF is remarkable, but at the end of the day even if you could
muster all the best programmers for as much time as you wanted it's still an
exceptionally risky endeavor. Everything we know about the evolution of
exploits screams that unprivileged eBPF JIT is an unrelenting nightmare. But
it's convenient, flexible, and performant, and at the end of the day that's
all people really care about, including most Linux kernel engineers. The
nature of the Linux ecosystem is that even if Linus vetoed unprivileged eBPF
JIT (optional or not), vendors would have likely shipped it anyhow. It's an
indictment of the software industry. Blaming hardware vendors (except for the
Intel issue) is just an excuse that perpetuates the abysmal state of software
security.

~~~
cthalupa
>The particular kernel exploits require injection of a JIT-compiled eBPF
program, as they said they were unable to locate any suitable gadgets in
existing compiled kernel code

Did they say that?

I don't see anything saying they were unable to, just that they didn't bother
to because it would take effort.

>But piecing gadgets together and figuring out which ones work in a
speculation context seems annoying. So instead, we decided to use the eBPF
interpreter, which is built into the host kernel - while there is no
legitimate way to invoke it from inside a VM, the presence of the code in the
host kernel's text section is sufficient to make it usable for the attack,
just like with ordinary ROP gadgets.

~~~
caf
The quote you have is for exploiting Variant 2, the post above yours was
talking about Variant 1. For Variant 1, the authors say:

 _To be able to actually use this behavior for an attack, an attacker needs to
be able to cause the execution of such a vulnerable code pattern in the
targeted context with an out-of-bounds index. For this, the vulnerable code
pattern must either be present in existing code, or there must be an
interpreter or JIT engine that can be used to generate the vulnerable code
pattern. So far, we have not actually identified any existing, exploitable
instances of the vulnerable code pattern; the PoC for leaking kernel memory
using variant 1 uses the eBPF interpreter or the eBPF JIT engine, which are
built into the kernel and accessible to normal users._

For Variant 1, the "vulnerable code pattern" they're looking for has to be of
a very specific type, it's not a run-of-the-mill gadget. It has to load from
an array with a user-controlled offset, then mask out a small number of bits
from the result and use that as an offset to load from another array, where we
can then time our accesses to that second array.

However, they also go on to say:

 _A minor variant of this could be to instead use an out-of-bounds read to a
function pointer to gain control of execution in the mis-speculated path. We
did not investigate this variant further._

Which seems much less reassuring.

~~~
cthalupa
Gotcha! Thanks for setting me straight here.

------
erikb
someone should honestly do a press release like "Intel Bug not actually Intel
only" or give this thing a neutral name to search for.

~~~
AndyNemmity
Variant 2 and Variant 3 are Intel only. They are the most concerning as they
break VM space.

------
dgomesbr
Great, embargo was in and google went ahead disclosing and saying hear we're
here disclosing this (because they've patched)

~~~
londons_explore
3 or 4 people had bits of demo code up on twitter earlier today.

I implemented it myself simply based on the clues in the press release from
AMD explaining why they weren't vulnerable. I don't even have a computer
security background.

~~~
contrarian_
So the vulnerability likely isn't something nobody thought of, it's just that
nobody seriously expected the CPU vendors to make the mistake of speculating
across multiple loads and actually leaving observable modifications in the
caches.

Note that even speculating across multiple loads could lead to observable
side-effects by measuring memory bandwidth to differentiate between loads of
accessible and silent page fault addresses. [1]

An interesting question is whether the CPU would also speculate on loads from
mapped PCI device regions, as that could be also detectable in many different
ways.

[1]
[https://eprint.iacr.org/2016/613.pdf](https://eprint.iacr.org/2016/613.pdf)

> Both hardware thread systems (SMT and TMT) expose contention within the
> execution core. In SMT, the threads effectively compete in real time for
> access to functional units, the L1 cache, and speculation resources (such as
> the BTB). This is similar to the real-time sharing that occurs between
> separate cores, but includes all levels of the architecture. [...] SMT has
> been exploited in known attacks (Sections 4.2.1 and 4.3.1)

------
kbwt
The papers take a while to get to the point. I nearly fell asleep re-reading
the same statements until they got to the point: speculative execution of
buffer overflows.

Could have been said more concisely. Sadly, this seems to be the norm with
academic texts.

~~~
pacavaca
It gives all the required context, much needed for an "average" engineer to
understand it. Without that, most of the people, except the microchip
engineers, would have to read about the related topics first anyways. I
personally was surprised at how understandably everything was explained.

