AMD processors are not subject to the types of attacks that the kernel
page table isolation feature protects against. The AMD microarchitecture
does not allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode
when that access would result in a page fault.
Disable page table isolation by default on AMD processors by not setting
the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI
My wild guess is that you can read a good portion (if not all) of the memory (or a significant subset or trace of it) of the whole computer from unprivileged userspace programs.
but (as he notes elsewhere) there are plenty of other channels in an Intel processor for information to leak...
Note that under Linux x64 IIRC the whole physical memory is mapped in kernel pages... playing with some adjusting variables, if this theory is correct, I don't see why we could not read all of it. Might be the same under Windows.
I've not checked in depth yet but it could match with all the technical facts we have: very important bug for which the semi-rushed workaround with high perf impact will be backported; affect general purpose OSes but IIRC does not affect some hypervisors (I guess they already do not map at all the pages of other systems while one is running), does not affect AMD (or maybe at least not this way and KPTI can not fix it for them) because of their microarch, involves data leak.
That was known since pentium 3 times, I wonder why nobody thought of this as a wonderful exploit target before
>I'm 90% convinced there is no way Intel managed to close all the side-channels on such a complex architecture
This is why one should not use verilog and run formal validation of all and everything related to hdl code
Tooling support for VHDL is lacking thought, any big company with in house formal verification must have their own tooling...
Ideally, a much closer to complete verification should be done in addition to simulation, like a mathematical proof that register content will never be like set A if inputs are set B
The top comment said:
> This basically adds another set of tools to the architectural-level attack toolbox. From reading this I expect we'll see some interesting developments in the future.
And of course there's an obligatory comment beginning with "this cannot possibly work... "
its a good point about TLB and VIPT but I don't think this closes the whole class of potential issue; if too-much speculative execution is performed (even just a little bit, even just when you use some obscure instructions) on anything data depend that has been speculatively loaded before privilege checking, I guess its possible to recover the data.
Edit: switched to link without AD cancer
Something like this: https://cyber.wtf/2017/07/28/negative-result-reading-kernel-...
- edit - I checked Ryzen ISA, they support AVX2. So probably AVX512 is the issue.
 I do hope they rename that flag to something like X86_BUG_NEEDS_PTI as the current name is bloody abysmal: way too broad and ambiguous, as if this is the only insecurity that ever has or ever will impact x86.
Edit: AWS Spurious Budget Email Barrage: https://www.reddit.com/r/aws/comments/7ndvli/anybody_get_spu...
I had a lingering account that cost me 3 cents a month. All of a sudden I get an email freaking out about all these alarm thresholds I'm blowing past. I just outright closed my entire account because trying to delete the offending S3 entities would fail without an error message.
Some discussion can be found on reddit https://www.reddit.com/r/sysadmin/comments/7ndlk7/aws_anyone...
nope, still empty table, deleted it and went to bed. glad I wasn't the only one who got that.
I wasn't sure what was going on so I nuked everything cancelled my account and will hope for the best.
One tiny Ubuntu instance I barely used and then put on standby. Then terminated.
Radioactive decays and cosmic particles flipping bits give an upper bound for reliability. You are not going to see low-background packages and rad-hard chips in your iPhone.
If it works 99.9999999999%, then it has a failure rate of 0.0000000001%, or 1E-12. Considering that a modern CPU executes approximately 1E9 operations per second, and that regular HDDs have a worse-case BER of 1 in 1E14 bits, 1E-12 is actually rather horrible and the actual error rate of computer hardware is much better than that.
Imagine if a CPU calculated 1+1=3 every 1E12 instructions. At current clock rates, that's a fraction of an hour. Computers simply would not work if CPUs had such an error rate.
I picked the 1E-12 number arbitrarily, but it's quite illustrative of the reliability computers are expected to have, despite their flaws.
Amazon is hosting life-safety applications in EC2. Commodity x86 hardware is grossly negligent for that environment.
Of course, you’ll also find a tendency there towards specified hardware. They bend or break it to use COTS x86 machines, but—as I think I heard from a comment here last week—nearly nobody ever specified wanting AMT in the initial design, so it’s pretty weird that we’re all buying and deploying it.
I wonder if home systems are equally vulnerable, or if there is something about data center system design or facilities that make them more susceptible?
I ask because I had a couple of home desktop Linux boxes once, without ECC RAM, that were running as lightly loaded servers. I ran a background process on both that just allocated a big memory buffer, wrote a pattern into it, and then cycled through it verifying that the pattern was still there.
Based on the error rates I'd seen published, I expect to see a few flipped bits over the year (if I recall correctly) that I ran these, but I didn't catch a single one.
Later, I bought a 2008 Mac Pro for home, and 2009 Mac Pro for work (I didn't like the PC the office supplied), and used both of those to mid 2017. They had ECC memory, and I never saw any report when I checked memory status that they had ever actually had to correct anything.
So...what's the deal here? What do I need to do to see a bit flip from radiocative decay or cosmic rays on my own computer?
Personal machines are typically limited by what your senses can handle. There are few of them for starters. They idle a lot. If many pieces failed inexplicably it's not likely to be something you are personally paying attention to with your senses.
(I have personally observed ram and disk failures on personal machines anyway. And I have seen stuff in my dmesg indicating hardware faults on my personal desktops, but rarely in a way that I notice in actual use not looking at dmesg.)
I was told once that today's concrete has a much higher background radiation than brick and mortar from before the 50s. There is also more steel in data centres.
However, I'm not at all sure if background radiation of building materials is even in the right order of magnitude to matter here. Probably not.
The reason the modern stuff is more radioactive is the massive number of atmospheric nuclear weapons tests conducted starting in 1945. I imagine the concrete has the same issue.
This is what I read online. A typical hardware components is enclosed in a aluminum case so really gamma radiation is of concern right?
I don't know if the sort of radiation they give out is of much risk to computers however.
This isn't a probability question. I don't understand why it's being discussed as if it were.
The fact that DRAM older than a few years is effectively immune to RH suggests it is possible to manufacture such. Yes, it will cost more, but I think many would be willing to pay for it like they used to, for none other than the assurance of having more reliable memory.
ECC helps, and can be done at full rate, but isn't a complete solution for all possible problems. And anything you do in hardware at full RAM speed is expensive.
Also, DDR3/4 DRAM is glacially slow in latency terms, it's far from clear that there would be appreciable slowdown. There are already big latency compromises in the standardized JEDEC protocols that are not inherent in DRAM - it would be very two-faced for DRAM vendors to say they'll only trade off latency over backward compatibility or tiny cost savings, but not over correctness.
There are many types of users and some don't care if it fails in a blue moon. Hell 99% might be good enough if the price was right.
If this last one can defeat ASLR, imagine leaking bit by bit from a co-hosted VM, to extract secrets of other cloud customers… This is the reason anyone serious about cloud security will reserve instances so that they won't share physical hardware with other customers (think EC2 dedicated instances).
Also, if you're interested in this type of things: Armv8.4-A adds a flag … indicating that you want the execution time of instructions to be independent of the data.
Now the primary source seems to have been edited(why?)… But webarchive still has it:
Data Independent Timing
CPU implementations of the Arm Architecture do not have to make guarantees about the length of time instructions take to execute. In particular, the same instructions can take different lengths of time, dependent upon the values that need to be operated on. For example, performing the arithmetic operation ‘1 x 1’ may be quicker than ‘2546483 x 245303’, even though they are both the same instruction (multiply).
This sensitivity to the data being processed can cause issues when developing cryptographic algorithms. Here, you want the routine to execute in the same amount of time no matter what you are processing – so that you don’t inadvertently leak information to an attacker. To help with this, Armv8.4-A adds a flag to the processor state, indicating that you want the execution time of instructions to be independent of the data operated on. This flag does not apply to every instruction (for example loads and stores may still take different amounts of time to execute, depending on the memory being accessed), but it will make development of secure cryptographic routines simpler.
The scope seems limited to ALU, so not really related to the TLB thing we have here. Also, it's still very far away, I'm not sure its predecessor Armv8.3-A is even shipping to customers yet.
TLB flushes for syscalls would be absolutely brutal for many performance-critical applications.
(Unless I have entirely the wrong end of the stick about this?)
As far as I know PCID hidden entries in the TLB result in the same page fault as non-existent entries, so the page fault handling becomes constant timed.
I dunno, it sounds like it might be easiest to go ahead and backport PCID along with these patches. It touches a lot of the same code, so trying to split it out might just create more problems.
FWIW, the compile flag that Gentoo enabled activates a seriously busted GCC feature, and I'm a bit surprised that Gentoo gets away with it in user code.
For more on the considerations that underpin stack probing, see http://jdebp.eu./FGA/function-perilogues.html#StackProbes for starters.
Regardless, it seems like RMW (or even just a store) should be fine as long as the stack pointer is adjusted before the probe.
Edit: nevermind. It's the -fstack-check flag mentioned in the other comments.
So yes, It's a terminally broken compiler from hell. I assume gentoo
has applied some completely broken security patch to their compiler,
turning said compiler into complete garbage.
The patch to "fix" it is explicitly disabling -fstack-check for the kernel build. I believe that will go out in 4.14.11 (it is not in 4.14.10).
"Tests show that simple ECC solutions, providing single-error correction and double-error detection (SECDED) capabilities, are not able to correct or detect all observed disturbance errors because some of them include more than two flipped bits per memory word"
It's a hardware level thing. Essentially, when you start rapidly flipping a single bit, that starts to 'leak' some current to the adjacent physical bits. This then allows you to flip a single bit. Especially if you can control bits on both sides of your target.
It's like you are using the bits you can control to 'simulate' an actual stray cosmic ray.
I'm due to refresh my gaming PC, and I was going to go with Intel again as they've not been a problem. However, if Intel chips are going to incur the same 5% - 50% performance hit on Windows, I might end up investing in AMD hardware instead.
How big the impact I will be, I don't know - but I wouldn't be surprised if it was a couple percent (effectively ruining the single-threaded performance boost Intel has in gaming over AMD before accounting for overclocking).
(Or maybe they already do as much as possible in userspace and then batch kernel calls?)
Is it linked and if so does that mean AWS have already patched?
A decade ago everyone knew that shared hosting was for hobby sites and stuff that didn't really matter.
Maybe some more people will learn that lesson.
If you do the math like “we have 1000 cores and 2048Gb of RAM and 10Tb of RAID’ed SSD” and then plug that in to the GCP calculator... it’s going to be at minimum 1.5-2x your bare metal cost.
That’s not even including bandwidth which is pretty much free at bare metal hosts unless you’re doing a lot of egress.
The calculus changes when you realize that you’re over-provisioned on the bare metal side for a variety of reasons: high availability, “what if”, future growth that’s more medium term than short, etc.
Then you scale back the numbers you’re plugging into the calculator and things are still expensive but now within reason.
Couple that with things like global anycast region aware load balancer, firewalls (an in-line 10GigE highly available firewall costs a lot of money), ability to spin up hundreds of cores in 5 seconds and the value proposition becomes clearer.
It still depends on your work load, but there’s a lot more to consider than just straight up monthly cost.
I use GCE for DNS, Storage, CDN (for fronting storage backed files), dynamic workloads that can run on preemptible instances, and scalable instances to serve published static content, but I use dedicated servers for databases, elasticsearch, redis, and application servers fronting those things.
We keep looking at GCP waiting for the pricing to make sense and still trying to figure out how people run low latency Postgres on there. :)
(I work on GCE)
> It is just as easy to automate [..]
It's really not. As someone who's done provisioning automation at 2 companies, this is hard. Hardware is difficult, every new generation of hardware introduces new challenges in the provisioning and the more hardware configurations you need to support (and different vendors, all kinds of PCI plug-in cards etc), the more likely things go wrong. It takes a full team to build, maintain and debug this. It takes a couple of hours to build a GUI that calls the GCP API's to provision an instance for you, assuming you even need to do this instead of just using the Cloud Console directly. Sure, you pay for it, but now you have 4-10 engineers freed up to do something that provides actual value to your business.
> [..] if you plan well [..]
If. But that's really hard. Capacity planning and forecasting is complicated and the smaller a player you are, the harder it'll be for you to get a decent vendor contract with significant discounts and to be able to adjust and get to hardware quickly outside of your regularly forecasted buy-cycle. On the other hand, it's not your issue in the cloud. You request the resources and as long as you have the quotas, you'll get it (with rare exception).
> [..] and more secure [..]
I severly doubt that. In most cases, though you can host your stuff in certified DC's you'll still be in a colocation facility. Most cloud providers have their own buildings or rent complete buildings at a time. No one else but them has access to those grounds. Aside from that, take a look at what Google for example does on GCP to ensure that their code and only their code can boot systems, how they control, sign and verify every step of the boot process. I've yet to see anyone do that and I doubt most companies that do bare metal have even thought of this or have the knowledge to even execute on this.
Aside from all of this, cloud isn't competing with just providing you compute. VM's (GCE, EC2) is just the onboarding ramp. The value is in all the other managed services they offer that you no longer need to build, maintain, scale and debug (global storage and caching primitives, really clever shit like Spanner or Amazon RDS/Aurora, massively scalable pub/sub and load balancing tiers, autoscaling, the ability to spawn your whole infrastructure or your service on a new continent to serve local customers in a matter of minutes etc). If all you're using cloud providers for is as a compute provisioning layer, then you're doing it wrong.
Yes, but you will hit all the same problems with different hardware generations, different configs with different limitations, etc. If anything GCE and AWS have more complex offerings than most bare metal hosts. And you have all the same maintenance issues as you run stuff over time and hardware and software updates get released.
> Capacity planning and forecasting is complicated
AWS and GCE certainly don't make it easier. And if you can't capacity plan accurately on cloud and take advantage of spot pricing and auto-scaling then you will be paying 10X price, which describes most smaller players.
> I severly doubt that [bare metal is more secure]
I am saying that shared hosting is fundamentally insecure. No matter what else you do, if you let untrusted people run code on the same server that is a huge risk that assumes many, many layers of hardware and software are bug free.
> cloud isn't competing with just providing you compute
I agree on this. But not all of those services work as well as advertised either.
I haven't hit any issues with hardware generations. At worst what I've had to do is blacklist a GCP zone b/c it misses an instance type I need. In most cases I don't need to care and images that can boot are provided and maintained by the respective cloud provider, so you can build on top of that. I don't need to source or test components together, or spend hours figuring out why this piece of hardware isn't working well with that one. Or why this storage is slower than the other disk with the same specs from a different vendor. I don't need to lift a finger or deal with any hardware diversity issues, I just do an HTTP POST and less than a minute later I have an GCE instance available to me. Though in most cases I don't even do that, I just instruct GKE to schedule containers for me. I also don't need to worry about any hardware renew cycles, deal with failing hardware, racking and expansion of my DCs and what not.
The reason GCP and AWS have more complex offerings is b/c they can afford to provide it. Due to their scale they can shoulder the complexity of letting you chose from a vast array of different hardware configurations, which usually also results in better utilisation for them. Most people can't, which is why bare metal host options are much more constrained. And as a consequence why a lot of resources are wasted b/c it's especially hard to find someone supporting small instance types for just bare metal.
> AWS and GCE certainly don't make it easier.
To me they do. I don't need to deal with the hardware. I don't need to plan buying cycles, account for production cycles and chip releases by manufacturers and factor in how that's going to affect supply, or how an earthquake in Taiwan will make it prohibitively expensive for me to get the disk type I normally want to. I still need to do capacity planning, but I can tolerate much bigger fluctuations in those, and people's usage patterns, in the cloud than I can on bare metal. Unless I want to have hundreds of machines sitting idle, just in case I might need them.
But the best thing is, if I get it wrong in the cloud, I can correct, in a matter of minutes if I want to. Too big instance types? OK, I'll spin up smaller ones, redeploy and tah-dah my bill goes down. Sure you could do that on bare metal, assuming you can even get to a right/small enough instance type, but it's far from this easy in most cases.
> And if you can't capacity plan accurately on cloud and take advantage of spot pricing and auto-scaling then you will be paying 10X price, which describes most smaller players.
But then we're back down to trying to use the cloud just for compute, which is not what you should be doing and not where the value of a cloud offering comes from.
> I am saying that shared hosting is fundamentally insecure.
Though that's definetly true security isn't black or white, it's not secure vs. insecure. Something that you might consider an unacceptable risk (theoretical or practical) might be entirely fine for someone else. There are definetly cases in which this would be of major concern, but for most people it really isn't. Aside from that, as both hardware designs are changing and software mitigations are deployed we're able to achieve stronger and stronger isolation. Eventually, for all intents and purposes, this will be solved.
> if you let untrusted people run code on the same server that is a huge risk that assumes many, many layers of hardware and software are bug free.
This sitll holds true even if you only let your people run code on the same instance (unless you're also only running a single process/app per server?). It becomes a bit more problematic but there's also a lot more research in this area going on than a few years back. We're discovering issues, sure, but we're also getting better and better at mitigating them.
> But not all of those services work as well as advertised either.
True. Every cloud provider could do better. But then, I'd like to see anyone attempt and succeed at what AWS, Google and Microsoft (or smaller shops like Digital Ocean, Rackspace) etc are doing, at their scale and with a staggeringly diverse portfolio of services and high SLAs. All taken care of for you, so you can actually assemble their primitives into useful things for your business, instead of needing to spend months and multiple teams to build the building blocks in the first place (and then also the cost of continued development and maintenance of these capabilities, and of course adding more and more of these capabilities yourself as your organisation's needs evolve).
Cloud vendors don't guarantee colocation of your resources unless you specifically arrange that. And for that matter, often you specifically don't want co-location, because you want redundancy and migration.
Does the author refer to ReactOS, or has Microsoft really open-sourced parts of the NT kernel?
Hell there's still 20 year old code running in the Linux kernel
Please don't buy into the idea that embargoes and coordinated disclosure are sacred. They tend to just reinforce existing power structures, sometimes in an unethical (or at least unfair) way.
I'd expect the incentives to be a bit more complicated than that, and I'm also a bit skeptical that either is all that good of a solution. I'd also like to see how exactly "proactive" and "reactive" are being used here, is it about push vs pull for vulnerability notifications, or about hiring their own security researchers, or... ?
They're an attempt to minimize harm, by getting things patched while minimizing information leaked to blackhats.
Just because giving preference to groups with a better reputation and more market share isn't "fair", doesn't mean it's automatically wrong. Now, if you can show that it actually doesn't help . . .
The faster the information gets out the better.
People are willing to make these assumption because they correspond closer to reality.
Prior to disclosure there will always be the possibility that some users are being exploited without their knowledge.
Therefore disclosure always improves the situation by giving those who could have been exploited without their knowledge the choice to take mitigation steps.
All of the "responsible disclosure" nonsense is just PR by companies who want to avoid the most obvious mitigation, which is for customers to stop using their products.
And what are those mitigation steps?
At every step along the way, there's been a choice of "Well, we could own the hardware and incur overhead costs, or we could trust someone else and pay our share of lower overhead. It'll mean giving up some control, but it'll save us a few bucks."
Or maybe it goes like "Well, we could develop with practices that result in more robust code, but we'd be slower to market."
There's definitely a sidetrack of "If we crank up the clock too much, all sorts of things get wibbly and we can no longer guarantee that the outputs match the inputs, but we don't actually have ways of doing it correctly at these speeds. The press will slam us if we don't keep pace with Moore's law, how could we launch a product with only marginal speed gains?"
And pretty often I think it sounds like "The ops staff says they're overworked and we need to add people or we risk an incident, but Salesman Bob says we can actually fire most of them if we put our stuff in BobCloud."
At every step along the way, someone made a conscious choice to do the insecure thing. The folks with their eye on security were dismissed as naysayers, and profit was paramount. And because these practices became so common, they became enshrined in market norms and expected overhead costs.
> A provider can't dedicate hardware for every single customer.
A provider absolutely could dedicate hardware for every single customer, that's literally how every provider operated before virtualization. It just wasn't as profitable as virtualization.
The story of the Three Little Pigs was supposed to teach about the importance of robust infrastructure. Nobody should be surprised when the wolf shows up. And every pig had the choice to build with sticks or bricks, it would just take more work or cost more.
And I see your message as saying "Are you serious? Build with something other than straw?! But we already own so much straw! All the pigs have straw houses, won't someone think of the pigs?"
Meanwhile the bankrupt brick vendor's assets have been auctioned off, and the wolves are salivating.
Not everyone is all in on cloud infrastructure. What about people who are right now deciding whether or not to move critical data to the cloud? Should their security be compromised by hiding the truth about a known exploit in order to "protect" people less concerned about security who already put their data at risk?
If this vulnerability applies regardless of where it is, what is the mitigation then? Move your machine to where? Your suggestion is not practical and you know it. The right mitigation is one that actually will fix the vulnerability. No one shut down machines and migrated everything to another distro because Debian had a bug in generating private key a decade ago even if the bug was zero-day.
edit: I think the mirai botnet(s) had some sort of power struggle? avoiding propagating knowledge about it prevents it from being easily exploited by more actors.
Embargo is simply a way to make sure the huge, rich cloud providers don't have their reputation tarnished at the expense of everyone else. "Stay with bigco, we fix things before everyone finds about it"
I understand (and agree) that the system admins/owners should also be able to mitigate through knowledge, but it's a dilemma that I think is better resolved by the other solution
(in this case it's apparently a complex issue, but history has shown that there are surprisingly easy to exploit bugs/issues (see heartbleed, shellshock (which was apparently very quickly exploited.))
This guy is not releasing an exploit implementation, he is just pointing to the existence of a potential exploit that has a patch in development. He can't even figure out exactly what it is.
The only people who would be able to code this exploit would be the ones who already figured it out before this guy.
If no one talks about a security problem that makes it disappear
Obscurity actually is a layer of security. The mistake is is when people are dependent upon it.
For your analogy to apply here it would be the manufacturer of the door lock having a master key stolen then not telling anyone about it until they have a new lock for you to buy from them, in the case of a lock I would want to know that the lock is useless even if there was not solution so I can mitigate the risk no simply continue locking it believing it to be secure
The difference is a lock-pick can't exploit 20,000 doors in 20 minutes.
Here is a good hint to when something is not being embargoed: there is a paper and a public demonstration.
I consider this newsworthy because many very good security engineers in my feed agree it is.
This fixes something bigger than Intel could not fix by microcode update...
What ARM64's system does provide is a much simpler way to do a PTI-style pagetable split by twiddling the high address register at entry and exit.
This will be merged for 4.16, when there is no 4.15 release yet. No idea what your cloud computing companies run but it's not 4.15-dirty, and backporting this monster is a great recipe for a nightly emergency when it goes OOPS.
edit: it isn't even merged yet.
I agree, they're very scary. Also, watch out: a kernel with PTI on will not function in KVM-emulated secure boot mode until a KVM fix gets backported as well.
I bet you within the foundations of AVX512 lies a nasty one that can't be patched with microcode update.
This time it could be an AVX512 instruction (intel only) that leaks kernel address in a way or another.
I was talking from an ISA perspective.
For eg, clflush may be implemented differently between Intel and AMD, it has the same effect on system RAM hence a shared exploit.
I just made a bet that I could guess something out of ISA only.
Going macro to describe what may be the issue. I'm just doing a guess work here.
I was not implying Intel has the same implementation as AMD, nor I was making a case for "this is like rowhammer"