I have been reading though all of AMD's documents, and I cannot find what mode of AES that SEV (or SME) uses. I find it extremely odd that this is not called out in any of AMD's documents, and frankly a bit worrisome.
For the record, "A Comparison Study of Intel SGX and AMD Memory Encryption"  claims a modified version of AES-ECB is what SEV uses, BUT their reference links to AMD's whitepaper , which does NOT say anything about their mode, so I do not consider  to be a trustworty resouce.
Two subtle but important features of these modes are:
1. the tweak is xored both before and after the permutation
2. the tweak is derived from a secret and the address in a secure way
(I don't know if AMD chose such a secure mode, or used some insecure homebrew tweaking scheme)
Is that documented in an actual AMD source, or even talk?
Am I missing something obvious ?
Will it prevent Google from being able to have a Root access to the VM?
From my understanding it does not seem to protect from Google. If they are still able to have a Root Access to the VM it does not matter if the memory is encrypted or not.
The only thing that I see, is in case of a spectre/meltdown vulnerabilty where the isolation of the RAM fails...
I mean, they have physical access to the hypervisor host machines, where they could do anything they like to them, e.g. tap the JTAG pins of the CPU.
Insofar as you assume that the attack here is “the NSA compels Google to gather evidence against you”, the lack of just being able to log into the VM doesn’t really change much.
As for physical attacks, Google is ultra paranoid about physical access to DCs, and I think we can quickly agree that rogue employees and outsiders would have little chance of successful attack given the outrageous (and secret) methods that Google employs. Remember, this is one of the most-attacked organizations in the world, they've had decades (plural) to enact defenses and test them, and a successful attack would cost them over $10 billion - there's a virtually unlimited budget for physical defense. Circa 2020, I'd put Google's physical intrusion defenses up against most military installations.
Namely, two at most.
For most people (and especially businesses) this is a totally unrealistic security aim. If what you're worried about is the government using it's legal ability to compel people and businesses to provide evidence then there's ~no product or service from that country that you could realistically use.
But the promise of things like homeomorphic encryption, is that you can do a computation on a truly untrusted substrate, i.e. you can trust computations performed by an untrusted adversary, and also know that they didn't learn anything about those computations. It's a technical solution to security/privacy, not a contractual one.
The ideal that everyone's hoping for, is that there's a way to get that same kind of technical guarantee from cloud compute providers, without needing a layer of maths that makes Monte Carlo quantum simulation look fast.
Any other expectation of protection from the state are a limited based on probability, seriousness of the matter, and your potential culpability. Your employees, service providers and others can be required to provide information without informing you. In extreme cases, agents will pose as utility, security or building management.
...and homeomorphic encryption would stop all of those attacks. Presuming it's a homeomorphically-encrypted substrate for an autonomous agent, making its own "evaluations" of the data it can perceive from the outside world (ala a smart contract with access to an oracle) rather than simply trusting data from the insecure domain that happens to be signed with the right key.
This is also, y'know, the security architecture that allows nuclear submarines to avoid being subverted by an enemy nation that has temporarily gained control of the White House. The sub's commander needs to know not only that they've received the order, but also that the world really looks like one where such an order would be legitimately given. The isolated secure agent, speaking to an insecure principal, needs not only proof of their credentials, but also needs to independently verify their claims about the state of the world. (And, if they can do that, the system is often architected such that the principal won't even communicate in the moment, but instead has just left flowchart-like orders in advance, involving various dead-man's-switch timers and so forth.)
It may, but how many business processes are as well thought out as nuclear missile submarines?
If a nation state and/or corporation that runs the infrastructure decides to exfiltrate your data, legally or not - the power dynamic is totally asymmetrical.
Individuals and smaller businesses can only do so much within their power to protect themselves, and at some point accept the compromise that "perfect" security and privacy are not possible.
(Although, I can't help but think that there may be technical solutions as yet unimagined, which could level the playing field.)
Of course the question is basically moot, because unless they have some sort of third party append-only log of actions they perform on the hypervisors, how would anyone ever know? Yes, AMD this and Intel that, but since Google is building their own computers, and since it's close to impossible to verify that you are in fact running in the secure enclave ... then you should assume you aren't.
Naturally, if you come up with a structure where you can incentivize Google to remain honest (eg. somehow make it evident to the world if they access or tamper with your stuff), then it becomes safe to delegate running VMs to them.
Again, of course, the same problem comes up if you try to do it in-house. How do you verify the security staff at your data center is honest? CCTV? Who watches the watchers?
The software would run your VM, and provide some kind of API which your VM could query to be sure it was running in a secure enclave, managed by Intel's signed software. The result of the API could be signed with Intel's key.
I should note that this by itself is a fairly hilarious proposition.
But SGX has been broken multiple times, and because people love breaking SGX because it's such a "all the security eggs in one basket" design, it will virtually certainly be broken again in the future.
Furthermore, SGX is reasonably disliked, inasmuch as people consider its equivalence to the security and boundary implications of the Management Engine.
This gets you close to the security of a physically dedicated server, without the expense of actually buying/leasing a full server.
Google probably has root by default via an agent, but you can remove that. Google can probably run single user mode to change your root password, but you can change your bootloader/kernel to forbid that. Google can probably mount your disk images and just read them, but you can use full disk encryption to avoid that.
Tell me this: will Google indemnify you against all your losses proportional to the amount they are to blame?
i.e. if you lose $50 million because you relied on Google's "confidential VM" and an investigation shows it's 100% because Google didn't protect the VM, do you get a year's worth of fees back or $50MM?
(disclosure: I was at AWS 2008-2014, and VMware's cloud business 2014-2016; opinions my own, no disclosure of secrets, etc etc)
Most SLAs only indemnify you for the paltry infrastructure costs in relation to the downtime or accident.
Few SLAs are more extensive, covering for example data loss.
AFAIK, no SLA exists that extensively cover such type of loss. (perhaps with the exception of heavily customized contracts, sometimes part of a military contractor deal, where numbers are out of this world and there's huge margins for clauses and exceptions everywhere).
They don't exist for two reasons, IMHO:
1) it's very complicated to define metrics related to the amount to be indemnified, and
2) it's hard to keep that amount current, based on how the data changes, or the architecture changes.
It would be interesting to see an "insurance" for these kind of things. I once thought of such a product, just after I left AWS, and for a little while I entertained the idea of launching a startup to go after that market.
A few comments on this topic from 44 days ago: https://news.ycombinator.com/item?id=23362073
Now insurance, that sounds like a much more promising facet for protection. But still I wonder about the incentives: SLAs are offered by the original company and mean that they have skin in the game. (Well, they already have skin in the game as you’re a customer and if they mess things up you will look less favourably upon them as a provider, but the SLAs should in theory provide more direct financial incentive to avoid breaking things.) Insurance would be offered by by a disinterested third party¹, and thus lose that SLA incentive.
¹ Insurance in general relies upon this for scale and safety; I can’t imagine trusting any provider offering first-party insurance: they’d likely go bankrupt the first time anything went wrong.
Negotiated SLAs may be able to take into account customer business losses (I don’t know, I’ve never seen a negotiated one), but standard SLAs never include that.
And a company the size of Amazon or Google probably doesn't actually buy business insurance; they are larger than many insurance companies and will find it cheaper to just accept most losses in their current regulatory regime.
AWS seems to have turned part of the cloud operating model they are supposed to be responsible for back onto the user and no one questions it.
You can also set up workflows such as your client owning the encryption key that encrypts data held by you and they can revoke it at any time. Slack has a similar system and I was asked by a large financial institution about the same. I expect to see this more in future.
Sounds like a great target for ransomware crews!
We'd definitely notice.
If you use a CMK, you can write custom policies which are both far more restrictive and shouldn't change as much as other resource types. That means that I could, say, have a policy which says the key-admin role/group/user is the only principal in the account which can update the key settings at all (even administrator can't do that then), the writer role is the only one with kms:Encrypt, and the reader role is the only one with kms:Decrypt. No matter what S3 access you have, if you aren't one of those roles you won't be able to use the encrypted data. This is probably also in a scenario something like “central group A provisions the KMS, devops group B creates lots of other resources using that key”.
You can add conditions, too — “anyone in our account can encrypt, decrypt requests can only come from this IP address or VPC”, “only requests from this AWS service are accepted” (i.e. that compromised EC2 instance can't use it), “access to data encrypted with this key can only be done in the two regions we approve of that”, “this particular encryption value must be used on all requests”, etc.
That adds an extra layer of defense: if I compromise a user, even one with some level of administrative access, who doesn't have the CMK access all I can get are errors or encrypted data rather than the raw data. If you're careful you can architect environments where a person can deploy code without direct access to secrets or a system can stream data through to an encrypted store (if you are storing PII, this can be a huge difference between “all of our users” and “only the ones who used the system during this time period”). In some cases these can be bypassed (i.e. a CMK might not allow Administrator to access it directly but they could possibly issue credentials for a user who does have access) but you're preventing generic attacks which just scrape up everything that compromised credentials have access to, and hopefully increasing both the level access required and the likelihood of producing an audit alert.
Self hosting is not about confidentiality. For nearly all categories of "confidential data", I would much much rather have it in a major cloud platform than running in some closet somewhere or in some random colocation center somewhere, all other circumstances being equal.
Self hosting is about how much you want to be in control, regardless of your capabilities to actually be in control.
Beyond a certain scale you can go build your own datacenter (or smaller: rent a whole rack cabinet in a datacenter) and start exploiting economics of scale.
A lot of people don't realize that nowadays you can pack tens of cores and literally terabytes of ram in a 2u server.
For most organizations, however, it's hard to justify investing millions of dollars up-front in the hope that at some point you'll be saving enough to make that pay off. If that's not your core business it's often easier and safer to outsource it so, for example, you don't end up with a data center full of 50% utilized hardware which you bought to have capacity for growth which wasn't quite what you expected — or a big crunch when you have more demand than capacity and now need to double that investment to handle [currently] 10% of your usage.
Well if you have your bills and a prospect of how much building and operating a datacenter would be, it could be very easy to do the calculation.
Btw one should not dismiss so easily the work of datacenter companies. They often have very high security standards and practices.
And this means that you don't necessarily have to build a datacenter from the ground up. You can start saving by just renting one or two rack cabinets and start putting your own hardware in there.
At some level of usage those costs are lower than the savings but that line has been going up for years, especially for anyone who needs PCI, HIPAA, FEDRAMP, etc. where there’s a ready package available covering a lot of it.
- CPU: 128 cores, 256 threads (2 sockets)
- RAM: Up to 2TB RDIMM or 4TB LRDIMM (16 channels)
- Avg. power at 100% load: 750W
Standard rack size is 45U:
- CPU: 5760 cores, 11520 threads
- RAM: 180 TB
- Power: 33kW
You might need to sacrifice 1U or 2U for switches.
Large companies have been announcing HUGE savings, small companies would be able to save a LOT too... such a pity, all the cloud abstraction creates lazy teams IMO, and lazy companies... (again IMO, I know this wont be a popular view, because this audience is exactly the cloud-happy audiencem but if you achieve self criticism, self-hosting / colo etc is probably a better fit for 99% of cases)
I'll happily accept that you can pay less money for the same amount of power, but security isn't free. You don't only outsource a considerable amount performance and reliability engineering to $MAJOR_CLOUD_PROVIDER, but also a lot of security engineering. Doing a lot less of that is cheaper for sure, but is that worth the cost? I'd argue that for most (not all: most), it isn't.
At ever growing scales the equation will eventually tip in your favor, but you have to either be working at a very substantial scale for that, or you simply must not care for an important portion of the tasks that the major cloud provider picks up for you. That is fine by the way, but you have to be sure that that is actually a concious decision and you're not simply forgetting to actually do that work or doing it poorly.
1) There is often assumption, that people behind cloud services are smarter than anyone else and don't make mistakes.
In reality, they still are humans. Big names attract some bright people, but not anyone is genius with good working in team skills.
2) Cloud companies have much harder problem to solve. They have two hostile fronts, outside world and clients. They need protect themselves from malicious clients and keep clients separated.
They offer generic services for everyone, so there is unused functionality for your use-case. You can disable/uninstall things in your self hosted setup (infra as whole, not meaning inside your virtual machine), filter aggressively in network perimeter and so on.
Self host don't only mean "running in some closet somewhere or in some random colocation center". I can't say, how things are done in US, but in my country government has several DC-s/server rooms for governmental agencies. There is on premises hosting too, sometimes with very good physical security.
From that point, yes, you open up your self-hosting to the world in a (hopefully) limited fashion and restrict access to your cloud management (hopefully) to a much narrower scope. But by default, a box in your building starts completely secure, and your AWS box starts accessible to anyone on the planet with your AWS password.
Zero-trust is not without merit, by any means. It is good to not assume there are no cracks in your walls, and you should indeed use as much internal security as possible wherever you can.
But you know what's really quite silly? Deciding to fill your moat in with dirt and knock over your castle wall because you think it's possible for someone to get in anyways.
You had better believe I'm going to use the latest authentication and encryption tools between machines that I can to ensure nobody can listen in from a stray network connection... and that I'm also going to put all of it behind a firewall.
Yeah, lock your doors inside your castle, but for heaven's sake, the moat and the castle walls still help. Defense-in-depth is a concept I swear everyone forgot when clouds became a thing.
My favorite example is my Google Voice account. It has a different area code (out of state) than my real phone number. I get a lot of spam calls, almost all through Google Voice, and I know not to answer them, because nobody legitimate calls me from the area code my Voice number is from.
Google has state-of-the-art artificial intelligence and spam filtering capabilities. It's arguably the two most sophisticated advantages Google has. And it is completely ineffective at blocking spam calls. If Google Voice gave me the ability to create my own filter rules, I could write a one-line rule that would drop any call from that one area code, and I would have perfect spam filtering for my account.
This isn't an example about Google Voice, but about the difference between generalized technologies that cloud providers use versus configurations you can apply yourself that are custom tailored. Obviously, Google can't block everyone in that area code as a spam filtering method... many people legitimately have that area code. But for my phone, it would be a good rule and would be nearly 100% effective.
Which is to say, my engineering regiments will always be more capable than my cloud provider's engineering regiments, because mine know my system and my customers and my use cases. I'm paying engineering regiments either way, so I might as well pay my own.
I think I lost track of what you're trying to discuss now. I'm not arguing cloud providers are a security "layer" in any sense, just that they take responsibility for some things you otherwise need to do yourself. If you got that from my post I apologize. Even if I said something like this, I don't know how your Google Voice example (which is an application/service) applies to cloud infrastructure.
> Which is to say, my engineering regiments will always be more capable than my cloud provider's engineering regiments, because mine know my system and my customers and my use cases.
Good for you if true, but I've personally never seen an environment where such confidence on the part of infrastructure engineers has held up. At least not from a security perspective.
> I'm paying engineering regiments either way, so I might as well pay my own.
If it turns out the equation favors you, then great, those companies exist. But I don't think the equation favors many, at least not when including all the items you need to have for self hosting.
I tried to explain the concept above, but it's that whether it's an application/service or cloud platform, it's tooling has to be designed for the entire customer base. Often, a far stupider solution can be far more effective, if it only has to be written to apply to one use case.
> such confidence
Don't get me wrong: Nobody's perfect and everyone has security holes. But things like all of the public S3 bucket fiascos should remind you that the cloud is, by default, open to everyone, and people become incredibly overconfident that Amazon or Google or Microsoft will keep them safe.
> If it turns out the equation favors you
It almost always does. When I do something in house, I am paying for hardware, software, and engineers. When I do something on the cloud, I am paying for hardware I don't own, software I don't own, engineers who work for someone else, and a healthy profit margin for one of the five most valuable companies on the planet.
Cloud is a narrowly-effective solution for startups which can't size out their solution themselves fast enough, and short-time peak loads. For everything else, you should probably not cloud.
Imagine a world where the cloud produces more data at a rate that exceeds the pipes leaving the cloud. You are quite literally locked into computing within that same cloud.
lfmao... but jokes aside, the "cloud" is just series of datacenters with less choice of a brand, you do realize this? There is no such "cloud" as you say... Internet allows exactly for decentralized data (among cloud-1a-b1-c2 to cloud-2b-5x-3h is not all "in the cloud", its the same as datacenterA to datacenterB.... proximity still has an effect, as do all other network conditions...)
Dont let the marketing sandcastle trap you in!
SEV is not, to me, a convincing security model. It was tried a long time ago, it doesn't work. SGX uses small enclaves for a reason. Hacking your average Linux box is quite easy already, which is why compromised passwords flow like water.
With the SEV threat model the cloud provider is no longer on your side. They are no longer defending you against threats, they are a threat. That's why you want encrypted RAM. But you're being threatened by one of the world's most advanced security organisations, a company that literally pays a large team to do nothing but locate zero day exploits all day, every day. And they swap zero days with other major tech firms too. So they have access to bugs in your OS before you do, and this is structural, there's no fix for it.
Worse, they recommend you use Google-controlled, 'hardened' OS images. But that makes no sense because you're trying to defend yourself against Google.
Finally, it's very unclear to me that the Linux kernel is going to accept bug reports of the form "the hardware behaved in arbitrary incorrect ways because it was redefined by a malicious hypervisor on the fly". The kernel isn't designed to deal with a malicious hypervisor. It's not going to be checking the results of things that might be hypercalls to check the hypervisor didn't hand back invalid results. In the past this has caused big problems with userspace apps trying to run on malicious kernels: the kernel was able to break in to the encrypted memory space immediately because the kernel could manipulate syscall returns in ways the app didn't expect.
But let's put OS hacking to one side.
SEV has a very poor track record of security. There were dozens of bugs in their firmware in past revisions, ordinary C type buffer overflows. Then there were basic crypto bugs: you could send the firmware an invalid point on the elliptic curve and it wasn't checking for that, things like this. Perhaps Google has audited AMD's firmware now and have knocked out all these problems. Perhaps not. Who knows - they mention working with AMD on performance but not security.
Moreover, SEV doesn't have anything to say on the topic of side channel attacks. And unlike with SGX, because the whole point of SEV is you use existing software and operating systems, there's also no fix for this problem. Normal kernels and apps aren't designed to resist a compromised hypervisor. SGX enclaves are designed for-purpose so you can argue that they're the smallest piece of app logic that needs to process your data and you can go to town on securing it, whilst the bulk of the app handling resource management, connections, scheduling, etc, is blinded by cryptography. Enclaves expect the host to attack them because they were written that way. They can do things in a less efficient but more side-channel proof way. With SEV this is theoretically an option (could run some specially written hardened OS), but, it's not how it's being advertised so nobody will do it.
If you're using a cloud provider, you ultimately have to trust that provider is doing what they claim. After all, you have to use their management control plane to configure SEV, and that control plane could always be lying about whether SEV is actually working.
So SEV isn't intended as a defense against Google as an organization. What it can do, is provide a layer of defense against rogue hardware administrators, as well as other tenants that might be sharing the physical machine.
If you don't do this then it provides no protection. The host can break in by just telling you SEV is in use when it's really not.
Consider that most of the side channel attacks being exploited against SGX also work across process and across VM domains. They tend to get advertised as SGX specific because that's the juiciest, newest and coolest target to hack. But they can break arbitrary CPU enforced protection domains.
Some of these side channel attacks are Intel specific. Many of them aren't: they're to do with how CPUs are designed, which is why Spectre et al affect AMD as well.
Whilst SGX gets a lot of focused attention from researchers exploring side channels, it has turned out to be pretty robust against the more ordinary kinds of attacks that felled a lot of prior systems, including multiple generations of SEV. Nobody has ever found basic cryptography or C programming bugs in the system enclaves, for example. I thought that would happen at least once - never did. All the bugs have been people reverse engineering CPU internals to a much greater degree than ever done before.
One reason they do this is because SGX is patchable in the field. A remote client can tell if the CPU microcode and SGX stack were updated to close vulnerabilities. Intel call this TCB recovery. So, it's kinda 'ethical' to research SGX bugs because you aren't breaking anyone's equipment.
AMD SEV has sadly not had a working equivalent of TCB recovery in prior versions. There was an attempt at such a mechanism but it can't stop downgrade attacks, so doesn't really work. Researchers have managed to totally break SEV such that the CPU generation itself had to be discarded and replaced, not just once, but multiple times. That's the worst case scenario for hardware roots of trust. Hopefully the new gen chips won't suffer any similar fate.
Given the fact that SGX has always been renewable/patchable, that all bugs found in it were really hard-core low level CPU design bugs of the type that AMD have also had, and that it has a stronger security posture to begin with (less code in enclaves than a whole OS), I'd say overall it's doing well. Now SEV is playing in the major leagues I expect to see more research on AMD chips: it'll be interesting to see what they come up with.
Unless Google Cloud specifically offers an insurance policy against security breaches (spoiler: they won't), the most you can expect is to be refunded your hosting fees.
That being said, you can always buy your own insurance to protect against that kind of risk, and _the insurance provider_ may mandate using a technology like Confidential VM in order to qualify for insurance coverage.
The best you can get from vendors/service providers are SLAs, where damages are meant to be mildly punitive (amount x number of affected customers can add up and hurt the provider.)
A better alternative is insurance. There are many insurers that offer contingent business interruption coverage for failure of cloud infrastructure, among other causes.
I'm just kidding
The Empire brand is so much bigger and more valuable
> With the beta launch of Confidential VMs, we’re the first major cloud provider to offer this level of security and isolation while giving customers a simple, easy-to-use option for newly built as well as “lift and shift” applications.
How is Google's offering different from the Confidential Compute Microsoft already offers?
Googles offering is more intuitive. Google Cloud already stores data on VMs encrypted to disk and handles decryption to be able to start the VM. But once the data is in memory its unencrypted and could be read by other processes. On a bare metal machine there may be more than one VM using portions of the same processor with physical access to the same memory range. So compromising this at a higher level would effect other customers. With the new offering it supports encrypted memory isolation for a virtualised VM. It's on a VM-level or 'whole operating system' meaning there is no need to write any special code to take advantage. You just tick a box. Tech is by AMD and not Intel.
Both of these options support different use-cases, IMO. Microsofts confidential compute allows you to manage untrusted applications on the same host with a high level of control. You can prove to other machines running the same app that you're doing this 'securely.' The Google solution doesn't give you the same level of granularity but is much, much easier to use for those who just want to take advantage of better memory protection and integrity checks.
My thoughts on this are mixed though because:
1. While the products are clearly very different -- Intel's SGX tech has already had numerous security vulnerabilities and that doesn't make me very optimistic AMD will have magically solved those issues.
2. The general advice in finance for highly sensitive data is not to use VMs, period. Since privilege escalation on one VM could potentially lead to access to the bare metal and hence access to the other VMs. Some of these risks still seem relevant even if memory protection is being used. I.E. it's better not to use VMs if you care about security. Trying to attract more highly sensitive data to 'the cloud' makes me nervous, to be honest.
3. I like the concept in general. Even though it's not a silver bullet it's nice to be able to have access to this option.
If you want something to run in SGX, you will likely need to rewrite the software (you can’t call syscalls directly in SGX since you cannot trust their results anyway).
SEV’s security model is weaker (no integrity), but lets you use essentially normal VM images.
Disclaimer: I work at Google in this space.
MPX was an instruction extension that was never widely adopted and that Intel has deprecated.
But it’s worth looking at exactly what it means to attempt isolation of programs on shared hardware. SEV is an interesting way of working on it.
It seems pretty clear that your data could never be modified or read, but there’s nothing preventing starvation of resources or side-channel attacks to leak encrypted values.
Of course I also always argue against using the cloud for anything sensitive as “the cloud is really, just someone else’s computers”. Albeit with a fancy provisioning api and some proprietary services adjoining it.
This is too simplistic: employing that argument obligates you to show how you’re mitigating the same threats on your own, especially with regards to ops and security staffing. I have considerably more confidence in any major cloud provider having robust internal monitoring than the typical corporate VMware deployment, and that even extends to bare metal unless you can air-gap it — if you get a bare metal server from AWS, Azure, Google, etc. they’ve still put more work into the firmware, management interfaces, etc. than most IT groups do and those are very juicy attack surfaces.
For me, I can say plainly: "this piece of equipment has these access controls, both physical and virtual and we have various radio frequency dampening systems" etc;
For you, you can think about outsourcing that responsibility.
There's no "right" answer, some cloud providers may indeed have much stricter access controls than I could ever have (for instance, budgets may require my servers to exist in a physically shared space, albeit in my own racks; those racks being porous to allow airflow). But ultimately you will never have more control than if you have complete ownership and audit capability of all systems.
I'm sure many people have lived in the same regulatory hell that I have; and I wouldn't argue that the regulatory hell is easier in the cloud or otherwise; I would instead argue that if I was the CIO; I would sleep better knowing I had done my job and not attempted to outsource the responsibility and wash my hands of it, which is what you're effectively doing, even if you trust the cloud provider, even if they've shown good faith- it's no longer your eminent domain to oversee.
But I can see how you read it that way.
I would definitely challenge you on 'wrong for a significant number of people' because if you're focusing on security then it's likely a core principle; and therefore you need to understand and be able to effectively argue your case.
And that doesn't matter if you agree with my position or not for that last point to be true.
And its often cheaper has better performance and no lock-in to a cloud provider.
The more logical alternative (without abandoning the cloud entirely) would be to use the cloud provider's dedicated instance functionality (GCP's terminology for this is "sole-tenant nodes"), but these are much more expensive than virtual machine instances, especially if you don't need the capacity of a dedicated node. At some point, you or your bosses are going to be asking if the security is _really_ worth the premium.
SEV-enabled VMs can provide a convenient middle ground -- more protection than just a hosted VM instance, but since you're still sharing physical resources, the cost is closer to a VM than a dedicated instance.
If SEV VMs are considered the equivalent of dedicated instances from a compliance perspective, this could open the door to cloud hosting for a variety of industries who were unable to do shared hosting before. However, that if remains to be seen.
The good news about colo'd equipment is that it's dirt cheap. You can have millions of customers running on a few poweredge nodes with full redundancy and capacity to spare.
As someone that actually manages "a few PowerEdge nodes," you're overstating their capability and oversimplifying what it takes to run a production-grade system with millions of users.
I mean ops isn't simple no matter where you're hosting it but it's not any harder than when I worked at AWS shops.
If you need to protect your data, all of those services are no longer usable.
This stuff is new, so I think there's going to be a lot of confusion about what it means and how it works. But the basic concept is easy. SEV and SGX don't mean anything except that you can remotely audit a piece of software that's running by checking the hash of the code that was loaded, and that code can derive private encryption keys unavailable (in theory) to any other piece of software on the system, and that its memory is encrypted.
In SEV that piece of code is effectively the entire OS. In SGX it's a much smaller piece of code, more like a single shared library.
With a Google managed database system, you don't know what software it precisely is. These clouds aren't open source. Even the hypervisors aren't open source, as far as I know. Even when based on an open source product there are proprietary patches. And, they change constantly.
That means if your VM starts up and you get a hash back saying it's the OS you wanted to start, that's great, but the moment you open a connection to some other server even if it's all running on AMD hardware, all you get back is a hash and moreover (iirc with SEV) a random hash that's only useful if you're actually the one that started the remote VM. With SGX you get the actual code hash and/or code signing key, even if you didn't start it, which is a bit better.
What does that hash actually mean? Unless you can reproduce the build of whatever you're talking to and audit the code to look for back doors, and keep auditing it as it changes, it means basically nothing. That's why SGX focuses on really tiny pieces of code - less attack surface, but also less audit churn and easier reproducibility.
Ultimately to have guarantees your data is private in a cloud, you need either to audit the software stacks and rely on hardware roots of trust, or you can use clever forms of encryption like FHE. But just ticking a check box by itself does nothing at all.
SEV is a hardware function that provides real-time memory encryption. That's all. Maintaining full confidentiality beyond the boundaries of the processor's memory controller is not within the scope of SEV.
Google Cloud's Confidential Computing platform _may_ offer a broader solution with some type of cryptographic guarantee, using SEV as a component of the solution. I don't know how far they are, or if that's the direction that they're looking to take the platform.
However, if you're looking for a platform with remote attestation that the computing environment is fully trusted, SEV is not sufficient. You would need a traditional TEE for that.
> just ticking a check box by itself does nothing at all.
Taken directly from Google's blog post:
"Confidential VMs can help all our customers protect sensitive data, but we think it will be _especially interesting to those in regulated industries._"
It provides additional hardware-backed protection when sharing a physical machine with other tenants, which among other things, can make shared hosting a possibility for security-conscious environments that previously prohibited it.
If that doesn't sound like a useful feature, or you feel that it's theater, then you're probably not the target market for the feature.
You're demonstrating my point for me. That isn't all, by any means. SEV is primarily implemented in firmware, and provides a form of measured boot and remote attestation. Don't take my word for it:
"AMD Secure Processor. Provides cryptographic functionality for secure key generation and key management."
This is literally the second feature of two that it advertises as part of SEV.
Those parts are critical and SEV doesn't really mean anything without it. RAM encryption is only useful if you don't trust the owner of the host hardware. But if you don't trust the host, you can't assume they switched on RAM encryption or booted the OS you asked for into the VM, you have to check it. That's what the remote attestation lets you do.
I work in regulated markets! And yes, it's true, there's a lot of regulators that can be satisfied with security theatre. The weakness of regulator understanding of technology isn't, by itself, a reason to consider SEV without RA useful.
Powered by AMD. I wonder who will leverage this next.
Something like heartbleed would still happily decrypt and transmit confidential data.
Something like speculative side channel attacks would still speculate on the unencrypted memory right?
Rowhammer would still flip bits, but now one bit flipping would turn an entire 128 bit block into garbage when decrypted? It seems like that would at least make rowhammer a lot harder to exploit into a privilege escalation. ECC memory already gave some limited protection here.
This wouldn't prevent a Spectre attack or similar cache-based attack, as the memory would be decrypted at that point. However, it would mitigate attacks like Meltdown or Rowhammer.
Edit: If each VM has it's own key, then they couldn't read each other's memory with meltdown. I think that must be the angle.
No. A particular VM's memory is decrypted by that VM's key.
Assuming that AMD's CPUs were vulnerable to a hypothetical attack similar to Meltdown, either a VM or a guest would be able to dump the machine's memory, but the memory contents belonging to other VMs would be encrypted and unintelligible.
That makes sense, it gives hardware protection from privilege escalation between VMs on the same host. Be it through a hardware exploit or hypervisor vulnerability.
Given that AWS and Azure also use AMD and Intel equipment, I'd expect them to introduce similar functionality. (AWS is probably closer as SEV support is built into Linux, whereas Windows has no support for it as far as I can tell.)
Only SEV-SNP  is supposed to address it, but only on new silicon which doesn't exist yet, and that probably not even Google has.
So why is Google releasing this feature if it is so flawed?
If true, it is even more disappointing.
SEV is a breakthrough in the sense that it's essentially transparent to the guest environment. Earlier technologies like Intel SGX or ARM TrustZone have a lot of performance limitations, and application needs to be explicitly developed to support it.
Intel is working on a similar technology -- Total Memory Encryption (TME) -- but they haven't released it to market yet.
TXT never really took off. There were a few problems, but, I don't think SEV has actually solved these problems, beyond the lack of the memory encryption.
One is that the hypervisor is ... well it's still a hypervisor. It can play a lot of games on the operating system, and hardware is limited in what it can do to stop that. TXT's solution was to "measure" the hypervisor, so you could audit it. That doesn't work for google/aws/azure who all use proprietary hypervisors, so you need to place a lot of trust in your chip and kernel that they can resist arbitrary malicious behaviour by the most privileged piece of software on the system - one that controls all hardware access. That's very difficult. For instance, the hypervisor controls access to the system clock.
Another is that it was very hard to make the operating system secure. Heartbleed being just one example of what can go wrong. So the trusted computing community concluded around this time that placing an entire operating system into your 'trusted computing base' doesn't really work. It's trying to run before you can walk. If you can't make the operating system reliably secure against remote attackers then trying to make it secure against the far harder adversary of someone who controls your hardware stack seems futile.
That's why Intel's equivalent isn't transparent. It's not very opaque, it's basically like loading a shared library and the library gets encrypted RAM. But when you try to write an enclave that's really secure, you realise that there's a lot the host machine can do to make a mess of things. I don't think that changes much if it's "just" the hypervisor that's malicious instead of the hypervisor and kernel. The solutions end up looking the same - you want to minimise your attack surface, you need to think carefully about clocks and time sequencing, side channel attacks, etc.
Given that RAM encryption is literally the core function of SEV, any functionality that lacks it is by definition dissimilar.
But the core technology is basically the same concept. You get a protected memory space (to some degree of protection), you can derive keys linked to the loaded code hash, and you can do remote attestation to set up a Diffie-Hellman handshake with the remote protected domain. All that stuff is identical between TXT and SEV.
And that's not necessarily a big deal. If you trust Google or AWS to hold all your business and customer data, no problem (and if your customers transitively have that trust). But I think there's a lot of denial about this fact: the cloud has all your data and all your customers data. Fixing that is really, really hard. It's not anywhere near as simple as Google are claiming in this announcement, certainly not "tick a box and it's switched on".
Or do you mean those resources physically located in someone else's data center running on their hardware and software platforms among their infrastructure and only assigned to your task as-needed? That's a bad assumption.
Aren't Amazon's Graviton 2 processors specified to do this too?
> I don't understand, and couldn't get any information from the article either.
See this wiki article for more info on this class of technology: https://en.wikipedia.org/wiki/Data_in_use
It just encrypts memory pages and registers with a key that the host and other guests can't get.
Modifying the page tables to establish a cryptographic merkle tree would fix the first attack, and SEV-ES fixes the secrecy attack from the second paper. Unfortunately a change to page table structure may make it impossible to run unmodified kernels in the VM.
I think it as impossible to prevent a hypervisor from fingerprinting what's running on a child VM; there are too many timing and power attacks to ever mitigate that, which was the attack vector on SEV-ES
Intel is working on a competing technology -- Total Memory Encryption -- but AMD beat them to market by quite a bit.
That being said, AMD SEV is transparent to the underlying applications and users, so anyone with an interest in protecting memory from certain classes of attacks (e.g., Meltdown, Rowhammer) can benefit.
Other technology in this space (e.g., Intel SGX, Arm TrustZone) requires the application to explicitly support the secure enclave, so their usefulness to a typical end-user is much more limited, and as such, they aren't really used much other than to enable DRM.