Hacker News new | past | comments | ask | show | jobs | submit login
The Cloud Conundrum: S3 Encryption (secwale.com)
53 points by daitya on Jan 16, 2023 | hide | past | favorite | 61 comments



SSE-S3/SSE-KMS does provide defense against a few things:

* Lost/stolen hard disks from AWS datacenter

* Attacker or rogue AWS employee who has access to low level storage but not keys.

The article is correct that this misses many important situations but it is still better than nothing.


The biggest benefit is compliance. Many customers have to satisfy external regulators or internal compliance teams who mandate encryption at-rest. Those requirements are typically designed for relatively low-security on-premises data-centers and basic systems, rather than the kind of secure facilities and erasure coding schemes that AWS maintains, but always-on encryption makes the checkbox easy.

Another often overlooked benefit of at-rest encryption is that it can enable crypto-shredding; delete the object key and the object can no longer be recovered. That's faster, more efficient, and better for the environment compared to other kinds of "secure erasure". Of course if you want robust crypto-shredding you do have to take some care about where the keys might be cached and for how long, but you can't do it without encryption at-rest.


Author here. Agree, I cover these in the article. Crypto-shredding might be the AWS' motivation to enable the feature as a 0-click default. But for the customers, they don't get much beyond the compliance checkbox, and that too is questionable as more and more auditors become aware of cloud security controls.


These are far less likely outcomes than your root account being compromised, an IAM fuck up, dumb assume role configuration or something using an assumed IAM role being compromised and having verbatim access to the bucket.

The feature is there for paperwork reasons.


Less likely? Yes. However, these are real, legitimate risks that need to be mitigated! This is not just some faff that needs to get put on paper.

Think about your risk model: there are a lot of employees at your cloud provider, and a lot of high-profile customers. There's a defense in depth approach here. Metal detectors, security guards, and searches to enter & exit data centers. Data encrypted when on-disk. Keep the keys in some KMS system. Physically isolate the KMS system, use HSMs, restrict permissions to the KMS.

Without these measures, you might end up with a system where ordinary customer support representatives, software engineers, data center technicians, or system administrators / SREs can exfiltrate customer data on a whim. At a large cloud provider like Amazon, Google, or Microsoft, that's not a small attack surface. That's a lot of people. The idea is to make it extremely hard for one of these employees to exfiltrate customer data without somehow leaving a record of it which ultimately gets the employee fired and prosecuted.

Customers will make mistakes too--screw up their permissions, fail to encrypt their data, or whatever. The idea behind defense in depth is that any one security measure can probably be circumvented, but multiple independent security measures are harder to circumvent.

It's really easy to be cynical about security but I have worked at cloud providers and the threats are real.


I'm fully aware of those risks. The issue for me is a matter of realism.

A combination of the default policy being poor on all AWS services, the shared responsibility model of AWS delegating this risk to clients and the default security posture in the industry being terrible makes the real problem of solving this like traipsing across a field covered in poo.

An analogy; you have to tick the box to put your front door key under the doormat. The burglar knows where the hole in your fence is and knows you might keep your keys under the door mat. Plus your kids left their keys on the github bus.

Of course the SOC2 audit says you ticked the box and you tell everyone your house is a castle.


It's there so folks can check their list and go "Yep, it's encrypted at rest. Yay us!"


SOC2 here we come!


Author here. Exactly, that's the biggest risk!


"less likely" risks are still worth protecting against!

I still put my seat-belt on, even though there's an airbag in the car.


If your root account is comprised, it is game over anyway.


>>The article is correct that this misses many important situations but it is still better than nothing.

I am not really sure its better than nothing - what it does do imo, is muddy the waters a bit for casual users, and perhaps make it seem like the customer doesn't really need to do as much securing themselves - because it is now 'encrypted by default' - giving a false sense of security - and as others have pointed out, really just protects against extremely rare edge case problems, i.e. the lost hard disk.


That's kinda like securing a door while not having walls


Welcome to security certifications


There is huge amount of data in S3 and EC2’s. Other cloud providers such as GCP hold similarly large amounts of data. The governments are no doubt very interested in these gold mines.

I am curious if cloud providers such as AWS provide default access to data stored in their data centers to the governments (to all data, and by default, in a successor to the PRISM program, not on a case by case basis and under subpoenas which is a different program)?

If so, KMS doesn’t help, and client-side encryption is the only way to protect the data.


Or a CloudHSM if you trust the certification: https://aws.amazon.com/cloudhsm/


What’s always been unclear to me is how much I actually gain from providing my own KMS key use with S3 if at the end of the day my private key is available to the s3 service.

I would only use client side encryption/decryption if I had really sensitive data and I truly couldn’t trust AWS with it.


As you said, using a key they generated (but not the default key) doesn't protect you from AWS-side attacks.

The main advantage is more control over the key & permissions on your side of the shared-responsibility model.

AWS default key has a fixed policy and is wide open to your account, so it's not suitable for:

- Using different keys for different situations/data within the account; this can help with defense-in-depth & errors in least privilege from escalating. Example: Lambda accidentally has overly-permissive S3 permissions, and you're using the default key. It could potentially read more S3 buckets/files than intended. With different keys, it couldn't decrypt the data in other buckets.

- Using different accounts. Default keys are only usable within your account, you can't grant access to use it to other accounts (which might sound odd, but there's use cases where you'd consider doing this)

- Default keys can't be used by some AWS services, e.g. the default key policy can't be used by CloudWatch Events to enqueue to an encrypted SQS queue, for instance. (workaround is adding a resource-policy to the queue, but I personally do not like resource-based policies -- I like all my permissions in IAM roles & policies so there's one place to audit and control)

All this said, I often leverage the default keys heavily, as using your own KMS key is more expensive.


Assuming for a moment that AWS's security at all matches the public API.

There are two ways to grant access to a KMS key: as a policy grant on a user/role, and as a policy on the KMS key itself.

If you don't have access to the key, then reading the S3 object doesn't work.

So I have to trust that AWS's security here is that the S3 API itself can't unless granted that access.


> If you don't have access to the key, then reading the S3 object doesn't work.

I'm still not convinced. Just seem like theater. S3 has its own access control. If you don't have access to the S3 object then reading the S3 object doesn't work ... Requiring additional access to the key seems redundant - theater.

P.S. I'm not arguing against encryption at rest. I'll take it, but using my own key doesn't seem to offer too much over using one of AWS's rel low level access.


AWS isn't monolithic internally and we use permissions boundaries between systems like S3 and KMS (in fact they are mediated by IAM). There's some more detail in my talk from re:Invent - https://www.youtube.com/watch?v=kNbNWxVQP4w .

For a case like this, the upshot of it is that with permissions on the KMS key, S3 is effectively locked out and can't fetch it. KMS is a hermetic system with no operator access with HSMs at its root, so that provides a pretty meaningful difference. It also means that controls can be handled by different customer teams, and that a single change on a shared CMK can render an unbounded number of S3 objects inaccessible quickly.


Useful. Thanks.

> KMS is a hermetic system with no operator access with HSMs at its root, so that provides a pretty meaningful difference.

So, if I don't use a CMK (Customer Managed Key), what am I getting instead? Assumed (perhaps naively) KMS would still be used under the hood.

> a single change on a shared CMK can render an unbounded number of S3 objects inaccessible quickly.

I feel like you could achieve that with an IAM policy. I guess the key can apply to a wide range of unrelated AWS services so act like a ~meta policy.


If you use the default AWS managed key rather than a CMK, it's largely the same levels of protection and the main difference is that you don't control the rotation schedule of the top-level key.

And yep, a CMK can span many services and still give you a single point of control.

But compliance is still the major thing. Many regulators, auditors and compliance authorities are just happier with KMS being the point of control. Even though AWS operates KMS, and the services, KMS is a neat compartmentalized hermetic system with HSMs at its root. So it's quicker and neater to evaluate its surface area, TCB, and so on and it's a familiar pattern to finance and health regulators especially.


Default keys don’t allow policies if I recall. No access control.


You can't set a Key policy, but you control access via the IAM Policy associated with the caller.


It's protection against mistakes where you've accidentally granted more access than was intended.

For instance, if someone does a wildcard resource "acme-prod-widget-*" for a new policy, but forgets that widget also has (say) a customer-pii bucket, too.

Well, if acme-prod-widget-customer-pii is encrypted with a customer-pii specific KMS key, then there's no accidental leak of that data.


I understand that but I guess my question is other than another layer of access control to prevent mistakes, what am I actually getting.

I know AWS has fixed this now, but in years past we paid a ton of money in KMS requests from s3 for these types of configurations and we asked ourselves what is this really buying us?

At the end of the day I have to assume some AWS employees have access to some or all keys in KMS.


No AWS employee has access to the keys directly from KMS. It's a hermetic system with no operator access like that. KMS Keys are released to AWS services for use based on IAM permissions and grants, and a time-bounded cryptographic pattern we call Forward Access Sessions ... where we end-to-end verify that the requesting service has a recent and legitimate signed request from the customer.

KMS also has the capability to support an external trust store (https://aws.amazon.com/about-aws/whats-new/2022/11/aws-kms-e...) ... where AWS holds no key material at all.


It’s a bit like saying I don’t have access to keys in my Yubikey. Sure, but I can decrypt data with those keys.

If I’m right, S3 sends encrypted data encryption keys to KMS, and KMS sends back the decrypted data encryption keys. So, although S3 has no access to the master key, it has access to the data keys in RAM for the customer to use. With a “bucket key,” it goes further storing those type of keys in disk.


It's more than what a Yubikey typically does. There are also can't-be-bypassed audit logs of the key usage, and the manner in which S3 is granted access to the data encryption key is very fine-grained; KMS won't allow the S3 systems to decrypt just anything.

The key is stored in memory while it's being used to encrypt/decrypt, that's unavoidable, but humans don't have access to that. Bucket keys are a bit different, where there's a per-bucket key which has a similar scheme and then per-object keys are derived from it as needed. Together with random nonces/IVs, it ends up being a bit of a mini multi-party scheme.


> Next, at best SSE-S3 adds a defense in depth protection against a physical loss, theft or confiscation of an AWS hard drive storing your data. Think crazy scenarios like a tornado or fire, followed by more chaos and somehow the AWS hard drive landing at Goodwill. If the data on it is unencrypted, game over. As you can imagine, the likelihood of this happening is about the same as that of the United States winning a cricket world cup.

It's incredibly easy to dumpster dive old hardware from datacenters (or waste management facilities). Even if their policy is to demagnetize and shred every hard-drive out of every dead machine, that's assuming they always follow their policy correctly. Humans are flawed (and subject to bribes).

I have friends who run K8s clusters on hardware they took from the dumpsters behind datacenters. (Not Amazon datacenters, but ones where the government has an entire floor dedicated to them)


Earlier in AWS, when I was migrating a complex sensitive system from self-hosted servers, there was a requirement not to leave data at rest in S3 unencrypted, and there was no suitable S3 feature for this.

Since I had to write the AWS API from scratch anyway (fringe language, and we tried to avoid C memory problems), I also implemented transparent encryption at the same time. (Using an off-the-shelf vetted implementation of appropriate encryption algorithms. And in such a way that, when handling large data, it could run on other cores if no accelerator.)

Of course, as the article points out, the EC2 instances (or whatever needs to access the data within the objects in S3) by definition need access, but at least that access is restricted more closely to what actually needs it.


AWS has access to the AWS-managed SSE-KMS keys, since AWS manages those keys on behalf of customers.

I am wondering if AWS has access to data encrypted with the SSE-CMK keys as well?

They state that the SSE-CMK keys reside in tamper-proof FIPS-validated HSMs and not even Amazon has access to them. But it’s a bit misleading because they use envelope encryption. Furthermore, the encryption and decryption are still server side.

They could have better phrased it that, the only advantage of the SSE-CMK keys over SSE-KMS keys is that, with former, the users can define key policies (providing another access control layer similar to IAM) and monitor usage.


Author here. Technically, they can. HOWEVER if this threat is part of a customer's risk profile, i.e., they cannot risk even a possibility of AWS having access to their data, then use client side encryption.


People here say that this doesn't secure anything. But in addition to encryption by default, AWS announced they will as of April 2023 enforce secure S3 configuration (no public access) by default. So things are definitely getting better when you put these announcements together.

https://aws.amazon.com/blogs/aws/heads-up-amazon-s3-security...


Well it’s kind of insane that it has been open by default, resulting in SOOOO many leaks.

They might as well have open SSH access on all EC2 instances using root + “password”


That's not entirely fair - the default has been to ALLOW people to set buckets to public and/or to set public ACLs on individual objects. Both of these require the customer to Do Something in order to grant public access. The default has NOT been "public access for all" ever, that I am aware of.


This definitely dates back to when offloading images and other static assets to S3 from your (era-appropriate) Apache or Varnish server was an optimization of the day...now, obsolete, as, at the time, there were not low-friction, lower-cost CDNs available, such as Cloudfront or Cloudflare.


Cloudfront as the CDN, backed with S3 for your static assets is still a very common pattern on AWS.


It is, but my point is referring to the time when there was no cloudfront.


I surprised they did not always encrypt data at rest, google does.


Legacy? Google public cloud operation is relatively recent.


GCS launched publicly in 2010 [1]. That's still 4 years after S3, but it's not "recent".

[1] https://en.m.wikipedia.org/wiki/Google_Cloud_Storage


I wonder if the real reason for encrypting is that it allows cheaper deletion. You can save the IO of deleting an entire file and just delete the encryption key. (aka cryptographic erasure, see [1])

[1] https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.S...


One problem with utilizing unencrypted drives is that if there's a hardware failure, it's more expensive to dispose of them. Normally Amazon overwrites the data, degausses the drive and then destroys it, but if, say, the electronics is faulty, the first step becomes more difficult.


Not this. A customer's bytes are likely spread across millions of hard drives. At the filesystem layer, the region of the disk those bytes occupied still need to be reclaimed.

What you're describing makes sense for a service like EBS where a single customer's bytes are limited to small number of disks in well-defined locations.


My speculation is that they now have nitro encryption on all or enough of the servers supporting s3 so it’s now no marginal cost for them anymore since it’s inline ASIC encryption and not taking incremental CPU cycles anymore.


This also implies that if the encryption key is that easy to dispose of, then it's also easy to somehow lose or screw it turning your valuable client data into a random stream of bytes, doesn't it?


They can dispose of it in the same way that they dispose of the actual core data (ie write over it X times, and ensure when disk is EOL that it's shredded). With cryptographic erasure you can do exactly the same erasure process on fewer bytes and get the same effective results.


“Users” in this case are supposed to be software engineers. Thus, I don’t see how Amazon’s claim can be misleading. Server side encryption is very valuable, especially against breaches on the cloud provider.

I would expect any mid to senior level engineer to understand the implications and add further security (local encryption before upload etc.) measures as neeeded.


Breaches in cloud provider so far tend to be entirely software, at least for the big ones. Encryption at rest doesn't help prevent software bugs


There is a danger too to encrypting your data with non built in KMS key: The key can be deleted and your data can become irrecoverable. KMS has a waiting period of 7-30 days when it is deleted but it is up to you to alert yourself of the fact that someone deleted it. We have AWS config alerts on this that send out via various means.


I believe all of the AWS S3 SDKs except php support client side encryption, you could also roll your own. In my mind that is more secure then S3 SSE or KMS because you are the only one with the key.


Encryption is easy. Key management is hard.


I was amazed when I read the other week it wasn't already encrypted at rest by default.


You could enable it on the bucket level for many years now. This would apply to all new objects in the bucket. So if you enabled it at bucket creation time, it was effectively "by default."


AWS will now encrypt all new data in its Amazon S3 storage service by default. Huge announcement, secure default for the win, sure, but it gives a false sense of security.


AFAIK, it only helps against hackers/insiders who steal the raw file. If you provide access to your bucket, it isn't doing anything. As always with Amazon (and other services, really): you really should understand the service. Just using one because you read "encryption" is unwise.


Wouldn’t your example apply to any encrypted hard drive as well? If you provide access to it encryption is not doing anything.


It's all about layers of protection.

This layer of server side encryption is designed to protect against an attack where someone breaks into an AWS data center, pulls a disk drive out of a storage system that is hosting S3 objects, and then tries to read the data off of that disk drive. Without the key (which is obviously stored separately, on a separate system) the disk will be useless to them.


Which is like the least likely attack to happen.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: