Hacker News new | past | comments | ask | show | jobs | submit login
Encrypting Postgres Data at Rest in Kubernetes (crunchydata.com)
108 points by plaur782 83 days ago | hide | past | favorite | 40 comments

These automatically encrypted disks with managed keys in the clouds are nice to check the "encrypted at rest" checkbox for security audits but I think that they add little security. In most scenarios I can think of, both the data and the keys will be accessible to the attacker. And the ones where the attacker would have access to only the encrypted data seems very unlikely, like physical access to the data center with the knowledge of where is physically stored the data. But I would be gladly proven wrong.

> And the ones where the attacker would have access to only the encrypted data seems very unlikely, like physical access to the data center

This is the primary purpose of encryption at rest. It is not unlikely. Have you ever worked in an average datacenter? Some of them have very little physical security and don't monitor employees' access to cages. And if you have a cage on the same floor as another customer, all it takes is a fake name badge and a clipboard to walk up to someone with their cage open and say you're doing a routine inspection. Walk into the cage, pop out a drive, put it under your clipboard, walk away. Ask physical security pentesters how "difficult" it is to steal from a datacenter. And let's not even get into dumpster diving, where clients regularly toss entire servers without erasing disks into dumpsters.

Like someone else said, it also is a good practice in multi-tenant configurations where a misconfigured storage or compute backplane could expose data to the wrong tenant.

Any case studies related to AWS or azure?

But that is exactly the point. You want to ensure that disks later sold by your cloud provider do not contain your data in plain. That's the whole point with encryption at rest. It's not supposed to protect you against an attacker with root access to your server.

A good, fairly recent example from the city I live in, of hard drives being sold that were (accidentally) absolutely chock full of unencrypted customer PII: https://www.google.com/amp/s/globalnews.ca/news/4476625/ncix...

But encryption at rest does protect you from insider or attacker access with full privileges on the servers where the data is stored. The data would necessarily be visible to a root user on the system where the data is being decrypted by an application, but that's not necessarily the same node.

The big cloud providers destroy old drives.

Not everything is physical access though. Suppose someone comes up with a way to force S3 to read random storage blocks, or to make a virtualized storage device read past its boundaries in the underlying storage, or to intercept another VM's ring buffer in the hypervisor. It's an entirely different scenario to read plaintext vs read something encrypted.

Even if all it protects is against the scenario of the cloud provider forgetting to wipe a disk, that's worth it.

That's a good point. I prefer to encrypt my files before they go to S3, it's easy to do in applications or using Minio as a gateway.

Minio can do that transparently?

I think so, but I'm not sure about their long term plans about it.

In most cases, you're trusting other organizations to dispose of the disk, ensure they can be securely re-used (1) and that their employees will generally behave correctly and ethically.

But still, in general, I agree. I fill in these security questionnaires maybe once every couple months, and I'm starting to see a clear shift from encryption of data-at-rest to encryption of data-in-use. Which mostly just feel like an even more tedious form of security theater.

(1) https://news.ycombinator.com/item?id=6983097

Agreed, although it does mean that if you release the volume you aren't relying on the cloud provider to wipe it before reuse (AFAIK they generally do that, AWS definitely does but it's possible other providers might not, especially if you're on some more custom OpenShift style setup).

GCE persistent disks are always encrypted anyway, with a customer-provided key if the customer chooses, or an automatic ephemeral key otherwise. I think the only storage option in GCE that doesn't feature customer encryption keys is locally-attached SSD, where you'd need to layer your own crypto on top. Definitely don't rely on cloud providers to wipe your data.

> But I would be gladly proven wrong.

All I can do is support your position. Our customers are interested in the idea of encryption-at-rest of any sensitive business data that we have on disk at any time.

I work in banking, so I am ethically bound to provide the most brutally-honest takes regarding security models to our customers. When developing proposals for encryption-at-rest we have to cast everything in the worst possible light.

We started with looking at things like DPAPI but that is almost a joke for our use cases.

We are more worried about something along the lines of - A senior IT administrator has access to all application servers with full admin/root and can trivially dump customer databases to the dark net for crypto. When modeling for this scenario, we have to consider both physical and digital controls. Digital controls alone will never provide a comprehensive solution for this kind of adversary. Our approach looks something like this:

The customer has to install HSM(s) in their environment. This would typically be something living in a nested secure network or USB dongle directly attached to each server. These secure environments/items must not ever be physically accessible unless at least 2 employees are present at the same time. Each of our application servers which transact PII must be configured to talk to the HSMs. Multiple HSMs provide redundancy in a production environment, mitigating downtime risk. The cryptoscheme is a key-wrapping approach, where working keys are managed per user work item. When a work item is requested for the first time, the HSM has to unwrap its working key (which is stored encrypted on the application server). There is a bit of a mild tradeoff between security and performance here. Hypothetically, we could require the HSM encrypt the payload itself on every I/O, but for our product that would quickly saturate the capacity of even the most expensive HSMs. The policy is that an untouched work item has its key zeroed out in memory after X seconds. Working keys that timed out of memory are marked for rekey on next retrieval.

We have also modeled for remote debugger scenarios, and believe that simple firewall and other access controls can mitigate most of those vectors. This is very much a defense-in-depth strategy, so I am only presenting a small piece of that puzzle here.

You're kind of missing the point. The whole point of data at rest encryption is: if this drive dies and gets thrown into a dumpster, I don't want some random shmuck to pull it out, put it in a server, run a recovery program against it, and get my customer's SSNs.

Now, I would hope Amazon shreds drives that are declared dead, but I wouldn't want to risk my business on it.

It has almost nothing to do with protection against a targeted attack and everything to do with chain of custody.

So many developers think encryption means it's completely fine to have unrelated security failures

Encryption at rest is a protection against physical theft that's it

Encrypting a hot relational database is madness but I've seen several bad attempts at it anyway

I agree in the general case but I think there are uses for it that only go slightly beyond the fully automatic version. For example selectively encrypting less frequently needed sensitive fields can give you more granular access control and logging, especially if you have to check in with another service to perform the decryption instead of having the key available locally.

Am I the only one enforcing a strict no database in kubernetes policy ?

You're not the only one, obviously. But that stance is not really relevant anymore. Just about every database as a service company will run their databases in kubernetes now.

Honest question: Why is that? What is wrong with your DB running in Kubernetes and what do you suggest as alternative?

I consider anything in kubernetes disposable. I want to be able to lift and shift anything in a new cluster without any migration in a matter of minutes.

Anything "stateful" like a database breaks this paradigm.

I have nothing against databases used as cache that can be "re-filled" upon re-creation, but I believe anything holding business critical data shall be held outside of a kubernetes cluster. Why, because being one command away deleting your StatefulSet, Helm Release ... etc scares the shit out of me.

You can of course minimize the risk with correct RBAC, ensure proper backup/restore migrations but that require lots of staff and efforts I can't spare.

So until I can be reassured that I have all the tooling that can recover rapidly any catastrophic failure/mishap, and that all this tooling is tested monthly, I enforce using managed databases services.

if you set a retain policy on persistentvolumes it will prevent your volumes from being deleted even when you delete the owning objets. cloud providers will keep the virtual drive on those cases even if you delete everything

regardless, it’s the wrong thing to fear. this is at the level of logging in every user as root on your servers and databases because proper user management would require extra staff and efforts you can’t spare.

There are databases where individual nodes are disposable. A lot of managed dbs aren’t exactly zero ops and hard to service without downtime too

Performance is a common reason. There are architectural incompatibilities between high-performance database engines and the way Kubernetes is designed to work. Kubernetes does not respect the traditional contract between the OS and the database engine, transparently interfering with that interface in adversarial ways that degrade important optimizations. Ironically, the performance loss can be substantially worse than running databases in a properly configured virtual machine -- virtual machines have some overhead but they otherwise don't interfere with the proper functioning of this software. Kubernetes wasn't designed to efficiently support the syscall and hardware utilization patterns of I/O intensive applications and this is evident throughout, even requiring non-standard hacks to set things up in Kubernetes that really shouldn't be necessary.

Deploying databases in Kubernetes is fine for many applications, I've done both. Not every application that uses a database is data intensive.

Do you have any specific examples of Kubernetes interfering with this contract? I've not heard of this kind of behaviour before.

All high-performance database kernels do full kernel bypass e.g. they control their storage hardware, CPU affinity/context-switching, memory, network, etc explicitly. For all practical purposes Linux turns into little more than a device driver. This enables integer factor gains in performance via various optimizations. Ironically, it also makes the code simpler because behavior is explicit.

Linux is specifically designed to support this type of usage. The necessary syscalls were added decades ago, originally to support databases. Kubernetes intercepts these syscalls because they break its abstractions; while they appear to function like the underlying kernel syscall, the resultant behavior is not the same and generally unsuitable for these types of database architectures. The practical effect is degraded and unpredictable performance because it violates invariants that core optimizations rely on.

This has been kicked around by Kubernetes people for years, including within my own orgs because we use a lot of Kubernetes. No one has every been able to make this type of software achieve comparable performance, even when we've used a lot of hack-y workarounds. Kubernetes was not designed to allow software to interact with the Linux kernel in this way. Consequently, this type of software is deployed on VMs or bare metal in practice, even if everything else is on Kubernetes.

Sounds like parent misunderstood how PDs work in k8s or maybe is referring to THPs - still doesn’t make a ton of sense

No. You are not alone. There are still a number of organizations that are cautious about deploying databases in Kubernetes. That said, various third party surveys as well as anecdotal evidence of what we are seeing at Crunchy Data suggests that deploying databases on Kubernetes is increasingly common. The degree to which it makes sense often depends on whether or not the organization is standardizing around Kubernetes for their deployment model more generally.

I'm hoping these kinds of policies continue to be phased out.

The Kubernetes world has changed a lot in the past few years in ways that make databases-in-k8s more appealing. Such as:

- Kubernetes "eating the world", meaning some teams may not even have good options for databases outside k8s (particularly onprem).

- Infrastructure-as-code being more prevalent. Since you already have to use k8s manifests for the rest of your app, adding another IaC tool to set up RDS may be undesirable.

- The rise of microservices, where companies may have hundreds of services that need their own separate data stores (many which don't see high enough traffic to justify the cost of a managed database service).

- Excellent options like the bitnami helm charts: https://github.com/bitnami/charts or apparently Vitess (haven't used it myself): https://vitess.io/

Obviously if the use-case is a few huge, highly-tuned, super-critical databases, managed database services are perfect for that. But IMO a blanket ban might be restricting adoption of some more modern development practices.

I wrote recently a piece [0] on why I believe you should run your databases in Kubernetes. FYI.

[0]: https://thenewstack.io/kubernetes-will-revolutionize-enterpr...

No, other people also need consistent performance and a simpler system that works in practice.

I have a strict policy of not running my own database at all. Does that fit within your policy, or does it violate your policy if my database-as-a-service vendor uses a container orchestration platform?

No it does not. I don't run database in kubernetes because I don't trust myself to be able to recover from disasters.

How database-as-a-service vendor run their services is none of my business as long as they deliver the performances I need and working backup/recovery procedures.

why is that?

Because resource intensive stateful workloads with persistent data is basically k8s’s Achilles heel. It’s not that k8s can’t handle it, it’s just that you get pretty much no benefits from k8s and so the extra configuration overhead is rarely worth it compared to running an external db cluster.

Was hoping for something a little more profound than "use an encrypted storageclass for your volumes".

Yes, what they explained is entirely based on encrypted EBS volume. Nothing related with Kubernetes apart from the ˋencrypted: true` in the StorageClass.

Tangentially related: what's state-of-the-art for data protection & access control for small organizations? One runs into the "someone's gotta be trusted with the master keys" problem there so early & often that all the "big" solutions feel silly. Do small shops just farm this out via SaaS and hope their provider's doing the right thing?

(the answer back in the day, and perhaps still, was just "they don't really worry about it at all, and hope nothing goes wrong")

This post was really laying it on a bit thick on the marketing, with three mentions of their own products before even finishing the introduction. I know that’s the point of most of these posts but then when the content was also a product, it’s too much. Pass.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact