One thing I think missing from this write-up is to walk through how the Restore process will work with encrypted data under pgsodium.
Namely what will happen when you first restore some data into a new Postgres instance which booted with its own randomly generated root key (the wrong key) and then how you are supposed to patch in the correct key and be able to start reading secrets again?
Also, how does the decrypted view look if you try to read it with the wrong key loaded?
Do you have to worry about a race condition where you boot an instance with some encrypted data but forget to put the key file in place, and then end up with a new random key, saving some new data, and now you have a mix of rows encrypted with two different keys? Or will the whole subsystem block if there’s data stored that can’t be decrypted with the resident key?
> Namely what will happen when you first restore some data into a new Postgres instance which booted with its own randomly generated root key (the wrong key) and then how you are supposed to patch in the correct key and be able to start reading secrets again?
We restore you're original key into new projects. There is also WIP on accessing the key through the API and CLI.
> Also, how does the decrypted view look if you try to read it with the wrong key loaded?
The decryption will fail (pgsodium will thrown an error).
> Do you have to worry about a race condition where you boot an instance with some encrypted data but forget to put the key file in place, and then end up with a new random key, saving some new data, and now you have a mix of rows encrypted with two different keys? Or will the whole subsystem block if there’s data stored that can’t be decrypted with the resident key?
There's no race in the system, your key is put in place by us before the server boots.
Thanks for the feedback! I'll put some more thought into your question about authenticating a key is the original before you use it.
Thank you for the quick reply! I’m not a Supabase customer so apologies if the questions don’t make sense in your context.
But I think it would help to understand if Supabase is fully managing key backup and recovery internally, how exactly is that working?
Ultimately the whole value of TDE at the database layer comes down to two things IMO which are flip sides of the same coin;
1) Being able to store your database backups in less trusted locations,
2) actually keeping the secret data secret, which amounts to keeping that encryption key secured at a much higher level than the database backup itself.
In the end it’s just key vaults all the way down, isn’t it!
> But I think it would help to understand if Supabase is fully managing key backup and recovery internally, how exactly is that working?
Supabase persists and protects your key and we will provide API and CLI access to retrieve it securely. This is a pre-release so we haven't worked out all the use cases yet but those are the basics for MVP.
> 1) Being able to store your database backups in less trusted locations,
Yes. Using Transparent Column Encryption you control on a column by column basis how your data is stored encrypted so you have more fine grained control over your data.
> 2) actually keeping the secret data secret, which amounts to keeping that encryption key secured at a much higher level than the database backup itself.
Yep, we don't have all the answers there, keeping the root key out of SQL is a big one. Maybe requiring MFA to access the key even with the API, there are a lot of possibilities. Thanks for your feedback these are all going into my notes for an upcoming release.
I’m really impressed with everything Supabase does, but…
They market themselves as the “open source alternative to Firebase”. Which is great, mainly because you don’t have to worry about vendor lock-in (to an extent).
Yet one of the main selling points of Firebase (at least in my humble opinion) is that you don’t have to concern yourself at all with implementation details and stuff like that. The learning curve is small, you get a database without having to think about databases.
Yet everything I read about Supabase is heavily centered around Postgres, it seems like you really need to know the ins and outs of the database. I wouldn’t really feel comfortable adopting Supabase without taking a class in Postgres first.
I’m wondering if Supabase plans to stay “low level” or give a higher level of abstraction to those who want it.
Edit: just want to clarify, I’m not saying “sql bad”, I’m saying there’s a not-so-small market (mostly beginners) who would see this as a big adoption barrier, which I think is understandable. I don’t know if Supabase wants to (or even should) cater to both markets.
My experience is that Firebase requires you to understand the ins and outs of Firebase, which has no real equivalent. Firebase is notorious for pathological cases and performance cliffs and other "gotcha"s; it isn't magic. Knowing what's going to perform poorly or become unmaintainable or otherwise cause problem requires you to have either prior knowledge or done something wrong and learned the hard way. At least with Supabase, if you know about Postgres, you can bring that knowledge with you.
Exactly, you do have to gain some understanding of Postgres yes, but it's SQL at the core which IMO is what you want 90%+ of the time, and you're not locked into their platform. When your company gets larger and you're ready to start wasting VC money on db admin and other problems that have already been solved, you can rip out Supabase and all the SQL will still work.
It’s funny reading that comment from the other side of the fence. I’ve not looked closely at Supabase so I have no real opinion on it, but hearing someone say that you need to know Postgres to work with it is reassuring to me.
Edit: don’t take that as a criticism, just more of an observation that there’s a target audience for which is probably hits a sweet spot.
Honestly, only to the extent that you need to set up your schema. But if your queries aren't too complicated, you can just use the client which is fairly straightforward.
This is true but for me the transparent abstraction over Postgres is actually a big plus though I can see that people who don't know postgres or SQL would be a little intimidated. I will say that postgres is the best SQL DB I've worked with and has become my goto.
In my experience there's no free lunch when it comes to high level abstraction over complicated systems. Also, having the option to draw upon the mountain of docs and info on the net about Postgres is nice to have in your back pocket. Of course the tradeoff is that you need to know SQL but I think that's a fair tradeoff.
I would like to see some more improvements over supabase js client api, but I hope they don't hide the fact that there's a relational DB under the hood and allow advanced access to the underlying postgres API.
I could see them making a nosql supabase over something like a mongo type DB like AWS does with document DB or even postgres jsonb fields. That would be nice feature. You could probably get a lot of mileage out of postgres JSONB fields.
I haven't used firebase much except for toying around with it but I think it's certainly a good option for simple nosql db for simplicity and speed of ramping up. Only thing with Firebase is that the cost is prohibitive at larger scale and you're going to be coupled to then when you get to that point so it could come as a rude awakening when your app starts to get a lot of users.
You just learn Postgres/SQL as you go. And I've gotten much better at it (schema design, functions, querying) after adopting Hasura (similar idea as Supabase). It's an investment that will pay off for any developer and will outlast whatever cool framework of the month.
But yeah, there's room for more higher level abstractions on top SQL databases. Metabase actually has a nice UI for building queries. Maybe something like this would be useful in Supabase: https://www.metabase.com/docs/latest/questions/query-builder...
Yes. What I'm saying is the Metabase query builder is cool. Could be really useful for making a database view this way (generating the schema change in the background).
With Firebase you have a team managing the service uptime.
When I last checked, Supabase is a group of processes that you manage yourself.
This means that:
- A. If something goes wrong or you need to customise something, it would be quite complex to fix as you have all these different processes and code bases to understand. The sum of depended-on lines of code for all the open source code bases in Supabase would be massive.
- B. You are tightly locked in. Once you code against the Supabase API's you will not be able to move your app off of it. Other API's lock you in too, but because Supabase does so many things you would need to replace a lot of functionality all at once to move away.
Supabase developer here. Firebase has a team managing service uptime. Supabase has a team managing service uptime. If you self-host Supabase, you have to manage uptime yourself. You can't self-host Firebase.
Regarding lock-in, you're pretty much right here, but this is going to be true of your entire stack. If you choose to develop your frontend in React, or Angular, or Vue, you're going to be locked into that framework.
"...because Supabase does so many things..." is a good thing, IMHO. You can choose to use any or all of our product, and each piece you choose is open-source. If, say, you choose to use Supabase Storage, and you have an issue with it, you can switch to something else but still use Database, Auth, and Functions without bringing down your entire project.
At some point you’re married to something right? At least you can Supabase to be self hosted even if that has warts.
You just can’t do that with Firebase
Though I’d argue that people overthink the value in being able to self host “just in case”. If it’s ever truly a concern you have you should use more vendor agnostic solutions
> people overthink the value in being able to self host “just in case”.
This. I'm guilty as charged here over the years. As I've grown older I've realized a few things. Nothing is, or ever will be perfect. Nothing lasts forever, so trying to build for what might happen in the future usually hampers what you do in the present. (IOW, don't worry about what might happen. Just build with what you have now and do the best you can. If what you build lasts until the next wave comes and makes it all obsolete, call that a win.)
If Supabase goes away at least your schema and data are still in Postgres.
What happens if Firebase goes away? Or you outgrow the NoSQL model (which you will).
What happens when you get acquired by big Java corp? They're going to toss aside your web layer and rewrite it in some old version of Java. But they will keep your data model and that's easier to do with SQL.
Supabase developer here. True to an extent, but at least with the data, it's PostgreSQL, which you could take somewhere else. Or you could easily port it to another brand of SQL and do something else with it.
And as far as being "locked into the software", isn't that pretty much true of your entire stack? Once I choose to develop in React, I'm locked into that, right?
Agreed. I used Supabase for a fairly simple project and felt like I had to know a lot about Postgres to implement anything. If you’re building something yourself, I feel like Firebase is still the safer bet. I’m guessing Supabase really shines when you’re building a startup or have a team.
In my humble opinion, if you're a software engineer in the modern world, then learning Postgres is about as fundamental to your job as learning to dribble would be to a job as an NBA basketball player. It is the just the foundation of almost everything else.
I agree 'software engineer' is too broad, but it'd definitely take non-trivial effort (and perhaps some otherwise pointless resigning) to avoid it in domains/companies/roles that could use it, or similar alternatives.
It's bullshit like this why I hate boomers and / or stuck up / and / or snobby / and / or ignorant software engineers, who, in the end, maybe aren't actually snobby, but just ignorant.
YoU cAn Go YoUr EnTiRe CaReEr AnD nOt UsE iT!!!
sure, this is true if:
- you don't work for / build / care about apps that have a persistence layer and serve more than about... let's say 20K daily users
- you don't care about perfomance
- you are confused
Postgres over:
- mongo: Postgres has ACID principles, where with Mongo you aren't sure you've saved ANYTHING at scale, there are multiple blog posts and humorous videos about them, i leave hunting them down to your discretion
- mySQL: don't even get me started, doesn't have any sort of plugin possibilities, is slower performance wise in literally ANY benchmark
- LiteDB: I know its the hacknews hipster rage, but seriously, you're going to rely on your entire backend via IO with a single file? ok, enjoy that one
sorry for the rant, i know it's not conducive to the hackernews mentality, but i've heard this rage and poking fun at postgres so many times, and nearly all have absolute NOTHING to with postgres' technical performance and much more to do with ego or some bullshit affiliation to some company and i'm sick of all of it and finally laying down the law:
Postgres is one of the BEST (if not THE BEST, bar none) databases currently available.
> Postgres is one of the BEST (if not THE BEST, bar none) databases currently available.
I would certainly expect the best database out there to be relatively straightforward to scale out. Posgres isn't. As a former SRE, redundancy > performance (for the differences we're talking about).
I'm so excited for Supabase. As soon as they move Realtime Subscriptions out of alpha / beta, I will replace Firebase on all new projects. The Firebase / Firestore analog - Snapshot Listeners - give your application a real-time backend for free and simplifies state management drastically since your subscriptions are your store.
Supabase being built on SQL is interesting to me- I love PSQL and the row-level security rules are incredible. But the historical SQL v NoSQL debate involves the trade-offs of Consistency, Availability, and Partition Tolerance [0]. With Firebase (and typically NoSQL) you lose Consistency and you get a bit of redundance by virtue of using onWrite listeners as opposed to Joins. That model scales really well since it's amenable to sharding seamlessly. What will scaling a Supabase backend look like?
Hmm... I feel like secrets are the one thing I don't want to be in Postgres... because I want to store my Postgres credentials in the secrets vault! And I certainly don't want to have to update the configuration for every service which accesses my secrets vault every time I upgrade my Postgres database (and the access URL changes).
IMO nobody's doing secret management for small companies / products particularly well, so there's definitely a niche to be filled here. But I'm not quite convinced this is it...
As a security guy, I'm always worried about secrets living in Env variables because it's an easy place for them to leak. (Many loggers will automatically log env vars, for example.)
That's why many services, like Kubernetes, have moved away from this model by either serving the secrets up in a runtime-mounted file (like /var/secrets.yaml) or by requiring you to make an explicit API call (SecretsManager.readSecret("foo")).
From a security perspective, those paths require a much more difficult exploit like full Remote Code Execution (RCE) in order to leak values.
The downside is that it requires modifying application logic to migrate away from Env vars though. Usually it's pretty easy, but if you have tons of legacy code I'm sure that often presents a challenge.
> Hmm... I feel like secrets are the one thing I don't want to be in Postgres... because I want to store my Postgres credentials in the secrets vault! And I certainly don't want to have to update the configuration for every service which accesses my secrets vault every time I upgrade my Postgres database (and the access URL changes).
Password storage is a somewhat different problem, if you're checking passwords, you just need to know it's authentic, not the actual password itself, so it's common to use hashing and salting techniques for this (pgsodium exposes all of the libsodium password and short hashing functions if you want to dig further) your best bet here is to use SASL with SCRAM auth for postgres
Secret storage is more about encrypting and authenticating data that is useful for you to know the value of. For example you need the actual credit card number to process a payment (waves hand, this is a broad subject, and some payment flows do not require the knowledge of CCN) but you want to make sure that number is stored encrypted on disk and in database dumps. That's the use case the vault is hitting.
We also have some upcoming support for external keys that are stored encrypted, so for example you can store your Stripe webhook signing key encrypted in pgsodium and reference it by key id that can be passed to `pgsodium.crypto_auth_hmacsha256_verify()` to validate a webhook callback instead of the raw key itself.
Ideally, you could have a Postgres instance specifically dedicated for secrets - I don't see why you should couple sensitive and non-sensitive data. Many OSS services like HashiCorp Vault just do that: you give Vault a backend (which can be a Postgre DB, just like the one Supabase is offering) and it's gonna use that to save the secrets.
You could then use (e.g.) OpenID to connect to the specific instance of Supabase with those secrets from your application
We are considering running the Vault in Trusted Execution Environments (TEE) that are similar to encrypted VMs, where the memory traffic to the cpu is encrypted until it hits the processor. We're still investigating this possibility but it would make for a more secure cloud environment for sure. Of course AWS charges quite a premium for them!
Hashicorp Vault is always my goto even for small companies. It seems too much but it’s really not. A single instance is scalable enough to handle quite a bit of traffic.
Another good alternative if you need something more SAASy is the 1pass API product
I felt the same but it was too hard to find people that knew how to operate Vault and so we abandoned it since it was too risky to have such a critical part of our infra without an abundance of talent out there.
I’m confused on why secret management considered secure. Maybe I’m missing something.
Why is letting a third party managed your secrets is secure? So if that third party gets compromised, they now have access to all your secrets. Amazon or other company employees can also view your secrets.
If your server gets compromised, the secrets that are accessible via that server are also compromised. Isn’t that the same impact as just keeping the secrets on your server? Maybe worse if your permissions are broad. You’re merely adding an extra step to get the secret from your secret management.
Speaking for EnvKey (mentioned above—I’m the founder), we use client-side end-to-end encryption to address this concern. Secrets cannot be accessed on an EnvKey server.
I’m biased, but I share your skepticism of secrets management services that don’t use end-to-end encryption. It’s not a wise choice for either the service provider or its users.
If I need access to a decryption key to read my secrets or to provide my secret to a process
I still have to manage my decryption key which means I might as well use that process to manage my secret
...and you managing your own secrets is way better than a third party?
wake up people, its all the same types of servers managing the same type of passwords with the same types of security layers, not one is better than the other! nobody has a 'secret sauce' to storing your passwords.
What I don't understand (perhaps I haven't found the right docs to read) is how to safeguard the secret if a client machine of the secret is compromised. Say I have a web server that's connecting to the database and the database credential are stored in some separate value. If someone get's access to the web server machine can they not access the value from there?
So I've actually spent about a year of my life working to solve this exact problem. Specifically: How do you prevent a single point of failure from leaking everything sensitive in a database.
It turns out that it's a pain in the rear, but it's possible. You can read through the docs about the design on the site[0].
The parts that I haven't implemented yet, and that limit it's utility in production, are around searching the encrypted data (requires a second vault using asymmetric encryption) and some more in-depth disaster recovery (secure token recovery).
If you give a database client access to the decrypted secrets, then they have them. What the client will not have access to is the hidden root key that is not accessible to SQL that pgsodium uses to encrypt and decrypt data.
The Vault will not prevent someone who has login access to your database and the right grants (or superuser) from decrypting the data. If someone is in this position they are fully compromised and the Vault is not protection against that (nor is anything else really).
In particular if an attacker has a postgres superuser login they can essentially asct as the OS process owner, and could possibly get around the process hardening we already employ to reduce that risk, but again Vault is not designed to protect against a full superuser exploit. You must carefully guard database login access.
However, the secret data that is stored on disk, in WAL logs, and in database dumps is encrypted. This way you are ensured that your secrets are encrypted at rest. The Vault also provides using standard Postgres privilege access control (via GRANT/REVOKE) to control access to the decrypted data.
I wasn't talking just about pgsodium or the vault product but similar products in general.
I understand the point of the database client having access to to the database key and not the key to the secret vault. So in this case other secrets at the vault are essentially protected. But let's say I really have this one secret to protect in which case is the vault fairly pointless?
Is it essentially that if a client using KeyX for some purpose than a compromise of said client will essentially lead to KeyX and there's really no way to protect it?
The Supabase Vault is encryption at rest, the column is stored encrypted in the database, WAL streams and backup dumps. This is usually more efficient than dealing with full disk encryption, and it allows you to control who sees decrypted data on a role-by-role basis using normal Postgres security GRANTs.
With Full Disk Encryption you also only get encryption to that one disk, if you are doing WAL shipping, the disk you are storing the db on may be encrypted, but the WAL files you ship will not be, so you have to make sure those files are encrypted through a full chain-of-custody. With the Vault the data starts off encrypted before going into the WAL stream. Downstream consumers would need to also acquire the hidden root key to decrypt it. We're working on making that process seamless but also secure.
All data goes in _a_ database, we’re just providing an extension in case you put sensitive data in your own. Developers often store sensitive data, this extension ensures that it’s encrypted at rest so that it doesn’t leak to logs and backups.
Specifically for Supabase customers, we have another extension called pg_net, which can send database changes to external systems asynchronously (called “database webhooks”). One of these systems could be, for example, AWS Lambda, but to do that we will need a Lambda execution key. Vault allows users to safely store this key inside their database, and because it’s co-located with the data the payload can be sent immediately via a trigger (and end-to-end encrypted).
Vault will expose a lot of libsodium functions that are useful to developers - encrypting columns, end-to-end encryption, multi-party encryption for things like chat apps, etc
Sorry, can you help clarify your comment? Do you mean that it's better to not call this "Supabase Vault" and just say "Secrets Management available in Supabase" ?
I figured there would be a comment like the one to which you responded, but didn't expect it to be the bottom one, downvoted to obscurity. Vault is an already heavily used word, with Hashicorp being the big player with it, and Ansible a second. There are a lot of words that could be used, and it is kind of a shame that one already associated to a big player in the secrets management game was the one used here.
For what it's worth, I think keeping it named vault helps exactly with your intent, it signals that the product is a secrets management product (or something used to store extremely valuable data)
Namely what will happen when you first restore some data into a new Postgres instance which booted with its own randomly generated root key (the wrong key) and then how you are supposed to patch in the correct key and be able to start reading secrets again?
Also, how does the decrypted view look if you try to read it with the wrong key loaded?
Do you have to worry about a race condition where you boot an instance with some encrypted data but forget to put the key file in place, and then end up with a new random key, saving some new data, and now you have a mix of rows encrypted with two different keys? Or will the whole subsystem block if there’s data stored that can’t be decrypted with the resident key?