I think the most interesting part of this is the PaaS disaggregation. Heroku built an exceptionally good Postgres service. They could not have done that with multiple DBs. Even their redis is pretty meh.
People like us (Fly.io) will end up either building very mediocre DB offerings or collaborating with DB companies (like yours: https://www.crunchydata.com/products/crunchy-bridge/) to ship stuff that's substantially better than RDS. I'm looking forward to it. Down with mediocre DB services.
Oh thanks for the props Kurt. The idea of PaaS disaggregation is definitely something I've been pondering for a bit and think I'm on the same page. At Heroku we were very opinionated about only Postgres for the longest time. We were fortunate to build-out the add-on ecosystem to give you more options, and explicitly did not want to run them ourselves for the longest time. I'm not sure how lucky we were vs. good, but surely spreading ourselves out to be platform + Postgres + everything else would not have resulted in the same experience.
It turns out running a PaaS is a lot of work. Running a DBaaS is also a lot of work. If you've ever dealt with database corruption, it's not the thing you can just hand wave and do the same way across all databases, you need deep expertise in it, that or you say it's not my problem and offer a poor customer experience. Personally feels like we can do better in service quality as platform/database providers so doing one thing really well feels like a good direction to head for a bit (at least that's heavily what we're betting on at Crunchy Data by just focusing deeply and purely on Postgres).
genuinely curious - this is the first time im hearing of crunchydata in a "versus RDS" context.
is there a pricing and feature comparison for RDS vs Crunchydata ? a honest tradeoff comparison.
We don't have anything published, but some of the basic summary on why us:
- We give you full Postgres super user access, so less restrictions
- Quality of support, whether it's the Postgres basics of indexing or you've found a crazy bug deep in Postgres. Our team contributes a lot to upstream Postgres itself and can go as deep as we need to, but in general quality of support is a big differentiator for us
- We've been able to beat in cases price to performance just on the mix of Postgres experience coupled with our experience running on AWS/other clouds.
- Not locked into a single cloud, can go from AWS to Azure and vice versa with click of a button.
There's more details, and more coming particularly around the developer experience and proactively improving your database for you. But that's the high level pitch.
I'm curious about what "substantially better than RDS" means. RDS has been good enough for me for quite a while. Does it only matter once you get to a certain scale?
Craig here from Crunchy the company he's referring to. Not sure what he has in mind, but having built a lot of Heroku Postgres in the early days I definitely have thoughts on what can make a database great. There is a big gap between most developers and what you need to know to efficiently run Postgres. Without tipping too much of our hand, we're focused deeply on building an amazing developer experience for Postgres. Some things we're thinking about are how we can actively detect N+1 queries (common in almost every ORM, Rails, Django, etc) and notify you about them. We already have some big differences like shipping with connection pooling built-in so you can easily scale to 10,000s of connections, really any production Postgres setup should be running with pgbouncer, where as on a lot of providers it's either not an option or you're left to your own devices.
Good enough may be absolutely fine for a lot of people, but no lock-in to a single cloud, better developer tooling, proactive alerts and recommendations, quality support all feel like an opportunity to be better.
Kurt may have entirely other things in mind, and would be all ears if there is low hanging fruit in terms of feature or experience we can do to make Postgres even better for folks.
Some examples of the things I've missed around developer experience for a database, that Craig and the team made possible at Heroku Postgres, include:
- fork: ever had one of those "why does this bug only exist in production?" problems? It was so trivial to fork the DB and run your tests/hypothesis/whatever without the risk of actually impacting production. Same thing for _really_ testing a migration script or load test.
- follow: a similarly easy approach for getting a read replica which is super useful for generating reporting.
- dataclips: "hey, can you tell me X?" sure, and here's a URL to the results that you can refresh if you need an updated number in the future. So great for adhoc queries.
All of these are obviously doable with RDS and/or other solutions too. But the time taken to do any of the above was often measured in seconds, at most minutes. It's difficult to communicate just how impactful those kind of improvements are to your workflow. It's like it subconsciously gives you permission to tackle whole new problems, build better solutions, get answers to questions you never thought to ask before. Because the barrier to entry is so low you just do these things. You don't sit around wondering if you could.
A great developer experience around a database (one that goes beyond setup and basic ops) is a severely under appreciated thing IMO.
> - fork: ever had one of those "why does this bug only exist in production?" problems? It was so trivial to fork the DB and run your tests/hypothesis/whatever without the risk of actually impacting production. Same thing for _really_ testing a migration script or load test.
This sounds great! How does it work though? Is it using some special postgres feature or btrfs snapshots or something else completely?
Craig (the poster I jumped in to reply to) would know the specifics better than I ever did. My recollection is:
- restore from the latest snapshot (there was one whether you’d configured a custom backup schedule or not)
- replay the write ahead log over the top to catch the restore up to the point in time you asked for/when you ran the command. At least some part of this process leveraged WAL-E, which was a tool largely developed by Heroku employees and open sourced.
This was a decade or more ago though. The state of the art of postgres has moved on and I assume the team would tackle it differently if they were doing it today.
It's leveraging pretty native Postgres tooling that restores the base backup from within Postgres, then replays the WAL to the exact point and time you specify. With snapshots and other mechanisms you may get a database "up" sooner, but we've seen when we follow that approach it's so long for the PG cache to warm up that you effectively still have a useless database even though it's "up". Further Postgres itself depending on how you do it will have to go through crash recovery, which I've seen cases on some providers taking over 10 hours.
Doing the native approach in Postgres isn't perfect, but we've focused on getting the developer experience for it down so you can use your database and it "just work" and if something goes wrong you understand how to rollback seamlessly.
RDS runs pretty well! It's just irritating to use.
The good DBaaS give me a lot more power. This is true for Heroku PG, PlanetScale, Supabase, and Crunchy Data. Some of them let me fork a DB to run a PR against, some give me app level features that save me code, etc.
Most modern hosted DBs also let you run your own replicas.
I'm not really complaining about how well RDS works when your app is connected to it and it doesn't failover/go down for maintenance/etc. It works fine as a DB backend. That's just a baseline I don't think is very valuable anymore.
we’re using aiven.io and quite happy, although hard for me to compare. you can port across clouds, which is reassuring if we need to switch. Otherwise their support was helpful debugging a couple of db issues (in our own code). Wonder how they compare in this matrix if anyone knows?
aiven.io is quite good. They went broad instead of deep, so they're not as good at Postgres as the Postgres specific companies. But they're probably better Postgres than a PaaS can build by themselves.
that’s useful to know that others do postges better. We use their redis as well, so in some ways, breadth is useful for them and for us.
at the time we picked aiven, we couldn’t find any redis-specialized hosting with instances in GCP europe if I recall. so their breadth also plays in terms of locating near the customer servers (important for latency)
You can easily use pg_dump to do a "vanilla" backup to s3. Its a managed db service but if you wanted to run your own you can extract your data and move to a new db. The lock is not "complete" you are acting like you can't even extract your data.
"Only export your data with pg_dump" is one of those misfeatures that makes RDS mediocre. They don't really expose much of the power of the underlying DB.
you cannot ssh to RDS machine. so u need to get another EC2 machine and pg_dump over the network. the connection breaks - yes has happened to us multiple times.
RDS makes it very inconvenient to do anything other than use their managed services.
because RDS backup data storage is VERY EXPENSIVE even compared to s3. this is very deliberate.
Ah that explains seamless migration from Heroku and support for Heroku postgres on fly.io. I appreciate not building subpar services just to show 'Yeah we have that too'.
My current strategy is to use Heroku to test waters, If the validation is successful then migrate to fly.io.
People like us (Fly.io) will end up either building very mediocre DB offerings or collaborating with DB companies (like yours: https://www.crunchydata.com/products/crunchy-bridge/) to ship stuff that's substantially better than RDS. I'm looking forward to it. Down with mediocre DB services.