Hacker News new | past | comments | ask | show | jobs | submit login
Heroku Postgres is now based on AWS Aurora (heroku.com)
211 points by mebcitto 11 months ago | hide | past | favorite | 142 comments



A warning about Aurora: It's opaque tech. I've been on a project that switched to it by recommendation by the hosting provider, and had to switch away because it turns out that it does not support queries requiring temporary storage, i.e. queries exceeding the memory of the instances.

It manifested the way that the Aurora instances would use up their available (meagre) memory, then start thrashing, taking everything down. Apparently the instances did not have access to any temporary local storage. There was no way to fix that, and it took some time to understand. After having read all the little material I could find on Aurora, my personal conclusion is that Aurora is perhaps best thought of as a big hack. I think it's likely there are more gotchas like that.

We moved the database back to a simple VM on SSD, and Postgres handled everything just fine.


We’ve generally been happy with Aurora, but we run into gotchas every so often that don’t seem to be documented anywhere and it’s very annoying.

Example: in normal MySQL, “RENAME TABLE x TO old_x, new_x TO x;” allows for atomically swapping out a table.

But since we moved to Aurora MySQL, we very occasionally get stuff land in the bug tracker with “table x does not exist”, suggesting this is not atomic in Aurora.

Is this documented anywhere? Not that I’ve been able to find. I’m fine with there being subtle differences, especially considering the crazy stuff they’re doing with the storage layer, but if you’re gonna sell it as “MySQL compatible” then please at least tell me the exceptions.


The first result on Google shows that Aurora certainly does have temporary local storage https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...


I believe this issue is (or was) real. There are important differences in how Aurora treats temporary data. Normal postgres and rds postgres write it into the main data volume (unless configured otherwise). Aurora however always separates shared storage from local storage and it's not entirely clear to me what is this local storage physically for non-read-optimized instance types. The only way to increase it is to increase the instance size. [1][2] This is indeed frustrating because with postgres or rds postgres you just increase the volume and that's it.

Luckily since November 2023 it also has r6gd/r6id classes with local NVMEs for temp files. [3] This should in theory solve this problem but I haven't tried it yet.

[1] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

[2] https://www.reddit.com/r/aws/s/sIhBQhsG80

[3] https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-au...


I think Aurora has to go through the same development process as every database. They changed essential patterns in the database, and there are severe side effects that need to be addressed. You can see the same with Aurora Serverless and the changes in V2; there were some quite quirky issues in the first versions.


Calling it a hack is pretty unfair. The log storage engine is a huge innovation, in my experience makes large MySQL/pg clusters much more reliable and performant at scale in a variety of different ways.

It has a couple of quirks, but on balance it feels like the future - the next evolution of what traditional rdbms are capable of.

But if you don’t have scale or resiliency needs it probably doesn’t matter to you.


Isn't Aurora mainly about their unique handling of logging and replica which leads to high availability and fast recovery? If you switch to VM, how do you handle availability in multiple locations and backups? If database checkpoints are good enough for you, sounds like Aurora is overkill in the first place.


We’re currently struggling with switching from RDS to aurora. The replica times are absolutely bonkers long for the simplest of writes.


Aurora has essentially constant replica lag times, it’s one of the best features. Should be around 30-50ms always, are you seeing different?


7-20 seconds depending on location.


Location? Are you doing multi region?


I’m going to preface this with I’m not an OPS person. But yes multi region. The main writer instance is us-east-1 it performs excellently when hitting this region. We have read replicas in us-west-2 and some in Europe/emea and Asia/pacific.

When hitting one of these with a write you end up with massive delays. The 7 seconds and below tends to be from us-west-2 and the higher numbers are from our Japanese users.

Our OPS team has struggled to figure out why the delays happen. There’s some code fixes we could probably do (i.e always write to the writer) but as team lead for the development side the deadline is too close and I don’t want to rewrite core parts of the app to split reads and writes. They engaged AWS support so I’m hoping something is just misconfigured or maybe this just isn’t the use case for Aurora.


Oh. Yeah the log file system and thus the consistent replica latency are local to a region.

It sounds like you might be using global aurora with write forwarding? That’s pretty new and not something I have experience sorry. AFAIU though it’s a whole different thing under the hood.


> It sounds like you might be using global aurora with write forwarding

Yes I believe this is what they chose. Honestly I’m going to leave it up to them and aws support. I have other fish to fry to get the functionality finished.


You could just point all your apps at the useast1 database in the meantime. Add some latency, but better than 7-30s or whatever.


Can you elaborate, which exact timings are bad?


I’ve added more details to a sibling comment. I’m not sure I can add much more. I’m not an OPS person, just team lead on the development side.


Having previously been on several managed PostgreSQL providers and now on AWS Aurora -- Aurora has been pretty great in terms of reliability and performance with large row counts and upsert performance.

However, Aurora isn't cheap and is at least ~80% of our monthly AWS bill. I wonder how it is cheaper than Heroku's previous offerings? Is it Aurora Serverless v2 or something like that to reduce cost? Aurora billing is largely around IOPS, and Heroku's pricing doesn't seem to reflect that.


Heroku Postgres has always been priced on platform convenience with very high margins. It's been many years now so I don't remember the exact numbers, but I moved a few databases from Heroku to AWS and reduced my DB costs ~90% (magnitude ~900/mo --> ~100/mo) for roughly the same specs. They probably have a lot of margins to eat into before they need to adjust prices.


I am not seeing the margins in this $5/mo instance but I could be wrong!


We're using the highest tier Postgres instance at my work for one of our legacy Heroku apps and it costs thousands over what we'd pay for the equivalent on AWS directly.


Sure, but those are not related to Aurora or this post.


Um, what? It's literally what we're talking about haha


This post is talking about plans that are at most $20/month. I don't believe the other Heroku plans are on Aurora.


According to https://elements.heroku.com/addons/heroku-postgresql the instances they're using for this tier have zero bytes of RAM, so presumably that's where they're getting most of their cost savings from.


I'm assuming this means that they are not providing any sort of guarantee on the amount of RAM available and packing these instances as tightly as they can.


I like to think they just aren't installing any RAM in the servers and running all the databases out of L3 cache


I know Salesforce has a huge AWS presence. That said, is it possible they are doing multitenancy? I don't know myself.


“Amazon Aurora Serverless is an on-demand, autoscaling configuration for Amazon Aurora. It automatically starts up, shuts down, and scales capacity up or down based on your application's needs.”


Try the Aurora IO-optimized, it's a (relatively new) game changer price-wise.

I'm migrating a 1tb database to it right now because I'm paying too much for iops even on the regular rds postgres.

I'm also quite sure this is what heroku must be using or they would be out of business because of the pricing mismatch.


This, Aurora normal is useless now


How much do you expect the price to drop?


Everything on Heroku is billed with a huge margin, plus as they're probably a partnered customer by now their pricing is a fraction of the average AWS customers pricing. I've been at companies on both sides of the partner pricing list and the difference is huge.


Aurora has treated us well. We make a self-hosted product that requires Postgres; our sales/customer engineering folks just started telling people to use Aurora, and it hasn't caused any problems despite the fact that all of our tests run against stock Postgres. Can't complain. Though a VM with Postgres would be plenty for our needs, and cost thousands of dollars less a month. But, HA is nice if you want to pay for it.


Aurora has a new configuration option that changes billing from iops to higher storage costs. Might be what this is using.


Yeah, that’s what we use as well but I don’t think that addresses the underlying instance cost? I’m not familiar with Serverless v2 though, if that’s what this is using.


The instance cost is not much different then normal heroku compute


Just curious, does Aurora scale down at all in price, i.e. if I have a test instance that's hardly ever used, does it ever end up being cheaper than a classic RDS instance?


Disclaimer: I work at xata.

Xata is (like Heroku) based on Aurora, but offers database branching and has different pricing. That should be ideal for lightly-used test instances, because you only pay for storage, and 15GB are included in the free tier.


It scales to zero, so costs nothing when it's not in use...


Can you share which configuration scales to $0? I am not aware of that being possible. Even the serverless option has a base ACU rate.


v1 of Serverless did scale to 0, but that's no longer an option


You're thinking of Aurora Serverless, but the typical Aurora customer isn't using the Serverless offering. Additionally, the original version of Aurora Serverless scaled to 0, but v2 doesn't.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...


>> but the typical Aurora customer isn't using the Serverless offering

Just wondering, why is that?


AWS Can't bill you on 0 usage.


Wouldn't RDS or EC2 be even less dynamic in terms of pricing?


A bit pricier, some companies have a steady load, scaling isn't instantaneous, etc.


that has not been my experience.


Came here to say this. Aurora is good but also very expensive. If your queries are not very well tuned you will pay through the nose for I/O that would be unnoticeable in other Postgres implementations. For a very modest installation I saw DB bills go down from $3,000 a month on Aurora to about $100 for self managed Postgres.


Use io optimized


>> Aurora isn't cheap and is at least ~80% of our monthly AWS bill.

Why don't you run your own Postgres?

It's not hard - why pay such a premium for the Amazon version?


“Not hard” is very relative. Is it hard to run a Postgres database? No. Is it hard to set up monitoring, automatic failover (without data loss), managed upgrades, tx wraparound protection, clustering and endpoint management, load balancing… and probably a bunch of other things I’m not thinking of? Yes.


If you’re using Aurora and not RDS you’re probably outside of the zone where rolling your own Postgres is easy.


So figure it out. I don’t understand why “ugh this is hard, I’ll pay someone else” has become the norm. You’re working in one of the most technically advanced fields in the world; act like it.


Most people aren't doing anything advanced. Also this has nothing to do with not wanting to do "hard things", that's ridiculous, it's a postgres cluster, you're not doing a PhD in math. People do it because there's limited time and no business advantage to operate postgres clusters. Use the time on what your business actually does.


> that's ridiculous, it's a postgres cluster, you're not doing a PhD in math.

It's not as difficult as a PhD (I assume; I only got as far as an MS), but based on what I've witnessed, it's up there in complexity. There are dozens of knobs to turn – not as many as MySQL/InnoDB to be fair, but still a lot – things you have to know before they matter, etc.

> People do it because there's limited time and no business advantage to operate postgres clusters. Use the time on what your business actually does.

I've seen this argument countless times for SaaS anything. I don't think it's accurate for a database. Hear me out.

For most companies, the DB is the heart. Everything is recorded there, nearly every service's app needs it (whether it's a monolith or micro service-oriented DBs), and it's critically important to the company's survival. Worse, the same skills necessary for operating your own DB generally overlap heavily with optimally running a DB, by which I mean if you're good at things like DB backup automation, chances are you're also good at query optimization, schema design, etc.

It's that latter part that seems to be missing from many engineering orgs. "Just use Postgres," people say; "just add a JSONB column and figure out the schema later," but later never comes. If your business uses a DB, then you do not have the luxury of running one poorly. Spend a few days learning SQL, it's an easy language to pick up. Then spend a few days going through the docs for your DB, and try the concepts out in a test instance. Your investment will be rewarded.


You're at a more surface level than what I'm talking about. Your advice at the end is just common sense advice for anyone using any tool. It doesn't mean you should spend time implementing your own custom backup process with ability to go back to a specific point in time, configurable in 1 minute. The amount of work needed to operationalize postgres in the same way and expose it to other teams in a company will take long enough that you won't get an Aurora-like experience in less than a quarter with a full team. What could they be doing instead to your product?

All I'm saying is it has nothing to do with difficulty. In my job for example we self-hosted HBase which is a beast compared to postgres, implemented custom backups etc, all because there was no good vendor for it. Postgres is much simpler and we always just used RDS and then switched to Aurora for the higher disk limits when it was launched. If there's a good enough vendor, you're just stroking your ego re-implementing these things when you could move on to the actual thing the business wants to release.

I've also seen senior engineering leads "proving" self hosting "saves money" but then 2 companies working on the same type of problem in the same industry with a similar feature set, on one side we had 5 people maintaining what on the other company it took 6 teams of 4-8 people. So it depends if you'd like to have a lot of your labor focused on cutting costs or increasing revenue. And they never include the cost of communicating with extra 5 teams and the increased complexity and slowness to release things this creates, while also being harder to keep databases with current versions, more flimsy backup processes, etc.

Ps: we got rid of hbase, do yourself a favor and stay away


> Your advice at the end is just common sense advice for anyone using any tool.

Common sense isn't so common. I've met a handful of devs across many separate companies who care at all how the DB works, what normalization is, and will read the docs.

> It doesn't mean you should spend time implementing your own custom backup process with ability to go back to a specific point in time, configurable in 1 minute.

If by implement you mean write your own software, no, of course not. Tooling already exists to handle this problem. Off the top of my head, EDB Barman [0] and Percona XtraBackup [1] can both do live backups with streaming so you can backup to a specific transaction if desired, or a given point in time.

Or, if you happen to have people comfortable running ZFS, just snapshot the entire volume and ship those off with `zfs send/recv`. As a bonus, you'll also get way more performance and storage out of a given volume size and hardware thanks to being able to safely disable `full_page_writes` / `doublewrite_buffer`, and native filesystem compression, respectively.

> If there's a good enough vendor, you're just stroking your ego re-implementing these things when you could move on to the actual thing the business wants to release.

Focusing purely on releasing product features, and ignoring infrastructure is how you get a product that falls apart. Ignoring the cost of infrastructure due to outsourcing everything is how you get a skyrocketing cloud bill, with an employee base that is fundamentally unable to fix problems since "it's someone else's problem."

> Ps: we got rid of hbase, do yourself a favor and stay away

HBase and Postgres are not the same thing at all. If you need the former you'll know it. If people convince management that they do need it when they don't, then yeah, that's gonna be a shitty time. The same is true of teams who are convinced they need Kafka when they really just need a queue.

My overall belief, which has been proven correct at every company I've worked at, is that understanding Linux fundamentals and system administration remains an incredibly valuable skill. Time and time again, people who lack those skills have broken things that were managed by a vendor, and then were hopelessly stuck on how to recover. But hey, the teams had higher velocity (to ship products with poor performance).

[0]: https://pgbarman.org

[1]: https://www.percona.com/mysql/software/percona-xtrabackup


Have you ever been paid to do work before? There’s a price at which a business will prefer to pay to have SaaS/PaaS solve a problem. Allocating engineering hours to setting up and maintaining a Postgres cluster has a cost. You’ll want someone senior on it. Their time could be well over $100/hour. And that’s assuming your business is small enough to only need one DBA part time. A business that’s spending a ton on Aurora might need 3 specialists. Now you’re talking about hundreds of thousands of dollars per year. It could be better to just pay AWS.

However, at large scales cloud won’t make sense anymore. They do have a markup and eventually what you’re paying in markup could instead buy you a few full time employees.


> Have you ever been paid to do work before?

Yes, many times, which is why I've developed this opinion.

> However, at large scales cloud won’t make sense anymore. They do have a markup and eventually what you’re paying in markup could instead buy you a few full time employees.

The issue is once you've finally realized this stuff matters, and have hired a DB team, I can practically guarantee that your schema is a horror show, your queries are hellish, and your product teams have neither the time nor inclination to unwind any of it. Your DB{A,RE}s are going to spend months in hell as they are suddenly made the scapegoats for every performance problem, and are powerless to fix anything, since their proposals require downtime, too much engineering effort, or both.

Hence my statement. Learn enough about this stuff so that when you do hire in specialists, the problems are more manageable.


You need to do things that are appropriate for a small company when you’re a small company. And then if you become a large company you change things to suit your new scale.

All of the troubles you described sound like bad management. I’m sorry if you’ve had to go through that. DBAs that are setting up a replacement are going to need time to do that right and expectations need to be set that this is a tricky problem.


Gonna use this next time I'm proposing my pet store should host an on-prem k8s cluster with a psql cluster on it !


Is it less true for other cloud stuff?


You clearly don’t know what aurora is or does if you think people can just run their own. It’s not a regular Postgres setup, and nothing exists that’s equivalent for self hosted.


  Product      Storage  Max Connection  Monthly Pricing
  Essential-0  1 GB     20              $5
  Essential-1  10 GB    20              $9
  Essential-2  32 GB    40              $20
The pricing looks quite competitive, although I'm not sure what the prior rates were.

10 years ago I spent 10x+ per month for 32GB (RAM) Heroku Postgres instances, IIRC they were around $400/mo, maybe even more.


> 10 years ago I spent 10x+ per month for 32GB (RAM) Heroku Postgres instances, IIRC they were around $400/mo, maybe even more.

Aren't you comparing RAM vs Storage there? The pricing chart here says nothing about RAM.


Heroku product here: the Essential 0 and 1 plans replace the older row-limited Heroku Postgres mini/basic plans at the same price points but better perf in a lot of scenarios and a storage instead of row limit - forcing people to denormalize for row count wasn't ideal under old mini/basic limits. The Essential-2 plan is a new option for a larger pre-prod/test/small-scale DB option above what we offered before.

We're expanding the Aurora-backed offerings to include larger dedicated DBs in the relatively near future as well.

Gail Frederick our CTO talked a bit more about it at high-level during re:Invent 2023: https://www.youtube.com/watch?v=fZLcv7rwj7Y&t=1955s


Why did I think you were leading the Apex product team?


I was for quite a few years - moved to Heroku in Q3 last year for an interesting opportunity. Apex is in good hands with Daniel Ballinger (and I stay in touch with a bunch of team if that helps).


These "Essential" tiers are bare bones instances for toys/mvps, they're much different than the bigger ones. No replication, 99.5% uptime target, no maintenance windows etc.


That's 750$/month now, I think: https://elements.heroku.com/addons/heroku-postgresql#pricing (Standard 4)


I'm curious how the Essential plans work, given that Aurora pricing starts higher than that in monthly costs. It is probably databases in a shared multi-tenant Aurora instance, and then the single-tenant plans that are currently in pilot give you the full Aurora instance. That also explains some of the limitations and the low connection limits.


Genuine question, who even uses Heroku anymore?

A VPS (hetzner, etc) + managed postgres DB (supabase / AWS / etc) or a local one might more more than enough these days.


We do, although we're in the middle of moving our entire Heroku Postgres spend over to Crunchy Data [1].

We were getting close to one of the big jumps on the standard pricing of Heroku Postgres, and we would have had to basically double our monthly cost to lift the max data we could store from 1.5TB to 2.0TB. On Crunchy Data, that additional disk space will be like 1% more rather than 100% more.

While investigating Crunchy, I ran some benchmarks, and I found Crunchy Bridge Postgres to be running 3X faster than Heroku Postgres.

Heroku seems to be working on some interesting new things, but I feel burned by the subpar performance and lack of basically any new features over many years. I don't know if the new Aurora-based database will be faster than Crunchy, but the benchmarks they're talking about sound like they're finally about to catch them. But we also have better features on Crunchy, too, such as logical replication. Logical replication is still not available on Heroku.

The experience for deploying apps and having add-ons is still pretty easy, but we'll see how that improves. HTTP2 support is still in beta.

1. https://www.crunchydata.com


My experience with going from Heroku Postgres to Crunchy Data (specific Crunchy Bridge) has been really good. Their product has been absolutely rock solid but what really made the difference was their support. They provided a huge amount of pre-sales support while I planned the move (and even suggested mitigations for the problems I was having with Heroku Postgres to make moving less urgent). Post-sales support has been just as good, though mostly I don’t even have to think about the database hosting anymore.

I also moved my app hosting to NorthFlank from Heroku and have been really happy with that as well. It’s got the features I always wanted on Heroku (simple things like grouping different types of instances together into projects really helps) plus again excellent responsive support.


Our experience of moving from Heroku to CrunchyBridge has been very similar - excellent help with the migration including jumping on a call with us during the switchover to resolve a broken index.

Would strongly recommend them to anyone looking to move off Heroku.


I was a bit concerned about the cut-over from the old database on Heroku, really wanted to minimise downtime. So they helped me produce a step by step plan, test as much of it as possible, then had an engineer join me on Zoom while I made the switchover. They were even able to accomodate doing it in the early morning in my timezone to minimise the impact. Ended up with maybe 5 mins of downtime which I was very happy with.


I'm working in a new startup, and I tried several "easy" solutions: AWS Lightsail, Heroku, Crunchy.

I settled up on AWS ECS :)

My main issue with Heroku was that they have not changed anything in _years_. No support for gRPC, no IPv6, and simple VPC peering costs $1200 a month.


Yeah, the lack of HTTP/2 support has been a long-standing issue with Heroku.

They just shipped HTTP/2 terminated at their router [0], and have it on their roadmap [1] to support HTTP/2 all the way through. But it seems like it's at minimum a few months off.

(As for VPC peering: the moment you need that, it sorta feels like Heroku is no longer the right place to be, even ignoring the costs.)

[0] https://blog.heroku.com/heroku-http2-public-beta [1] https://github.com/orgs/heroku/projects/130


They just shipped it, but it's still beta. So... I wouldn't consider that shipped yet.


+1 - recommend crunchy. I ran a substantial oracle to postgres project recently and crunchy were great.


Update to this: we've switched over our staging database, and the call to do that couldn't have been. more productive or more pleasant.

I got to talk to someone who was intelligent about Postgres, who answered various questions I had, who offered a few pieces of insight that I wouldn't have thought, etc.

Compared to every single support interaction I've ever had with Heroku for 10 years, and this was light years more friendly, informative, and productive.

I am so happy we're switching. Way to go Crunchy!


And one more update: we switched in production, and the first 24 hours have been smooth.


I do, as do several startups I advise.

Despite interesting competition, my feeling is that the Heroku of 2024 remains... Heroku.

I feel this way even though -- depending on how you segment -- the list of "interesting" competitors is quite long at this point: Render, Railway, Northflank, Fly.io, Vercel, DO App Platform, etc.


I revisit heroku alts every ~6months and I am shocked how not ergonomic they still are, I switched to DO VPS + Ansible Container & Github Actions for any project that doesn't need infinite scale after Salesforce paused heroku development but I'd go back to literally any heroku clone.

It's crazy how the ergonomic still just aren't there.


Yes, completely agree; I'm equally surprised by the poor DXes.

(And: bugs. I'm also surprised by the kinds of issues I run into on some of those sites in my list -- problems that, even if not show-stopping, feel like revealing indicators of quality.)


> Yes, completely agree; I'm equally surprised by the poor DXes.

Any specifics/examples? I find it hard to imagine those "big name" companies/platforms you just mentioned don't have entire teams dedicated to hyper-optimizing experience.


Having a dedicated team doesn't mean anything about the end result.


IME the Heroku of 2024 is Render.


Can you elaborate a bit more why render is good? we are on heroku and I have evaluated alternatives every 6 months since heroku/github outage 2 years ago [1]. But I don't see how render is better. 2 years ago render postgres did not have PITR. now they have build it, but Render's postgres offering is even more expensive than heroku, and queries run a bit slower on similar spec machines based on my test. I also don't like render charges per seat in addition to infra cost.

[1] https://status.heroku.com/incidents/2413


10yo+ B2B SaaS company we're still on Heroku. I think the value prop is particularly good for B2B SaaS and probably less so for consumer products. Our margin per customer is so much higher than the infra cost it just never makes sense to spend money on devops instead of building features.

That said it does feel a bit like a ghost town, I'm always happy to hear when someone is doing something over there.


A few years ago I was considering Heroku for something new. But then I learned that Heroku Postgres's HA offering used async replication, meaning you could lose minutes of writes in the event that the primary instance failed. That was a dealbreaker.

That was very surprising to me. Most businesses that are willing to pay 2x for an HA database are probably NOT likely to be ok with that kind of data loss risk.

(AWS and GCP's HA database offerings use synchronous replication.)


Noticed this too. The master failover is marketed like a strict upgrade, and the "async" part is only in the fine print. Many would actually prefer the downtime over losing data. A user who's experienced with DBs should think to check on this, but still.


I’ll just share our experience with Hetzner from earlier this week to spare everyone the learning experience. Their prices are indeed incredible. However, we spun up a VPS in their Hillsboro, OR data center only to find out that our IP address was blocked by Cloud Flare, so there was no way for us to connect to our error logging or transactional email providers. We also found this thread indicating that their entire IP range for that DC is widely blocked[0]. So, not really acceptable for professional use, IMO, unless you just need compute.

0: https://news.ycombinator.com/item?id=39638849


Cloudflare doesn’t block any ip addresses. Its their customers that do


So in that case both Mailgun and Sentry were blocking our IP. When we googled the error message, it appeared to be a Cloud Flare message of some kind. But anyway, wouldn’t touch them with a ten foot pole for anything web related.


probably those who don't want to set uptime alerts, fine-tune configs, set up backups and restores (which are essential because sooner rather later someone always deletes a few rows/tables) and want to focus on business


It's really easy to be SOC2 compliant for a small SaaS on heroku. We need to grow in customers and Dev resources to pull it off on Raw AWS. Am looking for options though because heroku is increasing their prices.


VPS is dandy when the application is fresh. Once it starts getting long in the tooth and the VPS OS needs to be updated… nightmare fuel.


I use render.com. Heroku is stuck in the past.


Me. Haven't found anything that's just as easy to use. The supposed Heroku replacements like Fly.io weren't.


For a lot of use cases, nothing beats « git push » and tada your app is deployed.


Very easy to do with GitHub actions or the equivalent in gitlab and probably other competitors by now I imagine.

If you wanted to spend money on something else, circleci and others help manage ci/cd as well.


Not to mention every cloud provider has something built in (AWS CodeBuild, GCP Cloud Build, etc.)


That’s a pretty standard feature nowadays


me. tried to move to render but it's been a headache for some key things that i need. my heroku setup is dialed in so it makes it a no brainer versus the time i've wasted trying to get render to fit my use case. right now i'm using both for two different services, and will consider moving off both once i get enough customers.


What are you trying to do on Render? We have some stuff in the works and I'd love to find a way to help.


Nothing sophisticated. My woes might be because I'm bad at devops.

But I spent several hours fighting with a DNS change, trying to host my marketing website as Cloudflare pages site from my root domain (with DNS managed by Cloudflare), and then wildcard subdomains routed to a Render server. I couldn't get it to work no matter what configs I tried. My root domain marketing site is proxied through Cloudflare and I was trying to get the wildcard subdomains as DNS-only, and I suspect this was the problem but idk. In other words, the Cloudflare pages marketing site is https://bookhead.net and I wanted my customer's subdomains to route to the Render server like https://forlornbooks.bookhead.net/ (I still get the error since I haven't finished my migration to Heroku). The subdomains worked with no problem until I tried to setup a separate marketing site at the root domain.

Also, I had a hard time setting up SSH with a containerized server. It was a weird DX that was a bit confusing to document so I can remember later. Can only imagine how confusing it might be if I ever have teammates. The Render CLI looks promising, though.

These are only the most recent issues. Seems like y'all improved the headaches I ran into the time I tried.


Then you have to manage a VPS. If you want to do ops, you’re not Heroku’s target customer.


Hetzner doesnt offer any SLAs which is important for some people


RDS is such a depressing database option. It does not matter how much money you throw at it, its performance will always be limited by the awful disk IOPS. Luckily these days you can easily run PG on EC2 (or simply use CRDB).

Weird for Heroku to ignore this huge efficiency opportunity.


Have you tried io2 in this context? Did not have the chance yet.


io2 is generally better than io1, one advantage is that you can scale storage size and IOPS independently. That being said, RDS with io2 is still worse than an ec2 instance with nvme (a lot worse)


Yes agree every bare metal server probably beats EBS but a lot of manual work needed then.


(CEO of Neon.tech)

Aurora is one of the few real innovations in the database space recognized by SIGMOD: https://sigmod.org/sigmod-awards/citations/2019-sigmod-syste...

It provides a lot of benefits to the user and also a ton more to the service provider. Specifically you don’t overprovision storage or compute. Plus at least theoretically you can provide invite IO throughput at the storage level.

There has been a couple more iterations on the design since. Microsoft separated transaction log from storage: https://www.microsoft.com/en-us/research/uploads/prod/2019/0...

Neon added an object store and branches so you can integrate backups and add a Time Machine.

PolarDB separated memory from compute - this makes serverless compute more nimble and unties memory and CPU.


I'm surprised to read this, given that both Heroku and Salesforce more broadly have hired what felt like a good number of PostgreSQL committers.


Which Postgres committers work at either of those companies?


Happy Aurora Postgres serverless customer here. Be sure to use pgbouncer (self hosted, but it needs minimal babysitting) if you intend to use it in a serverless environment (and even if you are not, the benefits of not having to worry about connection pool exhaustion are still worth it). AWS's proxy won't work too well with prepared statements connection pooling (something known as connection pinning).

EDIT: and yes, it's not cheap


After Heroku pulled their stunt in India when RBI changed some credit card rules, Heroku is pretty much history for me. They screwed small customers got rid of them saying they can't charge credit cards with new regulations. However they continued to entertain large customers from India.

Never recommending them to anyone anymore


You severely underestimate what a clusterfuck of credit card abuse comes out of India. Heroku used to lose millions a year.


Care to shed some more light?


Same - I've been slowly migrating to Render (https://render.com) as my new favourite.


We're building on top of Aurora at https://xata.io.

Currently our Aurora instances are in private beta. If you're interested in trying it out, drop me an email: richard@xata.io


Can you still see all instances with their IDs on a shared cluster when you connect?


Ah, I wish they would rather choose Neon... to get all the cool functions developers need but avoid going all the way proprietary.


Aurora is probably the best managed database solution out there but it is not cheap.


RDS beats it in almost every aspect. Running your own on native NVMe blows both of them out of the water for performance and price.

I am not a fan of Aurora. I don’t get the appeal at all. I’ve tried MySQL and Postgres varieties; it’s just expensive for no reason.


I'm not sure what you mean by "running your own on native NVMe" Are you talking about using the managed AWS Relational Database Service or something else? Aurora can also use instance types with NVMe.

Anyway, that's also ignoring the features that Aurora offers, which is why people pay more for it. The ability to have multi-AZ deployments and auto-scaling of (what can be cross-region) read replicas make it very resilient and it's dead simple to operate what would normally be considered advanced features of a DB cluster.

If you just need a managed Postgres or MySQL traditional single instance and none of those extra features, then obviously you would not need to pay the premium for Aurora. RDS exists for that reason.


I was referring to running your own DB on hardware with NVMe drives. Obviously you lose every nicety of managed services, but tooling exists to replace it, and you gain stupid amounts of performance.

RDS Multi-AZ Cluster gives you much of the advantages of Aurora, but with higher performance and more tuning capabilities, though you are limited to 3 nodes. Tbf 3 nodes is almost certainly enough for most companies. A few hundred thousand QPS would be easily handled by that.

Re: cross-region read replicas, eh… if you’ve somehow managed to ensure that every single aspect of your app is capable of withstanding the loss of an entire region – including us-east-1, since most of the control plane functions are there – then sure, maybe. But do you need it? If an entire AWS region drops out, half of the internet goes with it, and you can just blame that. I doubt the small possibility of higher uptime is worth the literal doubling in monthly costs.


What makes RDS better than Aurora?


To be clear, my comment stated RDS is better "in almost every aspect." Aurora is better at one [0] thing – storage scaling. You do not have to think about it, period. Adding more data? You get more storage. Cleaned out a lot of cruft? The storage scales back down.

Aurora splits out the compute and storage layers; that's its secret sauce. At an extremely basic level, this is no different from, for example, using a Ceph block device as your DB's volume. However, AWS has also rewritten the DB storage code (both MySQL/InnoDB and Postgres). InnoDB has a doublewrite buffer, redo log, and undo log. Postgres has a WAL. Aurora replaces all of this [1] with something they call a hot log. Writes enter an in-memory queue, and are then durably committed to the hot log, before other asynchronous actions take place. Once 4/6 storage nodes (which are split across 3 AZs) have ACK'd hot log commit, the write is considered persisted. This is all well and good, but now you've added additional inter-process latency and network latency to the performance overhead.

Additionally, the storage scaling I mentioned brings with it its own performance implications. If you're doing a lot of writes, you'll encounter periodic performance hits as the Aurora engine allocates new chunks of storage.

Finally, even for reads, I do not believe their stated benchmarks. I say this because I have done my own testing with both MySQL and Postgres, and in every case, RDS matched or beat (usually the latter) Aurora's performance. These tests were fairly rigorous, with carefully tuned instances, identical workloads, realistic schema and queries, etc. For cases where pages have to be read from disk, I understand the reason – the additional network latency of the Aurora storage engine seems to be higher than that of EBS. I do not understand why a fully-cached read should take longer, though.

As a further test, I threw in my quite ancient Dell servers (circa 2012) for the same tests. The DB backing disk was on NVMe over Ceph via Mellanox, so theoretical speeds _should_ be somewhat similar to EBS, albeit of course with less latency since everything is in a single rack. My ancient hardware blew Aurora out of the water every single time, and beat or matched RDS (using the latest Intel instance type) almost every time.

[0]: Arguably, it's also better at globally distributed DB clusters with loose consistency requirements, because it supports write forwarding. A read replica in ap-southeast-1 can accept writes from apps running there, forward them to the primary in us-east-1, and your app can operate as though the write has been durably committed even though the packets haven't even finished making it across the ocean yet. If and only if your app can deal with this loosened consistency, you can dramatically improve performance for distant regions.

[1]: https://d1.awsstatic.com/events/reinvent/2019/REPEAT_Amazon_...


Oroboros eating its tail


People still use heroku?


Google has 2 Postgres implementations: Cloud SQL and AlloyDB. How do they compare against AWS Aurora, for the heroku scenario, i.e., multi-tenant database?


here is the English translation: Amplify Might Be AWS's Worst Service, Bar None Confusing documentation, a mix of old and new systems, and it made a mess of my AWS account.

To put it simply, over the past two days, I attempted to deploy a full-stack assignment on AWS services. The front end was written in React, using Vite as the framework. For such Single-Page Apps (SPAs), I personally prefer using specialized services like Netlify or Cloudflare Pages for deployment, as these services offer very robust CI/CD services, allowing for one-click deployment and automatic updates, saving a lot of hassle.

Initially, I planned to manually deploy on AWS using the S3 + CloudFront model (since it was just a one-time assignment), but later I discovered that AWS has a service very similar to Netlify called Amplify, which also offers CI/CD one-click deployment services. Amplify goes even further by including user directory services, allowing for one-click registration and login via related components.

It sounds great, but you only realize how problematic it is after using it. After some research, my initial deployment method was to upload the code to GitHub and then click the deploy button in the Amplify interface. This is also the deployment method I use most often with Netlify.

However, I later found something wrong. The key issue was that applications deployed this way using Amplify couldn't directly use Amplify's UI components to access Cognito user directory services. After much searching, I found that Amplify has an Amplify CLI initialization command to create a new CI/CD project in the Amplify service, which also deploys additional resources like Cognito.

It seemed feasible, so I did it. Then I found some issues. The initial "issues" were just on the AWS account management level: after deploying the project via Amplify CLI, my AWS account quickly filled up with a bunch of "things"—the reason "things" is in quotes is that Amplify created a lot of fragmented resources, including but not limited to CloudFormation, IAM roles, etc., even creating two Cognito identity pools for me—it's hard not to call them "junk." Moreover, most of these resources have names that are impossible for humans to remember or distinguish, and there are no explanations or grouping features to tell you what these things are for.

If it were just like this, it wouldn't seem to impact the development process, right? The biggest problem is that the local debugging and production environment apparently don't use the same configuration files, and when I was cleaning up the automatically created resources in my AWS account earlier, I somehow deleted the roles calling the Cognito user pool in the production environment, causing the production environment to be unable to access the two user pools created by Amplify, constantly throwing 400 errors.

After several rounds of "deploy-delete-redeploy-redelete," I decided to start over and look for the related documentation again. Later, I found that Amplify has a set of documentation outside of AWS's own documentation system, and this documentation recommends a deployment method: clicking the deploy button on the GUI webpage—yes, you heard it right, the same deployment method I used initially.

So, how do you deploy additional components/services like Cognito this way? Amplify's answer is configuration files. As long as you create a folder for configuration files in the root directory of your project and write the corresponding configuration files in it, the cloud will automatically create the resources you need in AWS once it reads them.

It sounds reasonable, right? Then you go to find the part about configuration files in the documentation... What's going on? Why can't I find anything in the search box in the documentation? There's not even a sample configuration file! Algolia indexing service can't be this bad, right?

Searching for "defineAuth" in the Amplify official documentation returns mostly irrelevant information.

Is my search method incorrect? I entered keywords like "site .amplify.aws defineAuth" in the Kagi.com search engine but couldn't find any examples or explanations of configuration file items. At this point, I'm completely convinced that the Amplify documentation is garbage. Fortunately, the API documentation of the Amplify framework is quite good, at least reducing my urge to buy a ticket to the US and blow up Amazon's headquarters while guessing the configuration file items...

Also, Amplify has a UI that is completely different and more modern than other AWS services. The discrepancy is still a minor issue; the main problem is that if you create a project using the (slightly outdated) Amplify CLI, and then try to configure the back-end services like Cognito it deployed on the webpage, you'll enter an old interface. That is, once you click in, you see a slightly ugly but familiar interface, yet it feels completely disconnected from the previous Amplify interface...

So now I understand why I hadn't heard of Amplify before—it's really hard to use. Complete integration is indeed an advantage, but even being born with a silver spoon doesn't excuse Amplify's messiness, simply throwing everything together and telling users "it just works." Users look at it, wondering what on earth all these things are, and then you hand them a manual that looks fancy but has zero information. Users, flipping through this tome with no useful information, can only throw this pile of stuff into the historical junk heap behind them in frustration.

Some rights reserved Except where otherwise noted, content on this page is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license.

ENJOY

READ THIS Short story: "The Cry of Accelerated Demise" in a place accelerating towards demise.


Amplify is the worst!!!!


I don't know if it's still the case but a few years ago all major cloud providers were easily giving away thousands of dollars in cloud credits. I expect them to stop this soon since smaller cloud players build on top of them and offer better dx and startups prefer to work with these smaller companies despite free credits from larger players.


> startups prefer to work with these smaller companies despite free credits from larger players

Says who? My experience is the opposite - tending towards too much reliance on the main providers because of the credits


As a startup boy we are happily chewing through hundreds of thousands in GPU credits across all major cloud platforms + lambda labs.

And once those credits run out we are planning to expand our owned training hardware. Currently we just have 3x L40S but would expand to 32x L40S. I’m excited to now be a sys admin in addition to a full stack web dev.


Vercel is going strong - 25.5M in 2022, 100M in 2024. Netlify is currently at 30M. Add Supabase, Render, Railway, ...


AWS alone is $100B/yr now


Largely enterprise spend. Startups are a different market segment. They initially have small budgets, and eventually fail or grow large enough to move to different products.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: