
Amazon Aurora Postgres: First Thoughts - brandur
https://www.linkedin.com/pulse/amazon-aurora-postgres-first-thoughts-eric-green/
======
koolba
> What's the point of Aurora? Aurora does have a couple of positives. You can
> create additional read replicas virtually instantly, since they're just
> pointed at the same shared block storage.

If I understand this correctly, they're using the same block store backing the
primary database for the replicas. If that's the case, then wouldn't an issue
with the blocking store hose both simultaneously? Or are they just jump
starting the replicas copy of the data from the primary's block store until
it's fully replicated?

> And from a management point of view, Aurora makes the database
> administrator's job far simpler since you no longer have to closely monitor
> your tablespaces and expand your block storage as needed (and reallocate
> tables and indexes across multiple tablespaces using pg_repack) in order to
> handle growing your dataset.

This is indeed very cool. Scaling from 1 MB, to 1 GB, to 1 TB, and beyond with
no manual interaction is truly amazing. The storage pricing is a net win for
Aurora as you only pay for what you use at the same price as gp2 EBS volumes
($.10/GB/month) whereas the latter has to be preallocated.

~~~
lozenge
The block store is related to EBS, which is what AWS's other postgres option,
RDS, also uses. An EBS system failure is rare and could potentially take down
both services, and EC2 as well.

~~~
_msw_
[https://media.amazonwebservices.com/blog/2017/aurora-
design-...](https://media.amazonwebservices.com/blog/2017/aurora-design-
considerations-paper.pdf)

""" These are each replicated 6 ways into Protection Groups (PGs) so that each
PG consists of six 10 GB segments, organized across three AZs, with two
segments in each AZ. A storage volume is a concatenated set of PGs, physically
implemented using a large fleet of storage nodes that are provisioned as
virtual hosts with attached SSDs using Amazon Elastic Compute Cloud (EC2). """

The storage nodes are using local SSD instance storage, not EBS.

------
stingraycharles
"Aurora Postgres doesn't have a filesystem for its tablespace. It has a block
store. Instead, Aurora Postgres instances that are doing large sorts or
indexing large files use local storage, which is currently 2x the size of
memory. That is, if an Aurora database instance has 72gb of memory, you only
have 144gb of temporary space. Good luck sorting that 150gb table."

That seems like a problem that's easy to overcome. Allow people to configure
the local storage size and you have a reasonably fair solution.

Still has its limitations for really large datasets, but that's also one you
have with vanilla postgres.

~~~
brlewis
Also "good luck sorting..." makes it sound like such sorting is impossible.
It's just slower. [https://madusudanan.com/blog/all-you-need-to-know-about-
sort...](https://madusudanan.com/blog/all-you-need-to-know-about-sorting-in-
postgres/#DiskMerge)

~~~
znep
I'm not clear what you are trying to suggest with that link. The whole point
of the claim in the article around sorting is that in Aurora, for using disk
to sort, postgres only has access to a relatively small filesystem that runs
out of space and can't use the Aurora storage that is actually backing the
database.

------
rpedela
Is Postgres RDS any different? Isn't it also run on EBS?

~~~
craigkerstiens
Postgres RDS is indeed different, yes it runs on EBS, but the Aurora in a
sense isn't truly Postgres. What they've done is forked away from vanilla
Postgres and tweaked the storage layer to work on the block based storage
they've built just for it. It's why you see it advertised at "Postgres
compatible" vs. just being will standard Postgres.

This article is definitely an interesting one as it highlights some of those
trade-offs that they now need to account for.

~~~
stingraycharles
Agreed, I haven't had the opportunity to experiment with Aurora yet, but I
knew there would be issues. It's nice to get a field report, now I have a
better idea what to expect.

------
ugh123
>I'm currently managing about 2 billion rows in Postgres.

Not sure the type of data he's storing, but 2 billion seems a bit much for 1
or a few tables. Hopefully he's thought of sharding those into multiple tables
by date or key.

~~~
CodeWriter23
We don’t do that in Postgres any more. We use table partitioning.

