To be clear, my comment stated RDS is better "in almost every aspect." Aurora is better at one [0] thing – storage scaling. You do not have to think about it, period. Adding more data? You get more storage. Cleaned out a lot of cruft? The storage scales back down.
Aurora splits out the compute and storage layers; that's its secret sauce. At an extremely basic level, this is no different from, for example, using a Ceph block device as your DB's volume. However, AWS has also rewritten the DB storage code (both MySQL/InnoDB and Postgres). InnoDB has a doublewrite buffer, redo log, and undo log. Postgres has a WAL. Aurora replaces all of this [1] with something they call a hot log. Writes enter an in-memory queue, and are then durably committed to the hot log, before other asynchronous actions take place. Once 4/6 storage nodes (which are split across 3 AZs) have ACK'd hot log commit, the write is considered persisted. This is all well and good, but now you've added additional inter-process latency and network latency to the performance overhead.
Additionally, the storage scaling I mentioned brings with it its own performance implications. If you're doing a lot of writes, you'll encounter periodic performance hits as the Aurora engine allocates new chunks of storage.
Finally, even for reads, I do not believe their stated benchmarks. I say this because I have done my own testing with both MySQL and Postgres, and in every case, RDS matched or beat (usually the latter) Aurora's performance. These tests were fairly rigorous, with carefully tuned instances, identical workloads, realistic schema and queries, etc. For cases where pages have to be read from disk, I understand the reason – the additional network latency of the Aurora storage engine seems to be higher than that of EBS. I do not understand why a fully-cached read should take longer, though.
As a further test, I threw in my quite ancient Dell servers (circa 2012) for the same tests. The DB backing disk was on NVMe over Ceph via Mellanox, so theoretical speeds _should_ be somewhat similar to EBS, albeit of course with less latency since everything is in a single rack. My ancient hardware blew Aurora out of the water every single time, and beat or matched RDS (using the latest Intel instance type) almost every time.
[0]: Arguably, it's also better at globally distributed DB clusters with loose consistency requirements, because it supports write forwarding. A read replica in ap-southeast-1 can accept writes from apps running there, forward them to the primary in us-east-1, and your app can operate as though the write has been durably committed even though the packets haven't even finished making it across the ocean yet. If and only if your app can deal with this loosened consistency, you can dramatically improve performance for distant regions.
Aurora splits out the compute and storage layers; that's its secret sauce. At an extremely basic level, this is no different from, for example, using a Ceph block device as your DB's volume. However, AWS has also rewritten the DB storage code (both MySQL/InnoDB and Postgres). InnoDB has a doublewrite buffer, redo log, and undo log. Postgres has a WAL. Aurora replaces all of this [1] with something they call a hot log. Writes enter an in-memory queue, and are then durably committed to the hot log, before other asynchronous actions take place. Once 4/6 storage nodes (which are split across 3 AZs) have ACK'd hot log commit, the write is considered persisted. This is all well and good, but now you've added additional inter-process latency and network latency to the performance overhead.
Additionally, the storage scaling I mentioned brings with it its own performance implications. If you're doing a lot of writes, you'll encounter periodic performance hits as the Aurora engine allocates new chunks of storage.
Finally, even for reads, I do not believe their stated benchmarks. I say this because I have done my own testing with both MySQL and Postgres, and in every case, RDS matched or beat (usually the latter) Aurora's performance. These tests were fairly rigorous, with carefully tuned instances, identical workloads, realistic schema and queries, etc. For cases where pages have to be read from disk, I understand the reason – the additional network latency of the Aurora storage engine seems to be higher than that of EBS. I do not understand why a fully-cached read should take longer, though.
As a further test, I threw in my quite ancient Dell servers (circa 2012) for the same tests. The DB backing disk was on NVMe over Ceph via Mellanox, so theoretical speeds _should_ be somewhat similar to EBS, albeit of course with less latency since everything is in a single rack. My ancient hardware blew Aurora out of the water every single time, and beat or matched RDS (using the latest Intel instance type) almost every time.
[0]: Arguably, it's also better at globally distributed DB clusters with loose consistency requirements, because it supports write forwarding. A read replica in ap-southeast-1 can accept writes from apps running there, forward them to the primary in us-east-1, and your app can operate as though the write has been durably committed even though the packets haven't even finished making it across the ocean yet. If and only if your app can deal with this loosened consistency, you can dramatically improve performance for distant regions.
[1]: https://d1.awsstatic.com/events/reinvent/2019/REPEAT_Amazon_...