Hacker News new | past | comments | ask | show | jobs | submit login
Amazon Aurora: Parallel Read Ahead, Faster Indexing, NUMA Awareness (amazon.com)
63 points by Zenfinch on Sept 2, 2016 | hide | past | favorite | 28 comments

We've switched 6 or 7 of our production databases to Aurora now. It is much faster than MySQL/Percona Server. Even highly optimized MySQL installs with FusionIO cards struggled to keep up before we switched to Aurora.

Aurora is expensive, but it's a huge instant speedup and easily worth the cost for anyone with a production application that occasionally hits performance limitations on traditional MySQL.

That said, it is based on MySQL 5.6.10, so we are missing some features like online DDL from 5.6.17. Many bugs from 5.6.10 have been resolved upstream but are still present in Aurora. [0] It's also subject to the usual limitations of RDS (no SUPER, no access to the binary database files, no innobackupex).

[0] https://www.percona.com/blog/2015/11/16/amazon-aurora-lookin...

Cool, although I sure wish they had gone with Postgres. I can't live without the occasional JSONB anymore.

I know MySQL has made some half-hearted attempts to make headway on this front, but it has completely changed the way I model certain parts of my data.

I second this. I'd switch over from RDS to Aurora in a heartbeat if AWS built a postgres frontend for it.

Curious what is the key feature of Aurora on Postgres that you'd want?

PostGIS compatibility.

Unless you mean that the other way around, in which case I don't know what key features of Aurora I'm missing (besides price & performance).

Aurora is part of RDS. It's just a different DB instance type.

I can't live without the occasional JSONB anymore.

Some can't live without Aurora anymore ;) - they start their projects with MySQL(even thought they prefer Postgresql) to be Aurora compatible, just in case they need it.

I'm curious what are the reasons some people "can't live without Aurora anymore". I would love to hear specific benefits of Aurora, regarding performance or other aspects.

The comparisons I could find didn't seem to be very favorable to Aurora (mainly that the claimed 5x improvement does not show up in the benchmarks)

I moved a 60GB database too it last year that we had setup multi AZ.

We cut our costs by about 40% after the switch and saw about a 20% boost in speed. RAM usage was lower too.

One of the biggest savings is that you don't need multiAZ since your data is stored redundantly. You don't have to worry about failover because the restart time is < 1 minute anyway. The other thing is that I can stop having to keep an eye on disk usage (and paying for unused space) since it just grows as needed automatically.

The man hours alone that we used to put into actively watching and tuning that DB have just vanished as has almost the entire sysadmin burden. That project was already on MySQL too which made it an easy move too.

I wish they'd do something similar for PG too.

The big win for me is administration, not performance.

Laaaaazy. :P

Aurora being a closed-source fork of MySQL is a real problem in my opinion.

Look at all the comments here about new features and bug fixes introduced upstream but missing in 5.6.10 (online DDL, JSON, etc.).

We already have Oracle's MySQL, MariaDB, WebScaleSQL, MyRocks (Facebook's MySQL with RocksDB and DocStore), Percona Server for MySQL, and now Aurora. Each version has its own features and peculiarities. The ecosystem is too much scattered.

For context: Faster indexing and numa awareness is in MySQL 5.7. The parallel read-ahead is a Facebook patch.

Aurora is based on MySQL 5.6.

So is Aurora just a MySQL storage engine (e.g., InnoDB), or a modified version of MySQL under the hood?

It's api is based on mysql, doesn't share interals with it.

They had a deep dive into during the last re:invent.


Aurora is a modified MySQL 5.6.10. It shares several bugs with 5.6.10 that have already been fixed in later versions of MySQL. My understanding is that it mostly remains unchanged other than the massive enhancements Amazon has made to the I/O performance.

Sure it does. You can see this from diagnostic commands (Innodb status, mutexes etc)

Really? Werner Vogels said they built a totally different logging and storage layer.

Also a good bit of video of Vogels talking about Aurora.


Logging and storage being new does not mean it does not share a lot of other internals.

Do you think that Aurora is a reason to start a new project with MySQL instead of Postgres? (Aurora starts at r3.large instances.)

I'm not sure. compose.com, among others, offers PostgreSQL hosting with auto-scaling. But I ignore if they can match Aurora with 64 TB databases.

MySQL compatible means that moving from MySQL to Aurora is transparent to the application, included all the MySQL (InnoDB?) peculiarities, which many regard as bugs?

This could be really important because some applications end up relying on MySQL oddities even with good willed developers.

Any Equivalent for Postgre?

CitusDB is an option for scaling out PostgreSQL workloads, but it's not a direct equivalent to Aurora. That's both good and bad in different ways. E.g., it's good that CitusDB is not a fork of PostgreSQL as Aurora is a fork of MySQL. It's bad that CitusDB as pure software can't offer the same kind of high-performance, auto-replicated storage layer that Aurora does.

I remember Ed Kemmet saying that NUMA optimizations are patent encumbered and presumably licensed. Probably part of the reason this is closed source.

Does this support column-oriented database designs?

It's still MySQL

No, but Redshift does.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact