
Aurora - New MySQL-Compatible Database Engine - ak217
http://aws.amazon.com/blogs/aws/highly-scalable-mysql-compat-rds-db-engine/
======
bcantrill
(Disclaimer: I work for an AWS competitor.)

To me, there is a very interesting contrast to be had between this
announcement and Microsoft's announcement: it feels like Microsoft is
discovering the business value of being open at the same time that Amazon is
living in the time warp of proprietary everything. Has Microsoft internalized
that open source is (or can be) a differentiator in the cloud? Amazon is
clearly still oblivious to it -- and it will be very interesting to see if
this service generates fear of vendor lock-in...

~~~
mallipeddi
Amazon is way ahead of everyone else in the cloud space. Their closest rivals
(Microsoft & Google) are pretty much copying AWS. Most of Azure's services are
equally proprietary and I doubt that will ever change.

This just leaves the other smaller players (DigitalOcean, Joyent, Rackspace,
etc) who are mostly offering something akin to EC2 and then partnering with
other vendors to offer the missing pieces on top (frankly what other choice do
they have? They can't get into a race with Amazon/Google on who can build the
most no. of services - they will never win that race).

~~~
zzzeek
ignore OpenStack at your peril.

~~~
jpgvm
I worked on and around Openstack for 18+ months. I think ignoring it is pretty
safe to ignore at this point.

<rant>

There might come along something that is actually good, but Openstack isn't
it. The architecture is just bad, it will never be as reliable as something
like AWS. (not because of scale, purely because of lack of error handling
capabilities)

Reality is these sorts of orchestration systems need to be written by
specialists. The vast majority of Openstack was written by people that admit
they have no clue about systems level programming. This is what hype does, it
forces a whole bunch of bad programming and architecture down everyones
throats.

The marketecture and hype machine have done horrible things to Openstack. Not
only have vendors riddled the thing with lockin and crap code that needs to be
supported, but they have pushed entire projects that should have been shot in
the head. _cough_ Ceilometer _cough_

There is security, performance and plain availability problems everywhere,
mostly embedded deep in the architecture.

Dumb decisions like "All systems must be Python" when Python is clearly not
suited to a large number of the things they want to do is painful. As is the
general "not invented here" syndrome and boys club that leads to certain
libraries or patterns being pushed over others, usually to the peril of the
project due to the 'blessed' thing being incomplete and unproven.

</rant>

I don't intend on returning to Openstack if I can avoid it, unfortunately my
skill-set does tend towards that sort of thing so we will see how I go.

~~~
kiennd
Why you said that when a lot of large enterprise are very actively investing
in ? Can you tell more details ?

------
andr
Reading between the lines, it sounds like they've rewritten InnoDB's storage
layer and made it multi-server aware w/ consensus.

------
jasondc
I think the problem AWS is seeing is how they are being commoditized (e.g. you
can just run your database on the cheapest hosting provider), so profits will
move towards 0. They will need to add more services like this (Aurora,
DynamoDB, etc.) to ensure AWS isn't a commodity (and you can't just easily
switch to DigitalOcean).

~~~
Cieplak
I think that the AWS vendor lock-in won't come from single features such as
databases from specific vendors, but rather the ecosystem of Route53, elastic
load balancers, S3, EBS, AMIs, Glacier, RDS, CloudFormation templates, SQS,
security groups, libraries like boto, auto scaling groups, and everything else
besides instance provisioning. Of course you can implement most of these
yourself, e.g., HAProxy for ELBs, rabbitmq for SQS, but all that requires
application code changes and significantly more configuration management.

~~~
freshflowers
So far I've been able to avoid vendor lock-in on the code level (besides S3,
but the S3 API is simple and has become pretty much a standard for object
storage).

On the infrastructure level, the "lock-in" part is not so much the technology,
all of that is relatively easy to replace. But it would take me two additional
FTE's to configure and manage everything AWS offers as a service ourselves.

But that applies to a lot of things these days, AWS just happens to be a one
stop shop for a growing range of commodity services.

------
sciurus
More product details:
[http://aws.amazon.com/rds/aurora/details](http://aws.amazon.com/rds/aurora/details)

and Frequently Asked Questions:
[http://aws.amazon.com/rds/aurora/faqs/](http://aws.amazon.com/rds/aurora/faqs/)

------
AaronFriel
Ouch. This pricing is pretty rough for SaaS sold on the premise of cloud
scalability.

At $200/month for the entry level, their lowest price is many times what the
cheapest geo-replicated "SQL engine as a service" from Google or Microsoft is.
I'm not sure how the performance differs, but I am guessing theirs are no
slouches.

Microsoft offers "SQL Database" geo-replicated for as low as $15/mo., and it
scales up from there. Not sure about performance, but it would be apples to
oranges (MySQL versus SQL Server) and difficult to compare. I wonder what the
TPC numbers are, but apparently the TPC organization doesn't allow publishing
that yet.

Google offers "Google Cloud SQL", also geo-replicated, and their cheapest
pricing is between $10 and $18 dollars a month.

~~~
gaadd33
I don't think this is targeted towards those who only need to store a small
amount of data and have very low performance needs. If you are are already
spending $500/mo on an RDS instance, then it sounds like this would be a great
solution. If you are spending $10/mo on a micro instance for your database
server, I don't think this is targeted towards you.

------
frozenport
Does this introduce more latency compared to running a LAMPs stack on EC2 or
does it target larger databases? My database is about 300 megabytes.

~~~
Xorlev
You probably aren't the target audience if your database fits in RAM.

~~~
frozenport
You can see notable changes in performance if your database gets fragmented,
and tweaking the .my.cnf script seems like a massive distraction from my
actual work. Having somebody handle these and improve performance for long
queries would be wonderful.

~~~
jcampbell1
With a database that small, performance shouldn't ever be noticeable. Are you
running the DB on EBS?

1) If you are having trouble with tweaking my.cnf, give the perl script
`mysqltuner` a go. It is pretty good.

2) The biggest improvement would be moving the storage to SSDs. I personally
prefer Linode to DO, but both are cheap, and good.

My linode servers can do full text non-blocking backups at about 100MB per
second, so depending on your outage recovery requirement, you may be able to
get away with a simple cron job.

------
LeonD
Is a relational workload that requires four-nines-plus availability like
Amazon is touting really something an organization would be willing to run in
the cloud? Especially since that category includes a lot of personally
identifiable data that legally can't be trusted to a third party. Wonder what
kind of use cases they're aiming for.

~~~
techdebt5112
people should be roughly as comfortable with this as with operating mysql on
EC2 or using RDS, and I think that's quite common.

------
state_machine
“Aurora" is already the name of a cloud infrastructure management product, an
open-souce apache project no less:
[http://aurora.incubator.apache.org/](http://aurora.incubator.apache.org/)

Using an existing name for a product in a similar space is just confusing and
hurts everyone.

~~~
cr3ative
To be fair, I just searched for "aurora software" and "aurora open source" and
got nothing for that project on the first page of results.

A lot of the good names are already taken; there's going to be overlap, and
they're very different products.

~~~
covi
Name-clashing with an Apache project really surprises me. Also, consider the
fact that Aurora stems from Mesos, which supposedly Amazon (cloud; although DB
people are doing this one) folks have to know.

------
slik33
Why mysql? Why not postgres engine?

~~~
ahachete
No one is saying that Aurora is built on MySQL, I'd personally like to know :)
It's the protocol (drivers) compatibility what has been claimed so far.

I'd agree with you from a technical perspective it would have been better to
offer Postgres compatibility, rather than MySQL. But I believe the intent is
to target both a bigger market (as of today) and challenge Oracle.

~~~
BenoitP
> tables that I create in Amazon Aurora must use the InnoDB engine

It hasn't been claimed, but the article is filled with MySQL comparisons and
references. I would not be surprised if it was a MySQL fork.

------
loco77
4 times faster than MySQL on the same platform, how did they pull that off?

~~~
sciurus
Here is what they claim in the FAQ:

"Amazon Aurora delivers significant increases over MySQL performance by
tightly integrating the database engine with an SSD-based virtualized storage
layer purpose-built for database workloads, reducing writes to the storage
system, minimizing lock contention and eliminating delays created by database
process threads. Our tests with SysBench show that Amazon Aurora delivers over
500,000 SELECTs/sec and 100,000 updates/sec, five times higher than MySQL
running the same benchmark on the same hardware."

~~~
brendangregg
I'd love to see more details (active benchmarking). At the very least, what is
the CPU and disk utilization during the benchmark? It'll shed some light on
how this was done: a 5x improvement in cycles per query? Or better caching?

People will benchmark this themselves ASAP. If you do, try to include some
basic system metrics. The output of "vmstat 10", "iostat -x 10", and some
"pidstat -t 1" would be a great start. This may only be possible for the MySQL
benchmark, if Aurora is only visible via an API, and the database system can't
be accessed directly (?).

~~~
sciurus
To get the most comparable hardware setup for the comparison, I think you
would want to run both Aurora and MySQL via RDS. RDS does not give you access
to the underling EC2 instance, just to metrics via Cloudwatch.

------
zippergz
Has anyone been able to find specific pricing on this? All of the language
I've seen about it has been vague. One of the things that's kept me from using
RDS for toy projects is that it basically amounts to running another instance
on top of what I already have. For a "real" project, that's a drop in the
bucket, but for toys and prototypes, it can be a significant chunk of the cost
(so I usually end up just running my own db on the same instance as the web
server for those projects).

~~~
sciurus
Pricing is at
[http://aws.amazon.com/rds/aurora/pricing/](http://aws.amazon.com/rds/aurora/pricing/)

The cheapest is a db.r3.large for $0.29 / hr

~~~
zippergz
Ouch. Thanks.

~~~
cjg_
Just slightly more expensive than the same instance with MySQL on RDS ($0.24)

~~~
meritt
Yes but "Storage is automatically replicated across three AWS Availability
Zones (AZs) for durability and high availability, with two copies of the data
in each Availability Zone."

Is that included in the server cost? Or will you actually be paying ~3x the
server + storage costs?

~~~
dk8996
The pricing needs to be more clear. For example we run MYSQL RDS in production
with two AZ enabled, we pay x2 the prices.... so if its only costs a bit more
than MYSQL RDS and you get 3 AZ for the same price -- this is good. Moreover,
there are costs that come with storage (database size), this thing seems to be
smart about being more efficient with allocating space.

~~~
Wouter33
As i understood this your database storage is replicated to multiple places,
but it still runs on one instance. It does not have failover to those
replicated places. You'll have to launch multiple instances for that just like
RDS. So cheaper, no.

------
donmaq
I'm interested in transaction support for e-commerce. Magento users can
experience significant performance issues on MySQL, and the upper tier are
always interested in more RDBMS performance, eg via NewSQL solutions.

So what can Aurora do for that workload? Do the support multi-table
transactions and referential integrity across all 3 Availability Zones?
Similarly, they mentioned Durability targets; what's their targets for
Consistency (ie ACID).

------
i_have_to_speak
This is beginning to feel like how it was when Intel was competing with AMD --
evolutionary improvements at first, which just kept on coming, increasing in
size and impact, until AMD just faded out of the high-perf server market.

------
dschiptsov
So, this is a Xen VPS (Amazon uses Xen as far as I remeber) with a MySQL 5.6
database, with a custom storage engine that stores data in some Dynamo-based
storage. Any code could access it via MySQL protocol (bindings to
libmysqlclient.so). Fine.

~~~
jakozaur
I appreciate your technical insight, but glueing those technologies together
is far from trivial. Making it easily to manage and reliable is even harder.

I trust your best intention, but even Dropbox was dismissed on HN as user-
friendly rsync.

~~~
arnehormann
MariaDB enabled Cassandra as a storage engine:
[https://mariadb.com/kb/en/mariadb/documentation/storage-
engi...](https://mariadb.com/kb/en/mariadb/documentation/storage-
engines/cassandra/cassandra-storage-engine-overview/)

------
skj
oh neat, AWS following GCP instead of the other way around.

~~~
loco77
Your profile says you work at Google...

~~~
skj
I certainly do. On cloud no less. The reality is that Amazon got to market
several years before Google, so GCP had some catching up to do. That trend is
changing, and this is one indication.

~~~
loco77
If you're going to post your subjective opinion on something involving your
employer vs a competitor it's best to include a disclaimer in future.

~~~
skj
What was the subjective opinion? "neat"? The rest was factual.

~~~
loco77
The subjective opinion was about "following" when Amazon has had relational
database services for years, and the latest engine they say was under
development for 3 years, not exactly something in response to goog.

Irrespective of that you shouldn't be posting as joe public when in fact you
are a Google employee playing cheerleader for Google products on threads about
your competitors products. Keep it classy.

~~~
skj
We're not talking about a relational database, we're talking about a mysql-
wire-compatible database. Google BigQuery has been around for a while too.

Adding "disclaimer: googler" to every post I make on HN seems pretty obnoxious
to both me and anyone reading. I just didn't feel that "neat" \+ a fact
qualified. Clearly opinions on that differ, and I'll probably just post less
in the future.

~~~
loco77
It's only necessary in cases where there is an obvious conflict of interest.

~~~
skj
That's fair. It's certainly not my intention to astroturf.

------
himanshuy
What is the use of a cloud relational database service? I thought everybody is
going for Document Based databases.

~~~
wmf
If your data is fundamentally relational you're better off putting it into a
relational DB. [http://www.sarahmei.com/blog/2013/11/11/why-you-should-
never...](http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-
mongodb/)

~~~
ranman
Just want to point out that the headline is sensational and there are many
good use cases for document stores.

~~~
pessimizer
The headline doesn't mention document stores, it mentions MongoDB.

~~~
ranman
Which (as a document store) has plenty of use cases... and even when you read
the article it's mostly about how they didn't understand what they were doing
going into things... It's not a compelling argument.

"But this stuff wasn’t obvious at all. The MongoDB docs tell you what it’s
good at, without emphasizing what it’s not good at. That’s natural. All
projects do that. But as a result, it took us about six months, a lot of user
complaints, and a lot of investigation to figure out that we were using
MongoDB the wrong way."

The film/actor/career example they wanted to do could have been solved pretty
easily with a document that had an _id for an actors collection and a name...
that's just the document paradigm... I'm a huge fan of both postgres and
mongodb and I know that's not a popular opinion around HN. I just get tired of
seeing clickbait headlines being cited as the criticism within is correct.

