What happens here in the event of a network partition? There are several failure modes possible if I’m reading the diagram right. Network failure between the Aurora node and the shared storage layer.
I would expect that to result in a failed transaction and rollback, right? That’s not really all that available though, is it?
What about a network partition between availability zones? Maybe your local thinks it has a quorum because it can no longer see a different AZ? What happens when the partition is healed? How do the nodes reconcile? What if you’re working with multiple AZs and you have application nodes in each AZ writing to a cluster of nodes that all think they have a quorum because they can’t see each other?
What happens to those writes after the partition? Is every transaction terminated rolled back because us-east-1 can’t see us-west-3?
If they aren’t, how do the nodes achieve consensus? Are we doing a mongoDB thing here and just tossing everything after the last shared state? And even if that does work out,
What’s going on? And how is it achieved.
Look, if you need transactions at all, you most likely need isolation level: serializable. And that’s incredibly hard to get right.
Many systems really don’t need any of that at all, but when you need it, it has to be right.
This doesn’t address any of the really hardest parts of distributed systems. It doesn’t address why I would take any application at all that stores data in a RDBMS and use this.
I’m sure everyone there on that team is working hard. But people have to stop writing about distributed systems like this.
CAP is a simple, generic academic model. Researchers and implementers moved on from that a long time ago.
In the case of Google, they have partition-free networks for newer databases, so CAP rarely applies.
Galera/Percona Cluster has solutions for all the above also. Generally multi-master writes continue, and the behind server either catches up or gets automatically rebuilt.
MySQL 8 has something equivalent to Galera.
I would recommend testing (or waiting) a year for multi-master Aurora to bake. In the meantime, you can use Percona Cluster if you can't wait.
You can learn more here about the state-of-the-art in multi-region, multi-master RDBMS' here:
Source: DBA who's operated Percona Cluster for 5+ years. Contact me if you need extreme database performance or availability.
No, asynchronous networks cannot physically be partition-free. They are actually always partitioned.
> Galera/Percona Cluster has solutions for all the above also
They don't have solutions to all of the above.
C: Stay consistent but not available
A: Stay available but not consistent
There is no other option.
Even igoring Amazon, MySQL/Percona's mha story has had a history of buggy, weak, imprecise promises.
 I guess Google level special sauce gets a pass since they have the most published about specially clocked hardware, and how it changes DB design.
You have some real chutzpah, sir. I respect that. You are as wrong as can possibly be, but this is a bold move.
> At launch, Aurora Multi-Master supports two node clusters in a single Region. Support for more writer nodes and placement of writers in multiple Regions is planned for future releases.
Great to see more high availability options, but yet again it's just in the same region. When we see the East region go down entirely more often than single availability zones. At least this time they mention that limitation in the announcement and an intention to fix it later. In the mean time I'll stick to our multi-region Galera setup.
I have a lot of spreadsheets and PDFs as a silo, and I want be able to build a searchable MySQL database from those documents.
Does it possible? How?
Here is an example for spreadsheets, but I think spreadsheets is easy, because there are some solutions like MySQL workbench.
But for PDFs which has unstructured and semi-structured data and vectors, it's hard to find a good solution to extrac/convert them into MySQL
This faux-indignation puffery has to change. It's 2019.