
How MySQL is able to scale to 200M QPS – MySQL Cluster - andrewmorgan
http://highscalability.com/blog/2015/5/18/how-mysql-is-able-to-scale-to-200-million-qps-mysql-cluster.html
======
jasonmp85
> MySQL Cluster is a scalable, real-time in-memory, ACID-compliant
> transactional database

To what degree can something be called durable if it's in-memory?

> It is possible to choose how to store data; either all in memory or with
> some on disk (non-indexed data only). […] Disk-based data can be used to
> store data with less strict performance requirements, where the data set is
> larger than the available RAM.

It's great to have all the concepts spelled out in this article, particularly
the way nodes act as transaction managers when needed, synchronously replicate
to a "buddy", and have management capabilities for recovering from partition
events; however, it's almost misleading to use the MySQL name: this appears to
be first and foremost an in-memory solution with some afterthought given to
disk-based durability.

~~~
MichaelGG
VoltDB replicates to _k_ nodes so you can choose the level of durability.
Combined with UPSes in separate parts of the datacenter, it's durable in some
senses. (I'd be more concerned with a kernel or app bug that crashes the
system, if there wasn't a "save RAM on unclean boot".

This only applies to the commercial version, though. The open source one
ditched that feature.

~~~
jhugg
VoltDB, like MySQL Cluster and several other in-memory system, uses snapshot
checkpoints and logging to maintain power-loss protection. VoltDB is a bit
different because of how it uses determinism + logical logging, but the point
is that most of these systems don't have to lose data on power loss.

~~~
MichaelGG
Sure but those are mostly backup, right? If a system could log to disk at a
fast enough pace, then it'd not really be a main memory DB, right? It'd be
essentially the same as a normal DB with a really good RAM cache. I've got to
imagine there are some VoltDB users going fast enough to make logging not 100%
loss proof.

~~~
jhugg
Actually no. The key innovation is that data is stored in a memory friendly
way, then logged to disk in an append-only, log friendly way. VoltDB can be
fast and persistent; the catch is that your data must fit in memory.

~~~
MichaelGG
Oh really? I'm not sure how I missed that all this time. Is it a newer
feature? I don't recall it in the H Store paper or in any of the docs I read.
In fact I recall sync vs async logging to be a decision/tradeoff. And there
used to be a rather large emphasis on choosing s good k factor, and it
certainly seemed to be about durability. Otherwise, it's just another normal
HA thing, no big deal? As in k 0 is a perfectly fine setup, if logging is
enabled?

~~~
bballard
Command logging has worked this way since VoltDB 2.0. Sync mode needs hardware
that can support more frequent fsyncs, such as disk controller with BBWC (does
the job and inexpensive), or a server-class SSD, and should be configured to
write the command log and snapshots to separate spindles. Async mode is the
default setting so that out of the box VoltDB will perform as expected in the
cloud, on a VM, or on a laptop.

------
cengizIO
I'm sorry but MySQL Cluster is pure refined salty horse shit. Nothing more.
We're currently at the office trying to recover our production data from its
filthy hands.

~~~
no1youknowz
Preach it loud and true brother!

I had MySQL clustering in the lab with some serious live traffic for about a
month. Unless you have a team of DBs whose job is just to monitor it and keep
it going MacGyver style, seriously forget it.

There are other better options out there, with less headaches!

------
jandrewrogers
A modern and properly implemented database engine should be bottlenecked on
network saturation for these types of workloads even if backed by disk
storage, never mind in-memory. These workloads are trivially distributable, so
scale-out is expected unless you are doing something really wrong.

In that sense, similar "look how fast my simple lookups are" benchmarks
published by other database vendors are really a function of their hardware
budget for the benchmark assuming the implementation is competent. Queries
with complex constraints or joins would be a more interesting indication of
implementation scalability and performance.

~~~
toomuchtodo
I'd be interested to see a benchmark test suite for ACID-compliant databases
with, as you mentioned, complex constraints and joins. Get as close to real-
world workloads as possible.

I don't want to see the quarter mile, I want to see you road rally.

------
no1youknowz
I couldn't care less.

I found a bug and 5 months later still not resolved.

[https://bugs.mysql.com/bug.php?id=75293](https://bugs.mysql.com/bug.php?id=75293)

Maybe start fixing things people may want to use, instead of claiming "yeah we
can do that".

I no longer use MySQL, I have upgraded to something more "enterprise" and more
useful.

------
mrbonner
> distributed transactions based on 2-phase commit.

i just stop reading from there.

~~~
Xorlev
That's the way Megastore does it too. Optional 2PC for transactions that span
entity groups. So what's wrong with 2PC?

~~~
takeda
If you have a distributed system and need to create a transaction across
multiple nodes 2PC has some shortcomings[1], the 3PC is addressing this.

[1] [https://en.wikipedia.org/wiki/Three-
phase_commit_protocol#Mo...](https://en.wikipedia.org/wiki/Three-
phase_commit_protocol#Motivation)

~~~
frazerclement
Note that the 2PC in Ndb is not quite the same as that used for the normal
'2PC is blocking' arguments :

    
    
      1.  The transaction coordinator (TC) role is 
          automatically taken over in case of the 
          failure of the coordinator during commit.  
          The new coordinator uses the surviving 
          participants' states to quickly decide 
          on Commit or Abort so that locked 
          resources are released.
    
      2.  As the participants are part of an integrated 
          system, failure of both the coordinator and 
          e.g. the only participant to be aware of the 
          commit decision can be handled correctly.  
          In this case the transaction is quickly 
          aborted by the new coordinator and on recovery 
          the old participant will 'undo' its local
          committed-state before rejoining the cluster.
    
      3.  Participants' local commit decisions are not 
          coupled directly to lock release / visibility 
          of those decisions.  Locks are released only 
          when the participant-failure-durability of 
          the commit decision is assured (e.g. all 
          replicas have committed).
    
      4.  Ndb has a third phase (The complete phase) 
          which releases locks on replicas.  This is 
          not the same as the algorithm referred to 
          as three phase commit, so we don't call it 
          three phase commit.  Also, the third Complete 
          phase is not part of the critical commit 
          path - the commit acknowledgement is sent 
          to the application prior to the start of 
          the third phase, so it does not affect 
          the user's experienced commit latency.
    

Not all '2PC' implementations are created equal.

------
ksec
Are there anything similar for Postgre?

~~~
takeda
I suppose [http://www.postgres-xl.org/](http://www.postgres-xl.org/) ?

