
Clarification on “Call Me Maybe: MariaDB Galera Cluster” - crivabene
https://www.percona.com/blog/2015/09/17/clarification-call-maybe-mariadb-galera-cluster/
======
infinotize
Besides being a super well-written and interesting series technically, the
Call Me Maybe blogs have been very revealing as far as different
organizations' response to criticism. Especially considering all of the target
applications are open source, the project maintainers should be profusely
thankful someone has taken the time for such thorough analysis, presumably
much deeper than the maintainers themselves appear to have done at least on
consistency behavior, to reveal bugs which should ultimately make it that much
stronger.

~~~
dbarlett
I really like jerf's take on it [1]:

    
    
        How a project performs today tells you the zeroth derivative of its location.
        Looking at the commit log tells you about the first derivative. How people react
        to Call Me Maybe when its about their product gives you a lot of information
        about the second derivative.
    

[1]
[https://news.ycombinator.com/item?id=10082099](https://news.ycombinator.com/item?id=10082099)

Edit: username

~~~
seiji
A common response to these has been: "But you're using it wrong!!!!"

If your product is so complex or so ill-specified that full time testers can't
make it work correctly, it's probably difficult for your _developers_ to even
understand or fix the problems.

We ignore how much of software development has become "it works for me under
all of my default assumptions as the developer of the product—ship it,"
especially when faced with business deadlines and business management focused
around business objectives (and maybe not so much around software quality or
correctness).

------
tptacek
_I do not quite like the usage of the word “corrupted” here. For me, the more
correct word be to use is “inconsistent”._

Aren't we talking about situations in which a database tracking account
balances creates money out of thin air, or vaporizes it unexpectedly?

I feel like Aphyr is always at pains to talk about the real-world implications
of these findings --- not just how bad they are in sensitive applications, but
also the kinds of places you can get away with these "inconsistencies".

~~~
mattzito
I think the Percona writer is focusing on "corruption" in terms of how I think
most database folks imagine "corruption" \- where a series of commands will
erase a block of data, or make a block of data unrecoverable.

I agree that "inconsistent" is probably more mechanically accurate, but since
in this case the side effect of the inconsistency is that you can't trust the
contents of a block of data, it seems like a difference without a distinction.

~~~
pygy_
Yup, inconsistency is a kind of data corruption, provided you expect it to be
in a consistent state.

The emotional value of the latter is much stronger, though, and it sounds
worse to the uninitiated, which is probably why Percona folks are trying to
spin it that way.

~~~
StillBored
For me, inconsistent tends to imply a silent error, and corruption tends to
imply one that is immediately noticed.

I know that isn't accurate, but the tendency for it to be true, actually makes
the word inconsistent upset me more than corruption. I know how to fix bad
data, trying to track down a soft error that is hard to reproduce is the
subject of my nightmares.

------
MichaelGG
Am I missing something or does this not address the main issue the original
article raised: The documentation is simply incorrect. It claims to support
SNAPSHOT ISOLATION but does not. The company knows this and even this article
says the behaviour "is totally expected".

Seems like the first response should be to fix the docs and not claim
capabilities beyond what's implemented.

(Also it was pretty clear from the original article that corrupted data meant
the balances were incorrect, not that the file was corrupted like a bad
checksum.)

~~~
fipar
Disclaimer: I work for Percona.

The docs are on the galeracluster.com page, which is owned and maintained by
another company, so there's no way we could fix those. A staff member from
this company (And one of the Galera authors) replied on the original 'Call me
maybe' post indicating they would fix the docs, though.

I think the 'corruption vs inconsistency' debate could seem as nitpicking, but
anybody who has been working long enough on databases has a very specific
concept for each word, and, given transaction processing (distributed or
otherwise) is such a complex topic, it does not help to use the wrong
terminology.

~~~
sergiosgc
Corruption is read by the database people as: "Your data is unreadable";
Inconsistency as: "Your data is wrong".

The article implies that inconsistent is better than corrupt, and that mixing
up the terms paints a worse picture. The assumption is flawed. Inconsistent is
as bad or worse than corrupted. It is a silent failure, and silent failures
are worse than visible ones.

~~~
bakhy
i'm kinda surprised by the prevalence of that attitude. in theory, yes, you
are right. but that's really a bit of programmer purism. in practice, a DB
that stops working can mean a sudden stop in doing business, which is a huge
catastrophe. money lost every second.

inconsistency is horrible, and theoretically the same thing, you are right!
but it's something you might be able to fix, without the whole system failing,
with maybe just a limited number of people knowing what happened and not the
whole company and/or all customers. this is kinda fucked up, but i think it
counts. (and PS, accountants find these errors. it's their job, and they have
experience enough not to trust any proof of correctness, and check everything
twice.)

~~~
tptacek
An account database that makes new money out of thin air is a huge
catastrophe!

I honestly don't understand how this is even a live issue for debate. The
"inconsistency" we're talking about is terrifying.

~~~
bakhy
they are both a catastrophe, i never implied that inconsistency is not one.
i'm merely involved in semantic bickering here :) both are bad in their own
way, but which is worse will depend on the case. bear in mind that programmers
make mistakes all the time. they can also cause such effects. which is why we
often do things in a way where incorrect data can be corrected later.
particularly in anything related to accounting, where the practice has existed
for centuries. i'm really not interested in claiming that one catastrophe is
better than the other, but a data corruption in a database _can_ sometimes be
more problematic.

edit - sorry, i just noticed i actually did say, in another thread here, that
corruption is worse. i would like to take that back now :D

------
rushi_agrawal
Below are the tweets by Aphyr for the same thing (with language slightly toned
down). I never thought I won't find their mention here on HN. :) Nevertheless,
precise, to the point:

Buncha people giving me <filth> for calling data written through an invariant
violation "corrupted state", like somehow it's not garbage.

If TCP checksums don't work right we don't call the packet "inconsistent." We
call it corrupt. If a disk shuffles your file's bits? Corrupt.

I use the word corrupt to emphasize that not only has the system <messed> up,
but you have _no way to know_ your data is now <messed> up.

~~~
inyourtenement
I don't think you need to censor quotes on HN.

------
PeterZaitsev
There are many approaches to data consistency, including for example Eventual
consistency, not only in databases but in any parallel programming. Different
CPU architectures for example for years provided different memory consistency
models with trade-offs of performance and being usable for application
programmers.

It is important however behavior in this case is clearly documented.

I think there is decent documentation about Innodb describing how transactions
work in Innodb [http://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-
mo...](http://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-model.html)

Galera would benefit having more clear documentation about what data
consistency model exactly it provides.

At the same time Percona XtraDB Cluster, MariaDB Cluster can be used to built
reliable applications assuming you're writing to their consistency model
correctly.

------
moomin
I think he's trying to define "corrupted" data as completely inaccessible data
(because of file corruption, presumably). However, if I zip up a picture of
PaulGraham.jpg and when I unzip it I get one of Iron Man, I think I'd be
within my rights to call the data corrupt.

------
creshal
> If you use this in a real life, the more obvious way to write these
> transactions is:

Is it? Do ORMs really do that, or is it one of those "SQL was designed to be
used this way, but nobody using SQL read the design documents" cases?

~~~
dack
I think in general when writing applications, you should assume that any data
you keep in memory between SQL queries could become stale and change before
the next query/update.

Anytime you have to update multiple individual records and rely on calculating
state from both of them at once, alarm bells should start going off. Yes, you
can start dropping into special transactions but there's also a potential
opportunity for a better design.

Unfortunately, all that feels like a huge accidental complexity.

~~~
bboreham
> Yes, you can start dropping into special transactions

Transactions aren't "special" in SQL. You expect that reads and writes within
a transaction are kept consistent, unless you have deliberately chosen a
weaker serialization level.

~~~
dack
Well, SERIALIZABLE isn't the default in Oracle, Postgres, or MySQL.. it's
READ_COMMITTED, READ_COMMITTED, and REPEATABLE_READ respectively (unless the
documentation i just looked up is out-of-date)

I'm just saying you still can't rely on it by default, and have to start
reading the details of the isolation levels. If you're doing that all the
time, it might be a reason to rethink the design (however, there are of course
some exceptions)

------
AlisdairO
Edit: ignore this, it's addressing a completely different situation, and I
clearly didn't read the article well enough. The code in the article writes
all of the locations it reads, so true SI ought to keep you safe. My
apologies!

Yet another edit: huh, it seems that InnoDB in RR doesn't rollback when you
write to a row that's been written since you started the transaction. TIL.

\--------

It's worth noting here that (AFAIR) Oracle's 'SERIALIZABLE' (actually SI)
level suffers from this exact write skew vulnerability, so MySQL/MariaDB is
not alone in this issue. As pointed out, SELECT FOR UPDATE is a commonly used
remedy.

What it comes down to is that while re-reading data in a given transaction
under SI will give you the same result, _it doesn 't guarantee that the data
in the DB itself has stayed constant_. If you want to guarantee that the data
won't change, you need to lock it.

IIRC this also applies to PostgreSQL's REPEATABLE READ level.

~~~
inversionOf
_it doesn 't guarantee that the data in the DB itself has stayed constant_

By definition, snapshot isolation is supposed to guarantee that all reads are
the consistent, committed data as of the begin transaction, and the commit
will fail and rollback if any data altered within a snapshot isolation
transaction was already changed. The response by Percona, unless I am reading
it wrong, actually agrees that they are not truly implementing snapshot
isolation. They then argue that the tester should have tested something
totally different, but that doesn't address the snapshot isolation not
actually being snapshot isolation.

As to locks, snapshot isolation/MVCC are there to avoid a sea of locks. The
solution to a broken snapshot isolation level isn't simply to manually and
programmatically demand locks. While that may work, it completely undermines
the whole reason for SI.

~~~
AlisdairO
EDIT: per grandparent comment edit, this comment is based on a misreading of
the article and should be ignored. My apologies!

\-------

> By definition, snapshot isolation is supposed to guarantee that all reads
> are the consistent, committed data as of the begin transaction, and the
> commit will fail and rollback if any data altered within a snapshot
> isolation transaction was already changed

Absolutely. What it doesn't guarantee is that data that you read and then use
to update a different location has remained constant - which is what write
skew is in the first place. What InnoDB and PostgreSQL call 'REPEATABLE READ'
and Oracle calls SERIALIZABLE are in fact snapshot isolation.

edit: to clarify, Aphyr's original article describes RR as preventing basic
write skew, which mainstream MVCC-based RR implementations simply don't do.
Postgres does prevent it when using SSI (the SERIALIZABLE level), and lock-
based implementations (RR on DB2 and SQL Server) do also.

You can argue this both ways: The MVCC-based RR implementations do conform to
the (fuzzy) letter of the ANSI SQL standard law. They don't conform to Adya's
formal definitions, but in fairness many of them existed before those
formalisations did :-).

------
eis
A lot of defensive talking around technical terms, mixed with a bunch of typos
and topped off with unfair attacks.

Not very classy. And the point of Aphyr still stands I think. In default mode
it is easy to get corrupt data with Galera Cluster. That InnoDB on a single
instance can have the same problem makes it all the more troubling and I'm
glad I moved away from MySQL a long time ago.

------
sagichmal
> Following that conclusion is using Galera cluster may result in “corrupted”
> data. I do not quite like the usage of the word “corrupted” here. For me,
> the more correct word be to use is “inconsistent”.

But Aphyr never once uses the term "corrupted data", or the word "corrupted".
If you're going to quote an article, it's important to be precise.

This response feels panicked, or at least rushed. And it really misses the
point. You can't, or shouldn't, try to explain away these types of findings as
irrelevant, or just an issue of semantics. Instead, I'd hope by now that
technical folks on the receiving end of a Call Me Maybe analysis would have
learned that there's precisely one correct way to respond: acknowledge the
faults, clarify relevant documentation, and file (and link to) issues in
public issue-trackers that will address the problems.

HashiCorp, CoreOS, and arguably Elastic played it correctly. Aerospike, Mesos,
and now Percona, didn't. Shame.

~~~
mbrutsch
> But Aphyr never once uses the term "corrupted data", or the word
> "corrupted". If you're going to quote an article, it's important to be
> precise.

As opposed to "pedantic"? Here's a quote from the article:

> The probability of data __corruption __scales with client concurrency, with
> the duration of transactions, and with the increased probability of
> intersecting working sets.

------
wglb
How is _inconsistent_ different than _just plain wrong_?

This may be another example of reactions to Aphyr's reports telling about the
mental model of the developers.

------
bakhy
doesn't this "solution" here actually still leave wide open the possibility of
_reading_ inconsistent data?

