
Mnesia and CAP - mzehrer
https://medium.com/@jlouis666/mnesia-and-cap-d2673a92850
======
rdtsc
Unrelated, but maybe interesting: WhatsApp was running Mnesia and made it work
fantastically. Especially with such a small team of engineers.

Here is a video of Rick Reed talking about how they did it (hint it is not a
one monolithic Mnesia cluster).

[http://www.youtube.com/watch?v=c12cYAUTXXs](http://www.youtube.com/watch?v=c12cYAUTXXs)

~~~
lostcolony
Yeah, you can build scalability into Mnesia, but it's not there by default.
It's also dependent on what kind of data and persistence guarantees you want;
if it's mostly transient data, or acceptable to lose some data in the one in a
million case, Mnesia out of the box is probably fine. If you need stronger
guarantees, you either have to do a lot of work, or you should investigate an
alternative solution.

------
antirez
If Mnesia claims to be AP or CP, and it's not, there is a problem, but the
following sentence is wrong IMHO: "In short, it is broken with respect to the
CAP theorem.".

Basically if it is _per design_ , it is perfectly fine to design a database
system that does not try to be Available nor Consistent (capitalized to mean A
and C of CAP, not other kinds of availability or consistency).

C of CAP is very strict, and requires consensus (because we assume we are
partition tolerant). A of CAP is very strict, and requires even a single
isolated node to be able to reply to requests.

As long as the behavior of the system is well specified in the documentation,
a database system can make the conscious tradeoff of not being AP nor CP
without being "broken". If it is a good idea or not it's a matter of design
and use cases, but the point is, CAP does not capture the whole set of
tradeoffs of a database system, so there are other real world reasons in
certain systems that may make you sacrifice the possibility of taking the
maximum theoretical availability or consistency.

So certain real world systems may have a specific set of liveness and safety
property, well documented, and designed for a given set of use cases and
implementation goals.

~~~
rubiquity
Mnesia predates the CAP theorem so perhaps that is why the author feels it
doesn't specifically fit into AP or CP. You can't design your system around
something that doesn't exist yet! Though, the qualities that CAP tries to
represent were probably still on their mind albeit in a less formalized way.

------
michalzee
A lot of good points, the most important to remember is that nowadays mnesia
is good only for storing configuration.

~~~
lostcolony
I've used it even for fairly small amounts of user data (a few gigs max). What
we had, though, was a load balancer out front that would ensure only one node
was ever receiving writes (lazily switching on failure), and were relying on
it solely for fault tolerance (which, despite the article, Mnesia can do just
fine at, with certain caveats; what it can't do is scale).

In the event of a network partition, the load balancer is channeling all
writes to a single node, and the system rather loudly lets us know that a node
disappeared (in the event of a network partition, every node is loudly letting
us know that a/some node(s) disappeared), ensuring quick redress. While it's
possible that in such an event the node being written to will go down, causing
the load balancer to switch to the partitioned node(s) (thereby creating an
inconsistency), this is generally over a -very- small time window, as the
facility this was deployed into is manned 24 hours. Thus, any inconsistency is
confined to a very small window, approximately half of the time between the
initial split, and its being addressed (as if the failure that causes the load
balancer to switch to the other node happens before then, we could take the
switched to node as most canonical, and if it switches after that, we could
take the switched from node as most canonical), so even without an express
mechanism to resolve consistencies, we're losing very little data, and for
such a rare occurrence (multiple years running in multiple locations and we've
yet to see it happen), that's an acceptable risk.

It just depends on what your requirements are. The author is bang on that
Mnesia should not be used for scaling (and in general probably shouldn't be
used any time you don't want a full replication on every node, or relying on
some other mechanism to reconstruct/deconflict data), but in an environment
where split brains are infrequent, where you've got Nagios or similar running
to alert you to a split brain, and you've got people on call, where
scalability is not a concern, and where in the (extremely) unlikely event you
actually lose some data it's okay or easily addressed, Mnesia can be a
reasonable solution, since it's so easy to get running with, and quite
performant.

