
MariaDB acquires Clustrix - sahin-boydas
https://techcrunch.com/2018/09/20/mariadb-acquires-clusterix/
======
mr_pickles
Alright, for your education startup employees, let's run through some numbers
I have as a stockholder in Clustrix.

In 2010, they raised $12M in their Series B at ~$100M post-money valuation.
Things were looking alright.

In 2013, they raised $16.5M in their Series C, and then shortly thereafter
$10M in an unusual series D. That funding round reverse-split the outstanding
stock 26-to-1 and converted all existing shares, preferred or otherwise, to
common stock. What was left was $10M in new preferred stock, and $20M in
existing common stock! New post-money valuation: $30M. This down round ended
up being a 30x dilution for existing shareholders. If you had a tenth of a
percent of $100M before, now you had a hundredth of a percent of $30M. Yowza!

After that bath, the board amended the charter so they stopped mailing out
these notices. I don't know what's happened since, but I'll find out soon
enough.

I feel bad that the company wasn't successful. It really was a great team and
an impressive technical feat.

~~~
greglindahl
VCs do deals all of the time, and us startup people do only a few, so I'm
hardly an expert... but this down round sounds unusually friendly to common
stockholders.

------
Shelnutt2
First they picked up infinidb a while back and have been working to mainline
"mariadb columnstore". Now the acquisition of Clustrix and discussion of
mainlining that also. It looks like MySQLs separation of the storage engines
is paying off in their ability to keep one interface but allowing
significantly different backends to meet the different workload requirements.

There has been a lot of work expanding the storage engine API so columnstore
can be mainlined and work so spider can progress. I imagine that progress has
showed them that they are likely to be able to integrate a much larger and
less connected (to existing mariadb) code base like clusterix. I can only open
they decide to open source (BSL license?) the clusterix solution eventually.

Between MyRocks (replacing tokudb), columnstore, spider and eventually
clustrix seems that mariadb is trying to making the case that they can handle
any size workload being through at it.

~~~
karmakaze
I had high hopes for MyRocks, then I got a chance to use it. The limitations,
mainly being 5.6 and no coexistence with InnoDB made me reevaluate TokuDB and
it was a better choice for a write heavy, low update, workload especially with
interval flushing (non-fsync durable) commits.

~~~
zepearl
I am using as well MariaDB + TokuDB with " _a write heavy, low update,
workload especially with interval flushing (non-fsync durable) commits_ " =>
do you have maybe a short list of the limitations of MyRocks in MariaDB for
this area?

I tried to use MyRocks (never used it before) in MariaDB some months ago but
couldn't find almost any docs and ultimately didn't understand which
parameters were supposed to be set how under which condition... .

~~~
sethhochberg
IMO this is one of the biggest issues with the alternative storage engines for
MySQL-family databases... we've also experimented with TokuDB for log-like
data but found that, ultimately, the shortage of detailed documentation and
operational issues like needing to develop homegrown tooling for things like
backups overpowered the performance benefits.

InnoDB isn't perfect, but it _is_ exhaustively documented and pretty well-
understood, with a great set of related tools from Percona, etc, for
simplifying operations. That goes a long way.

Recently we've switched back to using InnoDB for ingestion on one of our
write-heavy tables and aggressively archiving the data out of it and into
Clickhouse (InnoDB deals with the high volume of concurrent inserts, data is
loaded into Clickhouse in large batches for querying). By comparison to Toku
or RocksDB, Clickhouse is refreshingly well-documented and its easy for us to
make consistent backups with ZFS snapshots.

------
newnewpdro
Clustrix had a great team and technology but nearly everyone responsible for
building the product left the company long ago. Even Sergei, the founding CTO,
eventually left for Dropbox a few years ago.

It'll be interesting to see if this acquisition results in the interesting
clustrix bits becoming libre software.

------
sciurus
I was unfamiliar with Clustrix. It looks like they've been around a while
(YC06) and have some sophisticated technology
([http://docs.clustrix.com/display/CLXDOC/Distributed+Database...](http://docs.clustrix.com/display/CLXDOC/Distributed+Database+Architecture)).
A horizontally scalable drop in replacement for MySQL is nothing to sneeze at.

[https://www.clustrix.com/](https://www.clustrix.com/)

~~~
ddorian43
I think it's in-memory. There is mysql-ndb-cluster free.

~~~
meguest
I spent a lot of time at a previous job supporting an NDB cluster and I can
attest first hand that it is awful.

Single node failures cause entire cluster shutdowns, the cluster then takes
forever to recover and must be done in a specific order. In fact just thinking
about it makes me anxious.

~~~
ddorian43
What version was this ?

------
mr_pickles
Early Clustrix employee here. Holy cow that took a long time! 12 years since
the company was founded and finally they exit. I can't wait to exchange my
illiquid startup stock for... stock in a _different_ private company.

~~~
newnewpdro
So it was a 100% stock purchase? Bummer man.

~~~
TylerE
Beats the stock turning into toilet paper.

~~~
pinewurst
But it turned from one brand of toilet paper into another, with indeterminate
scratchiness.

------
misterbowfinger
Couldn't really parse out the website for what Clustrix actually is. Is it
basically a leader node that distributes writes to key-value stores, and then
the reads figure out where the data is by a partitioning scheme, with the
benefit of MySQL protocol? Similar to CockroachDB?

~~~
mr_pickles
It was a fault-tolerant, fully distributed relational database which was
compatible with MySQL's variant of SQL. There were no key-value stores
involved.

Tables (and indexes) were automatically partitioned and replicated as needed,
completely under the covers.

Queries (reads and writes) were distributed to the nodes where the data
resided, in parallel.

Scaling the system was as simple as adding new nodes. Data was automatically
rebalanced to take advantage of the new capacity.

Failure recovery was automatic too. If a disk or node failed, the data
involved would be reconstructed from replicas and moved elsewhere with no
interruption in service and no failed transactions.

It was a pretty impressive system, which predated Google Spanner. But, in the
early days, you had to run their custom hardware to get it. There was no cloud
version.

~~~
misterbowfinger
Thanks! Is there a diagram that shows how it works? I'm still having trouble
visualizing it.

~~~
nehcsivart
Not a diagram, but here is an informational video by Clustrix:
[https://www.youtube.com/watch?v=PUq1fYZlNPs](https://www.youtube.com/watch?v=PUq1fYZlNPs)

The video is from almost 5 years ago, but the high level idea discussed is
still true today.

------
Rafuino
What does Clustrix offer that MariaDB Cluster within its MariaDB TX offering
not?

~~~
erulabs
I believe the main (and only?) purpose of Clustrix is sharding, which MariaDB
Cluster doesn't provide - with the provision I suppose that multi-master is
strictly -not- the same as sharding.

~~~
mjw00
Not exactly. With Clustrix, your application isn't aware of the data
partitioning. Instead, the query compiler creates a multi-stage program that
forwards tuples from one partition directly to the next, with only the result
set coming back to the node the application is connected to. Applications can
connect to any node within the cluster and issue queries (usually this is done
through a load balancer).

