
Progress in performance and scalability with CockroachDB - awoods187
https://www.cockroachlabs.com/blog/cockroachdb-2dot1-performance/
======
brod
I've just released a small product using CockroachDB, in retrospect it was
_probably_ my favourite technical decision. Previously I'd used it as a toy
and tested deployment strats but was skeptical (new tech and all that), but
now that it's ticking along in the wild I'm very impressed across the board.

~~~
zzzcpan
How does your infrastructure look like? Do you deploy it in a single
datacenter or even in the same rack on a couple of servers?

------
continuations
A few questions:

1) >631851 tpmC

How many servers are needed to achieve this throughput?

2) >4 terabytes of unreplicated, frequently accessed data

4TB unreplicated data? Does that mean if a single node goes down you'll lose
data (EDIT: I meant losing availability, not data)? That kinda ruins the whole
point of having a distributed database.

3) If I'm reading the KV benchmaks correctly, it takes 5 nodes to achieve 100k
tpm. That's 20k tpm per node. That's 333 tps per node. This is a 95% point
read benchmark. Why is the tps (333 tps) so low? Is that normal?

4) How does CockroachDB compare to other distributed databases such as TiDB,
FoundationDB, ScyllaDB?

~~~
manigandham
ScyllaDB is still the fastest at a key/value workload with per-query
consistency settings and quorum reads/writes across multiple regions. If you
need high-performance and low-latency, ScyllaDB wins. They are close to v3.0
which will have global secondary indexes and materialized views to improve
data model flexibility. FoundationDB is also key/value but much lower-level
and well proven for reliability. Don't have much experience with it and the
latest release just introduced multi-regional capabilities, but the general
tooling and documentation is still rough and it would take more effort to
build a higher-level querying layer or client library.

TiDB is interesting, but missing more features from MySQL than CRDB is missing
from PostgreSQL, so it's effective if you want sharding on mysql but will need
a few more releases before it gets polished. Vitess and Citus are good options
if you just want sharding on top of existing mysql or postgres with full query
support within a shard. There's also Yugabyte which is multi-modal
Redis/Cassandra/SQL offering with multi-regional capabilities.

CRDB is a great product with some of the easiest operations (although key
management is a nightmare that they do not have a good plan for). It's fast
enough for point-lookups and makes it easy to distribute and replicate your
data across zones and regions. All nodes are part of a single cluster so read
and write latencies will be high for global deployment, with the enterprise
version having a workaround for local regional reads using pinned covering
indexes. That works, but further lowers write performance.

It also has trouble with large transactions and the middle ground between OLTP
and OLAP with heavy joins. Good choice if you need easy scalability and SQL
interface over performance and complex queries.

~~~
morgo
Hi! I work for PingCAP, the company behind TiDB and come from previously
working on MySQL.

The gap of features missing is documented here:
[https://www.pingcap.com/docs/sql/mysql-
compatibility/](https://www.pingcap.com/docs/sql/mysql-compatibility/)

I would rate compatibility as actually pretty good: all but one SQL mode is
supported (which is a feat in itself), and most of the SQL functions are
supported.

There are some exceptions though, some which are addressable (missing
functions) and some that are not (often a property of being an optimistic
system).

We try to be as transparent as possible on this, which might be part of the
reason why you feel there is a lot missing?

If you have specific examples, I would be happy to clarify. We also have a
course designed for MySQL DBAs, which is designed to make the adoption easier:
[https://www.pingcap.com/tidb-academy](https://www.pingcap.com/tidb-academy)

~~~
manigandham
Good to see the progress, I was looking at the roadmap page:
[https://github.com/pingcap/docs/blob/master/ROADMAP.md](https://github.com/pingcap/docs/blob/master/ROADMAP.md)

Views and CTEs are probably the biggest missing pieces now.

~~~
morgo
The technical design for views was recently completed, and I expect to see
them added soon :-)

Window functions & CTEs are only very recent features in MySQL 8.0 (TiDB is
5.7 compatible). None the less, they are important for HTAP workloads, and I'm
looking forward to seeing them too.

------
evrydayhustling
This is a crazy multiple! Anyone from the Cockroach team up for sharing what
the key innovations were that are driving the improved performance?

~~~
awoods187
I'm the author. We've introduced transactional write pipelining (covered in a
forthcoming blog post), load-aware rebalancing, and completed general
performance tuning which all contribute to our improved performance numbers.

------
Confiks
I was wondering, quite unrelated to the article, if anyone knows if
CockroachDB would be suited for small databases (and comparably modest
computing/memory resources). I very much like its distributed properties, but
only have a simple table of usernames and corresponding cryptographic
material. Is CRDB easy to run and manage?

~~~
manigandham
Sqlite would be my first recommendation, unless you need client/server access.

~~~
yellowapple
I think the GP's stated need for replication would preclude SQLite unless
one's willing to write one's own replication system.

~~~
manigandham
Where's the stated need for replication?

~~~
yellowapple
"I very much like its distributed properties"

------
gigatexal
I’m currently evaluating this as an alternative to vitess + percona mysql. But
strict seralizability has its limitations.

~~~
eloff
What do you mean "strict seralizability has its limitations", you need
something stricter? Or you have a need for something weaker for some reason?

~~~
gigatexal
In the same way single threaded has limitations.

------
qaq
Now we just need some benchmarks on a reasonable size dataset like 100TB and
up

~~~
nawfalhasan
Not an easy benchmark..

~~~
qaq
One would imaging they are testing with much larger datasets internally.

------
nawfalhasan
I'm totally new to cockroach so I have 2 questions..

1\. Is there a managed service of this db where it auto scales, does geo
replication etc all by itself?

2\. Is there any really good book on cockroachdb?

~~~
orangechairs
We released a managed version at the end of October, with auto-scaling, geo-
replication, etc. \-->
[https://www.cockroachlabs.com/product/managed/](https://www.cockroachlabs.com/product/managed/)

Not sure that there any books on it yet.

~~~
manigandham
The managed service doesn't autoscale, it's provisioned capacity by cores. We
just did a call about it.

~~~
orangechairs
Our managed service is currently provisioned by cores. We automatically add
nodes to your cluster based on your usage. You can also request to add more
nodes if you anticipate spikes.

------
qaq
At what point is it cost effective to run CRDB vs PostgreSQL?

------
anticensor
Someone should create a fork with work-safe name. CockroachDB brings
connotation of cockroaches, who are known by eating almost everything and
living almost everywhere.

~~~
mlevental
you know what's ironic? that in every thread there's one of you people
complaining about the name - you're all just like cockroaches! no matter how
successful cockroachdb becomes, no matter how technically impressive the
product becomes, the naysayers never die.

can you imagine 20 years ago someone complaining that google wasn't safe for
work because it had a silly name?

newsflash dummy: stop saying/thinking/repeating stupid things like this and
it'll stop being the case that everyone is so conservative that silly names
are inadmissible.

~~~
h1d
That attitude does not fly with your boss.

Why limit the adoption for no good reason by choosing a weird name
intentionally?

~~~
mlevental
do you not understand perpetuation? "doesn't fly with my boss" \---> "won't
fly when I'm boss". also how about having a conversation on the merits? does
that fly with your boss? does with mine.

