
CockroachDB 19.2 - irfansharif
https://www.cockroachlabs.com/blog/cockroachdb-19dot2-release/#
======
ralusek
Some comments I have regarding the documentation. It seems like there is a lot
describing how to set up a cluster and how to do various SQL operations that
most people probably know how to do.

I think the information that should be presented much more clearly are:

1.) How do we have to partition our data/what restrictions are in place?
Basically, what considerations are necessary when designing the data model due
to the constraints of the technology?

2.) What functionality that we expect from RDBMS do we give up when working
across partitions? Can foreign keys exist across partitions? Can joins work?
Inner/Right joins?

~~~
jseldess
Head of Docs and Training at Cockroach Labs here. Thanks for this feedback,
ralusek. It's spot on, and we're planning to create much more direct and
prescriptive guidance and best practices for working with CockroachDB across
multiple regions. That's when placement of data (via geo-partitioning or other
approaches) is crucial for reducing network latency.

In case you haven't seen them, for now, we have docs on some data placement
patterns for multi-region deployments:
[https://www.cockroachlabs.com/docs/stable/topology-
patterns....](https://www.cockroachlabs.com/docs/stable/topology-
patterns.html#multi-region-patterns). We believe the first and third are best
for most cases. This tutorial also walks through the impact of using those
patterns in a cluster spread across 3 regions of the US:
[https://www.cockroachlabs.com/docs/stable/demo-low-
latency-m...](https://www.cockroachlabs.com/docs/stable/demo-low-latency-
multi-region-deployment.html)

But we'll definitely continue working on better guidance here.

~~~
zzyzxd
Last month I tried to deploy cockroachdb in Kubernetes and I felt the
documentation wanted to treat me like a 5-year-old.

I don't think the your product documentation has to explain what Kubernetes is
or its terminologies[1].

The worst part is that it does not tell the cluster admins what need to be
setup in the Kubernetes cluster at all, instead, it wraps a bunch of `kubectl`
commands in a Python script and says "just download the script and run it"[2].
I know a simple `python setup.py` command is easier for novice to just try
out. But it could really gives seasoned Kubernetes admins some headache...Our
clusters enforce GitOps and nothing can bypass pull requests. Running some
random `kubectl` commands is simply impossible. I ended up spending my day
reading and translating the script into Kubernetes object definitions by hand
and I didn't enjoy it...

1\. [https://www.cockroachlabs.com/docs/stable/orchestrate-
cockro...](https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-
with-kubernetes-multi-cluster.html#kubernetes-terminology) 2\.
[https://github.com/cockroachdb/cockroach/blob/master/cloud/k...](https://github.com/cockroachdb/cockroach/blob/master/cloud/kubernetes/multiregion/setup.py)

~~~
tyingq
I'm curious why you would want to deploy CockroachDB in a K8 cluster. Not
being critical...genuinely curious. Since it has its own idea of a cluster, it
sounds complex to me. Especially since the typical geographically wide
CockroachDB cluster would likely span outside a region centric k8s cluster.

~~~
BobVawter
(Cockroach Labs developer here) In CockroachDB parlance, a "cluster" is just
some number of cockroach binaries that have local storage and can connect to
one another via TCP/IP. It's entirely feasible to run a multi-node cluster on
a single laptop by just starting several instances of cockroach bound to
different port numbers.

The Kubernetes concept of a "cluster" is a much broader term, encompassing the
compute nodes and a lot of control-plane software to make all of the magic
happen.

Fundamentally, running CockroachDB on a Kubernetes cluster abstracts away the
process of getting the cockroach binary running and offers a lot of
convenience to the human operator vis-a-vis reliability and service discovery.

We like to say that CockroachDB is "Kubernetes native" in that you can easily
build a CRDB cluster using only the basic k8s building blocks, without
requiring a separate operator program to manage the deployment.

You can `kubectl apply` this config and get a fully-functioning cluster.
[https://github.com/cockroachdb/cockroach/blob/master/cloud/k...](https://github.com/cockroachdb/cockroach/blob/master/cloud/kubernetes/cockroachdb-
statefulset-secure.yaml)

Utilities like Helm et al. are certainly easier than managing a bunch of YAML
configs, but they are entirely optional.

Some other CockroachDB+Kubernetes synergies to consider:

1) When using a StatefulSet and PersistentVolumes, a CockroachDB node will
easily survive being rescheduled off of its underlying host (e.g. due to
maintenance or hardware failure) with no human effort needed.

2) All cockroach instances are, from the perspective of a client, homogenous.
That is, a client can send a SQL query to any member of a CockroachDB cluster
and get a meaningful response. This maps exactly onto the k8s Service
abstraction.

3) Federated k8s clusters and multi-region network fabrics do exist, although
they're not exactly common yet. CockroachDB can maintain its clustering across
"non-uniform network architectures" that exist within- and cross-region.

------
rubyn00bie
Is the primary advantage of CockroachDB over FoundationDB primarily in
supporting SQL out of the box? The multi-region support seems pretty banger at
first glance.

P.S.

... and I had a tirade here about the licensing information being largely
absent on their FAQ or product pages, but I did find this after a google
before posting:

[https://www.cockroachlabs.com/blog/oss-relicensing-
cockroach...](https://www.cockroachlabs.com/blog/oss-relicensing-cockroachdb/)

which... from what I can tell at my cursory inspection, seems okay to me. I'd
be pretty pissed too assuming a giant SaaS company came by, forked their
product, and then resold it with improvements they didn't upstream. Mostly,
because it's just fucking rude. One could have a good business relationship
with another company with both prospering, but instead you take the short-n-
shitty road for no reason...

With that said, it'd be nice if you put that in yer FAQ.

~~~
jaytaylor
> Is the primary advantage of CockroachDB over FoundationDB primarily in
> supporting SQL out of the box.

That + foundation db pulled the carpet out from all their users at the time
apple purchased them. Cannot trust that team / product again, it would be
incredibly and blasphemously foolish. At least CDB likely won't pull any such
stunt.

Disclaimer: I was personally affected by the FDB team making this choice back
in 2015 when events unfolded. Still leaves a bitter taste.

~~~
sporkland
The situation is completely different these days. Back then they were a small
startup. The DB space is incredibly challenging especially because engineers
tend to want things for free and are pretty demanding. An acquisition seemed
like a likely outcome. To partially offset that risk they offered source code
escrow to prospective customers.

Now it's an opensource project backed by Apple. And from what I can tell the
team is massively different these days. I'm not sure why the obvious
conclusion is "we can't trust them".

I've heard this narrative from people that got burned by fdb's purchase so
I'll never use it again multiple times. It's largely confusing to me, as it
doesn't feel connected to the current realities.

------
outside1234
Can we actually backup the database with the community edition yet?

Last I checked this was impossible.

~~~
gigatexal
Backups are still an enterprise feature but doing a dump is still supported in
the non paid version.

~~~
outside1234
Yeah, that is not a real solution for a database of any real size, which is
the only reason to use CockroachDB in the first place.

~~~
gigatexal
I know. It’s a bummer.

I’ve not tried to do block level backups for our setup in production.

~~~
outside1234
I have - it doesn't work sadly. :(

------
ainar-g
Excuse me if that has been discussed numerous times, but I distinctly remember
CockroachDB 2.0 coming out, and it being a big deal. Where are releases 3.0 to
19.0? Was the versioning scheme changed? How? Why?

Other than that, I _really_ wish I could use CockroachDB at "$COMPANY". Our
architects, unfortunately, deem it “too new” and “unproven”. Bah.

~~~
a-robinson
They switched to calendar-based versioning earlier this year with the 19.1
release, for reasons explained in this blog post:
[https://www.cockroachlabs.com/blog/calendar-
versioning/](https://www.cockroachlabs.com/blog/calendar-versioning/)

~~~
ainar-g
Thanks for the link!

“““ We wanted to find a solution that would minimize frustration (and time-
consuming meetings) internally, while setting the correct expectations with
users around quality and stability. ”””

Honestly, this doesn't explain much. If anything, hearing “FooDB 19.1” makes
me think of stuff like React 16 or Chrome 69. That is, of “hip dudes” who
“live fast and bump major versions”. On the other hands, hearing “FooDB 2.16”
would make me think “Yep, this thing seems stable as Perl or Linux”. The
meetings point also doesn't explain anything. Go simply does a minor version
bump every six months. Why couldn't Cockroach Labs just do that?

Oh well, who cares, really, as long as the product is great.

------
gigatexal
Pro-tip: if at all possible do not run this on top of Ceph. Bare metal SSDs or
VMs with SSDs.

~~~
SEJeff
From experience I take it? Was Ceph running on any of those?

~~~
gigatexal
Yes it is how we run it in production and sadly it’s not great.

On my own laptop I’ve found it to be maybe 85% or so as quick as normal single
mode Postgres. That said things aren’t great when the storage layer is doing
consensus and replication and then the database is as well.

------
ralusek
Glad to see Cockroach Cloud, I signed up for the beta.

I'm only really interested in an elastic, managed program with transparent
pricing. As a contractor with a fair amount of exposure to various startups
around the bay, I find these to be relatively ubiquitous criteria.

~~~
pakshi
CockroachCloud PM here -- we hear you on the transparent pricing. Cloud
pricing is transparent, see here
[https://www.cockroachlabs.com/docs/cockroachcloud/stable/coc...](https://www.cockroachlabs.com/docs/cockroachcloud/stable/cockroachcloud-
create-your-cluster.html#step-2-select-the-cloud-provider)

Glad to hear you signed up for the beta. We've been letting people off in
batches and hope you can use CockroachCloud for your apps! If you have any
feedback on the beta product, do let us know!

------
greatjack613
Is this the new version that has the new license terms that prevent amazon
doing an "elastic" on them?

------
Kognito
Waiting for the comments about the name.

On a more serious note, looking forward to trying this new release out. I
found previous versions really straightforward to get up and running with on
Kubernetes but performance was always lacklustre.

~~~
irfansharif
You can find our more recent performance numbers on
[https://www.cockroachlabs.com/docs/v19.2/performance.html](https://www.cockroachlabs.com/docs/v19.2/performance.html)

~~~
aaronbwebber
Is that saying that you are comparing Cockroach running on 81 c5d.9xls vs
Aurora running on 2 r3.8xls? I get that part of what you are trying to show is
that Cockroach will scale far past what any single-master system can, but it
feels pretty lame to run a test comparing transaction throughput on 2900 cores
vs 64 cores and a data set size comparison on 81 hosts vs 2.

The sysbench metrics seems like a much fairer comparison, and CockroachDB
looks great in those metrics as well, so I don't really get why you are
leading with a comparison that looks really sketchy at first glance.

~~~
awoods187
Hi- product manager from CRL here. You are right--we hope to demonstrate that
CockroachDB is built to scale horizontally. Deployments of CockroachDB can
grow by easily adding more nodes to the cluster which in turn linearly scales
throughput. We have posted the most recent published Aurora numbers as a
comparison to demonstrate how architecture can influence scale.

We also hear your point about efficiency as tpmC (throughput) alone isn't
sufficient to compare systems without taking hardware into account. TPC-C asks
users to provide a tpmC per dollar amount. We conducted this price comparison
previously in this blog post
[https://www.cockroachlabs.com/blog/cockroachdb-2dot1-perform...](https://www.cockroachlabs.com/blog/cockroachdb-2dot1-performance/).
These results are even lower in 19.2 because we can achieve greater tpmC with
fewer nodes.

This page wasn't met to be competitive-we simply showed Aurora as a reference
point. Since we want to focus on CockroachDB we will remove the Aurora
comparison.

------
cglace
How does CockroachDB compare to YugabyteDB

~~~
manigandham
CockroachDB is built from scratch in Go, uses Raft + RocksDB as a distributed
key/value store, and then implements the Postgres data structures and wire
protocol on top. Focuses on strong consistency and durability (which was the
origin of its name). Can run active/active in multiple regions. Lacks full
compatibility with Postgres.

Yugabyte is a multi-model database built in C++, uses Raft + RocksDB as a
distributed key/value store, and then implements a proprietary document-store
model on top called DocsDB. This is exposed as a Redis API, Cassandra API, and
now PostgreSQL RDBMS interface by using the actual Postgres code for the SQL
layer. Much higher compatibility with Postgres with all data types, most
advanced SQL, and even some extensions that work without problems. Not as
advanced on the distributed side and uses more of a primary/replica setup.

Both are great options for a distributed scalable RDBMS with Postgres.

~~~
bogomipz
Thanks for this overview. Is Cockroach an open-source version of Google's
Spanner as well?

~~~
irfansharif
Not technically open-source (BSL-licensed), but yes, source available on
[https://github.com/cockroachdb/cockroach/](https://github.com/cockroachdb/cockroach/).

------
ainar-g
Heh, they didn't seem to test their University website[1] before publising it.
There is a gaping:

    
    
      <!-- End Google Tag Manager -->
      Additionally, paste this code immediately after the opening <body> tag:
      <!-- Google Tag Manager (noscript) -->
    

In the source code. That “<body>” isn't escaped either, which makes the whole
thing borked-up.

This is why we need XHTML.

[1]
[https://university.cockroachlabs.com/catalog](https://university.cockroachlabs.com/catalog)

~~~
thatnerd
Thank you, we didn't get that resolved beforehand. Should be fixed now.

~~~
ainar-g
Wow, that was quick. Thank _you_ :-)

------
sandstrom
Awesome to see a new release! We've followed Cockroach for a while, would be a
good fit four our workload.

However, I think it's sad that their pricing is so opaque.

Even after a few emails back and forth with their sales people it still feels
fairly arbitrary, and since it's not transparent we don't know if they're just
making numbers up based on what they think we're willing to pay.

I find this sales-tactic off-putting. In theory it allow them to "not leave
any money on the table", as they would probably frame it, but I'm curious how
many deals they will lose simply because people don't want to be held hostage,
or don't want to put up with the sales-hassle.

With AWS and Google Cloud (their main competitors), we know that prices won't
be jacked-up.

———

An interesting example of this mistake is Mapbox.

They had a limited free tier, where most use-cases would be "talk to sales for
pricing". We had a frustrating 2 months talking with their sales people,
finally getting a reasonable deal. But we were very close to just ditching
them for Google Maps (even though I liked Mapbox product better and was
willing to pay more for it), just because the price negotiation and all the
cheap sales-tactics was so time-consuming.

Then, 6 months later, Mapbox CEO posted this blog post:

[https://blog.mapbox.com/new-
pricing-46b7c26166e7](https://blog.mapbox.com/new-pricing-46b7c26166e7)

    
    
        [excerpt, from Mapbox CEO announcing pricing/sales rethink]
        
        How we price our tools has a significant impact on how those tools get used 
        and what gets built with Mapbox. We were pricing some of our APIs wrong — 
        making things confusing and restrictive. Rolling out new pricing took 
        close to five months; informed by the stories of our builders 
        whose curiosity and insight gave us a lot of honest feedback. 
        
        Here’s what we were doing wrong:
        - Unpredictable pricing at scale
        - Development slowed by price modeling and negotiating volumes
        - Unintuitive billing units
        - Hard to compare measurement to our friends at Google
        - Confusion around when commercial plans are needed
    
        Our goal with this change is to reduce the friction to build with 
        our tools and to allow our team to help developers use maps and locations
        more creatively. As we designed this new pricing, we kept our key 
        pricing principles in mind:
        
        - Predictable and aligned with metrics our customers already measure
        - Clearly defined discount tiers as businesses scale with no surprises
        - Product usage measured in a way that’s clear to all involved
        - Generous free tier to encourage building and make it easier to get started
    

I experienced all those downsides, which almost pushed us to leave Mapbox, so
very happy to see them reconsider (for their own sake).

Perhaps CockroachDB could unlock similar benefits by making their
pricing/sales more open.

~~~
dvasdekis
Agree. My current firm has a policy of not working with companies that don't
post public prices, which sadly writes off CockroachDB for us. It's a shame
because we have a major use case for it.

~~~
pakshi
CockroachCloud PM here. CockroachCloud pricing is transparent. You have the
option of three different node sizes in GCP and AWS:
[https://www.cockroachlabs.com/docs/cockroachcloud/stable/coc...](https://www.cockroachlabs.com/docs/cockroachcloud/stable/cockroachcloud-
create-your-cluster.html#step-2-select-the-cloud-provider)

------
nwmcsween
Somewhat related but why not have an object store like ceph but expose the
guarantees of the data (cap or whatever) and just interact with objects? Bonus
points for making the API pluggable.

------
heelix
Anyone use the cross data center clustering in the real world? Got a bunch of
people considering this - a cluster spanning two data centers as the way to do
transactional updates.

------
yRetsyM
CockroachDB seems to be getting a lot of attention and marketing cycles
lately... What has made it the focus of late?

~~~
erdaniels
Marketing

~~~
robbrown451
Probably has to do with their decision to name it after the cockroach. I mean,
if they had called it MaggotDB or BotFlyDB or TapewormDB, people might find it
unappealing.

~~~
ixtli
Those aren't nearly as resilient though. Now, if you made a version of sqlite
that replicated itself inside your program then you could use botfly ...

~~~
fnordsensei
The successor will have to be named TardigradeDB.

~~~
SEJeff
You can't even kill it with Boring Company's flamethrower! Tardigrades are
virtually indestructible as going into space and coming back won't always kill
them.

------
suyash
Love the product name!

