
Ask HN: What Is the Value of CockroachDB? - lokiju
I dont mean this sarcastically, I just keep hearing how this technology is great and distributed but I cant really see the current use case in current enterprises.<p>Is there something I am missing? Why would someone simply switch their backed databases for something that does not even offer Data Warehousing or other data management tools?
======
neilhan
Disclaimer: I am from PingCAP, the company built TiDB

CockroachDB is well suited for applications that require reliable, available,
and correct data, and millisecond response times, regardless of scale, and
belongs to the category of NewSQL, which promises to combine benefits from
RDBMS (strong consistency) with benefits from NoSQL (scalability); it mainly
achieves this through new architecture patterns and efficient SQL storage
engines. TiDB and CockroachDB are a few of the leading NewSQL databases. Each
database implementation has its own take on how to ensure strong consistency
with a scalable architecture.

The following situations are some hints you may consider for a switch,

1\. RDBMS is becoming the performance bottleneck of your backend service

2\. The amount of data stored in the database is overwhelming

3\. You want to do some complex queries on large amount of data that cannot
fit in one machine without manual sharding

4\. Your application needs a full ACID transaction for data distributed on
multiple machines.

There are multiple choices for NewSQL too, engineers are always trying to
compare TiDB and CockroachDB, here’s some considerations before you decide.

TiDB is compatible with the MySQL protocol while CockroachDB is compatible
with PostgreSQL. You can directly connect to TiDB server using any MySQL
client and benefit from the MySQL ecosystem. Plus TiDB is well adopted and
trusted by 1000+ large scale Internet companies and banks.

Currently, CockroachDB is not suitable for Online Analytical Processing (OLAP)
while TiDB is a Hybrid Transactional and Analytical Processing (HTAP) database
that supports both OLTP (Online Transactional Processing) and OLAP workloads.
So with TiDB, a typical ETL (Extract, Transform and Load) process that moves
data to a different database for analysis is no longer required, enabling you
to create new values for your users, easier and faster.

Hope the above can help you a bit.

~~~
lokiju
Thanks, very helpful.

I have a feeling these technologies will be much more mainstream in 2-3 years

~~~
neilhan
No problem at all:)

Indeed! With the current situation going on, the digitalisation accelerated,
and the dataset is moving to a large scale, we see a huge demand of services
like this.

------
jasonhansel
It's relational, it scales horizontally, and it's free. That combination is
surprisingly hard to find.

~~~
zem
i'm not a database person, but does postgres not scale horizontally?

~~~
kasey_junk
No. Read replication approaches linear horizontal scaling (though global
latencies can lead to weird edge cases).

Write replication requires external products.

~~~
jasonhansel
Specifically, Postgres extensions that add write sharding are generally either
(a) non-FOSS commercial products or (b) not very well maintained.

------
BruiseLee
From my experience, CockroachDB is super simple to set up. You can pretty much
follow their tutorial and get a working cluster in no time. So I would say
it's a good solution for non-experts. Now if you are already running Postgres,
then I don't know why you would want to switch.

------
kasey_junk
The locality features are huge. Being able to do database operations near
where your users are is a big win for latency in global environments.

Pairing that with edge compute like fly.io is a killer combination. That you
can use most of your normal Postgres libraries with it make it an easy
transition.

~~~
lokiju
I see, thanks. The uses for locality seem a little niche though no? Are the
benefits that much greater that it would be noticeable to users?

~~~
kasey_junk
You can see hundreds of ms of latency per request based purely on distance.

Tokyo - NYC is ~175 ms and that’s a good route. Compound that by all the
requests you app does.

------
cdbattags
“Shard your Postgres DB without the typical devops headaches.”

------
heelix
Keep in mind, this is the only part of that elephant I've interacted with. The
key things for us were it was transnational and had cross data center
replication. You could update a node in data center a, and have it replicate
down to the other data center with minimal latency.

------
jaymce
NOTE: I am on the product marketing team at CockroachLabs.

Cockroach Labs has been building CockroachDB for over five years and has
reached a level of maturity where many enterprise and community customers are
gaining value from the database.

We consider it a cloud-native database as it was architected with the same
principles as a distributed platform like Kubernetes. It is built for scale,
resilience, is shared-nothing and is comprised of a single binary across all
nodes with consistent API. We have also gone to great lengths to ensure it is
wire-compatible with PostgreSQL and implements a large portion of standard SQL
syntax. \It is used as a relational database.

The key points we typically talk about fall into four key areas, that we
lovingly refer to as CRLS (an acronym for Cockroach Labs)

Consistency - the database implements serializable isolation across
distributed nodes within a cluster with acceptable latencies even both local
and globally dispersed deployments.

Resilience - the database replicates data across nodes in the cluster and you
can configure this to survivability at the table level for optimization of a
loss of a node, zone or even region. It is an active-active system that is
always on and data is always available. Further, you can implement online
schema changes and rolling upgrades in production without downtime,

Scale - scaling the database is accomplished by simply spinning up a node and
pointing it at the cluster. The database will take care of redistributing
replicas to incorporate the new node. This is basically auto-sharding and
adheres to the aforementioned policy of survivability mentioned above.

Locality - Unique to CockroachDB, you can also tie data to any particular node
in the cluster. You define this at the table level. The database uses KV at
the storage layer to accomplish this. Each table is represented as ordered KV
pairs using the PK for the table for the Key. You work the location into the
key and the distribution policy will ensure data is written to explicit nodes.
This is used to ensure low latency access of data and/or to tie data to a
location for compliance and privacy requirements. With online schema changes,
you can manipulate the PK in production and rearrange where data is physically
stored in a live production cluster.

There are numerous other capabilities in the database. We have a complete UI,
deliver distributed backup/restore, change data capture and have implemented a
cost-based optimizer that uses locality.

It is a relational database that you can deploy the database as a service on
kubernetes and or in any cloud.

There are countless more capabilities and we invite you to check out one of
our features we are most proud of… our documentation!

Also, please check out customer page - yes, these are high level but there are
some stories there.

~~~
lokiju
Thanks for the info - very helpful. Are there any specific use cases that you
can point to? I dont doubt that the tech is robust, just that the use cases
arent too obvious right now.

~~~
radub
There are quite a few case studies / customer stories on the website:
[https://www.cockroachlabs.com/customers/](https://www.cockroachlabs.com/customers/)

------
closeparen
It's an OLTP database, not a warehouse.

------
cocktailpeanuts
Everytime CockcroachDB has been on HN, the conversation has been always about
their name ("what a shitty name for an enterprise software" vs. "who cares if
it works well"), but I have been also wondering the same question.

Are they getting adoption? And what's their unique value proposition?

~~~
atrilumen
I think the name is probably hurting adoption.

The creature it refers to pretty universally considered repulsive. I'm no
psychologist or anything, but I would expect that to have an undue influence
on even the most rational person.

~~~
muzani
I don't know anything about it, but my first impression is that it's as robust
as a cockroach, able to take a lot of damage.

~~~
conistonwater
The main metaphor I'm familiar with is that cockroaches always come back even
as you try to kill more and more of them, rather like zombies, and never in a
positive way.

------
irfansharif
what "data warehousing"/"data management" tools are missing?

~~~
diehunde
In case of OLAP queries is performance[1] which is not surprising since it's
an OLTP database AFAIK.

I don't know what he means by "data management"

[1][https://www.cockroachlabs.com/docs/stable/frequently-
asked-q...](https://www.cockroachlabs.com/docs/stable/frequently-asked-
questions.html#when-is-cockroachdb-a-good-choice)

~~~
lokiju
What I mean is - and I get that these are somewhat different technologies -
snowflake offers a more seemingly comprehensive platform for data by
aggregating it and focusing the product on that aggregation for future use.

Compared to Cockroach DB which seems like a niche SQL replacement?

------
redis_mlc
As a DBA, I would not use a database product without 5 years of reported
production success. Additionally, most companies don't need yet another
product to do training and support on.

