CockroachDB 2.0 Performance Makes Significant Strides

atombender · on March 29, 2018

Looks very promising! We've looked at Cockroach for a particular project, and we've been concerned that performance wasn't good enough.

Cockroach performance seems to scale linearly, but single-connection performance, especially for small transactions, seems rather dismal. Some casual stress testing against a 3-node cluster on Kubernetes showed that small transactions modifying a single row could take as much as 7-8 seconds, where Postgres would take just a few milliseconds.

The documentation recommends that you batch as many updates as possible, but obviously that doesn't work for low-latency applications like web frontends that need to be able to do small, fine-grained modifications.

thanatos_dem · on March 30, 2018

7-8 seconds? Something definitely sounds misconfigured. I've been running a 1.1.x cluster for quite a while and I've never seen a single row transaction take that long. And even the slowest queries took at most ~500ms, and that was with:

  - Replication factor increased to 5x (rather than the 3x default)
  - 8 indexes on the table being modified which also needed to be updated
  - Nodes spread across North America, incurring higher RTT latency between nodes
  - Relatively high contention on the data triggering client-side retries
  - HDD's as the storage medium (RockDB is optimized for SSDs)

jetrink · on March 29, 2018

7-8 seconds seems extremely long. Human beings performing the raft consensus algorithm using paper and pencil over Skype wouldn't be much slower than that. Are you sure everything was working correctly?

lacker · on March 29, 2018

I don't know about you, but it would take me a lot longer than 7 seconds to perform the raft consensus algorithm with paper and pencil.

danicgross · on March 30, 2018

Who are you to judge? Did you win the Putnam or something?

Gigablah · on March 30, 2018

I think people are generally allowed to judge themselves without industry credentials...

ComputerGuru · on March 30, 2018

That was tongue-in-cheek (that still had no business on HN), a reference to @lacker's two Putnam awards.

grzm · on March 30, 2018

The Putnam is also the source of one of the best burns on HN: https://news.ycombinator.com/item?id=35079

tyingq · on March 29, 2018

" small transactions modifying a single row could take as much as 7-8 seconds"

That's surprising. I wasn't expecting CockroachDB to be really fast, given the constraints they work within. But that sounds more like a bug or config error. Unless perhaps you mean a really high number of processes trying to update the same row at the same time? Like a global counter or something?

atombender · on March 30, 2018

Indeed, the stress test updates just one row, which mirrors certain write patterns in our application. I just started this testing, so we'll see what happens when I extend it to more than one row.

orangechairs · on March 30, 2018

[cockroach employee here] Have you hit us up on Gitter or Stack Overflow to help debug and tune? We'd also love to learn more about how you're using K8s, what your setup looks like, surprises you're running into with it, etc.

atombender · on March 31, 2018

Thanks, I'll do that when I get back to testing.

welder · on March 30, 2018

Did you use the 2.0 beta version or the latest stable release? They improved performance a lot in the 2.0 beta released this month.

atombender · on March 30, 2018

I used 1.1.6. Looking forward to testing 2.0.

jnordwick · on March 29, 2018

> low-latency applications like web frontends

...

atombender · on March 29, 2018

We have a collaborative, Google Docs-like application that currently issues a write every time someone types into a text field. Now, clearly it's suboptimal and something that should be optimized to batch the updates, but on the other hand, with Postgres we've had zero incentive to make such an optimization, because it's able to handle thousands of writes per node in real time with no queuing happening on the client. I don't expect this from Cockroach, but I would definitely want low latency.

thanatos_dem · on March 30, 2018

Lordy, relational databases are not the way to go for that problem... With a single shared resource (document), you're going to be encountering write conflicts left and right.

Have you explored implementing a CRDT based solution like WOOT instead?

atombender · on March 30, 2018

Definitely. The application is conceptually a transaction log of field/subfield patches, which would lend itself to something like an LSM, and we're looking at possible alternatives.

CRDTs could be a solution, but from what I gather they require too much context information to be viable for a text editing application. Our app currently uses something similar to OT.

jacquesm · on March 30, 2018

He's saying it works. Why change it if it works?

Groxx · on March 30, 2018

Why write conflicts? Contention, sure, but contention isn't an issue until you have literal tons of it.

thanatos_dem · on March 30, 2018

From OP's description

  > issues a write every time someone types into a text field

With more than a handful of people, this is getting into conflict territory pretty rapidly, especially if the document is structured as a single row (hopefully it's more granular than that). Time for some back of the envelope maths:

Assuming that an average person types at around 200 words per minute (number pulled from https://www.livechatinc.com/typing-speed-test/#/), that's a character every 300ms on average. With 10 people editing the document, that's a character every 30ms on average, which can easily lead to conflicts if they're all trying to update the same resource.

dalore · on March 30, 2018

Perhaps it's event sourcing based. Every time someone types into a field it writes a row that something was typed which is a record of what was typed. Play it back and you have the full document with no conflicts.

gmueckl · on March 30, 2018

With the caveat that the stored events are not conflicting with each other. So a central instance needs to check each event for validity before allowing it into the log. The check cannot be parallelized easily without introducing races between event insertions.

dalore · on April 1, 2018

No event sourcing architecture I know of advocates that at all.

One of the strengths of event sourcing is that you can fix things after the fact. Say you got a wrong event like you suggested that conflicts. You don't check it at the time before allowing it in log. You notice the wrong event, delete it, and replay the log.

You can always replay the log to get you current point of time.

Groxx · on March 30, 2018

If you're talking about "insert char X at location Y" leading to undesirable changes: sure. But that has nothing to do with the DB type, and everything to do with how changes are resolved. CRDT / etc can all be written on top of any DB.

Updating a DB every 30ms should be trivial. Heck, you should be able to grab an exclusive row lock, double check your state, and write your change without even coming close - 100% conflict or deadlock free, regardless of the number of writers, simply by using a DB as it is designed to be used. In this case, by using the biggest reasonable hammer available: make everything sequential at the DB level. You can absolutely build other systems on a relational DB that don't have that limitation.

zzzcpan · on March 29, 2018

Poor UI choices these days do not provide feedback and reload, hence making low-latency necessary to even be tolerated by users.

segmondy · on March 29, 2018

I like what Cockroach is doing, I'm rooting for them to grow and survive. Unfortunately the only time I hear about it is when they post blogs. I never hear about it from other people.

AlimJaffer · on March 29, 2018

raises hand we're using them extensively. They're our database of choice that we've paired with Nakama[1] which is an open-source, distributed server. Have nothing but great things to say about the database itself in terms of growing performance and the team behind it :). They've been great to us since day-1.

[1] github.com/heroiclabs/nakama

netghost · on March 29, 2018

What kind of workload are you using it for? What's been your biggest win while using it?

AlimJaffer · on March 29, 2018

A couple of our use-cases include: good KV access (stored user data etc.) and listing blocks of data that has been pre-sorted on disk at insert time (leaderboard records, chat message history etc.). As well, the clustering technology is particularly useful at scale. We work in the games space with some very large games in production, which allows us to spread the load across multiple database nodes and offers us peace of mind regarding redundancy.

tomerbd · on March 30, 2018

What do you mean by KV access isn't it a relational data store? do you store the value as a blog? json? if so how do you then do queries on that value's values?

staz · on March 30, 2018

IIRC it's a KV store which offer a relational data store interface for usability/compability

wilbeibi · on March 29, 2018

The thing I really don't get is why CockroachDB is avoid benchmarking with it's rival tidb (https://github.com/cockroachdb/docs/issues/1412). tidb already pretty mature, used in many big companies (Let's say, Didi, which on the similar scale data with Uber, and banks).

Even if I like CockroachDB's pg sql more, it would be helpful to have the comparison/benchmark to show something more.

atombender · on March 29, 2018

TiDB looks promising, but it doesn't have serializable transactions at all, which makes it something of an apples-to-oranges comparison at the moment when it comes to OLTP.

TiDB has a weird kind of variation on "read committed" where you get phantom reads (though they're not called that in the documentation, which is actually ambiguous on this point). This is a problem for apps that expect consistency.

siddontang · on March 30, 2018

Siddon Tang from TiDB here. Thanks for your interest in TiDB! TiDB uses Snapshot Isolation (https://en.wikipedia.org/wiki/Snapshot_isolation), which is similar to Repeated Read in MySQL. It doesn’t allow Read Phantom but can’t avoid Write Skew. IMO, this can work well for OLTP in most of the cases. If you really care about serializability, you can use `select for update` to promote the Read as Write explicitly like other databases do.

TiDB supports READ COMMITTED isolation which is not the same as MySQL, but it is just designed for some special cases for TiDB itself and it is not recommended for external users.

atombender · on March 31, 2018

Thanks, that's helpful!

gregwebs · on March 30, 2018

The documentation states that phantom reads are not possible (under the default isolation level or repeatable read): https://github.com/pingcap/docs/blob/master/sql/transaction-...

manigandham · on March 29, 2018

TiDB is much more complicated to run with several moving pieces. Also missing a lot of standard relational features as you can see from the roadmap: https://github.com/pingcap/docs/blob/master/ROADMAP.md

jinqueeny · on March 30, 2018

As a distributed database system, the highly-layered architecture of TiDB is not a disadvantage because it makes TiDB easier to debug and diagnose when issues occur. The complicity resulted from the separated modules can be easily tackled by automatic deployment tools and microservice framework. For example, TiDB can be easily deployed either using Ansible: https://pingcap.com/docs/op-guide/ansible-deployment/ or Docker Compose: https://pingcap.com/docs/op-guide/docker-compose/

TiDB has been widely adopted by many users (https://github.com/pingcap/docs/blob/master/adopters.md) in production because it support the best features of both RDBMS and NoSQL. It is quickly evolving and iterating based on users’ requirements which are prioritized and listed on the Roadmap.

manigandham · on March 30, 2018

Sure, I get the advantages of layers in architecture, especially the growing trend of separating compute and storage.

However there's a difference between architecture and deployment, and having everything in a single package makes operations much easier. CockroachDB also uses a KV storage layer (using RocksDB) with SQL on top.

gregwebs · on March 30, 2018

TiKV uses Rocks also which is not distributed. TiKV is distributed and used by TiDB. TiSpark also uses TiKV directly which starts to demonstrate some of the advantages of the layered architecture.

manigandham · on March 30, 2018

I get it. Both databases are layered, but I'm saying deployment is still easier with a single package/binary instead of having TiDB, TiKV and Placement Driver instances.

siddontang · on March 31, 2018

TiDB provides many some ways to let the deployment easier. If you just want to have a try on the single machine, we provide docker-compose (https://github.com/pingcap/tidb-docker-compose). But if you don't like docker, the single binary may be easier.

For the business users, we support ansible and K8s deployment, both can help them run the TiDB easily and quickly in production. The deployment is not a problem so far.

IMO, for the distributed system, if you need to operate many instances (10, 100 or even more), what you pay attention to most is not whether the instance is one single binary or not, but is how to operate all these easily. At that time, there is no much difference.

summarity · on March 30, 2018

CockroachDB's SQL implementation is still heavily unoptimized for actual business queries (subqueries, unnest, (recursive) CTE, schema changes). Features like this either take way too long or are not even implemented yet. Some of them are in the 2.1 milestone though. Let's hope.

I'd really love some kind of distributed-for-performance database using the exact optimizer and query planner of SQLite plus std plugins (FTS5, JSON, transitive_closure, spatial). Something like a mix between Bloomberg's comdb2 (which uses a modified SQLite frontend) and rqlite (distributed-for-safety).

Note: You can save most of the shortcomings of CDB on the SQL client side today, but don't underestimate the time it takes to implement CDB-specific workarounds...

jimbokun · on March 30, 2018

"I'd really love some kind of distributed-for-performance database using the exact optimizer and query planner of SQLite plus std plugins..."

Why do you think the identical optimizer and query planner would work in a distributed environment, with no changes from the single server implementations?

summarity · on March 30, 2018

Sure there have to be (minor) changes and some removals, but at the core that's exactly what Comdb2 did, and it works pretty well.

If Comdb2 7 ever reaches stable, and BLP invests some time into deployment, ops and monitoring tools/docs, it'll be a strong competitor.

etaioinshrdlu · on March 29, 2018

Project idea: globally hosted / managed CockroachDB that lets developers quickly start building small apps cheaply or free using this database.

This database has the potential to dethrone Spanner in a major way.

joris · on March 30, 2018

That’s on their roadmap: https://www.cockroachlabs.com/docs/stable/frequently-asked-q...

SoulMan · on March 30, 2018

My Org/team is too conservative to use this is they have to hire ops and too froogle to use spanner.

kevincox · on March 30, 2018

Why do you think that hosted cochroachdb would be cheaper then hosted spanner? Google has been optimizing spanner performance for years so I would expect that it will be cheaper to run for quite a while. Of course the markup can be different but I wouldn't expect it to make a huge difference.

tyingq · on March 30, 2018

Maybe training? Spanner's lack of UPDATE/INSERT/DELETE requires a team that's trained pretty specifically on how it works.

ZeikJT · on March 30, 2018

Frugal, froogle was the old name for google shopping.

SoulMan · on April 13, 2018

Thanks. Frugal is exactly what I meant.

thinkloop · on March 30, 2018

I just started playing with spanner. The API is nice. Simple txs are a bit slow - seconds instead of milliseconds - ok for my use-case. I don't hear much about it though, few articles on HN, I was wondering how mature it was and how widely used it is. Is it common knowledge that it is used by a lot of orgs?

etaioinshrdlu · on March 31, 2018

De-throne in the sense of being the top-tier multi-master consistent transactional DB. I doubt too many people use Spanner outside of Google in reality. Google would likely eventually adopt cockroackdb if it was clearly better long term.

Gigablah · on March 30, 2018

There’s arguably nothing to dethrone, Spanner is too cost prohibitive for small developers in the first place.

qaq · on March 29, 2018

How is this meaningful without detailed setup description? http://www.tpc.org/tpcc/results/tpcc_results.asp?print=false... Looking at this list of results one wonders what those results actually mean?

ElijahLynn · on March 29, 2018

At the bottom of the article it says that information is coming:

"Note: We have not filed for official certification of our TPC-C results. However, we will post full reproduction steps in a forthcoming whitepaper."

qaq · on March 29, 2018

I guess will have to wait. The progress they made is obviously impressive but would really help if one could understand the overhead vs conventional RDBMS 5X might be OK 20X not so much.

espadrine · on March 30, 2018

The minimum latency will still be at least ~5ms (worse if nodes are very far apart) and especially bad for contended reads (because of the way they use clocks for serializable operations).

A traditional RDBMS does not have to worry about split brain decisions, but it can hardly do multi-master in the intuitive way.

pinars · on March 29, 2018

I think you can still drive some insights. I clicked on the TPC-C results you shared and read their executive summaries.

The Oracle on SPARC cluster (at the top, 2010) performs 30.2M qualified tx/min vs the 16K tx/min in this blog post. The Oracle cluster also costs $30M, which is clearly higher than the Cockroach cluster's cost.

That said, the TPC-C benchmark is new to me. Happy to update this comment if I'm misreading the numbers.

(Edited to incorporate the reply below.)

arjunnarayan · on March 29, 2018

A short note that the total cost of that SPARC cluster was $30 million. You're not misreading those numbers, but it requires a little context.

We're focusing today on our improvements over CockroachDB 1.1, using a small-ish cluster. We'll be showing some more scalability with larger clusters in the coming weeks. If you've found CockroachDB performance slow in the past, you will be pleasantly surprised with this release!

pinars · on March 29, 2018

Sure thing. I was primarily answering the question above - in terms of how the numbers in the TPC-C benchmark fit in. I updated my comment to reflect the cost.

I think what's interesting with TPC-C is that you can sort the results based on performance or price/performance. On the price/performance metric, SPARC looks expensive. Dell has a $20K SQL Anywhere cluster that can do 113K tx/min.

I wonder if anyone tried to run these benchmarks on the cloud and how one would calculate total cost of ownership there now.

qaq · on March 29, 2018

you do realize it's ancient hardware thats $300-400 USD on ebay now.

tyingq · on March 29, 2018

http://www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=11...

Yeah, but 1700 cores worth. That's still a lot of $300 boxes. Like qty 53 Sparc T3-2's for example. Which seem to be $1200 to $2k on eBay. And unsupported, end of life, etc.

I'd compare CockroachDB's number to some more recent result with a similar number of cores. (If you can find one)

guipsp · on March 30, 2018

Minor correction: They used 54 SPARC T3-2. You can see exactly which components they used in http://c970058.r58.cf2.rackcdn.com/fdr/tpcc/Oracle_SPARC_Sup...

tyingq · on April 1, 2018

Not feeling too bad about my back of the napkin guess being off by one server at 53 vs 54. :)

qaq · on March 29, 2018

I meant dell boxes Sparc based boxes retain some value

jamesblonde · on March 30, 2018

The open-source (GPLv2) MySQL Cluster (distributed in-memory database - not InnoDB, but NDB) got 200m reads/second (and about 60m writes/sec) on commodity servers: http://mikaelronstrom.blogspot.se/2015/03/200m-reads-per-sec...

My guess is that the benchmark setup would cost about 1m dollars to install (3 racks of commodity servers). The software is free. Naturally, Oracle aren't pushing this, when they charge 10s of millions for Oracle rack :)

Jweb_Guru · on April 1, 2018

TPC-C is much, much harder to do well on than that benchmark, even if you are using special hardware and even if you are using "tricks" like intentionally misreading the TPC-C invariants in order to avoid coordination. It's not directly comparable to what you posted; while TPC-C performs joins, reads, insertions, and updates, including ones that necessarily must cross partition boundaries some percentage of the time, and specifies invariants that must be maintained in order to qualify as a correct run, the benchmark you posted (flexAsynch) does only reads or only writes in the tested configuration--both options can be trivially executed without any coordination at all on as many nodes as you feel like. As such, it's more of a stress test of the overhead of the implementation than it is an indication of the actual performance you will get out of a normal workload on such a cluster.

And while I'd like to say that MySQL Cluster is nonetheless exhibiting very impressive performance, I can't really say that; they are using expensive networking and hardware that most people don't have available, microoptimizing both the client and server sides, and using a low-level API specifically designed for doing well in these sorts of benchmarks, but they still lag far behind the per-core performance of state of the art key / value stores in similar circumstances. For example, MICA can do 80 million key/value requests per second of a comparable size to the ones they listed on a single commodity server, with only regular NiCs and 10GiB ethernet (and in fact can saturate the Ethernet link). Granted, MySQL Cluster is a full-fledged database and MICA just does key/value, but I can pretty much guarantee you that on real requests MySQL Cluster's performance collapses, and in multi-master mode it's known to be inconsistent anyway.

If you really need hundreds of millions of k/v requests per second, you'll pay a lot less buying three servers and hiring all the researchers who wrote the MICA paper to manage your key / value store than you will buying MySQL Cluster :P Or, if you want a real database, you can play the same game with the many, many database systems that do much better than Oracle's cluster on TPC-C; the same person who wrote the MICA paper released one last year about Cicada, which can handle over 2 million serializable TPC-C transactions per second on a single 28-core server. Or you can try Calvin, which can do around 500,000 georeplicated TPC-C transactions per second on a commodity EC2 100-node cluster (2.5 compute units each back in 2014) and can operate on top of any existing key-value store. The database world has advanced a lot in the past ten years, and people who really need performance have no shortfall of options that aren't Oracle.

jamesblonde · on April 3, 2018

Great, thanks for the in-depth reply. Just saw it now. I agree that TPC-C is much harder than flexasynch, which is just a KV benchmark. Here's a different benchmark. NDB got around 8m ops/sec on 8 datanodes when performing 1.2m HDFS ops/sec in https://www.usenix.org/conference/fast17/technical-sessions/... . That is not a KV workload. It is, in fact, a very mixed workload with every transaction being a cross-partition transaction, lots of batched PK ops, lots of partition-pruned index scans. Transactions containing lots of ops. Full-table scans. Index scans. And get this, it's open-source. Research has, of course, advanced since NDB, but nothing open-source still comes close to it for some (sometimes complex) OLTP workloads - anything subscriber oriented. Not TPC-C workloads, they suck on NDB.

I'd never heard of MICA, will read up on it. Calvin, though, is extremely limited. They tried to build CalvinFS on top of Calvin, but it didn't work. Subtree ops didn't scale (at all), as the DB wasn't flexible enough. Features and maturity sometimes matter, and NDB has both.

trengrj · on March 30, 2018

Looking at the TPC-C page all the benchmarks seem quite old and only reflect commercial databases. Do you have any recent TPC-C benchmarks for OLTP databases such as Postgres, MySQL, and Cassandra so I can compare with CockroachDB?

mikaelronstrom · on April 9, 2018

Open source DBMS normally provide numbers using some open source variant of TPC-C. One such implementation is DBT2. TPC-C according to the standard is extremely expensive to setup and is only interesting for databases with massive disks. However most open source DBMS runs DBT2 with no wait time, this means that also an in-memory DBMS can report numbers on a TPC-C-like benchmark (e.g. DBT2).

MySQL Cluster (NDB) did such benchmarks a few years ago where we ran 2.2M transactions per minute using DBT2. This meant executing millions of SQL queries per second towards the cluster and most of the CPU load is in the set of MySQL servers.

Currently I am using the load phase of this benchmark to test loading massive amounts of data into the new 7.6 version of MySQL Cluster using machines with 1 TB of memory.

Rafuino · on March 29, 2018

How much and what kind of memory and storage (SATA SSD, NVMe SSD, HDD?) is included in the 3 nodes used for testing? This benchmarking is really interesting but the next level is to understand the cost per tmpC measured. Memory especially and storage is a big component of cost these days.

arjunnarayan · on March 29, 2018

Short answer: 3 n1-highcpu-16 GCE VMs with Local SSDs attached. We're working on a complete disclosure document, with comprehensive reproduction steps to replicate all our numbers. This document should be out in a couple of weeks. We want to walk you through, command by command, on how to reproduce these numbers, and verify the results for yourself.

Rafuino · on March 29, 2018

Thanks for the short answer. Would be good to know how many local SSDs are attached though for the 850 warehouse scenario. The TPC-C documentation says each warehouse maintains 100,000 items in their stock, but I can't surmise from that how much storage is required to hold 850 warehouses' worth of data. I'm impatient though so let me try to work through the #s myself. I'm using GCP's monthly reserved pricing in the US-Iowa region as a reference as of today's pricing.

A n1-highcpu-16 GCE VM costs $289.84/month. Local SSDs are added at 375GB per drive, and they cost $30/month at $0.08 per GB. I highly doubt you could fit the ~1250 warehouses (what got you the peak TPM-C) on 375GB local SSD, but I have to make assumptions here! So, now you're paying $319.84 per instance per month, or $949.52 for 3 of these instances.

At 16,150 TPC-C, you're paying roughly $0.06 per TPC-C, or, looking at it the other way, you're getting 16.83 TPC-C per dollar spent each month. Is that good? I don't know!

Now, the really interesting question is, is that TPC-C/$ on CRDB 2.0 actually better than TPC-C/$ on CRDB 1.1? The answer lies in how many local SSDs you have to provision to reach that peak throughput. Peak is at ~1300 warehouses on CRDB 2.0, and ~800 warehouses on CRDB 1.1.

Does anyone with more knowledge here know how much storage you need per warehouse in the TPC-C test?

jordanlewis · on March 29, 2018

Each warehouse requires about 80 megabytes of storage, unreplicated. 1250 warehouses * 80 MB * 3-way replication = 300 GB, which comfortably fits in a 3-node cluster with 1 local SSD each.

Rafuino · on March 29, 2018

Thank you! My Google-fu couldn't find me the answer

baconomatic · on March 29, 2018

I'd love to hear from someone who has implemented this in production. Seems like really cool tech, but haven't had a chance to use it on a project yet.

welder · on March 29, 2018

Using it in production currently with dual-write and dual-read to compare perf. I'll do a write-up showing how Cockroach performs to Citus and Cassandra for my use case.

qeternity · on March 29, 2018

We use Citus and Memsql (big data analytics use cases). How does Cockroach handle joins and other OLAP style queries?

arjunnarayan · on March 29, 2018

CockroachDB is not (yet) ready for use on OLAP-style workloads. Our performance work has focused on OLTP workloads so far. That said, we do great on OLTP joins (which is a stressed in the TPC-C workload).

manigandham · on March 29, 2018

You're not going to get better performance for OLAP than MemSQL's columnstore and in-memory rowstore for reference tables to join.

Citus is great if you want the Postgres interface but is still using standard rowstore tables. CockroachDB is similar with rowstore performance but with added distributed consensus overhead. They are both much better for OLTP and sharding. CockroachDB also provides easy high-availability and replication.

ams6110 · on March 29, 2018

MemSQL is one of those "ask for a quote" products. What are some rule of thumb estimates for what it costs?

manigandham · on March 29, 2018

Licensed by total RAM of all nodes. $25k/year minimum license now, but you should still talk to them if you're a small company. Regardless of price, I highly recommend the product as one of the most polished data warehouses available for on-prem/self-managed operations.

ddorian43 · on March 29, 2018

$25K/year/box from previous comments

qeternity · on March 29, 2018

Yeah this is what we do. Citus is our single source of truth and powers a few interactive apps and admin panels. These sync hourly to our Memsql cluster which is cstore + ref tables and works amazingly.

shaklee3 · on March 29, 2018

kdb+?

manigandham · on March 29, 2018

Sure, but it's far more expensive and not as generally usable as the mysql-flavored MemSQL for common data warehouse scenarios. Performance will be similar but there are differences in functionality like kdb's asof joins that can't really be compared.

kdb+ is much better for numeric/financial analysis apps, especially when used with the integrated query language and interpreter environment.

shaklee3 · on March 31, 2018

The statement was that you will not get better performance than memsql, not about which is cheaper.

manigandham · on March 31, 2018

As written: "Performance will be similar but there are differences in functionality like kdb's asof joins that can't really be compared."

Also the original post only mentioned MemSQL, Citus, and CockroachDB. With those, MemSQL is the fastest.

no1youknowz · on March 29, 2018

I'm using MemSQL's columnstore myself and the performance is nothing short of amazing. I migrated from Citus DB to it for OLAP workloads.

But for what reasons are you using Citus as well? Would like to know if I am missing something or hear another perspective.

Can you explain your use case? Thanks

xstartup · on March 29, 2018

We use clickhouse cluster with 1000 nodes and 50000 GB clickstream data.

_wmd · on March 29, 2018

That's only 50gb per node. Why do you/Clickhouse need so many nodes?

tedmiston · on March 30, 2018

Maybe some space is dedicated to replication? Or for query execution temp space like Redshift. Or he could be trying to keep everything in memory.

didip · on March 29, 2018

Please post your write-up on HN! I would love to read CockroachDB performing in real world.

some_account · on March 29, 2018

Please do, that would be a very popular read I think.

baconomatic · on March 29, 2018

Please do!

smnscu · on March 29, 2018

Works great, just a tad slow. Hopefully this improves things. Deploying with Kubernetes is pretty seamless as well.

evrydayhustling · on March 29, 2018

Great stuff. I appreciated being educated about TPC-C, and the whole spirit of not focusing on vanity benchmarks!

itsdrewmiller · on March 29, 2018

Same here, but in educating myself more I found that TPC-C seems to be a somewhat obsolete metric compared to TPC-E (see https://stackoverflow.com/questions/9246939/what-is-the-diff...). Why use the old one here?

edit: Looking into it even further, I agree with the co-author's response here that TPC-C is still an appropriate metric. TPC-E is different and newer but still not as widely used.

arjunnarayan · on March 29, 2018

I don't think it's true to claim that TPC-C is obsolete and subsumed by TPC-E. They are both different OLTP benchmarks, with different characteristics. TPC-C is more write heavy, TPC-E is far more read heavy. It's true that TPC-E is newer, but doesn't deprecate TPC-C (the way TPC-A, for instance, is now deprecated).

We chose TPC-C because it's far more understood than TPC-E in 2018. We wanted to provide understandable benchmarks that can be put into context with other databases. Other databases report TPC-C numbers, so we choose to do so as well.

tyingq · on March 29, 2018

It seems not used much anymore. Follow that link (http://www.tpc.org/tpcc/results/tpcc_results.asp?print=false...) and sort by either score, or price performance. The vast majority of top results are a decade old or more. I couldnt find anything less than 5 years old without going to second/third pages.

And the top results are usually crazy high number of cores clusters. The Sun example was over 1700 cores.

makmanalp · on March 29, 2018

The problem is that I think it costs money and red tape to submit results and vendors run their own, and you kinda have to take their word on it or reproduce them yourself.

tyingq · on March 29, 2018

That makes sense. Probably TPC-C died after Oracle basically killed off Sybase and Informix. No more well funded competition to keep up the pace. And no multitude of RISC vendors trying to fend off Linux/X86.

The open source databases didn't play that game, so TPC-C became irrelevant.

Too bad there isn't a good way to directly compare the healthy survivors.

sheeeep86 · on March 29, 2018

I dont like when companies are not transparent about the pricing of their product. If you have a price page, show the price, so that Í can decide if this is relevant for me or not ...

true_religion · on March 30, 2018

It's not relevant to you.

Enterprise pricing generally basically scales with the size of your company/budget and how much trouble they think you'll be worth as a customer.

As a rule of thumb, it starts at just above 1000 USD per unit, and goes up from there.

Many contracts are bespoke orders especially when you're dealing with a small company, so you can't have transparency since there isn't a single product.

nhumrich · on March 30, 2018

I would usually agree with you, but cockroach is so new, I doubt they have any type of fixed price. They probably work it out on a 1-by-1 basis.

ahmedalsudani · on March 29, 2018

The only thing the Enterprise offering gives you is priority access.

mjibson · on March 29, 2018

Enterprise allows access to various features like distributed backup and restore.

ahmedalsudani · on March 29, 2018

Ah, my mistake. I stand corrected.

skybrian · on March 29, 2018

I wonder how far apart those three nodes are and how much the latency between them matters?

d0ugie · on March 30, 2018

Hadn't heard of Cockroach but based on the article, this thread and the rest of their site it sounds at least worth installing on a few hobby nodes if only to get familiar with the behavior and configuration should a need arise - like Cassandra was years ago when I had already on my own learned the gist of it, sort of a road not taken relative to my then-firm's usual prescriptions (MySQL and Mongo), it turned out to be perfect for my team's needs (paperwork to get permission to use it notwithstanding). Thanks for posting and good luck!

jb1991 · on March 29, 2018

Nice pun there. Cockroaches do indeed have a habit of making significant strides.

Asdfbla · on March 30, 2018

Does someone have more information about how they implemented serializable in such a way that, as they claim, performance isn't negatively impacted? Seems pretty hard to achieve that.

elvinyung · on March 29, 2018

Since you only have 3 nodes, doesn't that mean every range is replicated to every node? Doesn't that make joins trivial (i.e. no different from non-distributed joins)?

d4l3k · on March 29, 2018

Yeah, though from what I understand this benchmark is measuring both transactional read and write performance rather than just join performance.

Transactional writes are likely the slowest thing since they need to talk to all replicas.

elvinyung · on March 29, 2018

Actually hmm, do reads need to talk to all replicas in this case (serializable isolation)?

ComputerGuru · on March 30, 2018

As I understand it, reads only need to talk to the lease owner, which in turn bypasses raft since write consensus guarantees atomicity of after completion of write intents. Cockroach tries best-effort to have the lease owner and the raft leader be the same.

some_account · on March 29, 2018

Congratulations to the cockroach team for putting out an awesome product :)

Would be great to see how it compares against postgres in similar scenarios.

strict9 · on March 29, 2018

Before clicking the comments link, I always know what to expect in HN comment section for a CDB post announcing their latest milestone or feature:

A lot of congrats and excitement, questions about who uses it in a production environment, very specific use-case questions, and of course the name.

Weird how predictable the response to one company/tech always is.

misterbowfinger · on March 29, 2018

So. For me, personally, I don't care about the name. I generally care that it's great tech, and it clearly has a great team behind it.

However....

If I worked at CockroachDB, and I saw the negative feedback around the name, I'd take it to heart. At the end of the day, the name is marketing for the hard work of their engineers, and marketing for the engineers that want to use this DB (remember, they need to sell it to their managers who may not be technical).

This issue can show up in unexpected ways. For example, for cloud providers like Compose (IBM company), would they be comfortable with putting "CockroachDB" on the front page? They might if it's good enough, but it's at least a consideration (i.e. another meeting, another stakeholder to convince).

Or how about an enterprise company that's going through due diligence, and when their client asks them about their tech stack do they say "CockroachDB" or do they obfuscate the name by saying "It's a high-performance distributed database". That's a crucial moment to market CockroachDB, and it could get lost. As sad as it is, saying that you're using MySQL "because Oracle" is a point of leverage for some sales people.

Is the name worth it? Asking honestly.

latenightcoding · on March 29, 2018

People complaining about the name and how they are never going to be able to use it in production because of how gross cockroaches are is definitely the most recurring point. I think it worked well for them, since everyone remembers the name, specially with all the distributed stores coming out lately.

ngsayjoe · on March 30, 2018

Sometimes i wonder the evolution of cockcroaches' grossness has something to do with its high survivability?

johnmarcus · on March 29, 2018

came here to comment on the name.

beamatronic · on March 29, 2018

I came here to upvote comments about the name

pieterhg · on March 30, 2018

Great stuff but this name really doesn’t work. Make it a name with positive connotations.

as1mov · on March 30, 2018

While we are changing names for petty reasons, let's rename Python to something else, since a sizeable part of the population has a phobia of snakes.

expliced · on March 30, 2018

I think it's a great name once you get the.. uh.. pun behind it.

arbitrage · on March 30, 2018

What is the pun?

mmilano · on March 30, 2018

I agree, the unfortunately clever/punny name will detract from potential consideration based on illogical human subconscious thinking.

ardit33 · on March 29, 2018

[flagged]

dang · on March 30, 2018

We detached this subthread from https://news.ycombinator.com/item?id=16710517 and marked it off-topic.

dexterdog · on March 29, 2018

Cockroaches are considered pretty durable right? In the 80s I remember the line was always that after the nukes landed there would only be cockroaches and twinkies left.

That's not a bad thing to say about a database.

crispinb · on March 29, 2018

This is the sort of obtuse insistence on narrow denotational semantics that makes everyone avoid engineers at parties ;)

dexterdog · on March 29, 2018

It works like a charm for me, or wait, what's the opposite of a charm?

crispinb · on March 29, 2018

Denotationally speaking, an engineer?

benchaney · on March 29, 2018

I think that is where the name comes from. It isn't narrow denotational semantics.

crispinb · on March 29, 2018

Ignoring the connotation is a textbook case of it!

jrs95 · on March 29, 2018

And TwinkieDB would be a copyright violation! ;)

tyingq · on March 29, 2018

WaterBearDB maybe?

awalton · on March 29, 2018

But you could also call it, "BombproofDB", "NukesafeDB", "GeodistDB" or something that gets the same idea across.

rifung · on March 29, 2018

But cockroaches are known for being extremely hardy! That seems like a good quality for a database..

Perhaps you should replace serious with pretentious or shallow?

mlevental · on March 29, 2018

>I can't see serious engineers working on a company named "Cockroach".

You used to work at a company called Yammer <rolls eyes>. God forbid they're not called tech.ai.io-ify.

I think it's really funny that this comes up almost every time there's a post about CockroachDB. There were also a lot of people commenting on https://news.ycombinator.com/item?id=16693253 about foul language and such. I also remember being at a big conference and one of the speakers being a little cavalier and dropping an obscenity for emphasis - in the meetup comments people were so deeply offended. And let's not forget people's constant flagellation around brainfuck.

Make no mistake: this is the flavor of conservatism and hypocrisy that tech is home to: pretend to be liberally minded but lash out whenever something is just slightly divergent.

dang · on March 30, 2018

Please don't bring in someone's personal details or history as ammunition in an argument. That breaks HN's rules.

https://news.ycombinator.com/newsguidelines.html

fwgwgwgch · on March 30, 2018

On that note how would I go about deleting all my comments without associating them with an email? (not keen on sending you an email). If you can, please delete all my comments

awalton · on March 29, 2018

People have been complaining about "The GIMP" for literally decades... (edit: in case this isn't clear, Spencer Kimball, the CEO of CockroachLabs, also created "The GIMP" at Berkeley).

Sadly, names do matter. Cockroach seems to be a great DB from my poking at it, but there's definitely a visceral reaction some people have to the name (myself included) that has to be overcome first.

crispinb · on March 29, 2018

True enough. It's a really bad name.

On the other hand, someone would have to be astonishingly thick (or at best cavalier about their business) to take that seriously into account when deciding whether or not to use it.

andrewflnr · on March 29, 2018

They need a cute mascot, with a name. Then when people go "ew, it has a bug in it's name !!1!" we can say, "Aw, what's wrong with bugs? You're making Ricky cry."

capsulecorp · on March 29, 2018

I don't think its just that it has a bug in its name. I would venture to guess that if the name was Wasp DB, this issue wouldn't exist. It has more to do with the disgust trigger that many people have when they think of cockroaches.

andrewflnr · on March 30, 2018

Ricky can't help that he was born a cockroach. :(

tyingq · on March 29, 2018

Maybe it's strategic and the eventual not-free commercially supported version will have a more palatable name?

daxorid · on March 29, 2018

There's nothing wrong with the name. It's quite good, in fact.

api · on March 29, 2018

What drug was the person who drew that graphic on?

evrydayhustling · on March 29, 2018

Heavy doses of Hieronymous Bosch? https://en.wikipedia.org/wiki/Hieronymus_Bosch

2474 · on March 29, 2018

Ask him.

https://www.dalbertbv.com/about/

GrayShade · on March 29, 2018

Windows 7 had some similar wallpapers (Scroll down):

https://blogs.msdn.microsoft.com/e7/2009/05/02/a-little-bit-...

ngsayjoe · on March 29, 2018

[flagged]

unethical_ban · on March 29, 2018

That's unfortunate, but I hope your post isn't suggesting they shouldn't have named their product as they did.

Should people who don't like large numbers not use Google? Should people who fear fire not use Firebase? Should people who don't like coffee not use Java? Moreover, should those people suggest the names be changed due to their phobia?

We could number all database servers. Server 1, Server 2, Server 3, Server 4... but 4 is unlucky in China, so we can't use that.

tdb7893 · on March 29, 2018

Almost no one fears coffee or large numbers in the same way people fear cockroaches and fire also doesn't generally elicit near the same reaction (there are even types of fires that people react positively to, like camp fires, and many people like the smell of fire). They are allowed to name it whatever they want but it's hard to imagine a name with more negative feelings associated without getting vulgar or ridiculous.

atomical · on March 29, 2018

Are there people or cultures who have a fondness for cockroaches?

soperj · on March 29, 2018

Wall-E

ngsayjoe · on March 29, 2018

No not at all im just giving my feedback of why I couldn’t use a great product sadly due to my phobia suffering from its name that many ppl might face the same but don’t bother to give feedbacks.

(PS: Yes if your target customers are Chinese you should avoid using number 4 especially in real estate)

sorokod · on March 29, 2018

I think it depends on on how common the negative sentiment is. VomitDB anyone?

GraffitiTim · on March 29, 2018

Just so you know, it’s possible to overcome a phobia. In fact my startup Fearless helps people to just that, using VR. We have a module for cockroaches specifically. It goes very gradually, starting with a cartoon drawing, and progresses at your own pace. If you’re interested, http://FearlessVR.com

Also, I understand your complaint about the name, as I’ve encountered many, many people with all sorts of phobias at varying levels of extremity. It’s more common than most people think.

evgen · on March 29, 2018

On behalf of everyone else who puts up with this constant repeated BS for every story about CockroachDB, we get it. Now shut up already and move on. I am one of the downvoters because at this point literally NO ONE cares about your complaint and this constant background whine contributes nothing. If you can't get over the name of a product then just keep it to yourself.

mike-cardwell · on March 29, 2018

You have a pretty extreme phobia if just seeing the word used for naming the thing that you're afraid of, is a problem for you.

jaebinyo · on March 29, 2018

I share your sentiment on its name..

ekovarski · on March 29, 2018

Granted, phobias are no fun and can be debilitating but seeing as the product has been around for 3 yrs, I don't think they have any plans on renaming it.

Quick question, and don't take it the wrong way as I am truly trying to understand the extend of your commitment of not using it, but what if you were to receive a once in a lifetime job offer and after you start the team decided to switch to this database, would you quit? Your not physically working with any cockroaches so does the phobia extend to even just hearing and/or saying the term? Thx

ngsayjoe · on March 29, 2018

Yes just hearing or reading the word “cockcroach” gives me a cringe. I tried to google the word “cockroach phobia” just now and damn google shows a bunch of cockroach images that made me press back button immediately and couldn’t read a damn thing.

Im not sure what will happened to my job but given an equal choice i would choose alternatives, job or database.

CyberDildonics · on March 29, 2018

Why don't you just write a browser script extension to edit out your trigger word? It might be easier than expecting a database company to change the name of what they make.

ekovarski · on March 29, 2018

I like this idea, I wonder if other individuals with similar phobias would benefit from it. Wasn't there one, or perhaps multiple ones, to change variations of Trumps name at one point?

manigandham · on March 29, 2018

It can also be called CRDB.

johnmarcus · on March 29, 2018

I will not use this product based on it's name alone, it give me jeepers. Petty? Damn straight it's petty. Doesn't make it less real though.

farseer · on March 29, 2018

[flagged]

jrs95 · on March 29, 2018

It's just a pun about it being nearly indestructible. Pretty good name imo

wufufufu · on March 29, 2018

Shame. We've got a CockroachDB infestation in both our East and West datacenters and it's been critical in reliably scaling thus far.

Sometimes a CockroachDB egg sac will go down, but we can spawn another which will hatch very quickly.

The only downside is when our ORM burrows deeply into human ears, causing pain and hearing loss.