Hacker News new | past | comments | ask | show | jobs | submit login
CockroachDB 1.0 (cockroachlabs.com)
811 points by hepha1979 on May 10, 2017 | hide | past | web | favorite | 347 comments

I really like the fact that the CockroachDB team recently did a detailed Jepsen test with Aphyr. The follow up articles from both CockroachDB and Aphyr explaining the findings are very interesting to read. For those who might be interested -



> CockroachDB is a distributed, scale-out SQL database which relies on hybrid logical clocks

I was curious what "hybrid logical clocks" meant and found the linked paper a bit over my head. I found this more layman description:


Apparently Google used GPS/atomic clocks to keep time synced:

>> To alleviate the problems of large ε, Google's TrueTime (TT) employs GPS/atomic clocks to achieve tight-synchronization (ε=6ms), however the cost of adding the required support infrastructure can be prohibitive and ε=6ms is still a non-negligible time.

And CockroachDB created more of a hybrid version that works on commodity hardware.

Distributed systems programming sounds endlessly challenging as you are always balancing trade-offs.

You might find our post[1] on atomic clocks, rather having to do without them, partially interesting.

[1]: https://www.cockroachlabs.com/blog/living-without-atomic-clo...

Hey guys, I'm a fellow developer of distributed systems here.

First of all I think what you are doing is great.

My question is what's the point of clocks at all? The current time is a very subjective matter and I'm sure you know this, the only real time is at the point when the cluster receives the request to commit. Anything else should be considered hearsay.

Specifically the time source of any client is totally meaningless since as you say further in the discussion that client machine times can be off by huge margins.

If you accept that then one has to accept the fact that individual machines within the cluster itself are prone to drift too, although one can attempt to correct for that I appreciate.

Wouldn't you think though that what is more important is that the order is more based on the bucketed time of arrival (with respect to the cluster).

I don't see how given network delays anyone can be totally sure A is prior to B, atomic clocks or not.

What is important is first to commit.

[edit] Yes would love to talk privately about this topic @irfansharif

When a single system is receiving messages, you pick an observed order of events that meets some definition of fairness, and you stick with it all the way through a transaction. By pretending A happens before B (even if you're not entirely sure) you can return a self-consistent result. And once you have that you can simplify a lot of engineering and make a lot of optimizations, so that the requests aren't just reliable but also timely.

You throw three more observers in and how do you make sure that all of them observe the requests arriving in the same order? Not even the hardware can guarantee that packets arrive at 4 places in the same order, even if the hardware is arranged in a symmetrical fashion (which takes half the fun out of a clustered solution).

My question is what's the point of clocks at all?

I would highly recommend to read the link by irfansharif. It's probably the best primer ever written on the subject.

Yes, I really enjoyed it!

> Specifically the time source of any client is totally meaningless since as you say further in the discussion that client machine times can be off by huge margins.

distributed systems like cockroach shouldn't use the client's conception of current time for anything at all, except possibly to store it (_verbatim_, don't interpret it) and relay it back to the client or to other clients (and let the client interpret it however they want).

Hmm, I'm not sure I completely understand your question or your source of confusion here but unless I'm grossly misunderstanding what you're stating I think we might be conflating a couple of different subjects here. I'm happy to discuss this further over e-mail (up on my profile now) to clear up any doubts on the matter (to the best of my limited knowledge).

Why not simply have the cluster sync a time between themselves? First node in the cluster gets the time, and as the new nodes come online they set their own internal time via the cluster? So in a world where there is not NTP or atomic clocks the system could continue to operate.

This doesn't take into account when clocks on different systems run at different speeds, or when clocks jump, especially on VMs and cloud instances, which happens all the time.

I don't really get why you would build a distributed database with dependency on wall time (unless you're Google and can stick atomic clock HW on every node). Why not use vector clocks? Am I missing something?

the section on lock-free distributed transactions on our design document[1] should answer your question, specifically the sub-section on hybrid logical clocks.

[1]: https://github.com/cockroachdb/cockroach/blob/master/docs/de...

Thanks! Interesting. http://www.cse.buffalo.edu/tech-reports/2014-04.pdf is the relevant paper on hybrid logical clocks, linked in the faq.

It may be a nitpick, but Google don't stick atomic clocks or even just GPS clocks into every node. Just into every data center. The difference means that it's actually perfectly feasible to do that for very many other companies running DCs or just in colos. The big news was how they used the fact (that times are synchronised with an upper limit to how far the clocks in two nodes will diverge) as a very significant optimization in Spanner, one of their distributed databases.

Building a distributed database that can optionally benefit from the same optimization actually makes a great deal of sense. Your average hobbyist won't care, but spending some extra few kilo bucks on hardware in a dc and get big throughput improvements out of your database system is a steal.

The real definition of distributed systems is endlessly challenging as you are always balancing trade-offs.

The CAP theorem still holds, so we pick which 2 out 3 to be strength​s and where to compromise as little as possible. It's a guaranteed 87.3% effective hair loss formula. I find Quiet Riot helps.

> When a node exceeds the clock offset threshold, it will automatically shut down to prevent anomalies.

If you're planning to run on VMware, be prepared to handle rather dramatic system clock shifts. I've seen shifts of up to 5 minutes during heavy backup windows. Not all customers might be willing to have their nodes go down due to system clock / NTP issues.

employee here.

Yep, we've also had our share of troubles with noisy clock on cloud environments, so that's something we're very aware of. Further down the road, we're considering a "clockless" mode, which of course isn't clockless, but depends less on the offset threshold: https://github.com/cockroachdb/cockroach/issues/14093

That said, even today, configuring a cluster with a fairly high maximum clock offset is feasible for many workloads.

Do people not run NTP on their VMs?

Or are you saying that you see heavy clock skew despite having NTP in place?

The latter. NTP only checks and corrects clock offsets every so often. If the "hardware"[1] clock undergoes offset shifts at random times because of VM pauses this won't get fixed immediately until the next NTP sync.

This gets exacerbated in cloud settings where VMs get moved between physical machines, or racks since now it's not just the pause, its that the clock is now pointing to a new hardware time source. [1] in quotes since it's viewed as a single piece of hardware to the software inside the VM.

Cassandra user here in AWS. Clock drift is a big problem on VMs. NTP is not aggressive enough in these environments to keep clocks relatively in sync. We regularly had several hundred milli drifts between nodes. As cassandra is extremely clock sensitive, this is a big problem. We ended up using chrony with very aggressive settings to keep things in the sub-ms range for the most part. But it's still possible to get "hiccups" where time will skip. Especially if you reboot a VM.

Vanilla ntp makes assumptions about the hardware clock (that drift is stable) that don't apply to virtualised clocks. Using tsc clocksource may help as well.

Interesting. I wonder if anyone has documented any best practices for timekeeping in VMs.

VMware has this but it does not appear to have been updated in a while. https://kb.vmware.com/selfservice/microsites/search.do?langu...

I run a lot of VMware:

* Set the esxis to have five external sources

* Search fwenable-ntpd (https://www.v-front.de/2012/01/howto-use-esxi-5-as-ntp-serve...) and download the .vib (do a security audit on it - its a zip file I think - to ensure it is what you think it is). Install the .vib which simple adds a ntp daemon option to the firewall ports. This works on v6.5

* Run ntpd on Linux VMs, pointed at the hosts with the local clock fudge as a fallback

* For Windows VMs in a domain, set the AD DC with PDC emulator role to sync its clock to the host via the VM guest tools, leave the rest alone

* On your monitoring system make sure that it has an independent list of five sources and use plugins like ntp-peer for ntpds and ntp-time for Windows (Nagios/Icinga etc)

With the above recipe, ntpq -p <host> shows offsets less than 1 ms across the board for ntpds after stabilising.

Hijacking my own thread:

I don't suppose anyone knows how to make a Windows NTP server permit queries? Googling does not seem to reveal anything insightful. I know how to do this for ntpd but am stuck with dealing with a Windows NTP server right now.

Why does VM ware emulate the hardware clock rather than giving (possibly slightly debounced) access to the real system clock?

I suppose for vMotion purposes. A VM is not tied to a physical machine.

@CockroachDB dev:

Has CockroachDB some health status page or REST api? (like Nginx/Apache/Redis/Memcached or a special table like MySQL)

It would be helpful to monitor the CockroachDB database in production.

I see there is some feature inbuild, but it only sends that data home to your server for analytics. (can be turned off) https://www.cockroachlabs.com/docs/diagnostics-reporting.htm...

CockroachDB has a rather nice admin interface that monitors the health of the cluster.


There's also a lot of rpc end points used for the admin UI that can be queried to get more fine grain info. However, they're primarily for internal use and might change in the future.



We're still working on integration with other monitoring systems, but the one we've tested the most and documented is prometheus: https://www.cockroachlabs.com/docs/monitor-cockroachdb-with-...

Additionally, you can get some of the same status info on the dashboard using the `node status` command (https://www.cockroachlabs.com/docs/view-node-details.html).

Pardon the nature of my question, but I'm really interested in what your experience has been so far building a database with Go? Has its runtime (the GC for example) posed any issues for you so far? Looking at other RDBMS's, languages with manual memory management like C or C++ seems to be the go-to choice, so what were the reasons you chose Go?

I'm quite frankly amazed that Go's runtime is able to support a database with such demanding capabilities as CockroachDB!

We have a post on why we chose Go, from a year and a half ago: https://www.cockroachlabs.com/blog/why-go-was-the-right-choi...

More technically, here's a somewhat random set of thoughts on the subject:

The Go GC is performant and predictable, unlike the JVM GC. We do have some very memory-allocation-conscious code patterns to minimize the performance impact of working in a garbage-collected language runtime, but in the end it's not as bad as you might expect if your expectations are coming from the JVM world.

Library support is good. To quote our CEO, "Most of us on the team have done extensive work with C++ and Java in the past. At Google, C++ was the standard for building infrastructure and there are a lot of good reasons for that. It's fast and predictable. It would be a good choice for Cockroach, except that in the world outside of Google, in open source land, the supporting libraries for C++ are either terrible, incredibly heavyweight, or non-existent. We didn't want to rebuild everything which you take for granted at Google from scratch. It turns out that Go has many of the necessary libraries, and they're straightforward and very well written."

Basically, if Google's internal C++ libraries, tooling, style guides (and the tooling to enforce them) were available externally, we might have gone with C++.

Some of us are fans of Rust, but Rust sadly did not exist in a stable state when CockroachDB started. I'm not sure we would pick Rust were we to start today (tooling is still a concern there), but it would certainly be part of the discussion.

The native support for concurrency in Go is a huge plus. We use thousands of goroutines in CockroachDB, and that's been a huge blessing.

I can answer any more specific questions if you have them.

Why do you think the Go GC is better than any the JVM options? From what I've seen, while the Go GC is well tuned for low latency, by picking the right JVM GC parameters you can on balance get a better throughput latency tradeoff. I'm just wondering if you have any reliable benchmarks or evidence to support what your saying? I don't use either language for work, so I think you might have better information than I.

I talk about this in the presentation I linked in another subthread (https://www.cockroachlabs.com/community/tech-talks/challenge...). The key to getting good performance out of any GC is to generate as little garbage as possible, and in our experience Go makes better use of stack allocation and value types keep many objects out of the garbage-collected heap. We've found that idiomatic go programs tend to produce less garbage than similar java programs, and in the presentation I discuss some tricks we use to get that even lower in critical paths. Admittedly, we're not JVM tuning wizards so maybe there's more that could have been done on the JVM side.

As I understand it, Java needs a complicated GC implementation because it produces, by design, a makes a huge amount of heap allocations -- lots of very short-lived little objects.

Much of Java's GC focus has been on correctly partitioning the heap so that long-lived objects can be less aggressively collected than short-lived ones. (An example of a challenging long-lived object is the entire set of classes used by a program, all of which need to available to the runtime for reflection. For many bigger apps, the class hierarchy alone takes up many megabytes of RAM!)

Go can make use of the stack to a much larger degree (structs and arrays can be passed by value), and so it can get by with a much less advanced GC. As a result, Go team's main focus has been on reducing pause times more than anything else.

We got around this by writing our own GC management: https://deeplearning4j.org/workspaces

We write our own GPU algorithms, Java native interface transpiler (eg: we generate JNI bindings) as well as our own memory management.

We've found the JVM to be more than suitable. Granted - we wrote our own tooling and had reasons we can't move (those customers are a neat thing most people don't think about :D)

I understand why you guys did go though. Congrats on pushing the limits of the runtime.

Thank you for the reply! You and the presentation video Ben posted covered pretty much all of my questions, and I'm going to keep an eye on the issue tracker regarding performance to see what interesting things you might run into and how you deal with them in Go!

If you have a moment, "tooling" is pretty vague; which kinds of tools were you worried about with Rust?

(Also, congrats on the 1.0!)

This is more my personal opinion, and perhaps more revealing my ignorance on the existing equivalent tools in the Rust ecosystem, but here is a list of some of the Go tools we use when developing CockroachDB:

1. gofmt and goimports really helps enforce a single uniform style. We don't really care what the style is, as long as it's consistent across our 30 engineers and 200k lines of code. We have hand-rolled more Cockroach-specific linters on top of this as well, but we could do that for Rust too.

2. go tool pprof is a great profiler. Being able to quickly dig into allocations, cpu usage, etc. is great, and we do so regularly. As a result, the overhead of the GC is minimized, since we can rapidly identify and mitigate the allocation overhead with the application of a few known patterns.

Now I don't know what the state of the art of rust profiling is, but if we were to litigate Rust vs Go starting CockroachDB from scratch today, we'd probably pay close attention to what the answer is here. The Xooglers on this team have a tonne of C++ experience, and were very happy with C++ profiling tools, and thought the Go profiler matched up to the best tools they had used previously. If there is a Rust equivalent, this isn't a problem.

3. Consistency of code (in both style, but also patterns used) across third party libraries is a concern. The existence of a single toolchain that enforces a single style in Go really helps keep the whole ecosystem healthy here. Even if tools exist for Rust, if they aren't universally used, that is not as powerful.

I honestly think that Rust would probably be a close contender if we litigated this question today. The TiDB folks use Rust for their KV side, but Go for their query engine, which is an interesting mix. If faced with this decision today, I personally would push for Rust; I'm not a fan of the Go type system's various limitations, which we are running into particularly as we write a more sophisticated query optimizer that has to do more classical programming languages reasoning. But I am one of the most junior engineers on the CockroachDB team, so I'm not sure I would prevail in this fight! :)

Thank you so much for the thorough answer! This is stuff we're always working on, so it's helpful to know about this stuff. Since you're not actively looking, I won't go into all the details, but if you ever are in the future, happy to give you a rundown of the state of the art whenever that is :)

Thank you as well! I will almost certainly reach out to you about this at some point!

(Cockroach Labs co-founder) I gave a talk on exactly this subject at the ACM Applicative conference last year: https://www.cockroachlabs.com/community/tech-talks/challenge...

Overall we've been happy with the choice. The GC is sometimes a performance issue, but it's manageable (and Go gives you better tools to limit the cost of GC than many other garbage-collected languages)

Thanks for the video. Can you please upload the slides somewhere?

We were told that the slides would be uploaded by Applicative (hence we didn't post a copy), but we can't seem to find it on the internet either, so here's a copy from our Google drive: https://drive.google.com/file/d/0ByQnrkOiRT_LMmZ6SXFmbk5wTDA...

Did y'all ever figure out a way to get goroutine leak detection working with parallel tests? (didn't expect to see that linked here)

We have started parallelizing our tests with the new subtest feature: leaktest in the top-level test, t.Parallel in the subtests. This means we only check for leaks in between batches of parallel subtests. This works OK for us for now since our slowest "test" is really a huge data-driven test suite, and that's the only place we're currently parallelizing, although it would be better if we could parallelize more of our tests.

Thanks! Arjun.

What would be the risk with GC?

In an era where hot air and hip DB technologies prevail, I'd like to emphasize the fact that the CockroachDB engineers are consistently honest and down to earth, in all relevant HN posts.

This builds up my confidence in their tech, so much so that even though I had no real reason to try this new DB, I'm gonna find one! :D

Exactly! The confidence that the devs inspire by taking the time to explain the choices behind the tech, makes me want to find a project to test it out on.

Are there published benchmarks for multi-key operations and more complex SELECT statements? I apologize if I missed them.

I'm trying to determine whether there's a place for Cockroach within what I think are the constraints in the database space.

* Traditional SQL Databases

  - Go to solution for every project until proven otherwise.

  - Battle tested and unmatched features.

  - Hugely optimized with incredible single node performance.

  - Good replication and failover solutions.
* Cassandra

  - Solved massive data insert and retention.

  - Battle tested linear scalability to thousands of nodes.

  - Good per node performance.

  - Limited features.
It seems like many new databases tend to suffer from providing scale out but relatively poor per node performance so that a mid-size cluster still performs worse than a single node solution based on a traditional SQL database.

And if you genuinely need huge insert volumes, because of the per node performance you'd need an enormous cluster whereas Cassandra would deal with it quite comfortably.

[Cockroach Labs engineer here working on performance benchmarking]

We have load generators for YCSB (just raw key-value ops in a firehose) and TPC-H (very complicated read-only queries) running right now, and we're about to start running TPC-C queries (moderately complex queries in large volume) as well. You can follow along on our progress here: https://github.com/cockroachdb/loadgen

In the context of your dichotomy, we want to bridge that gap. We want the linear scalability of your second group along with the full feature-set of the first group.

We will be publishing our performance numbers, but we haven't so far because the product has improved rapidly, and our numbers have been quickly obsoleted, but rest assured, we will be publishing a series of blog posts very soon. Anecdotally, our beta customers are not finding that they need very many more CockroachDB nodes than their existing database solutions, even with something as high-performant (but inconsistent) as Cassandra.

That's great. Thanks for the response and I'll keep an eye out for the blogs.

How does Cockroach efficiently handle the shuffle step when data is on many nodes on the cluster and has to move to be joined? Does Cockroach need high capacity network links to function well?

I always see companies making the claim of linear speedup with more nodes but surely that can't be the case if the nodes are geographically disjointed over anything less than gigabit links? Perhaps linear speedup with more nodes is only possible over high speed connections? How high is that exactly?

Congratulations to the team on the release! Introducing this kind of database is no easy task - thank you and great job, keep up the good work!

The short story is we do need high capacity network links to function well. By "high capacity" I mean at least double digit megabit links between your datacenters.

A query that inherently requires shuffling because the data is geographically distributed can't get past the bandwidth needs of performing the shuffle. At the very least, with the literal simplest query plan, you're going to need all the raw data to be transported to a single node/datacenter, and I doubt there's a query and network setup where that's more efficient than doing networked shuffles themselves.

I don't think you need gigabit networks, but you're certainly going to want at least 10 megabit links. We have not tried to benchmark scenarios where we are bandwidth constrained, so I can't tell you precisely what the minimums are. All the cloud scenarios we've tested (on GCE, Azure, AWS, DigitalOcean) are constrained on other dimensions (i.e. CPU cores, memory, disk IO).

And thank you :)

That makes sense- I think part of the reason such types of databases are well suited to cloud operations is the guaranteed throughput of the cloud providers own network backbone, which is almost impossible for any single "regular" organization to match, at least for the price. I think we are at a point where doing business without the cloud will become nearly (but not completely) impossible at huge scale with all these features.

Thank you very much for your detailed answer and good luck with the continued rollout!

15 years ago I was working on a similar distributed DB product. At the time, the idea was to send the query execution plan to each node to execute any filtering criteria to trim down the candidate row set. Then compute a Bloom Filter on the joining keys on the node with the largest candidate set (using some heuristic statistics), ship the Bloom Filter to other nodes with smaller data set to greatly reduce the non-matching rows. The rows survived the Bloom Filter are highly likely joinable and are shipped back to the main joining node to perform the final join. Bloom Filter is the perfect compromise between size and speed.

I'd imagine CockroachDB is doing something similar for distributed join.

haven't come across this idea before, interesting - will definitely have to give it some more thought. our 'distributed joins', so to speak, run through our distributed query execution model (distsql) setting up incremental 'stages' of computation with the results pipelined and plumbed through individual computes. viewing it through this model our implementation more closely resembles the Grace Hash Join[1] algorithm. you might be interested in the PR[2] that landed this changeset, there's a cool visualization in one of the comments[3] showing the query execution plan.

[1]: https://en.wikipedia.org/wiki/Hash_join#Grace_hash_join

[2]: https://github.com/cockroachdb/cockroach/pull/12221

[3]: https://github.com/cockroachdb/cockroach/pull/12221#issuecom...

The Grace Hash Join approach ships the entire joining key set across network. Even if each node just get one partition of it, the aggregate network traffic is the entire set. For small table, it's fine. Large table is going to really tax the network.

I've heard of pushdown techniques including function, predicate and aggregate pushdown in distributed relational engines before.

Another interesting idea I read about (I can't find it anywhere online) was called "join zippering". Basically you first request the cluster to solve a join by querying and streaming the key columns from a join predicate back into the cluster itself to identify which nodes have matches and then streaming the results from each node in parallel, and doing the join in the stream.

This is hard stuff but so cool too :)

> This is hard stuff but so cool too

I agree! we have some semblances of pushdown filtering across aggregations and some other interesting techniques as documented in the RFC[1] that first proposed the distributed execution model.

[1]: https://github.com/cockroachdb/cockroach/blob/master/docs/RF...

I think this is the DB Project of the year in the open source community. Cockroachlabs has done an incredible effort to develop and test a new Database and these guys are giving it for free (I read about the series B raise too ;)), for us to use it.

Thanks for doing this. You're very much appreciated. (BTW I love the name and the logo!!)

There was a great session with Spencer Kimball (CockroachDB creator) and Alex Polvi (CoreOS) at the OpenStack Summit. It's a good overview and demo: https://youtu.be/PIePIsskhrw

there's a second part to this presentation[1] running cockroachdb across 16 (!) cloud vendors.

[1]: https://www.youtube.com/watch?v=nBXXLNIwAoo

CockroachDB looks like a great alternative to PostgreSQL, congrats to the team for doing so much in such a short time. The wire protocol is compatible with Postgres, which allows re-using battle-tested Postgres clients. However it's a non-starter for my use case since it lacks array columns, which Postgres supports [0]. I also make use of fairly recent SQL features introduced in Postgres 9.4, but I'm not sure if there are major issues with compatibility.

[0] https://github.com/cockroachdb/cockroach/issues/2115

I'm an engineer on the SQL team at CockroachDB. We're very aware of our missing support for array column types - and in fact beginning to add support for arrays is one of my team's priorities for the next release cycle.

What kind of other recent SQL features introduced in Postgres 9.4 do you use? Postgres has a ton of features, as I'm sure you're aware, and while we strive for wire compatibility with Postgres it's not a goal of ours to implement support for every Postgres feature out there.

I double checked my codebase and it looks like it's just JSONB, which CockroachDB also doesn't support [0]. Sorry to bother about missing features, but there are really some things that prevent a smooth transition from Postgres.

[0] https://github.com/cockroachdb/cockroach/issues/2969

Yep, JSONB is on our roadmap as well, although it won't come before array column type support. Thanks for the feedback - I'd personally love to see migrations from PostgreSQL to CockroachDB become seamless for more complex use cases as we continue development.

It occurred to me to migrate Odoo ERP to CockroachDB, scaling up the DB is one of our biggest challenges with some of our clients.

However Odoo leans heavily on Postgres, migration would be a lot of work I imagine. The first snag I've hit with CockroachDB is the lack of 'CREATE SEQUENCE'.

Plus, Odoo uses REPEATABLE READ + a hand-rolled system of locks for consistency, I'm not sure how that would play out with CockroachDB. In my experience some of the performance issues come more from long lived locks in the app than from sheer DB performance.

Postgres' network types [0] are very useful, especially when you use the << and >> operators to query for addresses contained in a subnet.

[0] https://www.postgresql.org/docs/9.5/static/datatype-net-type...

JSON/JSONB datatypes, listen/notify, spatial extensions.

JSON/JSONB will come after array support. As far as I know, we don't have any concrete plans at this time to support listen/notify or spatial datatypes.

what's the story for change data capture with CockroachDB? Postgres 9.4 added logical replication, which is incredibly useful for this use case.

Also, we use JSONB fairly extensively -- I see the tracking issue here https://github.com/cockroachdb/cockroach/issues/2969 but no movement.

Regarding change data capture, please see Arjun's answer: https://news.ycombinator.com/item?id=14309173

Make sure to take a look at Debezium: http://debezium.io/

It's a really solid CDC framework which has connectors for PostgreSQL, MySQL and MongoDB.

I'm basically here to ask a similar question, whether this is aimed as an modern alternative to Postgresql, since they don't clearly state this on the OP news announcement.

To me, at least for now, it seems more like a SQL enabled etcd or similar. They aren't currently claiming performance numbers that make it sound suitable for general purpose relational database scenarios. A SQL aware etcd like thing has a lot of appeal though, and I assume the performance work is coming.

It looks like there is still no mechanism for change notification, which in our particular case is the only missing feature that prevents using it as a postgresql replacement.

Does anybody know if this feature is planned in the short or medium term ?

https://github.com/cockroachdb/cockroach/issues/6130 https://github.com/cockroachdb/cockroach/issues/9712

This feature is planned, but I cannot give you a concrete timeline. We want to do this right, and we need other parts in place to do this with high performance, in a transactionally consistent fashion, in the face of high contention, and for arbitrarily complicated "views".

I will say that this is the single feature that I personally am most invested in at the company, so it will happen.

Is Cockroach DB intended for just "big-data" companies? Would a small project run really well with Cockroach DB?

Of course a small database probably won't need a lot of the unique features, but is this aiming to replace PG/MySQL in the small/mid-size projects?

[cockroachdb here] Yes! In addition to being highly scalable, CockroachDB also comes with built-in replication. That means that even with a smaller project that hasn't scaled yet, you still get the benefit of a more resilient database.

Also, CockroachDB is super easy to install and get started with!

I've come across many projects that are easy to get started with, but the main stuff to look for is in the details. Although MySQL might be easy to get into, for example, it takes time to learn the intricacies for query optimizations, and importantly, what to do when SHTF, like when a table gets corrupted.

My question is, in your opinion, what does it take to become proficient in CockroachDB sufficiently enough to be comfortable using it in a high volume, high-uptime-required environment?


I can't speak for others, but at least for me the main attraction of CockroachDB is getting foolproof HA straight out of the box. That is something I think anyone can appreciate regardless of their dataset size.

Note that I haven't actually ran CockroachDB yet, so I can't confirm if it really delivers on that promise, but I'm hopeful.

"getting foolproof HA straight out of the box"

This is a minimal requirement for any modern database.

No, HA straight out of the box is a minimal requirement. Foolproof HA is not a requirement, since neither MySQL or Postgres offer "easy" HA setup.

Galera cluster works "out of the box", that might be the closest SQL competitor in that regard.

What is HA?

High Availability

Here we see a cornerstone of HA: redundancy

High Availability

High availability :)

What advantages do I have using Cockroach compared to Postgres, Cassandra, Rethink or MongoDB? (I know that all of them are completely different, that's part of the question)

From the linked website: "CockroachDB provides scale without sacrificing SQL functionality. It offers fully-distributed ACID transactions, zero-downtime schema changes, and support for secondary indexes and foreign keys". Significantly, CockroachDB has had extensive design dedicated to surviving adverse network conditions (see Jepsen references in other posts)

We have an comparison page[1] that might potentially be what you're looking for.

[1]: https://www.cockroachlabs.com/docs/cockroachdb-in-comparison...

Do you have any performance comparisons as well?

So performance is complicated. Right now, we’re performance testing CockroachDB regularly, and everything is out in the open. Everything we do is tracked with a GitHub issue with the “perf:” prefix, if you want to follow along.

Here are all our issues that track performance: https://github.com/cockroachdb/cockroach/issues?utf8=%E2%9C%...

Here’s our open source repository where we keep our load generators: https://github.com/cockroachdb/loadgen

A blog post (well, many) are in the works outlining our performance benchmarking. The situation on the ground is changing fast - our performance has improved rapidly over the past months, and each time we sit down to write a blog post, it gets quickly obsoleted. So, trust that we will have a blog post talking about performance very soon.

Anecdotally, our customers are not finding performance to be a bottleneck. I encourage you to set up a Cockroach cluster, and try the various load generators (we've got the standards and a couple other homegrown ones in the repository).

They are targeting MySQL/Postgres users, basically a post-CAP approach to RDBMS. But if you can work with eventual consistency, they are definitely not your first choice.

[Cockroach Labs engineer here.]

Yes, if very low-latency (i.e., P99 latency sub-5ms) reads and writes are critical to your application, CockroachDB should not be your first choice. That said, one of the primary motivations for CockroachDB is that most existing systems don't handle eventual consistency well. In our experience, most developers will eventually write code that assumes a consistent database, either accidentally or intentionally, because it works most of the time. Dealing with eventual consistency is hard.

Rather than "if you can work with eventual consistency, you should look elsewhere," the sentiment we're trying to cultivate is "if and only if your performance requirements can't work with strong consistency, then you should look elsewhere."

I support the default on consistence the way you posed it. Main reason is safe-by-default construction has proven more effective for average programmer over decades. The other approach caused many disasters.

Meh, this is just pr, nothing is safe-by-default. It's not actually true that people eventually assume strong consistency, because eventual consistency forces certain stricter way of thinking about the state and time, kind of functional, you just can't escape it. It's strong consistency that lets you get sloppy, while making you forget how not simple it is. It only exists inside the system and if you have clients from the outside of the system, like web browsers, you don't have two phase commit protocol on a button click there, so you have to resort to that same functional way of thinking to at least try not to confuse anyone on retry, but it's clearly not the case in the wild. It's just too complex.

I don't think anyone goes back from eventual consistency. It's more appropriate for this asynchronous world, easier and more reliable.

Google disagreed on that last part. Their bright engineers kept screwing up with eventual consistency. It's why they built Spanner in the first place followed by F1. So did customers of FoundationDB and Cochroach despite free solutions available for eventual consistency.

So, Im not seeing it so clear cut in favor of eventual consistency.

Google never did or bothered to do much work on eventual consistency, they cannot possibly have any experience with it. CRDTs didn't came from them. And you know very well that customers do not care about any of this.

Their cloud storage said eventually consistent for apps needing a lot of performance when I looked into it. A quick Google on the offerings show pages describing what tradeoffs are available for customers with each option. So, they not only know about it: they implemented it as a product feature. Their internal stores were strongly-consistent with high performance except AdWords on MySQL. That got moved to F1 for strongly-consistent high-performance. Spanner, which F1 uses, then got offered to cloud customers.

After re-reading the F1 paper, my mistake seems to be thinking they relied on eventually-consistent stuff internally. It appears that was just an option for 3rd party developers in their cloud products. Thanks for the peer review as I found some more stuff double checking. :)

> P99 latency sub-5ms

Has the team cooked up any latency benchmarks for different configurations? E.g. same-rack, same-zone, multi-zone, multi-region?

Not in a rigorous fashion yet, but we will talk about that soon. I've got a couple other comments on this post talking about performance benchmarking: https://news.ycombinator.com/item?id=14308770 and https://news.ycombinator.com/item?id=14308903

I've been following CockroachDB for quite a while. Great job on 1.0.

I've had a question for quite some time though (and I think there is an RFC for it on GitHub): do we still need to have a "seed node" that is run without the --join parameter, or can we run all the nodes with the same command line, with the cluster waiting for quorum to reconcile on its own?

Currently, you need to run one node without --join for the initial bootstrapping (as soon as this bootstrapping is complete, you can and should restart it with --join to get everything into a homogenous configuration). I was hoping to make some changes here so you could start every node with --join from the beginning, but it was trickier than anticipated so it didn't make the cut for 1.0. Watch for improvements here in a future release.

Thank you for your answer.

That's okay, for now, I run a simple StatefulSet where each pod checks whether the Service is reachable on port 26257 to determine if it should join or init the cluster.

It's not as nice as if it was handled by Cockroach itself, but it does the job.

This bootstrapping problem is tricky. We publish kubernetes templates at https://github.com/cockroachdb/cockroach/tree/master/cloud/k... that contain our current best solution for the join/init problem.

Does this work theoretically interplanetary (just asking because for science) ?

No. Once your latency goes beyond single digit seconds, performance will probably collapse. Too many subsystems would time out. in theory it could be made to work (with terrible performance, and extremely long commit-waits due to having to wait until the remote planets get back to you), but I wouldn't architect a planetary spanning distributed database this way. We probably would have to go back to the drawing board and start from scratch.

Thanks for the long answer. Much appreciated. The question came into my mind when reading some of graphics and specifications.

You'd need to give up on consistency, because there is no such thing when the time of communication is long compared to interval of events. In the long run, ACID is dead.

[cockroachdb employee]

Short answer: no.

Long answer: at their closest earth and mars are about 54m km apart, at the furthest it's over 400, with an average of around 225m km, so theoretical latency is varies between 4 and 24 minutes.

CockroachDB uses synchronous replication via raft, and that latency would cause problems as would some other setting like our window sizes and their interaction with timeouts.

> CockroachDB uses synchronous replication via raft

Deep space aside, I wish the announcement just said that! I came back to HN for insight into the paragraph about "multi-active availability... an evolution in high availability from active-active replication". Marketing... sometimes... I tell you what.

Whoops, sorry about that. If you're looking for more on how it works (rocksdb, raft, distributed transactions across multiple raft groups, etc), you might find the design doc interesting: https://github.com/cockroachdb/cockroach/blob/master/docs/de...

More practically, I note this from Cockroach's document on "Deploy > Recommended Production Settings":

"When replicating across datacenters, it’s recommended to use datacenters on a single continent to ensure performance (inter-continent scenarios will improve in performance soon). Also, to ensure even replication across datacenters, it’s recommended to specify which datacenter each node is in using the --locality flag. If some of your datacenters are much farther apart than others, specifying multiple levels of locality (such as country and region) is recommended."

In short, IIUC, even _planetary_ deployment doesn't come for free (yet). Perhaps I'm just not well-enough versed yet in how people deal with globally-distributed databases, but I'd love to see the docs dig into this a bit more: practical limits of cluster deployment, recommended strategies and tools (if any) to replicate data between clusters, etc.

Distance aside, you don't even have a semblance of simultaneity with multiple reference frames, so not even theoretically possible.

Can someone give a brief pros/cons between Cockroach DB Core and Google Cloud Spanner?

Open source vs not open source. Cockroach still in it's infancy vs spanner. I'm sure there are a variety of things here, but they mostly aim to solve a similar problem with a slightly different approach.

Some of the big details relate to not requiring atomic clocks: https://www.cockroachlabs.com/blog/living-without-atomic-clo...

Here's their comparison chart, though naturally it's biased for things-cockroach-does: https://www.cockroachlabs.com/docs/cockroachdb-in-comparison...

(I guess you can't write to Spanner with SQL? That seems like a big difference. No INSERT/UPDATE?)

I'm confused. What's the difference between 'Yes' and 'Optional' in the 'Commercial Version' row on the comparison chart? To me 'Yes' suggests there is only a commercial version, but clearly that's not true for CockroachDB.

Thanks for pointing that out! We will fix that to optional for us :)

[cockroachdb here] Thanks for the great response, bpicolo!

Can Cockroach be plugged into a Rails app where mysql was?

I'd be interested in hearing:

- the backup story

- the replication/failover story

- horizontal scaling story (is it plug and play)

I have ported a MySQL-based ActiveRecord Rails app that was somewhat complicated to Postgres, and then on to CockroachDB. It works pretty well, so I'd give it a go. We're also committed to supporting ActiveRecord via the Postgres connector, so if you run into any bugs, we would do our best to fix them. I am personally invested in ActiveRecord support myself. At this point ORM support on CockroachDB is driven mostly by usage so please try it!

Your other questions are better answered on the blog post, but quickly:

* CockroachDB core comes with a `dump` command to backup your databases. CockroachDB Enterprise has blazingly fast _incremental_ cloud backup and restore, the kind that you might want for a very large deployment.

* Replication is managed under the hood by sharding the data into many ranges that are each 64mb in size. Each range is replicated using Raft, and if a node goes down, the other replicas scattered across the cluster seamlessly take over and upreplicate a new replica to "heal" the cluster.

* The horizontal scaling is indeed plug and play - just add more nodes to the cluster and they'll automatically rebalance replicas across the cluster with no downtime and no additional configuration.

Not mysql, but we've tested and recommend the Ruby pg driver and the ActiveRecord ORM[1] (CockroachDB supports the PostgreSQL wire protocol). It should be 'plug and play' insofar as you simply point to any node in the running cluster when setting up ActiveRecord::Base.establish_connection.

As for our backup story, our doc page[2] on the subject should shed more details.

[1]: https://www.cockroachlabs.com/docs/build-a-ruby-app-with-coc...

[2]: https://www.cockroachlabs.com/docs/backup.html

Very interesting. I have to admit I've seen the product name a few times, but never took the time to have a look. I do have a few questions, though, if any of the engineering team are still around watching the discussion :-)

From the high availability page [1] in the docs:

> Cross-continent and other high-latency scenarios will be better supported in the future.

Do you have a specific timeline in mind? I've been working on an application that needs to be highly-available, and which uses Oracle right now. It seems like you can add all sorts of tools to the mix (RAC, DataGuard, etc), but there are always significant caveats around the capabilities of the resultant system. We're talking 1 to 2 TB of data total, tables of up to 100 million rows with 1 million rows added per day, distributed across three data centers (US, EU, Asia).

And regarding high availability in the context of application deployments, is there any documentation on the locking characteristics of DDL statements? I'm interested in the ability to modify the schema during an application deployment without having to bring down the system or implicitly locking users out. Apologies if I missed it somewhere on the website!

[1] https://www.cockroachlabs.com/docs/high-availability.html

I don't have a specific timeline but it is something we will be focusing on in the following releases.

Regarding DDL statements, this blog post [1] has details. In a nutshell, online schema changes are possible; the changes become visible to transactions atomically (a concurrent transaction either sees the old schema, or the fully functional new schema).

[1] https://www.cockroachlabs.com/blog/how-online-schema-changes...

Congratulations to the team on the relase!

Everything under "The Future" really excites me, especially the geo-partitioning features. That is something that I'm really looking forward to be using!

That might end up being an enterprise feature though.

Will there be a rethinkdb style REALTIME Changefeed or PostgreSQL's Listen Notify ?

I'd also like to know this. PG notify and triggers in general. Any equivalent to DB link?

Yes, change feeds (and triggers) are on the roadmap (though not yet in active development).

I read the announcement, got all excited, then clicked "What's inside CockroachDB Core?" and got rewarded with a 404. Ouch! This itches.

[cockroachdb here] Yeah, we're experiencing some caching issues.

Slightly offtopic, but what do you use for your blog and documentation pages?

The blog and other non-docs pages use hugo (http://gohugo.io/) and the docs use jekyll, but will be ported to hugo soon. We use github pages for hosting with cloudflare in front (for https on a custom domain).


About nine months ago we made the decision to go with RethinkDB for our infrastructure in place of PostgreSQL (at least for live replicated data), but if this existed at the time we'd have seriously taken a look. We're pretty happy with RethinkDB but I plan on still taking a look at this so we have a backup option.

[cockroachdb here] We are big fans of RethinkDB, but also glad to hear that you'll explore CockroachDB. Let us know how it goes, and definitely file any issues / feature requests in our GitHub repo!

It probably scales but how is the performance? If I need to load a couple billion rows and do a dozen joins in some analytics, is that one machine, a dozen, or 100?

Is it more for web apps, analytics, or what? When would I consider switching from e.g. Postgres to CockroachDB?

[Cockroach Labs engineer here]

For just a couple billion rows and a dozen joins, a single node will suffice (with the caveat that you really want at least 3 nodes because CockroachDB is built for replication and fault-tolerance and you're not getting that with a single node cluster), but you'll get linear speedup as you add more machines.

Your performance on a single node should be on the same order of magnitude as doing this in Postgres right now. We are rapidly closing that gap, and intend to close it completely for TPC-H style queries, while retaining the linear performance speedup with more nodes.

The reason this gap isn't already closed is we've been focused on transactional performance in distributed, fault-tolerant situations rather than analytics performance, for 1.0. There are lots of optimization low hanging fruit that we haven't focused on in analytics scenarios that we are just getting started on.

Hi Cockroach Labs Engineer here,

On the feature FAQ joins are describe as 'functional' which doesn't inspire a lot of confidence but maybe it's just a perception thing. What exactly does functional mean?

A SQL db without joins sounds a lot like just a NOSQL db with a familiar query dialect.

If you are using Joins in an OLTP setting, everything should work absolutely as you might expect.

"Functional" is our caveat that if you run Joins across your data in an OLAP setting, it will work, but it may not be the most performant Join possible. For example, our query planner does not currently plan Merge-joins even if the appropriate secondary indices exist. So after a point (joining ~billions of rows of data) it no longer is as performant as it could be. Now we expect to roll out this particular fix within 6 months. However, optimizing 4 or 5-way nested Joins in OLAP-cube style settings isn't something we're going to be performant at for years. We need a lot more infrastructure built up before we start solving the kinds of problems revealed by, say, the Join Order Benchmark paper (http://www.vldb.org/pvldb/vol9/p204-leis.pdf).

Thanks for your response. It sounds like CockroachDB might be an alternative to setting up an RDBMS for read replication once you need many connections.

Should've gone with tardigrade instead as a name, those little bastards can live in space!

I'm struggling to understand how this company has raised $50 million dollars when db companies with paying customers like RethinkDB and FoundationDB had to shut down.

They are gonna earn back $50 million by selling...a backups tool?

I think one major difference is that it's a drop in replacement for certain SQL products, plus a major selling point of NoSQL - good horizontal scaling.

RethinkDB and FoundationDB are great, but require a paradigm shift I think.

Free open source ops tools + enterprise support is a pretty solid business. For recent-ish DB companies see Mongo, Elastic, Redis, MemSQL, etc.

I'm excited to track this project!

Congrats Ben Darnell and team! I am fan of his work on Tornado web server!

Thanks v3ss0n!

Does the replication work cross-region, say US-East and US-West? or even cross continent? It sounds like the timing requires very short latency and might not work in these scenarios

Jepsen test results basically show that latency caused by replica distance won't screw your data. On the other hand, clock drift can stop your system, or even potentially corrupt your data, depending on how fast such incident can be detected/handled and what is your workload/what you are doing.

Yes, it works. Your latency will just be correspondingly higher (due to the speed of light). We are constantly testing a cross-region (i.e. US-East and US-West) cluster and have periodically run tests on cross-continent clusters (US to Asia-Pacific).

In these cases you can help the cluster out by following some of the advice on the "Recommended Production Settings" page (https://www.cockroachlabs.com/docs/recommended-production-se...) around specifying which `--locality` each node is in.

Since CockroachDB is Eventually Consistent Reads then how would that affect my SaaS multiuser application? How long on average would I have to wait for them to become Consistent?

CockroachDB reads are strongly consistent, not eventually consistent. You don't have to wait at all.

How does the speed compare to that of Postgresql and MongoDB?

Does CockroachDB have a streaming API a la RethinkDB changefeeds? This is a killer feature, IMO.

Not yet, but it's on our roadmap.

Just out of curiosity, do you mind elaborating a little bit on why not? It strikes me as something that would be very easy to implement in a database, is there a reason why so few databases have a mechanism to do this?

If it's about maintaining an open connection in order to notify the client, that part makes sense, but at the very least the changefeed itself should be toggleable and easy to query in any DB.

One of the challenges for us in implementing something like LISTEN/NOTIFY comes from our distributed nature: since a table is likely broken up across many nodes, you somehow need to aggregate changes from all of them back into a single change feed wherever the listener is, and in such a way that it doesn't create a single point of failure.

Can someone explain how is/can it be better than MariaDB Galera or MySQL Group Replication?

You can't deploy your MariaDB Galera/MySQL Group Replication systems across the Pacific and then expect it to further scale from there.

Congrats to bringing out 1.0 bern following the project and look forward to try it out!

Say you scaled up to 100 nodes for the holiday season, is there any way to tell how many/much storage/nodes you have to keep running in order to keep 3 backups and maintain your new post holiday load?

We don't have any auto scaling for either up or down scaling, but if you're using a deployment tool such as Kubernetes, I don't see why it wouldn't be fairly easy. And it might be a good idea to add a message in the admin UI if you all of your nodes are experiencing a high load.

By just looking at your max load over the last 24h or perhaps week, it would be pretty easy to see when to down scale.

That being said, as long as you remove the cockroach nodes one at a time , it's pretty easy to down scale a cockroach cluster.

On a three node cluster will it survive two nodes going down?

short answer: nope. cockroachdb replicates data for availability and in order to guarantee consistency across the replicas, it uses Raft[1] internally. Raft necessitates a majority of the replicas remain available in order to operate. it ensures that a new 'leader' for each group of replicas is elected if the former leader fails, so that transactions can continue and affected replicas can rejoin their group once they're back online.

[1]: https://raft.github.io/raft.pdf

What are the recommended configurations then? If I want to survive multiple node failures could I have 9 replicas?

raft is premised on overlapping majorities, so to speak. in order to tolerate up to `n` node failures you'd need to run `2n + 1` instances (for nine nodes you'd tolerate up to four node failures).

How does it compare to Couchbase with it N1QL?

The main difference is the consistency model:


Whereas CockroachDB aims to be strongly consistent. This makes life for the application developer much easier.

Curious why Mac is better supported than Windows. This is obviously something you'd run on a server. Do orgs run Mac servers? Is it just to support dev work for people too lazy to launch a VM? Sorry, Windows/Linux ops person here with very little awareness of Mac ecosystem.

It's not so much a matter of Mac > Windows but rather Mac+Linux+*nix > Windows.

This just comes down to the fact that Windows is a special snowflake that does everything differently. Sometimes for good reasons, but usually not for good reasons.

Any support for postgres trigram searches?

Now if we could get a 1.0 of TiDB ???

Almost there.

Very disappointed with HN turning into a 4chan/reddit style trolling board about the name. Guys, we get it that you don't like the name. Can we please stop bike shedding and move on? The people at cockroachdb have obviously seen all your messages but decided it's worth keeping the name. What more is there to talk about? Why not talk about the relative technical merits of this DB?

> Can we please stop bike shedding

Unfortunately this is a version of the thing it's trying to stop, as is plain from the below. These balls of mud are immune to negation; they laugh at it and grow stronger.

So I do product marketing for a living, and have launched a whole wack of things with good and bad and boring names. The fact that CockroachDB is consistently on top of HN with each thing they do is pretty strong evidence that they're doing just fine with the name they have, and probably doing even better than if they had a milquetoast tech startup-esque name.

Also, they know what their sales cycles look like. They hear feedback from actual customers. They have people whose job it is to notice any advantage they could have along the way. And yet! They're still selling stuff, they're at 1.0, and they're still alive — with the name they have.

I think the fact that the name isn't milquetoast is a huge bonus. People are caught off guard by the name, which gets them thinking about it.

I think the dismissal that business people won't look at it because of the name is purely opinion-based. But what do I know?

If the name were Milquetoast it would be awesome, because that was the cockroach from Bloom County. Or, was that the joke and I'm only just getting it late? ;)


Zerg players will love the name. And there might be a small intersection, somewhere, between business people and Starcraft amateurs.

Erlang posts are also consistently at the top of HN.

It's not bikeshedding when the bikeshed's color will actually have concrete effects on adoption. Most people -- i.e. in procurement, management, finance, and others you need to appeal to -- don't want anything to do with cockroaches. The idea disgusts them at a gut level, not something you can talk away.

HN users are giving vital advice, for free. Those who ignore it will have only themselves to blame.

As I say every time this comes up, would you be so dismissive about critics of naming a product PubesDB? Or GonorrheaDB? Or [n-word]DB? Then you agree that disgust-invoking connotations of the name matter, and we're just haggling over the details.

Ubuntu, Mongo, Swagger (edit: Hadoop also) ... they're weird, sure, but they don't evoke the visceral feeling of disgust that cockroaches do.

Not to be an ass but...

It so far appears not to be hurting them. In the slightest.

This "warning" comes from the HN crowd every time something is posted about CockroachDB. I think it's time to LET IT GO.

I for one, completely disagree with you but that's because I have a different understanding of the relationship between the business side and engineering. We are already looked at as eccentric and strange people, rarely if ever has an absurd technology name caused issue.

Someone talking about "cockroach" is equivalent to talking about "unicorns" or "git." Its considerably less offensive than talk of "masters" and "slaves." If you think this is such a problem for you, then work on your salesmanship as I wouldn't hesitate to talk to other departments or investors about this product.

I was a CTO up until I took medical leave this past October and I cannot stress how important salesmanship is to the role. I think your examples of other databases are hyperbole and not the point. You want them to be equivalent but they aren't. This comes down to what you can sell in your organization and if there is merit to it, then selling it should not be a problem.

One last point is other departments don't give a shit what the database technology is called unless it's something to put on their CV. Just call it the "database" as they most certainly will.

> It so far appears not to be hurting them. In the slightest.

I feel like that is tough to judge because the public has only known them by one name as far as I know. If they switched to this name from another name and saw no difference then we could surmise that the name has had no affect.

I disagree that it's tough to judge but that's because they've raised a considerable amount of capital ($53 million over three rounds):


You don't know that they wouldn't have raised more with a better name.

The end goal of a company is not to raise venture funding. So you cannot use "they raised capital" as proof that their name isn't a problem. Their name absolutely will hurt their adoption. Maybe the product is good enough that they'll still be successful, but if so, you would expect them to be even more successful if they didn't have such an off-putting name.

Did I say it was the end goal? It's merely a metric for a young company. What it means is that enough people have decided that there is a future that current revenue, growth, and expectations are being met or substantial. Raising $53 million dollars isn't easy. So I can say capital raised is a metric on which to base a judgement.

Your statement that it "absolutely will hurt adoption" is unqualified and nothing but opinion. And what exactly is "more successful?"

The handful of people who won't try this because of the name won't matter to their bottom line. If it's good enough then for even a large majority of those they'll end up using it anyway.

> And what exactly is "more successful?"

Pretty much any reasonable definition will do. For example, higher adoption is one metric that can be used to define success.

> Your statement that it "absolutely will hurt adoption" is unqualified and nothing but opinion.

It's an opinion that a lot of people share, judging from the HN threads I've seen about CockroachDB. And really, I shouldn't need to defend the idea that having a name that disgusts people will hurt adoption. It's just common sense. The only real question is how much damage will the name do? The better the product is, the more people will forgive things like bad names, but there will definitely be at least some level of damage.

In addition, if there's multiple products in the same category that are fairly close in quality, then subjective things like names will matter more. Maybe CockroachDB is significantly better than the alternatives right now (I really have no idea; this product category isn't something I know anything about), but if so, surely it won't remain "significantly better" forever. Other products will catch up, or other products will be created to compete, and we'll end up with several products that are similar, and once again, naming will become more important.

And finally, you're completely ignoring the fact that a lot of decisions about tech stack aren't actually made by technical people. They're frequently made by managers rather than engineers. And when the decision is made by non-technical people, marketing (e.g. name) is very important. Heck, even when the product is made by engineers, marketing is important, because that's how you convince the engineers to spend the time investigating the product to see if it lives up to its claims or does what they need.

Speaking as an engineer, if tomorrow I suddenly have the need for a cloud-native NewSQL database, I'm probably not even going to look at CockroachDB, simply based on the name, unless someone else convinces me that it's clearly superior. I find the name very off-putting and I'd rather not be confronted with the mental imagery of cockroaches any time I use the product.

You're missing the point. Saying "they raised capital" is not a good counterargument to "it will hurt adoption". Your response would be a good counter to "they will never raise capital" or "no one will use this".

You can't know how many VCs didn't fund due to the name or how many tech decision-makers at companies will pass on this product due to the name. That being said, I doubt it will be/was significant in any case.

> I think it's time to LET IT GO.

It will never be let go, because each new person is a new interaction with the system that prompts the same point again.

It's like those '*porn' subreddits. You can explain and explain 'till you're blue in the face why the subs are so named, but there will always be some sniggering discussion when it is introduced to new users no matter how much you try and silence or control for it, because it's based on a natural response.

Capitalize all you like, but that's just how people work. :)

> It's not bikeshedding when the bikeshed's color

Still seems like bikeshedding.

It's not your company. You're (probably) not an equity holder. Have you personally been harmed by the name because your company wouldn't let you adopt it in spite of its technical merits? Are you worried it won't succeed because of the name and thus are fighting on the company's behalf for its survival?

> Most people ... and others you need to appeal to ... don't want anything to do with cockroaches

You're making so much of this up out of thin air.

> giving vital advice, for free

> As I say every time this comes up

As the parent said, the staff have already seen these messages. They have decided to keep the name. Advice is helpful, but once the decision is made, it's not. Let it go.

"CockroachDB..your data will survive."

It is both a negative reaction and is memorable. It is not clear which wins, and it isn't your job to decide. Yes, you have an opinion but you may not be right.

I remember in the mid-2000s thinking that a particular politician couldn't possibly succeed with a Muslim sounding name. Turns out that a lot of people thought that. Yet Barack Hussein Obama managed to become President.

Your opinion has definitely been registered. Continuing to state it has no value.

When people are bike shedding, they aren't doing it to waste time, they think that they are adding value because of `list of bad results of wrong colour here`.

So here's my question to you: could you be wrong about this having "concrete effects on adoption"? And if you are wrong, is this just bike shedding?

And to continue the bike shed metaphor, it's about people ignoring nuclear power plant design whose worst case scenario is nuclear meltdown. For CockroachDB 1.0, what's the equivalent, data loss? So are you discussing something technically trivial (colour is easy to understand) over the design (technically complex) that would prevent data loss? If the answer is yes, aren't you bike shedding like a champion?

tl;dr Bike shedders don't know they're bike shedding and think the discussion is very important.

I elaborated in much (probably too much) greater detail on the relevance of the bikeshedding metaphor here: https://news.ycombinator.com/item?id=14310444

I would appreciate your thoughts.

With respect to your specific point: if we could resolve how much it matters, then yes, that would obviate the debate. But the bikeshedding metaphor doesn't add much there because it's precisely in dispute about how much it matters.

Well know I feel like a bit like an asshole with how abrasive my response was, sorry. Kudos to you for not escalating.

I agree that it resolves to how much it matters, and I guess I disagree with you on how much it matters. How it relates to the bike shedding metaphor is starting to feel like a semantic argument, which is not something I want to continue.

In response to your escalated names like PubesDB... my opinion is that I agree I wouldn't work with them, not because of any internal disgust reaction, but because the name signals a level of maturity that I don't want in my stack. Some people might have the same reaction to Cockroaches.

>Well know I feel like a bit like an asshole with how abrasive my response was, sorry. Kudos to you for not escalating.

I didn't feel it was abrasive at all.

For my part, I'm just upset that I went to such great lengths (in the comment I linked) to unpack where the bikeshed metaphor does or doesn't apply, disentangling the various issues and merging them into a general understanding, right where that comment was needed, and yet that's the one that no one is responding to... (what's worse, it was downvoted less than a minute after I posted it ).

>In response to your escalated names like PubesDB... my opinion is that I agree I wouldn't work with them, not because of any internal disgust reaction, but because the name signals a level of maturity that I don't want in my stack. Some people might have the same reaction to Cockroaches.

Right, like I said, "we're haggling over the details"; it should be regarded as a question of which names are so disgusting to be out of the question, yet people are dismissing the entire naming issue as "lol emotional primates".

> It's not bikeshedding when the bikeshed's color will actually have concrete effects on adoption.

Not taking a stance either way on the name, but that is the definition of bike-shedding (aka law of triviality). A committee won't vote for my nuclear plant because the bike shed is red. The bike shed's color has concrete effects on adoption.

EDIT: I would just like to acknowledge the irony of bike-shedding bike-shedding.

  > ...but that is the definition of bike-shedding (aka law of triviality)
  > A committee won't vote for my nuclear plant because the bike shed is red.
  > The bike shed's color has concrete effects on adoption.
Not exactly.

  > Parkinson observed that a committee whose job is to approve plans for a 
  > nuclear power plant may spend the majority of its time on relatively 
  > unimportant but easy-to-grasp issues, such as what materials to use for
  > the staff bikeshed, while neglecting the design of the power plant itself,
  > which is far more important but also far more difficult to criticize constructively.
  > -- https://en.wiktionary.org/wiki/bikeshedding
This part is key here:

  > A reactor is so vastly expensive and complicated that an average person cannot
  > understand it, so one assumes that those who work on it understand it. On the
  > other hand, everyone can visualize a cheap, simple bicycle shed, so planning 
  > one can result in endless discussions because *everyone involved wants to add a
  > touch and show personal contribution*.
  > -- https://en.wikipedia.org/wiki/Law_of_triviality
  > -- https://books.google.com/books?id=RsMNiobZojIC&pg=PA317

I need some additional hand-holding here if you don't mind, I don't see the difference.

If I were to rephrase those two excerpts:

  > Parkinson observed that a committee whose job is to approve plans for a 
  > [globally distributed relational database] may spend the majority of its time on relatively 
  > unimportant but easy-to-grasp issues, such as what [the name is],
  > while neglecting the design of the [globally distributed relational database] itself,
  > which is far more important but also far more difficult to criticize constructively.

  > A [globally distributed relational database] is so vastly expensive and complicated that an average person cannot
  > understand it, so one assumes that those who work on it understand it. On the
  > other hand, everyone can [read a name], so planning 
  > one can result in endless discussions because *everyone involved wants to add a
  > touch and show personal contribution*.
edit: formatting

Please tell me we're not having a bikeshedding discussion on the meaning of bikeshedding. :-)

We probably are. ;)

It's so meta it hurts.

Alright, if you really want to unpack the metaphor:

The bikeshed story is to illustrate overemphasis on something that is trivial. It uses the example of a bikeshed color and a committee wanting to spend a lot of time on it because a) they care a little about it, and b) they understand it well enough for hard-headed members to wade into the dispute rather than trust experts.

It's a failure mode -- by stipulation -- because the bikeshed color doesn't matter beyond minor (but real) aesthetic feelings among the committee, that are far outweighed the cost of high-level personnel devoting time to it. Had they been aware of the general dynamic of these thing, they could entirely prevent the loss by moving on; it's purely an internal matter.

The bikeshed model ceases to demonstrate a failure mode if and when the bikeshed color has impacts far beyond things under the control of the committee. For example, if the majority of the world's people had a near-religious devotion to destroying facilities that house a blue bikeshed, and that fanaticism was hard to defend against, this would be a valid reason not to make the bikeshed blue, and would warrant the committee's attention.

I summarize such situations as "that's not bikeshedding", though of course, to be more technically correct, I should say "that situation does not illustrate the avoidable failure mode in the parable of the bikeshed".

Similarly, if adoption matters for more than just that committee -- if they need to convince numerous other committees to adopt the design -- it's likewise "not bikeshedding" because the first committee doesn't have control over all the other ones; with respect to the first, it's an external matter, and they can't stem the loss just by saying "hey, this is trivial".

Now, you are correct that, a high enough level, this could work as a bikeshedding example, if you could simultaneously get the entire world to collectively agree on the non-importance of aesthetics on technical matters, and on what counts as technical vs aesthetic. Then the world could play the role of that first committee and say "wow, this is trivial" and it's done.

But if that were actually feasible, then that should be your product (producing universal agreement on matters where you have a logical proof-of-correctness), not a database!

procurement, management, finance, and others you need to appeal to

They don't need to appeal to any of these suits. Just the technical decision-makers, whose express job it is to choose solutions on their technical merits, not their spurious emotional reactions.

You can't possibly think that's true. If you do you can't have had much experience in buying or selling technology.

Selling, no. Buying, definitely. And names don't influence my decisions. A rose by any other name...

There is a sad fact though is that there are many organizations do not have technical people at the decision making level.

So these Suits you speak of, won't be able to get past the product name, enough to hear any technical merits of why this technology should ever be considered. Due to disfunctional leadership not even having a role of Chief Technical officer, or Chief Informational officer at the senior leadership level. A lot outsource because they don't want to hire/pay for this in house. It also shifts responsibility away, giving the CEO,COO,CFO, etc... the ability to point fingers at an outside entity.

That is a double whammy! internal can't sell/justify it to management, and outside IT providers/contractors can't sell it either.

So while they may be surviving with the current name they have, that does not mean they wouldn't be crushing the market share with a different name. If they are getting negative comments about the product name, then that's a warning that they should do market research to find out how many people would avoid the product because of the name.

But what the hell do I know, I'm making yet another HN comment post.

On the other hand, if I heard of a database called "CockroachDB" gaining ever-greater adoption, I'd pay close attention to it because it was clearly succeeding despite a marketing handicap.

Mongo (similar to 'mongolism', another term for Down's syndrome) has done just fine, despite (or thanks to!) their name.

There was similar criticism about their name in the early days, but it has waned as mongo has grown. This will too.

Wasn't mongo slang for humongous? Like Mongo from Blazing Saddles?

In the UK "mongo" is synonymous with "retard" if I'm remembering correctly.

In Spanish it is also a synonym for retard

> PubesDB

Oh man, that's too much. lolol

"The DB system designed to handle crazy (sometimes even tangly!) growth. No matter how hairy your data is, Pubes can handle it!"

That's just like your opinion man. Maybe they will love it. Having a name that stands out is generally good marketing actually.

Mongo is a derogatory term in spanish. But hey! not english so no problem.

"Deragatory" isn't the problem; the issue is whether it invokes visceral feelings of disgust. Many terms can be used as an insult, but are still tolerable as a name because a) they have non-insulting usages, and b) the emotional response does not rise to the level of "visceral disgust".

The Spanish Wikipedia suggests many usages of the term "mongo", which probably wouldn't persist if the term was so repulsive: https://es.wikipedia.org/wiki/Mongo

My mother tongue is English, so I never made that association. But I've lived all my life in a Spanish speaking country. Mongo is usually used in the context to mean "retard".

Edit: Now that my memory kicked in, it's racist as well.

I'm a spaniard and mongo is an insult. [removed unnecessary snarky comment]

I think SilasX is not suggesting that it isn't a insult, but rather that a word being an insult is not what really matters as far as naming is concerned. What matters, according to him, is whether the word automatically elicits a strong negative emotional reaction. That a lot of words that elicit such a reaction are used as insults is mostly incidental to the argument.

If calling your product RetardedDB or MentallyChallengedDB in a professional setting is OK by his standards because it's not spelled in english, then I'm OK with that. That's what mongo means in spanish, btw.

Okay, there are separate issues going on here; let me try to clarify:

Is "mongo" the equivalent of English "retard", in terms of being a low-class insult that invokes a visceral reaction among the majority of the population?

I didn't believe that at first; if so, why didn't anyone ever put it in Wikipedia? English has "retard" (in the pejorative sense):


And why doesn't it show up in a top-result Spanish dictionary?


If it's merely an insult with numerous other meanings, I don't think it's comparable.

But let's assume it is equivalent to "retard". In that case, I would agree that it shouldn't be used as a name. But you have to pick your battles: all words will have that trait in some language. For my part, I would consider the Spanish-speaking market big enough not to expect them to buy [the equivalent of] RetardDB. So I agree there.

Edit: I agree with the sibling commenter networked's points.

First of all, Wikipedia isn't infallible. Second, genkaos and myself grew up on different sides of the world. Culturally different, and yet, in our own respective cultures we learned, however wrong it is, the implicit meaning of mongo when used derogatively. It doesn't have several meanings as you pointed out, it has one, which does not mean it is the same for genkaos. As you pointed out in the Wikipedia link, it refers to certain type of people. When used as an insult, towards to a, whether the person is white, Hispanic, black, whatever, it is implied that that person is that sort of person and a retard. Thus, my comment of it being racist as well.

Now, even if I had associated MongoDB with that explanation, and now that I do remember it's inherent meaning under a certain context, I take no offense in it since the people behind MongoDB didn't have that intent. Obviously this is an assumption on my part.

Let us not get derailed from the main point, which is the 'visceral' feelings that cockroachDB has on so many people as you mentioned in several comments. It is true, it happens to me as well. But not the word itself, but when I'm around one. Those feelings of fear, whatever, when around one are irrational. I don't remember the explanation why it's irrational, I've never worked in the field of psychology.

And that's my point. You're only offended because it's in english (and that's fair). But no matter what name you use, it will offend someone. CucarachaDB will fly under the radar.

Maybe you have to be culturally immersed to know those things. Mongo, mongol and mongólico are the terms you should research.

>And that's my point. You're only offended because it's in english (and that's fair).

I specifically said I would be sensitive to the offense it would cause in other languages, at least the major ones.

So you are saying it would be similarly fine to call something NiggerDB?

Just finished my code, I'll commit with git.

Much better to name it something like Oracle, after a mythical seer who gives cryptic, self-contradictory answers open to wild interpretation.

Well, since this discussion has already gone down the tubes, we might as well spend some productive time making fun of other DB names:

- SQLite: SQL database with no sugar. Less calories!

- MySQL: A selfish database.

- IMB DB2: Released in 1983, but never got promoted to DB3. Probably abandoned software?

- Postgresql: Gesundheit!

- CouchDB: A database for lazy people. Part of the NOSQL family, the Zen database family, that achieve SQL by not achieving SQL... like I said, lazy.

- Microsoft Access: It's very accessible. Ironically, most people that use Office don't know what it is, or that it exists, and thus, don't use it.

- dBase: De-bases your data.

- Sybase: Pronounced sigh base, which is the sound people make when you suggest it.

> - MySQL: A selfish database.

MySQL is named after the founder's daughter "My". The fork is named after his other daughter "Maria": https://en.wikipedia.org/wiki/Michael_Widenius#Personal_life

I didn't know this. My intention was never to offend, just to try (and apparently, fail) to be funny.

SQLite: they support a subset of SQL? unacceptable.

MySQL: a proprietary product if ever I saw one.

CouchDB: wow, does that hide bits of data until you search next week?

It's not trolling. It's a legitimate warning and they can choose to ignore the chorus to their own peril. The warnings get louder as they get more resistant to changing their name. Keeping the name for whatever reason IS going to cost them enterprise customers

There's a difference between "legitimate" and "useful". If the top comment on every CockroachDB post was "hey y'all remember that Go's maps aren't thread-safe", that would certainly be a legitimate warning. But at the same time, the CockroachDB team have been coding in Go for years, and they obviously already know that. If those top comments frequently turn into big threads arguing about whether Go's maps should have been thread-safe, the whole thing goes from being questionably useful to seriously annoying. Same thing's happening with the name. They know.

I haven't really seen that many comments about the name, though?

So now the top thread is about how terrible HN is for bikeshedding instead of talking about the actual topic... except this top thread is also not talking about the actual topic. Worth considering, imo.

What's even wrong with the name anyway? It's certainly a lot better than the ridiculous ones like "PostgreSQL" and "MongoDB" and "Redis" (what do these words even mean?).

Since there's a little side riff about the name going on I thought I'd throw in my 2 cents. Personally I love the name. I think it does a great job of conveying the spirit of the project and provides unlimited pun opportunities. Plus it's memorable, just like a real life roach encounter. Unfortunately I'm sure some people will discriminate against your DB on the basis of name alone. That's ludicrous, but that's our species for ya.

At first when I saw yet more name comments on this thread I felt disappointment that people can't leave the subject alone.

But then I realized that as someone who doesn't care about the name, even positively enjoys it, I have a competitive advantage over those people.

Now I feel good again.

Choosing technologies based on first-hand review and first principles rather than things like Gartner magic quadrants, big company brand recognition, feature lists, and "serious" sounding names is a competitive advantage that startups often have over big businesses. The latter are forced by their procurement departments and other forces to use old, inferior, and more costly technology.

On the flip side though if I were in charge of CockroachDB I would look at doing something about the name. Maybe rename it something like "Resilient" as part of the "exit from beta" milestone. It's going to be a serious liability for them selling to the kinds of customers I described above, and unfortunately that's where most of the money is in these devops/infrastructure markets. The key to success is to make a superior product and then figure out how to sell it to pointy haired bosses. The latter often means making it look more boring than it actually is.

Fun factoid: scientists sometimes do this with grant proposals. I've had two scientists independently tell me that they often take cool, fascinating research proposals and "make them boring" to sell them to bureaucrats. "You have to hide all the interesting stuff and make it sound like you are doing boring incremental research. If you talk about anything 'revolutionary' you will never get funded."

I see it as technical people on HN who appreciate the metaphor, versus marketing/business people who can only think of "image".

It's to expected with the massive infestation of HN by suits and khakis in the last few years.

I think the problem is worse: marketing / business people have convinced the worker that this surface level analysis is all we can expect of anyone. As said by other commenters: if the name of the DB solution influences your choice then you're probably gonna get what you deserve.

(Within reason. Someone on here actually said this argument is reasonable to have "because what would you do if they named it 'n-word'DB." Seriously.)

It appears to me that "marketing/business" people are simply stereotyped in this thread, because surely the complaints come mostly from "tech people".

It's the classic case of everyone saying "I think it's great buy <somebody> will complain". Which ends in mindless mediocracy.

Go CockroachDB!

I agree with all your points. That being said MongoDB, Aerospike and Hadoop have all gotten good traction even with their slightly silly names.

To me, cockroaches are such an unbelievably negative association that I don't think I could get over the name and work with this product, because I wouldn't want to be saying cockroach all the time.

Same thing if your database was called BedBug.

> I don't think I could get over the name and work with this product

That puts everyone competing with you at a HUGE competitive advantage. Making technical decisions based on the name of a product is the worst type of decision making.

To me, cockroaches aren't disgusting. And yes, I have used an outhouse in a 3rd world country where cockroaches were swarming up and out... But they just don't disgust me.

It's a bad name because this topic will come up every time it's discussed, forever. It's a distraction from other relevant issues like new features or how it performs.

Strangely, it seems to be helping them. Usually whenever there's an excellent product/article featured on HN, there's not much to say, so there are very few comments. CockroachDB seems like an excellent product, yet the firestorm about their name is fueling discussion, which amusingly might be leading to more upvotes from people who dislike that they're being discriminated against based on their name. It's counterintuitive internet behavior at its finest, similar to everyone complaining that Soylent was a terrible name.

I hope it is excellent and advances the state of the art, but it won't reach it's full potential until it has a name people can use when talking to users, customers, and board members.

"Well first we collect all of the data in the Epidemic schema, run it through the Apocalypse pipeline to transform it into something that our Extinction servers can handle, and finally store it in CockroachDB."

As the creator of a moderately popular open source project, I can attest that the name of the project is very important.

A common problem for open source projects is that the name is not recognizable enough (e.g. too technical) or too generic (e.g. a simple English word which makes is heard to search on Google).

In this case the name evokes negative emotions of fear and disgust which are not what you want to associate with a database.

I wondered for a time if the action movie XXX[0] chose that name because it would be very hard to search online.

[0] http://m.imdb.com/title/tt0295701/

Back in 2000, I used to enjoy an online streaming radio station called echo.com, and as a sort of reward for listening, you could earn Amazon gift certificates.

I tried googling for "Amazon echo gift certificates" but I couldn't quite find what I was looking for.

I miss AltaVista.

There is a tools menu on Google search where you can set a date range. Not perfect, but it helps.

But you'd want your airline named "Virgin" and your morning-after pill named "Plan B"?

The name is, indeed, evocative. Good names don't have to universally convey "positive" emotions.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact