foohak42's comments

foohak42 · on Oct 18, 2017

So, how does it compare to cockroachdb?

So far i've seen: - Apache v2 license - Aims at compatibility with Mysql vs postgres for cockroach

shenli3514 · on Oct 18, 2017

There are some key differences between TiDB and Cockroach.

1. User interface and eco-system Despite that TiDB and CockroachDB both support SQL, TiDB is compatible with MySQL protocol while Cockroach chooses PostgreSQL. You can directly connect to TiDB server with any MySQL client.

2. Architecture The whole TiDB project is logically divided into two parts: the stateless SQL layer (TiDB) and distributed storage layer (TiKV). As TiDB is built on top of TiKV, developers have the freedom to choose to use TiDB or TiKV, depending on their own business. If you only want a distributed Key-Value database, you can just use TiKV alone for higher performance and lower latency.

In a word, our system is highly-layered and modularized while CockroachDB is a P2P system. The design of our system results in the fact that we use two programming languages: Go for TiDB and Rust for TiKV to improve the storage performance.

And benefit by the highly-layered architecture, we build another project[1] to run Apache Spark to on top of TiDB/TiKV to answer the complex OLAP queries. It takes advantages of both the Spark platform and the distributed TiKV cluster.

3. Transaction model Even though CockroachDB and TiDB both support ACID transaction, TiDB uses a model introduced by Google’s Percolator. The key feature of this model is that it needs an independent timestamp allocator. Like Spanner, each transaction in TiDB will have a timestamp to isolate different transactions.

The model that CockroachDB uses is similar to the TrueTime API that Google described in its paper. However, unlike Google, CockroachDB didn’t build the atomic clocks and GPS receivers to keep the time consistent across different data centers. Instead, it uses NTP for clock synchronization, which leads to the problem of uncertain errors. To solve this problem, CockroachDB adapts the Hybrid Logical Clocks (HLC) algorithm.

4. Programming Language TiDB uses Go for the SQL layer and Rust for the storage engine layer. As Go has a Garbage Collector (GC) and runtime, we think it will cost us days to tune the performance. Therefore, we use Rust, a static language, for TiKV. Its performance is much better. CockroachDB only uses Go.

[1] Spark on TiKV: https://github.com/pingcap/tispark

ansible · on Oct 18, 2017

4. Programming Language TiDB uses Go for the SQL layer and Rust for the storage engine layer. As Go has a Garbage Collector (GC) and runtime, we think it will cost us days to tune the performance. Therefore, we use Rust, a static language, for TiKV. Its performance is much better. CockroachDB only uses Go.

CockroachDB uses RocksDB for the storage engine on each node, and that's written in C++ (which can be viewed as good or bad).

foohak42 · on Oct 18, 2017

Thanks for the details!

I'm pretty sure cockroach use RocksDB for the underlying storage so it's written in C++.

shenli3514 · on Oct 18, 2017

I mean the raft/mvcc/transaction layer which are on top of RocksDB and below the SQL layer.