
TiDB open sources its MySQL/MariaDB compatible data migration tool - ceohockey60
https://github.com/pingcap/dm
======
georgewfraser
If you are considering TiDB for OLAP workloads, you should first evaluate an
OLAP-only data warehouse like Snowflake, BigQuery, or Redshift. A "hybrid"
database sounds like a good idea, until you realize that you probably don't
want ad-hoc analytical queries running against your production database, so
you're going to end up running two systems anyway.

Furthermore, column-stores are so different than row-stores in their ideal
implementation that they are effectively completely different systems with the
same outer interface (SQL). Daniel Abadi wrote a fantastic paper [1] in which
he showed that fast column stores must use completely different strategies at
every layer of query processing.

You should be very skeptical of any database claiming to be good at both
transaction processing and analytical workloads. Such as system would
effectively be two different databases hiding behind the same SQL interface,
and would not be as useful as you might expect, for devops reasons.

[1] [http://db.csail.mit.edu/pubs/abadi-column-
stores.pdf](http://db.csail.mit.edu/pubs/abadi-column-stores.pdf)

~~~
morgo
Morgan from the TiDB team here.

I agree that column and row store have very different characteristics, but
what I think is worth mentioning is that some hybrid solutions actually store
as _both_ row and columnar and have a query optimizer that can pick between
them. For example: Oracle DB In-Memory, SQL Server Columnstore index.

At the same event as this announcement, we also announced that we are working
on TiFlash which will do similar. Stay tuned for a blog post with more details
:-)

~~~
georgewfraser
As described in that paper, it’s not sufficient to simply store a second,
columnar projection of the data to get good performance. You also need a
block-oriented execution engine, which means you effectively have two separate
databases operating side-by-side. This is a huge challenge and it’s not clear
if it’s really worth it, since for logistical reasons you will nearly always
operate a separate data warehouse doing mostly OLAP and production database
doing mostly OLTP.

------
giancarlostoro
Anybody using TiDB and care to share why? Especially if you migrated from
MariaDB / MySQL. I'm curious about the use cases compared to just using
MariaDB.

~~~
jinqueeny
Disclaimer: I work for TiDB.

Here are some links that might be helpful:

\- the typical use cases: [https://pingcap.com/blog/architecture-and-use-
cases-of-a-clo...](https://pingcap.com/blog/architecture-and-use-cases-of-a-
cloud-native-newsql-database/)

\- the key differences between MySQL and TiDB

[https://pingcap.com/blog/5-key-differences-between-mysql-
and...](https://pingcap.com/blog/5-key-differences-between-mysql-and-tidb-for-
scaling-in-the-cloud/)

\- the current case studies: [https://pingcap.com/success-
stories/](https://pingcap.com/success-stories/)

~~~
jinqueeny
One more link:
[https://www.pingcap.com/blog/2017-07-24-tidbbestpractice/](https://www.pingcap.com/blog/2017-07-24-tidbbestpractice/)

------
etaioinshrdlu
Anyone else feel super-conservative about databases? It's such a critical
piece of infrastructure, the last thing you want is for it to fail... I'm
using Aurora MySQL but am interested in alternatives. I can't bring myself to
use them out of bug paranoia.

~~~
SamReidHughes
My stance on clustered databases is that you replace hardware risk with
software risk. So they make sense if they get you better throughput or better
uptime stats, but you've still got to plan for a software bug bringing down
the whole thing.

------
blakesterz
Somehow I've never heard of this before:

TiDB (The pronunciation is: /'taɪdiːbi:/ tai-D-B, etymology: titanium) is an
open-source distributed scalable Hybrid Transactional and Analytical
Processing (HTAP) database. It features infinite horizontal scalability,
strong consistency, and high availability. TiDB is MySQL compatible and serves
as a one-stop data warehouse for both OLTP (Online Transactional Processing)
and OLAP (Online Analytical Processing) workloads.

[https://github.com/pingcap/tidb](https://github.com/pingcap/tidb)

~~~
thanatos_dem
“Infinite horizontal scalability”. Yeah, no. There’s coordination overhead for
adding more nodes. It needs to have an upper bound. It may be a high upper
bound, but infinite is straight up impossible.

It may seem trivial, but claims like that are why I prefer CockroachDB when it
comes to NewSQL systems. They’re at least very upfront with its compromises
and areas needing improvement.

I’ve never see anything from Pingcap explaining the compromises that the DB
makes. A “when should you not use this” type overview, and to me those are the
most important things when deciding on a technology.

It may be good technology, but it’s marketed like modern day snake oil.

~~~
morgo
Morgan from the TiDB team here. Thank you for the feedback, and I agree with
you. We actually took this line out from the same copy in the docs:
[https://pingcap.com/docs/](https://pingcap.com/docs/)

(We must have missed a spot, and I will follow up and make sure it is
addressed).

We try to be transparent about the differences from MySQL. On the
compatibility page, there are a few cases described such as large
transactions, small transactions and single threaded workloads:

[https://www.pingcap.com/docs/sql/mysql-
compatibility/](https://www.pingcap.com/docs/sql/mysql-compatibility/)

------
gigatexal
How does tidb deal with at rest encryption and nested transactions?

~~~
morgo
Morgan from the TiDB team here. We are working on at rest encryption now -
stay tuned.

w.r.t. nested transactions, this is not something that MySQL currently offers
(TiDB is MySQL 5.7 compatible). Sometimes this is emulatable via savepoints,
which is a feature we plan to add in the future.

~~~
gigatexal
How good or bad is the SqlAlchemy support?

~~~
morgo
It should work fine, with one item to note:

I believe SqlAlchemy defaults to READ-COMMITTED, whereas TiDB defaults to (and
recommends) MySQL's default of REPEATABLE-READ.

~~~
gigatexal
Any chance you guys run it against the SQLalchemy test suite?

