Hacker News new | comments | ask | show | jobs | submit login
SQL in CockroachDB: Mapping Table Data to Key-Value Storage (2015) (cockroachlabs.com)
55 points by yinso on July 9, 2017 | hide | past | web | favorite | 33 comments

This was a good introduction, but I expected to see some detail regarding the table mapping strategy and concurrency control / isolation levels. In addition to InnoDB and SQLite cited in the article, there is also a lot of good information on the MyRocks wiki on mapping the relational model onto a RocksDB-based storage engine: https://github.com/facebook/mysql-5.6/wiki/MyRocks-record-fo...

I hadn't considered that most storage engines for SQL databases worked in a "similar" fashion.

[Official containment subthread for complaining about the name "CockroachDB"]

Under the terms of the Unified HN Convention, agreed 2015, every thread about CockroachDB must by law contain a series of complaints about the name of the database. Please post yours below.

To help you get started, here's some prompts you might use:

"My Enterprise CTO will never go for something named..."

"I just think the name sounds really disgusting and off-putting..."

"Marketing a product is at least as important as making a product, and this is bad marketing..."

Have they fixed the performance issues in Atom yet? Last time I checked it couldn't load a 800MB file without locking up.

Yikes. Guilty as charged for unironically making comments as the first two. Going to stare very quietly in nature and reflect.

"For the Swarm!"

"Should've been called RoachDB"

Blizzard called and wants you to talk to their lawyer

I still don't understand why CockroachDB doesn't just use Spark as the compute layer. Spark seems perfectly capable as a platform for performing arbitrary distributed computations on distributed data sources. It even has a full-fledged distributed SQL engine!

As it stands, it seems to me that CockroachDB is mostly just reinventing Spark from scratch, except maybe from a more OLTP-centric perspective.

I'm not an expert with either system, but I think the answer is pretty clear: that's not at all an efficient solution. Spark's approach is: do the data however you want, then use a ridiculous amount of resources getting that data into the format you want it in. You also get a ton of guarantees that you don't get with that Spark system you've proposed. Here's their comparison page[1]. If you're familiar with the technology you're suggesting it should become fairly obvious why Cockroach's nuanced approach to SQL is superior to the Spark SQL solution. For one, Spark SQL doesn't support indices. You can build one, but I'd be surprised if you could get the same general performance out of that system vs. a dedicated system like the one that's built into CockroachDB.

I know the hype is to "use NoSQL" but the reality is that there's a limited set of cases where the existing solutions truly make sense. Generally speaking, SQL is the best choice for many companies. That CockroachDB is bringing some new ideas to the SQL space is something that should be lauded. And I think they're doing it in a way that can't really be replicated easily or efficiently by current systems (which is perhaps why they're receiving so much attention.)

[1] https://www.cockroachlabs.com/docs/stable/cockroachdb-in-com...

I... agree. But my argument was not "use NoSQL". I never suggested to use NoSQL. I suggested to use Spark as the compute layer.

Spark has a sophisticated [1] query engine, which seems perfectly capable of pruning partitions and pushing down index scans into data sources. Yes, Spark can't do writes, so you'll still have to build a transactional KV store that sits underneath the query layer, but you won't have to implement SQL from scratch. (This seems to be similar to the approach taken by e.g. SnappyData [2] and Splice Machine [3].)

[1] https://databricks.com/blog/2016/05/23/apache-spark-as-a-com...

[2] http://snappydatainc.github.io/snappydata/

[3] http://doc.splicemachine.com/2.0/GettingStarted/Architecture...

SnappyData employee here -- This is essentially what we did. The main difference is that we already had a decade old transactional K/V store, that, over time morphed into a more full fledged in-memory database. That is what we integrated with Spark versus rolling a new database. The SQL layer in this database (GemFire/Geode) already had a number of optimizations we could use to speed up Spark SQL queries, even over the native Spark cache.

Like some of the other comments in this thread, the idea was to provide all the guarantees of a OLTP store (HA, ACID, Scalability, Mutations etc) with the powerful analytic capabilities of Spark.

@elvinyung I am going to try to be as positive as I can. I understand why someone who hasn't written a database or who focused on the internals would see spark as similar to cockroach. Frankly they aren't even similar. Things like isolation levels, data distribution strategies and computational placement are very different in the two systems. They really aren't similar. That being said I think both of them are junk. They are the best junk out there, but junk nonetheless. So maybe take my opinion with a grain of salt.

Genuinely curious: what's not junk? (Ideally for use cases similar to what CockroachDB targets.)

Why do you say they are junk?

To be fair, Spark is way too slow to be used as a back-end database system (there's a big bap between "lightning speed computation", as they put them, and a common workload for a database serving data to a user-facing application).

Now of course Spark makes up for it with its great flexibility and scalability, but I do not really see the two technologies as competing ones.

This even without getting into the other parts of the data model (insert, update, delete) that do not exist in Spark (or "kind of" exist), by design.

Spark is built around a query planner that can aggregate multiple data sources and "push down" predicates, ordering and so on into the "leaf" node itself, much in the same way that Cockroach does. But I suspect that while this design is flexible and fast, to get the kind of OLTP-friendly performance that Cockroach is aiming for, a good the distributed execution logic has to be fine-tuned in a way that Spark -- being a general-purpose, backend-agnostic system -- cannot do. Spark can query things like Hive, Elasticsearch, Cassandra and so on; Cockroach has the luxury of only needing to target its own engine.

Also, last I checked Spark had very limited support for mutating data in SQL; it was designed for queries. Cockroach has to do both.

There's overhead in the flexibility, but it seems that you'll have to do the same kind of tuning all over again if you build your own query engine from scratch. And frankly it seems (at least at a high level) that Cockroach is heading towards building the exact same kind of dataflow engine anyways[1].

Spark has some support for SQL DML, and to me it actually mostly seems like a concern for the underlying KV layer to provide ACID transactions, and expose them to Spark through the Data Sources API.

[1] https://www.cockroachlabs.com/blog/local-and-distributed-pro...

> "except maybe from a more OLTP-centric perspective"

This is a big difference, separating compute and storage layer comes with a big overhead with the flexibility.

I would actually argue that CockroachDB isn't really a pure OLTP play, nor does it really want to be one. If you read some of their more recent literature like [1], it seems pretty clear to me that they're headed towards building a "good enough" one-size-fits-all [2] database solution, one that can do both OLTP and OLAP on the same data (i.e. HTAP).

Furthermore, it seems to me that query evaluation techniques aren't really that different between OLTP and OLAP workloads, i.e. the difference is mainly in the storage format.

In that sense, I think it follows that Spark is a good fit as the query layer. Notice phrasings like "arbitrary computations close to your data". That's basically exactly what Spark already is. I think there's a lot of synergy in combining it with a strongly consistent distributed storage engine.

Anyways, my $0.02, evidently not a very popular opinion ¯\_(ツ)_/¯

[1] https://www.cockroachlabs.com/blog/local-and-distributed-pro...

[2] https://cs.brown.edu/~ugur/fits_all.pdf

Can anybody give me an example of any dbms that spans the OLTP/OLAP gap properly _on the same data_?

Potentially added constraints: -ACID for OLTP

-## TB+

-30-way analytical joins on complex criteria and multiple data sets with billions of entries

-fast iterations on data prep for analytics, so analysts can make, find and correct errors

-proper workload management (almost no "stupidly designed" queries)



I'm asking because I can't see this without hw and sw being integrated to allow for it (appliance). Are there any cloud offerings that live up to this?

EDIT: Formatting got mangled on submit.

Oracle does pretty well running OLTP and analytics workloads concurrently against the same schema. While I don't generally recommend doing so, you can fire off massive scans or complex joins against a production system doing thousands of transactions per second and it handles it just fine.

In my view, one reason that we don't see huge demand for this combination is that the schema that makes sense for analytics is often different from that which makes sense for the online system.

Not that I'm aware of. It's tricky because the data storage format, query scan method, data distribution etc. requirements are different for OLTP/OLAP. However you can replicate the data to both OLTP and OLAP database and use both of them at the same time. That's people usually do. In fact, even if there is a database for this use-case, it should probably do the same process internally so that you don't need to do it in application level.

I'm one of the founders at MemSQL, which does what you describe.

Totally tangential, but I'm curious if you could elaborate a bit on how compiled queries [1] work in MemSQL? Am I correctly guessing that it completely ditches the traditional Volcano-style iterator model, in a manner similar to HyPer [2]?

[1] https://docs.memsql.com/v5.8/docs/code-generation

[2] http://www.vldb.org/pvldb/vol4/p539-neumann.pdf

Looks like a sound design based on a cursory inspection :)

Question(s): Do you offer any appliances? The reason why I am asking is for computationally intense workloads where the same data may be shuffled around multiple times between processors. Can one e.g. set up MemSQL with RDMA over Infiniband?

MemSQL engineer here.

No, we do not offer appliances. We are a software only solution. I do not know of any deployments where RDMA is being utilized today. I'm interested in your use case. If you're so inclined, join chat.memsql.com (my UN is eklhad) and we can converse a bit more rapidly.

Thank you guys for your answers :)

I am charting the landscape of distributed database systems (federated and homogenous). Node interconnectivity is just one of many potential bottlenecks.

With a sufficiently complex query, redistribution of data by hash must occur a number of times for linear scalability (based on my understanding). Ethernet based interconnectivity typically suffers from high CPU utilization and various QoS issues for this particular use case. This also seems to apply to Ethernet based fabric offerings, though I haven't kept up with that field for a couple of years.

If you guys are encountering performance issues connected to either RAM=>CPU loading or data redistribution between nodes, you may want to keep this in mind.

I may get in touch via chat at a later time as I'm slightly more than average interested in HPC database systems :) The more offerings, the better!

Check snappydata ?

Netezza but IBM bought it and killed it.


It's in-memory so you also need TBs of memory in order to be able to use it for analytical workloads.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact