
Ex-Googlers Building Cloud Software That’s Almost Impossible to Take Down - eridal
http://www.wired.com/2014/07/cockroachdb/?r
======
eloff
So what's special about this database? What properties does it have that are
superior to the currently available alternatives? This article is all hype and
no substance.

EDIT: GitHub has more information. It's a scalable ordered key-value store
(think a distributed version of Berkeley db.) Storage is based on RocksDB (a
variant of LevelDB) and consensus is achieved using Raft. The database is
written in Go. It's meant to "tolerate disk, machine, rack, and even
datacenter failures with minimal latency disruption and no manual
intervention."

What's not clear at all is where it sits CAP wise. It says it's available and
strongly consistent. Which would be CA, which is not an option (especially for
something claiming to be failure tolerant.) It must either sacrifice write
availability or consistency in the event of a partition. No idea which way it
leans.

~~~
jbooth
Why are they using RocksDB rather than LMDB?

If it's based on Raft, then it sacrifices availability if there aren't a
quorum of nodes online.

~~~
peterwwillis
Probably because 1. LMDB is limited to logical address space, 2. it has one
big global lock, 3. It's a B-Tree, and both of those contribute to the fact
that 4. LMDB is a read-oriented database [performance wise]. I would
conjecture that Rocks could also be 'more easily embeddable', but i'm talking
out my ass there :)

And yeah, you kind of have to sacrifice availability if you want to stay
consistent in the face of write skew...

~~~
jbooth
One big global write lock in LMDB maps pretty well to one single stream of
replicated log entries in Raft, IMO.

And logical address space is still far in excess of what most disks or arrays
can fit, right? 40 bits or so on linux?

EDIT: 47 bits, for 128TB -- [http://stackoverflow.com/questions/2159456/whats-
the-max-fil...](http://stackoverflow.com/questions/2159456/whats-the-max-file-
mapping-size-in-64bits-machine)

~~~
peterwwillis
I don't see how? One global write lock means a single instance can't update
multiple ranges at a time, so determining consensus and writing from multiple
peers would just take a long time for no reason. The whole point of an SSI
MVCC is to get around difficult locks....

If Moore's law holds, a single SSD will outgrow the address space in around 7
years. In four years, an array of eight disks would outgrow the address space.
This is just for a single server. If you want a linearly-scaling, robust
solution for _future requirements_ (like multi-petabyte and exabyte
distributed datastores), there's no reason to lock yourself into technology
that'll be obsolete in half a decade.

(edit: SanDisk says it may release 8TB SSDs next year, also adding "We see
reaching the 4TB mark as really just the beginning and expect to continue
doubling the capacity every year or two, far outpacing the growth for
traditional HDDs")

~~~
t1m
IIRC, current x86-64 chips are limited to 48bits virtual address to simplify
the address translation logic (cheaper to manufacture).

This makes sense for the current generation of storage sub-systems, though it
would be misleading to say using memory map technology will be "obsolete in
half a decade". The 48 bit limit is arbitrary. Manufacturers have 56 bit
designs on the table right now, and there is nothing stopping them from
implementing full 64 bit virtual address support.

------
Goranek
Just noticed that it's written in Go. Lately more and more databases are
written in Go.

------
mathattack
I see a headline like that thinking someone will take it as a dare. Of course
the source is prone to overhype, so it's worth a grain (or twenty) of salt.

~~~
jahewson
You got your idiom backwards: more salt = more meat and substance in the
story, less salt = less substance.

~~~
jcbrand
That idiom doesn't work the way you think it works :)
[https://en.wikipedia.org/wiki/Grain_of_salt](https://en.wikipedia.org/wiki/Grain_of_salt)

~~~
mathattack
I like the reference. :-) And I've been misusing the idiom for at least 10
years without correction!

------
jflowers45
I like the name: "CockroachDB" \- and find it interesting that a bunch of the
guys are working on it while they also work at Square - and also that it's
supposedly based off a Google research paper for "Spanner" which I hadn't
previously read about. Lot of good nuggets here.

------
grogenaut
Why, when I see a wired article title, do I immediately assume that it is
going to be completely untrue?

------
curiousDog
Spanner truly is remarkable. I've always wondered why google hasn't opened it
up!

~~~
outside1234
Google does open source when it benefits them. Giving everyone else Spanner
doesn't benefit them.

~~~
cle
It wouldn't benefit anyone. Spanner is most likely tightly coupled with their
internal services and infrastructure.

If you want to know how spanner works, read the paper. If you want to use it
yourself, you'll need to build it on top of your own infrastructure, just like
Google did.

------
na85
Personally I don't think takedowns are the biggest threat facing cloud users.
By far the larger threat is having your cloud data harvested and used against
you by adversaries such as advertising firms and three-letter agencies.

~~~
lern_too_spel
Neither of those are realistic problems. (How and why would an advertising
firm use your data against you, anyway?)

Servers go down every day, and data centers go down every month. This project
solves a real problem.

~~~
na85
Targeting me with ads is absolutely an adversarial situation. I and most of
the population would rather not see ads at all. I do not want advertising
agencies building a profile of me so they can sell me stuff. Do you?

Maybe you'll come back with a completely-bogus, typical hackernews-ish
response, deluding yourself, saying that yes you like it when advertisers
target you based on the data they collect about you because of reasons X Y and
Z, something something "better for me". Ads targeted at you do not benefit you
in any way unless you work for an advertising firm.

And are you seriously arguing that NSA et al do not peek into cloud storage?
Have you been living under a rock?

~~~
cwyers
Most of the population seems to have determined that, given the tradeoff
between "seeing ads" and "paying more for content," they are more willing to
do the former than the later.

~~~
icebraining
Given current transaction costs, yes, but we don't really know what would
happen if these were lower (both in money and hassle).

In any case, it's not every day people are given the choice. There's probably
a handful of sites you can pay to remove ads, and the cost is usually much
higher than the value the user would have provided in ad revenue.

