
Hello World - francesca
http://www.cockroachlabs.com/blog/hello-world/
======
mackwic
The design page is the most interesting place to look at:
[https://github.com/cockroachdb/cockroach/blob/master/docs/de...](https://github.com/cockroachdb/cockroach/blob/master/docs/design.md)

~~~
mattikus
Also worth mentioning is the excellent talk[0] given by Tobias Schottdorf at
FOSDEM 2015 about the high-level design and impetus behind CockroachDB.

[0]:
[https://www.youtube.com/watch?v=ndKj77VW2eM](https://www.youtube.com/watch?v=ndKj77VW2eM)

------
marknadal
I'm the author of a distributed database, competing with them. Overall I'm
refreshingly impressed with their design document. Which I am sad to say that
most other databases don't come any where near thinking these things through
except as an after thought - so I am glad to see they are making it their
priority.

With that said, they seem to be assuming that their clock skew (ε) has a fixed
maximum boundary which is incredibly disconcerting to me as it implies that in
certain (rare and anomalous) network partitions that they'll get data
corruption and fail.

I can see how they, coming from a Spanner background with atomic clocks, might
assume this. But this assumption requires that their database cluster is
always connected, within some heartbeat interval (which they mention) such
that they can trust there exists a maximum bounded ε skew.

So while it seems like a dumb question, I honestly must ask a very trivial
question: how does CockroachDB handle basic network partitions? I assume they
have a good answer to this, but it needs to be clarified in order to answer
the more important issue of anomalous partitions, like split brain. This might
rip the crockroach in half, quite literally, meaning that all other
"guarantees" they give get thrown out the window like linearizability and
global consistency.

~~~
ansible
I believe that if a node in the consensus group exceeds ε clock skew, it will
be kicked out of the group.

As far as network partitions go, a consensus must exist for reads or writes.
If you don't have 3 out of 5 working correctly and talking to each other, then
you are down.

~~~
tschottdorf
That's correct, it's a consistent system and so the majority needs to be
involved on mutating writes. Reads typically can read from one designated copy
of the replica directly (bypassing Raft).

------
patorjk
I know this is a small gripe, but the name of the company (even though it
makes sense) makes my skin crawl a little. The name does stick in my memory
though, so I'm not sure if this is a good thing or a bad thing.

~~~
steve918
Let's be frank, it's a terrible, horrible name.

~~~
matt4077
A counterintuitive name can be an excellent choice. Who would'a thunk you
could name an airline "Virgin" and get it off the ground?

~~~
vijayr
That case is different. Virgin started with music records and was already a
huge brand by they time they launched the airline. They launched tons of
products and all of them were called Virgin (mobile, cola etc).

May be a better question would be "Who would'a thunk you could name a record
label 'Virgin' and get it off the ground?"

------
AgentME
I love hearing about a database that decides to focus on replication and self-
healing itself! It drives me nuts how most databases implement a data store
and then leave all the complexities of sharding and replication as an exercise
to the reader, who is busy trying to get other things done.

I've been looking for a database which does sharding and replication
automatically and without throwing away any focus on consistency and
transactions, so I figure I'm likely to use this in the future. I've struggled
to try to find any others meeting these criteria.

~~~
ploxiln
There's a number of datastores that claim to shard and replicate
automatically, with no worries for dev/ops/devops.

They've been lying. Never trust them.

Some datastores can actually do this, but performance per beefy server is less
than you'd expect. You can use Riak but you have to write proper CRDTs. You
can use zookeeper or etcd but those are for small amounts of configuration
data, not for large amounts of customer data.

For all the datastores that claim to do everything automatically and have
great performance, we can thank Aphyr for providing the proof that they don't
live up to their promises, while we just suspect they don't.

I'd suggest trying to use a simpler model, and understand and accept its
failure modes. Maybe your app has to go into read-only mode for a few hours if
there's a server failure, etc.

~~~
AgentME
>I'd suggest trying to use a simpler model, and understand and accept its
failure modes. Maybe your app has to go into read-only mode for a few hours if
there's a server failure, etc.

I'm fine with failure modes like that. I just want it to be automated. I don't
want to come home from a trip and find that my database master has fallen over
and the database slave has been patiently waiting for me to manually promote
it for the last few days. I could probably rig up some cron jobs and shell
scripts to automate this, but this is what I'm looking for something to do for
me that's hopefully written by people smarter than me.

~~~
toomuchtodo
RDS MySQL does this, with a hot standby master and 60 second failover.
Replicas too. Unless you were looking for free. That's a bit harder.

------
eis

      "Today, we’re launching CockroachDB for everyone. Use it. Build on it. Contribute to it!"
    

Does this mean it's more or less ready? The status in Github hasn't been
updated in quite a while and lists it as Alpha with important parts like raft
concensus still missing.

Can someone (preferably from the team) clarify the current situation?

PS.: CockroachDB is the only distributed DB that I would bet on going forward
and being a solid base for a big distributed DB.

~~~
tschottdorf
not quite ready yet, but the pace has picked up dramatically. We've begun work
on the structured data layer and are whipping up a suite of extensive
acceptance tests (load testing, performance metrics, ...) to iron out all of
the performance issues/bugs that we don't want to be a part of the beta. Raft
consensus, btw, is already implemented. We'll update the README shortly to
give a more concrete estimate of the situation.

~~~
eis
Thanks for the answer, I appreciate all the hard work that is being done on
the project.

So since you guys started working on the structured data layer, does that mean
CockroachDB is going into Beta?

I can't wait to get my hands on a test setup even though it's probably going
to take a long time before I can deploy it into production.

~~~
tschottdorf
The plan is to get out of alpha as soon as possible. I'll leave it to the
founders to throw dates around but we're working hard on getting the technical
core on solid ground, and all the auxiliary stuff (UI, deployment, ...)
required for beta is getting a lot of attention. We'll have hands-on
deployment demos soon and if you follow the project in the coming weeks you'll
probably get a good idea of where things are going.

------
curiousDog
I've been following the development of this project from the beginning and it
has been very interesting to see how they've productized it. IIRC, they all
used to work at Square (and before that in a startup called viewfinder) and
started it on a hackweek.

~~~
brown9-2
They also invented GIMP.

------
dreamdu5t
How does Cockroach Labs plan on making money, with regards to the free
license? Do you plan on providing hosting and support?

------
Mahn
It's worth noting that they are backed by Google Ventures apparently:
[https://twitter.com/GoogleVentures/status/606534505332154369](https://twitter.com/GoogleVentures/status/606534505332154369)

~~~
justincormack
"Aside from Benchmark, Google Ventures, Sequoia Capital, and FirstMark Capital
also participated in the round."

[1] [http://venturebeat.com/2015/06/04/peter-fentons-latest-
inves...](http://venturebeat.com/2015/06/04/peter-fentons-latest-investment-
is-a-database-startup-called-cockroach/)

------
smrtinsert
The Team page looks like a Silicon Valley promo.

------
tzury
Using a bug name for a software product is an interesting strategy.

~~~
BasDirks
Not unusual:

Ant: [http://ant.apache.org](http://ant.apache.org) Wasp:
[https://www.waspbarcode.com](https://www.waspbarcode.com) Hornet:
[https://www.npmjs.com/package/hornet](https://www.npmjs.com/package/hornet)

~~~
gauravagarwalr
Some of them are the good kind of bugs!

------
exacube
Man, these guys have a lot of stellar engineers with really strong
backgrounds. They also have "free fridays".. which is crazy.

~~~
brazzledazzle
What's "free fridays"?

~~~
tuananh
sounds like you dont have to work on Friday.

------
EliRivers
Opening line: _Databases are the beating heart of every business in the world_

Well that's not remotely true, is it? Not even close. Is it really a good idea
to lead with something so obviously untrue? If you're trying to convince me of
something (i.e. that this product is good), putting such a jarring, obvious
falsehood right at the start is a bad idea. I'm wondering if they're
deliberately spoofing their own seriousness, but I see nothing else in there
to support that.

~~~
jnpatel
This line didn't bug me so much. Would you have been okay with something
subtly milder? e.g. Databases are _at_ the beating heart of every business in
the world

or even: Data is the beating heart of every business in the world.

~~~
EliRivers
Depending on how generous a reader is feeling, either still not true or a
massive hyperbolic exaggeration that stretches the word "database" far beyond
its actual definition; I'd wonder (in fact, I actually am wondering this) if
they were living in some kind of bubble.

------
mbell
> Cockroach is a distributed key:value datastore _(SQL and structured data
> layers of cockroach have yet to be defined)_ \- emphasis mine

I guess this is interesting, but distributed hard consistency pure K-V stores
have been done before, Zookeeper, etcd, etc. It seems like the vast majority
of the hard work is left to do. I don't want to get into naming arguments, but
I wouldn't really call this a 'database' yet. It doesn't sound like you can do
anything but a key lookup or range query currently, which is incredibly
limiting for most real world applications.

I somewhat question the approach. e.g. why not figure out the hard part first?
i.e. build the `SQL and data layers` on top of zookeeper or etcd then replace
the backend to scale better? I would think this would get a lot more early
adopters. As is, it's a very niche usage case that the alpha fills.

~~~
lobster_johnson
I suspect they _have_ figured out the hard part.

If you look at the documentation (eg., [1]), the design has been rather
carefully thought out; it's just that they're implementing it from the bottom
up.

According to their roadmap [2], they're aiming for KV functionality in 1.0 and
aren't aiming for SQL until past version 2.0 (it's currently alpha).

Given the backgrounds of the technical people involved (including Google, as
this project is inspired by Spanner), they should have a lot of experience
with what they're trying to accomplish.

As for "done before", a core feature of Cockroach is true ACID transaction
support, including snapshot isolation, something no distributed NoSQL database
I know about supports. (ArangoDB does support transaction, but is mostly NoSQL
in the sense of implementing a different query language than SQL.)

[1]
[https://github.com/cockroachdb/cockroach/blob/master/docs/de...](https://github.com/cockroachdb/cockroach/blob/master/docs/design.md)

[2]
[https://github.com/cockroachdb/cockroach/wiki/Roadmap](https://github.com/cockroachdb/cockroach/wiki/Roadmap)

~~~
mbell
> As for "done before", a core feature of Cockroach is true ACID transaction
> support, including snapshot isolation, something no distributed NoSQL
> database I know about supports.

Zookeeper has ACID transactions which I believe are linearizable (which trumps
SI). The downside is the memory only working set, but given how cheap memory
is, I'd still rather have a memory only Zookeeper with a rich query interface
than a large storage data KV store with minimal query interface.

> ArangoDB does support transaction, but is mostly NoSQL in the sense of
> implementing a different query language than SQL

What is your definition of NoSQL?

~~~
lobster_johnson
ZooKeeper is not a general-purpose database. I have heard of anyone using it
as one, either.

> What is your definition of NoSQL?

I don't have one, and I think the term isn't terribly useful. But the whole
idea of NoSQL started as an attempt to break free of the _relational_ aspect
of SQL, because things like joins, strict schemas, foreign keys, and
normalization were perceived as getting in the way of distribution. ArangoDB
supports joins (but not foreign keys, because it's schemaless) and an SQL-like
query language, which makes it a lot closer to an SQL database than something
like Redis or Cassandra.

------
danmaz74
Great storytelling, accompanied by a call to action at the end... but right
there at the end a big bold button (or link) is missing, you need to figure
out that the tech details are from the menu. Make it simpler for the reader!

------
mindstab
[https://github.com/cockroachdb/cockroach](https://github.com/cockroachdb/cockroach)

"The highest level of abstraction is the SQL layer (currently not
implemented)."

------
ilya-pi
Would love to see how it performs against Jepsen

------
rudiger
Arguably, the highest level of abstraction will be in the query language
exposed to applications. Will it support SQL?

~~~
mattparlane
From here [0]: "The highest level of abstraction is the SQL layer (currently
not implemented)"

[0]
[https://github.com/cockroachdb/cockroach#architecture](https://github.com/cockroachdb/cockroach#architecture)

------
gauravagarwalr
As a consultant, how am I to ever push for this technology to clients and
their developers?!

~~~
chubs
Just say 'I recommend we use this database from a bunch of ex-Googlers'.

------
paulkonp
Well, at least users don't have to worry about open-source selling out...or do
they?

~~~
andybons
[http://readwrite.com/2015/03/25/apple-foundationdb-github-
cl...](http://readwrite.com/2015/03/25/apple-foundationdb-github-closed-
source)

~~~
misframer
On that note, it's quite interesting how similar this is to FoundationDB.

------
zenogais
I've been following this project for well over a year now. It's come a long
way, has a long way to go still, but it's pretty exciting as an alternative to
weak consistency stores available now.

~~~
themartorana
Agreed. I'm in the middle of implementing one of the lesser DBs, and have all
of the engineering ahead of me that requires. Unfortunately this doesn't look
smart until 2.0, which is probably years away. Too long to wait for.

I can always scan-and-switch when it's ready.

------
noir-york
Looks awesome! Can't wait for the alpha.

------
posnet
I would be interested to see how it performs against the YCSB+T benchmark.
Many claim SSI but very few achieve it.

------
liyanage
I'm very happy to see that this is released under the Apache license.

------
jmspring
"Fast forward to the mid-2000s, when Google ushered in the NoSQL movement."
... Hubris? Fanboyism? Or not knowing history?

[http://en.wikipedia.org/wiki/NoSQL](http://en.wikipedia.org/wiki/NoSQL)

~~~
bpicolo
They're referring to
[http://en.wikipedia.org/wiki/BigTable](http://en.wikipedia.org/wiki/BigTable)

~~~
jmspring
But google ushered in the era of NoSQL? Seems a bit of hyperbole.

~~~
dmayle
Except it's not. Strozzi used the term to refer to a relational database that
didn't support SQL commands, the first modern usage of the term NoSQL was used
to describe the slew of databases that copied Google's bigtable approach.
(CouchDB et al)

------
bra-ket
does it understand SQL?

------
tantalor
Proper title, please.

------
panon
This is what my brain messaged to me:
[https://xkcd.com/927/](https://xkcd.com/927/)

~~~
AgentME
Overuse of that comic is a pet-peeve of mine. That comic is about standards,
which the entire point is for everyone to agree on. That doesn't apply to all
products in general! The point of a database isn't for everyone to use the
same product. It's to store data. It's not ironic in any way that there are
multiple competing database products! It's only standards that have any irony
in that fact!

------
brento
I'm getting a 503

------
Rainymood
When I click on the image I am taken to this page. There seems to be no way to
go back to the main page ...

[https://i.imgur.com/aZPWk6Q.png](https://i.imgur.com/aZPWk6Q.png)

~~~
rexbee
Click the back button in your browser

~~~
Rainymood
Just thought it was quite bad UX design not letting me click the background or
adding an X somewhere ... oh well.

~~~
acloudbuster
That's just an image, though. It's not actually a page. And since you're on
Firefox, the image is centered with a dark grey background, which imitates the
common UI modal design pattern... but it's not.

Now why they would think to make that image a clickable link is beyond me...

~~~
cdcarter
It's a Wordpress default that images are links to themselves.

Why it's a Wordpress default is beyond me...

