
Avoid Vendor Lock-In Risk with Multi-Cloud Deployments - nate_stewart
https://www.cockroachlabs.com/blog/gs-response/
======
daxorid
I can't find any documentation on how, exactly, ACID is achieved across all
these partitions. It's impossible to reason about performance characteristics
in the absence of documentation.

For example, if they use two-phase commit, what is the granularity of locks (
table, row, etc )during an open transaction? Table locks during two-phase
commit with, say, 50ms coast-to-coast latency would be an _immediate_
dealkiller.

Or, if it's eventually consistent as most asynchronously distributed DBs are,
they can't claim AC.

I really wish Cockroach had better documentation - not so much how to use it,
but how it works so as to let engineers reason about performance
characteristics.

~~~
knz42
(working for cockroach labs)

As explained in the blog (as linked by another response) and the documentation
([https://www.cockroachlabs.com/docs/stable/strong-
consistency...](https://www.cockroachlabs.com/docs/stable/strong-
consistency.html) and [https://www.cockroachlabs.com/docs/stable/distributed-
transa...](https://www.cockroachlabs.com/docs/stable/distributed-
transactions.html)) CockroachDB does not use locks, and write intents in the
concurrency control scheme are expressed by K/V pair i.e. per table cell.

Meanwhile, I acknowledge that CockroachDB does not yet convey well its
performance model so that engineers can get an intuition about it. There is
documentation, but it's either too detailed (like the design doc) or too high-
level (most of the docs site). This needs a new sort of documentation, which
we are working on.

~~~
eternalban
From the first link:

> Employs Raft, a popular successor to Paxos.

I don't believe that is correct and that is not a good meme to promote.
Successor implies the 'deprecation' of Paxos, which is certainly not the case.
For the architectural sweet spot where both protocols can be used, it is fair
to say Raft provides the more accessible approach, and that for those who
insist on rolling their own consensus mechanism.

Paxos and Raft serve distinct architectural concerns in the shared context of
providing a distributed state with formally defined consistency guarantees.
And to the point, it is entirely reasonable to see a distributed system that
utilizes both to address system component specific concerns.

CockroachDB, ironically (in context of your OP), in fact could benefit from a
vendor lock-in were the candidate cloud provider to provide atomic clocks and
GPS cards in their offerings.

~~~
ddorian43
What are the GPS cards for ?

~~~
infogulch
A lot of synchronization problems are solved with access to a very accurate
synchronized clock. E.g. with it you can determine a reliable "happens-before"
relationship between all in flight transactions, among other useful things.

Literally all GPS does is provide a way to determine the time very accurately
in relation to a bunch of synchronized atomic clocks in orbit. (As a side
benefit, knowing the time like this will also give you your position.)

~~~
phamilton
To expand:

Windows for concurrent operations are bounded by clock uncertainty. If you say
you did something at 2:00 and I say I did something at 3:00, you probably did
your thing before me. But if it's 1:57 vs 1:59, there's strong possibility my
clock is fast or yours is slow. If I know both our clocks are within 5 seconds
of the correct time, then I can say your action happened before mine. So
accurate clocks allow you to provide ordering to more events.

Additionally, If you do identify operations as concurrent, you can explicitly
deal with them. Dealing with them is doable, but expensive. If your clocks are
in sync with a small margin of error, the rate at which you will have to do
expensive things to resolve concurrent operation is low.

Google's Spanner database uses atomic clocks to have a max clock skew of 7ms.
That eliminates most concurrent conflicts, and dealing with conflicts is fast
because we can recognize conflict quickly. Basically, it waits 7ms on all
writes to see if any other nodes have a conflicting write.

CockroachDB can operate in "spanner mode", but without atomic clocks it uses a
max clock skew of 250ms. This is a much bigger window and identifying
conflicts on write takes a lot longer.

CockroachDB recommends that you don't run in "spanner mode" because of the
performance hit. Instead, the default mode does lazy conflict detection, where
instead of waiting after all writes, it will sometimes wait to read data until
the clock skew window has passed.

Here's a great read on the topic:

[https://www.cockroachlabs.com/blog/living-without-atomic-
clo...](https://www.cockroachlabs.com/blog/living-without-atomic-clocks/)

------
tyingq
I'm not convinced this helps much with lock in. Assuming you want to do things
like deployments, scheduled backups, monitoring, etc, you're either:

a) Using "lock in" features like Cloud Formation, Lamba, Cloudwatch, etc, to
do so, such that you use some of the "value" of the cloud.

==or==

b) Rolling your own solutions for these peripheral functions (deployment,
backup, monitoring, etc) and running them on the bare VPS instances. In which
case...why are you on this expensive cloud? You can get a VPS with lower
bandwidth costs elsewhere.

Basically being sort of "half in" on the cloud, where you use it as a
glorified VPS makes no sense to me. Either be all the way in, or all the way
out.

~~~
jchw
Alternatively, you can use tools like Kubernetes and Terraform as abstractions
around cloud environments while still being able to utilize them fairly
effectively. Kubernetes is a clever product for Google to release, because it
makes migrating loads to GCP much easier if you already run them in
Kubernetes.

~~~
tyingq
Similar decision for me though. If you want k8s, use Google or Azure, where
they take care of it for you.

K8s on AWS, at least right now, means you are in the business of being a
plumber. For example, if I remember right, the solid k8s ingress controller
means ELB, which has issues like the need for cache warming. ALB is better,
but has only bleeding edge support as an ingress controller in K8s. Similar
for getting it installed on AWS in the first place.

Rackspace or Digital Ocean or similar is a cheaper better place if you want to
roll your own k8s solution.

~~~
jchw
I'm hosting K8S on AWS in production. Kops takes care of most of the details,
though I _do_ understand the low level details of what's going on. Gotta be
able to debug somehow.

Google Cloud Platform is definitely cheaper than AWS, and their hosting of
Kubernetes seals the deal for me. The only problem? Legacy. If I want to link
two virtual machines across cloud environments, it introduces somewhat
unacceptable latency, and throughput will be less reliable. So connecting to
things like extremely busy Redis or AMQP clusters is not really as safe and
downtime is more likely.

Still, I don't feel like Kubernetes is terrible on AWS. It works fairly well.
I'm not using Ingress resources right now, just a bunch of services with their
own ELBs.

~~~
tyingq
Sure, I concede it can run well. It just feels like renting a furnished house
and storing the supplied furniture in the basement so you can use your own.

Why not just rent a cheaper unfurnished home if you insist on bringing your
own...

~~~
jchw
I'm not saying I entirely disagree. I'm mostly suggesting that moving toward
systems like Kubernetes and Terraform help you to reduce lock-in so that you
can pick the best tool for the job. 7 years ago, it made sense to use AWS and
not look back. But I'm sitting here now, and a lot of software in our stack is
pretty tied to AWS when I'd much rather use GCP.

The furniture analogy is a bit flawed. I can get comfortable in a new,
furnished house, but furnishings don't come with vendor lock-in. Cloud
services naturally do, at least the highly proprietary ones. I'm not saying
put the furniture in the basement, I'm just saying throw a slipcover over it
so we don't touch it directly. :)

DigitalOcean meanwhile is a lot more limited. The new stuff they've added is
nice with firewalls and load balancers, but the tooling with AWS is more
complete and I can utilize a fair bit of that tooling from within Kubernetes,
including things like Amazon's certificate provisioning.

Basically to be clear, I'm saying maybe now embracing GCP and Azure seems like
a solid plan, but years down the road you might want to have some more
mobility. If you're already using things you can bring with you to the next
provider, you're going to be ahead of the curve.

------
sengork
For anyone interested in multi cloud deployments at the IaaS layer I recommend
looking at [https://libcloud.apache.org/](https://libcloud.apache.org/)

------
caleblloyd
I think this is a poorly written blog post that completely ignores the
challenges and difficulties that arise when going multi-cloud. Multi-cloud
introduces huge amounts of latency, requires additional load balancing and
routing logic, requires twice as much administration, etc.

Multi-cloud is a huge decision that need lots of planning and application
level support. I'd go so far as to say it is an order of magnitude easier to
design an application that can be moved between clouds easily than it is to
design a multi-cloud application. This still defeats vendor lock-in.

I think CockroachDB is interesting tech, but this blog post reads like a
pretty empty marketing piece.

~~~
jchanimal
It’s true that operating your own cluster (of any technology) spanning cloud
providers is not for the faint of heart. However, at FaunaDB we’ve found
performance of multi-cloud globally consistent operations to be acceptable. We
currently run across AWS and GCP, with Azure on the way, and we have customers
running their own clusters in hybrid configurations.

------
meddlepal
I'd cloud vendor lock-in a serious concern for most businesses? It always
feels like a requirement that originates from the technical side rather than
the business side. But it never feels like it's that important actually.

~~~
scaryclam
We have a bunch of work that requires large, process heavy, calculations, so
yes, vendor lock-in is a concern for us as it can mean either huge cost
increases, if the vendor changes pricing for the required instances upwards,
or huge savings if a competitor can do the same computations for less. We
solve this problem by making sure we can trash servers and spin them up
elsewhere if needs be (mostly via terraform etc).

I don't think this should apply to anyone running some hosts for web
applications though.

~~~
movedx
Just out of interest, when have AWS increased their prices?

~~~
manigandham
Never.

------
ddorian43
Still waiting for some benchmarks by the team.

What I would be curious, is, how much faster the db would be if it was in
c++/rust instead of java/go. I know bigtable claims 3x faster than hbase.
Scylla claims ~10x (though also different architecture) faster than Cassandra.
Trinity ~2x compared to Lucene. etc etc.

Also: incoming comments about the name of the db.

~~~
knz42
Note that the underlying data store where a lot of the byte churning happens
is RocksDB, written in C++. Also the Go/C++ interface in CockroachDB is coarse
to reduce overheads.

~~~
ddorian43
In that I would be curious on RocksDB vs LMDB. There was an issue about this
but it didn't progress:
[https://github.com/cockroachdb/cockroach/issues/14#issuecomm...](https://github.com/cockroachdb/cockroach/issues/14#issuecomment-41171979)

For some reason(s) nobody is using LMDB on distributed dbs (only know
actordb). While for single-nodes there are many cases that have switched and
been better.

------
ycmimi
It's like operating systems. In the end, you will have a limited number of
options, aws, gcp, and azure, like windows, mac and linux.

~~~
JohnnyConatus
Yup. And the amount of work it takes to be competent in just one (such as AWS)
will make it nearly impossible / not worth it for most people to run cross-
cloud.

------
mslate
Haha cloud lock-in is what they think I'm worried about?

~~~
TruthSHIFT
So... What are you worried about?

~~~
mslate
Database lockin. Guess that was not clear enough.

~~~
andreimatei1
FWIW, CRDB should be a good fit for people worried about database lock-in.
You'd only be "locked in" if you discover that we're much better than the
alternatives :) CRDB uses the standard SQL language. Also, CRDB speaks the
PostgreSQL wire protocol; for using CRDB you'd be using the Postgres drivers
for different languages. So, more or less, any application that can work with
us should also work with Postgres (and vice versa). Moreover, the Postgres
drivers generally respect other higher interfaces (odbc, jdbc) so, more or
less, a CRDB application should also work with any other established SQL
database (and vice versa). CRDB can also export your data in formats easily
importable by other SQL databases.

