
Global Multi-Cloud Replication in FaunaDB Serverless Cloud - apervez82
https://fauna.com/blog/global-multi-cloud-replication-in-faunadb-serverless-cloud
======
marknadal
Sigh, the contrast between both the current CockroachDB submission and this
FaunaDB one is a perfect comparison.

\- CockroachDB starts with an image explaining how their query engine handles
requests.

\- FaunaDB starts with saying they are the only multi-master cloud database.

\- CockroachDB then spends the next 3K words to explain how and why.

\- FaunaDB claims that others are just cross "continental" systems, and that
they are the only "global" ones, with no reasoning to justify the claims.

Yes, FaunaDB being a proprietary hosted service is certainly targeting a
different audience than CockroachDB which is Open Source facing. But it
damages your brand to make untrue claims:

\- Cassandra is a multi-master database you can run across globally
distributed clouds.

\- Heck, even my own system,
[https://github.com/amark/gun](https://github.com/amark/gun) , is a multi-
master database that you can (and I have) run across globally distributed
clouds.

\- CosmosDB has a tunable option for this now, I believe.

I also don't recall their vocabulary being "multi-master" before either,
because that doesn't match with the claims of being "Globally Consistent" in
the CAP Theorem sense. Unless they just mean it is sharded? But that is
different.

I'm sure my comment will just be ignored, but I ask you for your own sake (and
database vendors in general) to not make marketing claims like this. Database
vendors are notorious for doing this, and it caused a big falling out with
developers because of it. Finally, I felt like between RethinkDB, me, and
others, that we were starting to make amends again, being open with the
industry/community. I'm not trying to be harsh just to be harsh, I genuinely
mean this: If you make a claim, please back it up - you guys are smart and
hard working, so please just go the extra step to provide the evidence.

~~~
eldenbishop
Good comments and I felt the same way. This stinks of CAP violations and
bullshit. It may have a lot of value but the bullet points are eye rolling.

------
redwood
How does the commercial viability of this company look? It's one thing to bet
on an open source project, and another thing to bet on a cloud database for a
standard database where in a pinch you can always run that same cloud database
somewhere else or even on your own servers. But for a fully managed
proprietary cloud offering that is not a standard database software technology
that you can't run anywhere and not backed by one of the big clouds... it
seems like a huge risk to bet on this offering.

~~~
evanweaver
You can run the on-premises edition yourself. That's what our large enterprise
customers do, for this reason and others.

------
jchanimal
My favorite part is that expanding the cluster does "not affect the latency
profile of existing applications: writes to FaunaDB only need to commit to the
closest majority of datacenters to maintain consistency. Currently all data in
FaunaDB Serverless Cloud is replicated to every region, guaranteeing low
latency reads."

This means that as we grow, your apps will get faster and be able to run
closer to your users.

~~~
zenithm
CosmosDB lets you set region failover priority if something goes wrong, does
Fauna have a similar model?

How does an application know which region to talk to? Especially once you can
select a subset of regions to be in. Some might not have the data you are
looking for.

~~~
evanweaver
Since FaunaDB is multi-master (or masterless, if you prefer), there is no
failover step per se. Any region in the cluster can receive writes if it is
part of the majority partition, and any region can serve consistent reads even
if it's temporarily in a minority partition.

You don't have to set any priorities, and partition events don't change commit
latency for the cluster majority.

Currently FaunaDB drivers use geo DNS in route53 to automatically find the
closest region, although you can pin to specific regions if you know the
cname. If that region doesn't own the data for the logical database in
question, FaunaDB forwards the request internally.

In the future, drivers will maintain their own ϕ accrual failure detectors and
make faster and smarter routing decisions than DNS can provide.

~~~
redwood
How do you handle write conflicts?

~~~
evanweaver
Transactions are strictly serialized; the paper explains how this works the
best: [https://fauna.com/pdf/FaunaDB-Technical-
Whitepaper.pdf](https://fauna.com/pdf/FaunaDB-Technical-Whitepaper.pdf)

We use a single-phase model inspired by Calvin, rather than Spanner's two-
phase model. The tradeoff is that interactive transactions (like in SQL) are
not supported, but overall latency and throughput are much better.

------
evanweaver
Let us know what other regions and cloud providers you would like to see, like
maybe Digital Ocean, etc. We're exciting to keep rolling these out.

~~~
lux
+1 for DO support!

------
jedberg
> FaunaDB Serverless Cloud remains the only multi-master, globally-distributed
> cloud database.

Cassandra or Datastax? Cassandra has been doing this for years.

Or did they mean the only hosted option?

Edit: Made me sound less rude.

~~~
nemothekid
Isn't Google Cloud's Cloud Spanner hosted, multi-master, and globally-
distributed?

~~~
evanweaver
"Cloud Spanner currently offers only regional instance configurations:
replication within one region of the United States, Europe, or Asia. Regional
instance configurations in additional Google Cloud Platform regions will be
added throughout 2017. Multi-region replication (i.e., replication across
multiple geographies) is planned for future release."

------
lux
Sounds very cool. One thing I'd love to see on the pricing page is an
estimation tool where you can enter different values to see a monthly cost
estimate.

~~~
evanweaver
That makes sense. In particular, the minimum price is always wildly better
because of the serverless model (metered, like S3), but you still want to see
where you will be at with bursts and such, or compare to a static Postgres or
DynamoDB cluster at expected load.

A lot of the benefit comes from not having to manage capacity up and down in
the first place, though. Even if other systems let you do it quickly you still
have to either predict or react to your load "by hand".

