Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: FaunaDB, a strongly consistent, globally distributed cloud database (fauna.com)
131 points by evanweaver on March 15, 2017 | hide | past | favorite | 159 comments



I think Fauna is not very good at docs and communication yet, at least judging by confusion from some of the comments and by reading their docs. But launching will probably make them a lot better at it. Here are my notes which may add clarity for some:

similar to a RethinkDB/MongoDB:

* Designed to be great for storing application data. Fields can be dynamically added (schemaless) and their values can be arrays so it is easy to maintain data locality according to your application use patterns.

* Uses a non-SQL query language

* Probably not great for ad-hoc reporting (arguably SQL is a requirement for that)

Unlike MongoDB: supports joins

Unlike RethinkDB: great support for transactions, just not SQL transactions with an open session (which are unnecessary for an application)

Unlike most databases

* cloud-hosted and pay-for-use (on-premise is on their roadmap)

* claims support for graph data by storing arrays of references

* QoS built-in so you could run a slow analytics query without disrupting your application

Cons

* Unfortunately just like MongoDB/RethinkDB they have no real database-level integrity of schema and foreign keys, but at least foreign keys are on their roadmap.

I am a huge fan of the cloud-hosted pay for use aspect: I wonder why anyone would design a DB today without this in mind. You can transfer your data from a pay-for-use application DB (FaunaDB or Google DataStore) to a data warehouse (Snowflake or Google BigQuery) which is also pay for use and gives you SQL reporting abilities.


Thanks, we realize the docs still have a long ways to go and are working to improve them as fast as we can. This is a good summary.

We're definitely aware that the lack schema definition is a problem for certain use-cases, and solving this is on our roadmap.


You might consider removing the badge on the home page that says "Global Latency 2.8 ms". Unless you really can give me latency across the globe of 2.8 ms, in which case your solution to the speed-of-light problem is quite impressive :)


Confusing latency number (although it is a real number) is gone.


Reads do not require global coordination because of the way the transaction log advances within each datacenter, but I take your point.


I'm not sure if you did take the point intended. You would need a datacentre in every city in the world to get that "global latency".


Yes. Light travels 839 km or 521 miles in 2.8 ms. If you need a round-trip to send a query and get a response, then you would have to be within a few hundred miles of the datacenter, assuming perfect efficiency.


Could you point us toward the amazing switches and routers that have 0 processing latency we would love to buy them.


And vacuum too. Speed of light is smaller in optical fibers and copper wires.


Supporting every region in every public cloud infrastructure provider is absolutely the goal. But we can change the little widget to be less confusing, too.


I find this interesting. Do you plan to have every database record replicated across every one of your datacentres? That would be some crazy high availability.


Yes that is the model. Each region has a full copy of the data and has its own internal replication factor. The customer can select which cloud regions and providers to replicate to, and pay accordingly. So you will be able to choose how many replicas and where they are.


How does the cost work out? I don't see any mention of that on the pricing page.

Or is that just built into the data storage pricing?


You will pay per gigabyte-hour, per-site, once we ship the site selection feature.

Eventually you will be able to arbitrage differences in the underlying infrastructure costs across clouds.


That sounds pretty good


You are assuming that the database is being directly accessed from the end user. If the database is being accessed from within an application running in a datacenter, then you only need a database in that datacenter.

2.8 ms to access data within a datacenter is both good and believable. And if I'm building an application, it is a latency figure that matters to me a lot.


If it's "global" then whichever service provider I choose should see the same latency. If my datacentre is in Brisbane and their closest is Sydney the latency I see will be about 40ms.

I wasn't even thinking of the end user.


Yes, OP is assuming that. The popular use-case for a "global DB" is end-users.

And the popular definition of geographic network latency is a function of distance & the speed of light, not the speed of which an app talks to the DB.


you can't say that and also have 'ACID consistency' on your home page.


It seems like one of the big issues with the marketing copy here is some of the word tricks being played:

Any first-read of "The first serverless database" has the implication that the database is serverless. Comments from FaunaDB folks in this comment page clearly indicate that they mean is that it's the first database for serverless, which is a pretty bold claim, given Google and AWS and any number of other providers offer databases that are accessible from serverless things, so it essentially boils down to "The first database that's marketed specifically to serverless use cases", which is maybe true but also kindof not a useful trophy to put on the mantle?

This is further muddled by the blog post linked to from the launch announcement (https://fauna.com/blog/escape-the-cloud-database-trap-with-s...), which includes "FaunaDB Serverless Cloud is an adaptive, serverless database". Nobody is reading that and thinking "ah, an adaptive database for serverless apps".

To describe it as "The first active-active multi-cloud database", is possibly true if you mean "the first time a single company has sold a publicly-available database-as-a-service running on multiple cloud providers". But the text says "database" where "public database-as-a-service" would be the accurate term, leaving the reader with the impression that no existing databases can be set up on multiple cloud providers in an active-active HA config, which is absurd. Fixing the copy here should be pretty easy, and they're already headed in the right direction with the next bullet point, although it as well refers to "database" where it means "database-as-a-service".

It feels like somebody on marketing really wanted to have a list of firsts, so they toyed with definitions of words until they thought they could flex these into being technically accurate. I get the same feel from the closing argument in the linked blog post: "The query language, data model (including graphs and change feeds), security features, strong consistency, scalability and performance are best in class. There is no downside.". I don't think I want to trust a database if the folks designing it couldn't think of any downsides.


Understand. We can be careful to be more accurate even in less technical contexts in the future.

Serverless is supposed to mean: a database with serverless pricing, for serverless applications.

There's always a tension, too, between "does a feature exist somewhere" and "is it actually usable?". For example, you could perhaps run MySQL Cluster across multiple public clouds...but would you want to? It's surprisingly hard to design for true cross-continent global replication, and the additional latency of crossing public clouds makes it even worse.

We've tried to design the best database. We can put in some bugs for you to give it some downsides.


I'm not talking about bugs, though I promise there are some. No offense intended by that: I'm not aware of any bug-free software, especially when we're talking as complex as databases.

But to tell my your database model has no downsides at all? No use cases where its consistency model isn't optimal, where an implementation detail means that it's not a good fit? Even just skimming the marketing copy, it's clear that if I'm using SQL in my existing app, migrating to FaunaDB has the clear downside that I need to rethink my queries, and that's just the surface layer.


Bugs was a joke.

Analytics is the biggest missing feature...it will come. It's also not cost-effective at the moment for timeseries data.

We've updated the posts in light of some of this feedback.


I'm not aware of any other services which offer pricing that isn't based on number and size of database servers.

Edit: Thanks to all the commenters who corrected me :)


It's entertaining that one can make an innocent HN comment about something esoteric like cloud databases and be quickly corrected by three different Google engineers.

To provide a non-Google example, AWS DynamoDB's pricing is based on throughput (similar to Google Cloud Datastore).


Heh. :)

One technicality: DynamoDB's pricing is based on throughput, as you know, but it's provisioned throughput. (You manage your capacity unit allocations yourself, at least that was the case a little while ago.) You're charged money for how much you provision regardless of whether or not you actually consume the capacity units. Our pricing model (like Fauna's) only takes ops and storage into consideration.


Ah, so just like SDB. I wonder why AWS moved on from that model.


I thank you for the compliment, but I am a lowly PM rather than an engineer, that doesn't find Cloud Databases [1] any more esoteric than the Cloud in general. :)

You'll probably find there are a lot of distributed database geeks at Google, me included, so we all get attracted to exciting posts on new distributed systems.

[1]: http://db-engines.com/en/blog_post/68


I can vouch for itcmcgrath's lowliness.


One of the reasons I love HN :)


Firebase's Realtime Database does this, too.

https://firebase.google.com/pricing/

(Disclosure: I work on Cloud Datastore, which is another Google Cloud product; it, too, does pricing based on ops and storage rather than provisioning, as my colleagues note.)


Cloud Datastore does exactly that:

https://cloud.google.com/datastore/pricing

Disclosure: I work on Google Cloud (and sit near the Datastore team!)


Take a look at Cloud Datastore's pricing, which is probably a leader in this area (I work on it).


Hey everybody, today we launched FaunaDB Serverless Cloud, 4 years in the making. FaunaDB is a strongly consistent, globally distributed operational database. It’s relational, but not SQL.

We're excited to open our doors and explain more of our design decisions. Our team is from Twitter, and that experience has deeply informed our interface and architecture. Try it out and let us know what you think.

An on-premises release is coming later this year.


> FaunaDB Serverless Cloud

By "Serverless", do you just mean DBaaS? "Serverless" in the context of a database is kinda weird because the data does have to be stored somewhere. This branding doesn't make much sense to me.

> It’s relational, but not SQL.

Why not SQL? Is there something your query language supports that SQL doesn't?


In answer to your first question, we have a blog post about the provisioning trap here: https://fauna.com/blog/escape-the-cloud-database-trap-with-s...


It still doesn't make much sense to me.

> With pay-as-you-go pricing, your database costs nothing when no one is using it. Combine it with a function-as-a-service provider like AWS Lambda or Google Cloud Functions, and your entire cost structure scales dynamically with usage.

In provisioning terms, how is FaunaDB different from existing alternatives like Google Cloud Datastore? What makes it "Serverless"? Are you just saying PAYG = "Serverless"?


And to see the depth of integration we're taking, here is a Python crud service example, and see elsewhere in this thread for a link to our production authentication service: https://serverless.com/blog/serverless-fauna-python-example/


It is similar to datastore, but with full-blown database features like transactions, joins, multi-region replication, temporal history, covering and compound indexes, graph queries, stored procedures, etc., and you can run it on-premises if you want (currently in beta).


PM for Cloud Datastore here.

We also have multi-document transactions, self-joins, active-active multi-region replication by default, secondary indexes, composite indexes, a SQL-like query language, PAYG model with no capacity planning, etc, so I think it's a bit disingenuous to say it's "similar to datastore, but with full-blown database features".

Otherwise, congrats on your launch!


I like Google Datastore alot - I think it is the best of the serverless databases at the moment, assuming you are wanting to build "normal" crud style web applications to store stuff and query it in a way that is reasonably familiar to most programmers. It's great and well thought out.

The weird thing about Google Cloud Datastore is that its essentially got no real search functionality.

I say "weird" because, well, it's Google and you'd think that the search thing would pervade everything.

I recently had need of a serverless database, and I'm a huge fan of AWS and I use AWS Lambda but AWS really lets its side down with its serverless databases DynamoDB and SimpleDB, neither of which, in my opinion, are usable for "ordinary" applications.

So in the end I implemented Google Cloud Datastore in my AWS Lambda functions.

After a couple of days I bailed out on the strategy and installed Postgres on an instance and now I'm using that.

There were several reasons to finally go (back) to Postgres. First, the Google Cloud SDK for JavaScript was huge which was a nightmare for uploading AWS function zip code. Second it made me really nervous that I couldn't do any sort of LIKE or search query, which I need in my application. Third the documentation for DataStore was good but not great and there wasn't alot of third party stuff written on it. Given it was early in my project I decided to retreat to the safety of Postgres to avoid buyer regret.

I still feel that there is a real need for better serverless databases - Google Cloud Datastore has my vote as best so far but not ready yet for the sorts of thing I want to build.


Thanks for the feedback!

I happy to hear you gave it a shot and I've heard several times about it being used successfully from Lambda (hopefully people can look at GCP's Cloud Functions now, but I'm biased). Sorry to hear it didn't work out.

Search is something we're looking at, and we have a high quality bar on reliability and scalability aspects as a service that handles 10's of trillions of requests per month.

I'll also pass on your feedback about the JavaScript SDK to its team.


You should only need the @google-cloud/datastore module:

https://googlecloudplatform.github.io/google-cloud-node/#/do...

and like Dan said, you can come over to Cloud Functions now if you'd like ;).

Disclosure: I work on Google Cloud.


I tried with only the google cloud datastore SDK and it took the zipped(!) filesize down from 23 meg to 18 - not a substantial difference.

And AWS has me hooked on its cognito service so I'd never even be able to try Google Functions because as far as I know there is no Cognito equivalent in the Google ecosystem. Cognito is awesome.


I can't speak to the SDK parts, but take a look at Firebase's suite more broadly. :) Viz a viz Cognito, check out Authentication and the Realtime Database in particular.


> active-active multi-region replication

How does this work? Is the Datastore data available locally in every region? Docs don't make this clear.

We were looking for something like DocumentDB that has replication to any region you want to run in.


Datastore can be created in multi-region configurations. This is deployed over multiple regions in a single continent/nearby geographic areas, e.g. USA, Europe, and Asia. Google Cloud Storage has the same features.


We have a series of locations that determine replication topology, but you cannot do à la carte region selection at this time. Under the hood we run on at least 3 independent private super fast network fabrics between regions, so we have a pretty high bar infrastructure wise for deployments. We bet on having reliable high-grade deployments and Google's world wide network to get requests to users faster rather than stretching out arbitrary deployments. Seems to work so far for customers like Snap & Niantic Labs (Pokemon GO), but YMMV.


For Cloud Datastore, looks like you choose a region to run in:

https://cloud.google.com/datastore/docs/locations

(Not sure about the rest.)


Thank you.

I should note that part of our deployment is on Google Compute Engine, and we like it.


Awesome!


We also have transactions [1], compound indexes [2] and multi-region replication [3]. I'm not sure what defines a "full-blown" database, but that much is true.

[1] https://cloud.google.com/datastore/docs/concepts/transaction...

[2] https://cloud.google.com/datastore/docs/concepts/indexes

[3] https://cloud.google.com/docs/geography-and-regions#multi-re...


> It’s relational, but not SQL.

Why not? If you already support relational algebra, it seems like a non-brainer to just add SQL. Even if it's only SQL-92 you would be able to support some existing tools/ORMs almost for free.


This is explained a bit in a blog post[1]. Basically, this is because FaunaDB uses Calvin to do distributed transactions, which makes it hard to support OLAP-style SQL sessions. (If I understand the Calvin paper[2] correctly, this has to do with a combination of how it doesn't support "dependent" transactions, and how it does concurrency control.)

1: https://fauna.com/blog/distributed-acid-transaction-performa...

2: http://cs.yale.edu/homes/thomson/publications/calvin-sigmod1...


Yes, it has to do with the way Calvin handles transactions, which are required to declare their read and write sets before executing.

These kind of transactions are also called static, and are normally of the type "I have this read/write operation(s) against multiple keys, go and do it" vs dynamic transactions that might depend on read values from the db to figure out what to do next.


We allow you to make different writes depending on reads. The query language has loop and control flow structures, so you can push a lot of the logic to the database. You can read more about the query language here: https://fauna.com/documentation/queries


You can issue writes depending on reads, but are they performed in the same transaction?

If I do `if(read(A) = 0) then write(B, 1) else write(C, 2)`, would that be executed in 1 or 2 transactions?


It should be executed in a single transaction. But the query may be scheduled more than once.

According to the Calvin paper, there are two strategies here:

* Either, the write set of the query is considered to be {B,C}.

* Or the query if rewritten and scheduled several times, the `if` operation beeing replaced by an assert operation, until the assertion is indeed true, in which case the transaction proceeds using a smaller write set. If not, the query is scheduled again after having been rewritten to match the current values. In other words, if A equals 0 then the query is rewritten as `assert A == 0 then Write(B,1)`. Otherwise the query is rewritten as `assert A != 0 then Write(C,2)`.

Months ago, I implemented a prototype after the Calvin paper and used the first strategy. It may imply more contentions and even lock the whole dataset with queries like `Write(Read(A),a))`. Sadly, such queries are not rare: updating a distributed index is the typical case.

What is the approach of FaunaDB ?


Ah, yeah, I suspected it could be fixed with your first approach, but the second one also sounds plausible.

But if you go with the first option, then you can end up locking the entire database, as you pointed out with `write(read(A), _)`. Wouldn't your second option cause multiple transaction aborts until you find the rewrite that satisfies the assert? And in that case, the scheduler would need to know if the abort was actually caused by a faulty rewrite.

It would be interesting to know what Fauna does in these cases.

Is your implementation open somewhere?


The prototype is unfortunately tied to a more global project, which I'm not ready to make open.

This is a POC implemented in OCaml using Kafka. I'll take a few hours to see what I can open.


Sounds great! looking forward to it



Thanks for opening it up! I'll give it a look as soon as I get the time.


4 years seems a very long time for development.

I'd be interested to hear why, and what you would do differently if you were to start from scratch.


I don't think 4 years is long at all for developing an advanced, distributed database solution.

It is a long time to go without real world use and user feedback, though.


The beta has been in production with cloud and on-premises customers for two years.


(Disclosure: I work on Google's Cloud Datastore.)

This looks super neat, and I can't wait to learn more about it, but just for the record: I'm pretty sure this isn't the first serverless cloud database. Both Firebase's Realtime Database and Cloud Datastore (which powers Snapchat and Pokemon Go) are serverless; you pay only for your ops and storage. They've been publicly available for several years.


Fair enough; I think it depends where you draw the line between key/value store and database.

Both of those depend on other distributed storage systems under the hood, as far as I am aware? Or is Datastore an end to end system? I know Firebase was backed by MongoDB.


Datastore runs on top of Megastore [1]. You can find out more about our data model here [2], but it's definitely not limited to key-value data.

Our end users don't have to think much about our storage system, though, if we're doing our jobs right. :)

[1] https://cloud.google.com/datastore/docs/articles/balancing-s... [2] https://cloud.google.com/datastore/docs/concepts/entities


An interesting observation that I can't seem to "un-observe" is that Megastore is actually a lot more like MongoDB than one would expect.

Both ostensibly work best when the application fits a hierarchical data model (entity groups vs. documents), and provide out-of-the-box strongly-consistent transactions for a single entity group. MongoDB feels like schemaless Megastore.


:)

Here's a paper with which you might already be familiar, but it's one of the citations for the Megastore paper: http://adrianmarriott.net/logosroot/papers/LifeBeyondTxns.pd.... You'll probably enjoy it (if you haven't already!).


If I remember correctly, Datastore is basically a thin layer on top of Megastore[1] (aka the precursor to Spanner).

1: https://static.googleusercontent.com/media/research.google.c...


It's a lot more than a thin layer, but like most systems at Google is a specialized layer that utilizes built on top of more fundamental building blocks that are designed and proven to do a particular job really well.

In this case Megastore provides the underly multi-region/datacenter K-V replication services.

All the database features like secondary & composite indexes, query language, multi-tenancy support, PAYG model, etc, etc, are built in the Cloud Datastore layer.


> secondary & composite indexes

Interesting, so you don't use Megastore's indexes?


As you noted earlier, Megastore has a schema and we don't, so we have our own index implementation, yes. :)


In technology evolution there are technologies that enable a new ecosystem, and then there are technologies that are built natively for that ecosystem. The previous generation of datastores enabled Lambda style applications, the next generation of databases assumes they are the new normal.

The reasons FaunaDB fits serverless like a glove can be boiled down to a few points: pay-as-you-go, database security awareness and object level access control, hierarchical multi tenancy with quality of service management. Running on multiple clouds makes the Serverless model more acceptable for risk averse enterprises, and complements multi-cloud serverless FaaS execution environments nicely.

There's more to say, check out this post on the blog: https://fauna.com/blog/serverless-cloud-database and https://fauna.com/blog/escape-the-cloud-database-trap-with-s...


I'm familiar with both; what disappoints me is the claim of novelty here with respect to autoscaling. That's just not true. To quote you:

"A serverless system must scale dynamically per request. Current popular cloud databases do not support this level of elasticity—you have to pay for capacity you don’t use. Additionally, they often lack support for joins, indexes, authentication, and other capabilities necessary to build a rich application."

That first criterion we absolutely meet, today. Cloud Datastore has been doing that for eight years now. We don't have joins, but we do have indexes, auth, multi-region replication and a whole lot more.


That's why it comes down to the details and the fit and finish. One neat feature is the ability to run Lambdas with a database access token corresponding to a particular user, which can then be passed through to sub-Lambdas (or it can even run with sub-permissions). Here is a blog post with quickstart instructions: https://serverless.com/blog/faunadb-serverless-authenticatio...

For instance, you could have a fire-and-forget self-service self-provisioning online shopping site builder, and bill database costs through to your customers (we give you that information in response headers).

You can also use FaunaDB to do consistent coordination between FaaS execution environments running in different clouds. So if you like a processing feature Azure makes available, but want to run your user facing servers in GCE, you can use FaunaDB to coordinate between the clouds.


Again, neat stuff, but that's not what this link claims:

  * The first serverless database
  * The first active-active multi-cloud database
  * The first strongly-consistent multi-region database available to the public
None of these are firsts. I don't know if our (GCP) services are themselves the first of their kind (it's an ambitious claim and, as an engineer, I try to be careful about those), but Datastore meets at least two of those three and predates FaunaDB by several years.


Maybe the first one.

GCP can't span multiple public cloud providers, or even different continents within GCP, apparently.

Indexes and cross-partition transactions aren't consistent, which doesn't meet, to us, the minimum bar for utility. Your docs say the consistent write throughput per entity group is 1 write per second?


Perhaps I've misunderstood you, but I'm pretty sure cross-partition (I assume you mean cross-entity-group in our terms) transactions are in fact consistent (not totally sure what you mean by transactions being consistent, per se; if you're talking about serializability, at least, we are). Explicitly (from [1]):

  Queries that participate in a transaction are always strongly consistent.
And the consistent write throughput to which you refer means sustainable write throughput per entity group. We can burst much higher.

It would be much easier to assess the relative consistency models of our products if FaunaDB had documentation with respect to its claims. We have a litany of pages about ours (e.g. [2] and [3]).

[1] https://cloud.google.com/datastore/docs/concepts/structuring...

[2] https://cloud.google.com/appengine/articles/transaction_isol...

[3] https://cloud.google.com/datastore/docs/concepts/transaction...


Those docs are coming. I still don't see how you can make any useful claim about consistency when indexes are never isolated or consistent, and sustained consistent write throughput can't exceed 1 wps.


I feel like I need to note the pricing. $0.01 per 1,000 queries. That doesn't sound like much, but it adds up. Let's say you make 1,000/sec. $0.01 * 60 seconds in a minute * 60 minutes in an hour * 24 hours in a day * 30 days in a month = $25,920.

Is that a lot? I think it is. Google Cloud Spanner costs $0.90/hour per node or around $650/mo. Each Cloud Spanner node can do around 10,000 queries per second[1]. So, $650 to Google gets you 10x the queries that $25,920 to Fauna gets you. I mean, for $25,920, you could get a Spanner cluster with 40 servers. Each of those servers would only have to handle 25 queries per second to get you 1,000 queries per second.

I'm sure that people are going to question whether FaunaDB can actually do what it claims. At this pricing, I can't imagine someone actually seeing if they can live up to their claims. They have a graph showing linear scaling to 2M reads per second. Based on their pricing, that would be $630M per year. For comparison, Snapchat committed to spending $400M per year on Google Cloud and another $100M on AWS (and people thought the spend was outrageous even for a company valued at tens of billions of dollars). This is more money for the database alone.

Heck, it looks like one can get 5-20k queries per second out of Google's Cloud SQL MySQL on a highmem-16 costing $1k/mo[2]. That would cost $130k-$500k on FaunaDB. It seems like the pricing of FaunaDB is off by a couple orders of magnitude.

Ultimately, Spanner is something built by people that published a notable research paper and used by Google. Reading the paper, you can understand how Spanner works and be saddened that you don't have TrueTime servers powered by GPS and atomic clocks. FaunaDB has some marketing speak about how I'll never have to worry about things ever again - without telling me how it will achieve that.

It's also implemented in Scala. This isn't a dig on Scala or the JVM, but I use three datastores on the JVM and only one isn't sad for it is Kafka. But Kafka does very little in the JVM - it basically just leans on sendfile to handle stuff which means you don't get bad GC cycles or lots of allocations and copying.

FaunaDB is a datastore without much information other than "it's great for everything and scales perfectly". Well, at their pricing, they might be able to make it happen. I mean, most customers would simply move to something cheaper as they got beyond small amounts of traffic due to the pricing. 60,000 queries per second? That'll be $18M per year from FaunaDB or $50k per year from Google. It's not even in the same ballpark. If you really need to scale to 2M reads per second, $630M seems like a lot more than $1.6M for Spanner.

Maybe it's an easy way to get some money off people that "need a web scale database", but are actually going to be serving like 10 queries per second and are willing to spend $260/mo to serve that. If they hit it big, it shouldn't be insane to scale it to 10,000 queries per second and milk $260k out of them each month for a workload that can be handled by a single machine. That money also pays for decent ops people to run a big box and consult with the customer if they're going towards 100k queries per second with a $2.6M monthly payment.

EDIT: looking over Fauna's blog and some of their comments here, they seem to understand more than their marketing lets on. Daniel Abadi is one of those people whose name carries weight in the databases world (having been involved with C-Store/Vertica, H-Store/VoltDB, and others). While I haven't read the Calvin paper, it looks like a good read. I can see that they are using logical clocks and I can't find it right now, but I thought I saw that they're not allowing one to keep transaction sessions checked out - that all the operations must be specified. So, it seems like there's some decent stuff in there that's currently being obscured by marketing-speak. Still, the pricing seems really curious.

[1] https://cloud.google.com/spanner/docs/instance-configuration

[2] https://www.pythian.com/blog/benchmarking-google-cloud-sql-i...


To add extra color, for about $3M/month @ list prices of Cloud Datastore [1], you can, in a Multi-Region active-active synchronous replication configuration, run a workload with the following profile:

Reads: >1.1M entities/second Write: >380K entities/second Delete: >190K/second Storage: 100TB

And that's if you don't use any of the nearly free optimizations like Projection queries & keys-only queries, which any large scale customer does.

That's not pre-provisioned usage, it's actual pay-as-you-go usage - so if you have no traffic, you have no costs (except for what's already stored). It's been that way for 8 years too.

[1]: https://cloud.google.com/products/calculator/#id=e21b61d5-4a...

(PM for Cloud Datastore - if you'll looking at 1M+ QPS workloads feel free to message me)


Huge scale is what FaunaDB On-Premises is for; the pricing model is different. That's what NVIDIA uses for example. Nevertheless, we will have volume discounts and reserved capacity in Cloud too.

I see where you're coming from. People make the same argument against using cloud services at all when you can buy hardware yourself and operate it. The lack of flexibility is the hidden cost.

Our cloud pricing is competitive with other vendors, most of which require you to massively over-provision in order to get high availability, especially global availability, as well as predictable performance. In traditional cloud databases, you have to provision for peak load. Usually this is an order of magnitude difference from average load. An order of magnitude difference happens to matches your Spanner example exactly; however with Spanner, you still have to manage your capacity "by hand".

Architecture docs are on the way.


You're right that it's was a bit unfair to compare a flexible FaunaDB to Spanner which you'd need to provision for peak traffic. But even if it's an order of magnitude more, $16M vs $630M is still quite a gap. It really doesn't match the Spanner example. And if you're able to handle incredibly spiky loads, information on how is kinda important. If I go from a steady state of 100 QPS to 15,000 QPS for a 20 minute period, will that just be pain?

You've said that Spanner makes you manage capacity by hand, but the marketing copy says, "FaunaDB is adaptive, because it lets you change your infrastructure footprint on the fly. Dynamically shift resources to critical applications, elastically add capacity during peak events, and replicate data around the world—all in a unified data fabric." So, if I'm expecting a burst of traffic, do I have to "change my infrastructure footprint" manually? How quickly can one "elastically add capacity"? I mean, I've seen plenty of systems that one can add capacity to that, well, get humbled when copying data to new nodes. Like, you had 10 nodes and now you want 15 because you're being hammered. And wonderful, it's trying to copy data to the new nodes while it was already having capacity issues and only making response times worse and errors go up. I'm not saying that will happen to you, but there's no information to make me think that problem is addressed.

Honestly, people involved in FaunaDB seem to know enough about databases that I'd just expect more real information on the website. When Kudu came out, they published a paper that basically read like, "well, we created a column store kinda like one would if you'd read the C-Store paper and these are the trade-offs and we seem to have done reasonably" and I came away from reading it thinking, "ok, these people know the score. It may or may not be executed well enough, but there's an understanding." They led with a paper that might not have been revolutionary, but really showed that they understood the space and explained how it was designed such that someone with databases knowledge could see that it was reasonable.

Introducing your database with so much, well, non-information doesn't help you (in my opinion). Without digging, it looks like another DB vendor that promises everything will be perfect and that it's great for any workload.

The whole "About FaunaDB" page doesn't tell me much. Like, there's a comment in here that tells me you're using logical clocks, I can see from Daniel's Twitter that you're using some of his research, etc. I mean, you actually have cool technical details to highlight - details that make your DB seem a lot more real. But the page makes it feel like you don't have cool technical details - that you're trying to hide information because it's not good. I mean, adding in some details about how things are achieved make a product seem a lot more real. I know what logical clocks are. Calvin is a research paper I can read. I mean, finding that makes FaunaDB seem way more real - there's something substantive. Like, I can read Calvin tomorrow and some of the ways you're achieving things will come to light and I might be impressed.

But right now, it's really hard to find the information that would impress technical readers.


I'm with you. That level of detail is coming soon.


If we're on the topic comparing Spanner, here's the 15-second live demo of resizing Spanner from 70 to 99 nodes at [0]. The act itself is quite unremarkable, but the complexity abstracted away is awesome.

Both Spanner and Datastore do quite well in the cloud for "huge scale" as fully managed services. And with any deployment on-premise, one certainly must manage their own capacity "by hand".

(work at Google Cloud)

[0] https://youtu.be/kwnWfHq2EfQ?t=11m48s


Hi guys. Can I ask a simple question?

I understand that we are talking about a globally distributed, serverless and yet consistent relational database.

My question is about latency. How long does it take for transactional atomicity to become a consistent read on a globally distributed database? (1) And what are the measures taken between entry nodes to prevent clients from recieving inconsistent data? (2)

As I ponder this, I am struck by not the consistency problem, as that is solvable. But I am struck by the latency problem of assuring that all global queries are consistent for some (any) time quanta. What sort of latency should be expected?

both questions (1) and (2) are interesting, but (1) is critical while (2) is academic.

Thanks, and very interesting work guys.

EL


FaunaDB has a per-database replicated transaction log. Once a transaction has been globally committed to the log, it is applied to each local partition that covers part of the transaction. By this point, the transaction's order with respect to others in a database and results are determined. While writes require global coordination to commit, reads across partitions are coordinated via a snapshot time per query, which guarantees correctness.

In short, writes require a global round-trip through the transaction pipeline; reads are local and low latency.


This is a very good answer. So if I understand you correctly (please correct if I do not), atomicity is handled on a per connection basis (writes cannot be distrubted). And there may be high latency in distributing a transaction, but read consistency is guaranteed by timestamp (equivalent to versioning).

Is this correct?

EL


One thing that makes this easier is that FaunaDB does not support session transactions, rather you must express your transaction logic as a single Fauna query, which is executed atomically. Transactions can still involve arbitrary keys, however.

And yes, for reads, by default the coordinating node chooses a timestamp and uses that to query all involved data partitions. Each partition will respond with the requested data as of at that timestamp, or will delay responding until it has caught up.

One nice thing about this approach is that any chosen timestamp is enough to provide a consistent snapshot view of the dataset at that time. This ends up being useful for bulk or incremental reads, where a longer running process needs a stable view of the dataset.


Without session transaction, how does the application performs transactional read modify write?


Okay, Last question because I'm in Asia and it's 7:30 and I am tired. (But yea, I'm a Midwestern computer scientist who just happens to be in Asia ;-)

Question #2 -- Two, or Three, or More transactions occur simultaneously that want to change data D. each of these transactions send out transaction logs that contradict each other. What happens?

EL


snapshot time as in wall clock time? What is the time source?


It's variable, and the client can provide it. By default, we use the greater of wall clock time or the highest timestamp transaction timestamp the node serving the query has seen.


In that case, your database loses data and is definitely not strongly consistent. Why do you claim it is?


To be clear, read-write transactions are applied in a consistent order derived from the transaction log. Read-only transactions can rely on the fact that read-write transactions are totally ordered, and execute without global coordination while still providing a guarantee of serializability. Wall clock time is used as a suggestion only, in order for the database's logical timestamps to track as close to real time as possible. We will have more information about how Fauna meets its consistency guarantees in a future blog post.


you can't have a read transaction that's serializable that doesn't go through global coordination unless all of the data is guaranteed to only ever be local. Looking forward to the blog post.


looking forward to Jepsen...


Looks like FaunaDB uses Raft[1], so I'd expect that data is sharded into multiple consensus groups, like Spanner or Megastore. That would mean consistency on a single shard/consensus group is basically just dependent on reading from and writing to the Raft leader.

1: https://news.ycombinator.com/item?id=13645876


I think they missed Google's launch of Spanner their distributed strongly consistent DB.


Spanner is great, but it's not pay as you go, or multi-region (yet), or multi-cloud.


$0.01 per simple operation sounds very expensive to me. This would add up very quickly.

Edit: I misread it. Perhaps instead of inventing your own point system that you have to explain and hope silly people (like me) don't mix up you could take a lesson from Google Cloud and just lay out the pricing in a table. If you ever add another service you'll have to integrate it also into your made up points system.


It's $0.01 per thousand, not each.


You're right. I realised that and came to correct it. Thanks for pointing it out.


That pricing model and serverless model is why I've always chosen CouchDB/Cloudant. If I'm doing the MB/hour to GB/month conversion correctly, Fauna cloud is significantly cheaper.

I see Fauna has temporal queries, but receiving events is strictly pull, there is no push or single feed?


Event push / feeds are on the roadmap. Currently we have everything implemented at the data model level to do live query feeds, you just have to do polling until we ship the feature.

I'm working on a follow up example to this CRUD one, that implements a multi-user TodoMVC, and will use event queries to keep the UI updated between tabs and users. You can see the basic Serverless CRUD starter example here: https://fauna.com/blog/serverless-cloud-database


There is a related technical blog post [1] and discussion [2]. Also I've got a companion blog post on the Serverless.com blog at [3]

[1] https://fauna.com/blog/escape-the-cloud-database-trap-with-s...

[2] https://news.ycombinator.com/item?id=13877223

[3] https://serverless.com/blog/faunadb-serverless-authenticatio...


What's the biggest deployment so far?


We have a bunch of customers listed in the press release. [0] Our managed cloud installation is about the size of our large customers. NVIDIA launched their latest world-scale user facing service on top of FaunaDB.

Personally I'm more interested in helping people writing fresh apps use FaunaDB, because while we can solve enterprise problems at scale, it's the greenfield apps that will be able to best use our advanced features.

[0] http://finance.yahoo.com/news/fauna-launches-faunadb-serverl...


Serverless Database, Global Latency 2.8 ms, Relational but no SQL (whatever sense that makes) BULLSHIT BINGO at its very best.


You could put this more nicely - they have spent alot of time working on it. I'm sure no-one intends "BULLSHIT".

One way to handle it when something strikes you as not correct is to politely ask for clarification.


Well no. A Global latency of 2.8 ms is just bullshit. As others pointed out, they did not beat the speed of light. I hate stuff like this. If you really have a worthy product just point out what it really can do, not what in a lab environment worked once.


I agree with the buzzword bingo complaint, but they really do have a claim that is defensible here.

Their claim is that you can run an application distributed in a cloud around the world with data in their database, and read queries to their database will get results in an average of 2.8 ms.

This puts a lot of caveats on that 2.8 ms claim, and makes it something that is both good and believable. Which makes the claim very much not bullshit.


Snackai is right. Bullshit stays bullshit.


You should explain how it works. It's not like I'm going to steal your ideas and spend five years implementing them ... or maybe I will if it's good ;)


The node that receives the query figures out which nodes it needs to talk to for an answer, and then it adds the operations to the next batch for those nodes. The batch dispatches, the nodes do their work, the batch commits, and the client receives the response.

It's not really a full answer, but maybe hints at the architecture.


I have been a fan of Evan going way back to the early Rails days. Congrats on the launch.


I'm curious about the relational-ness of FaunaDB. e.g. How do you efficiently maintain integrity of foreign key constraints across the entire system? How fast and consistent are secondary indexes?


So... where does the data go? Maybe a simpleton question but I couldn't easily find an answer in the about section. If it's all function-based, where does the data actually get persisted?


This is a database FOR serverless style applications. It runs on servers like most databases, it's not made out of lambdas. But it's built so you don't have to worry about the details. When a traffic spike hits your app we'll keep up. And when your app is quiet, you don't pay for unused capacity.


Ooooooooooh, that makes sense now. I thought it was a database implemented as as whole series of individual lambdas and I was having a hard time figuring out how you guys pulled that one off.

Data stored "on the wire"? :D


Pretty sure this is how reality works: all matter is information, all information is functional, hence all perception is the lazy evaluation of a functional universe.

It merely remains to turn this into a startup.


Data stored "on the wire"? :D

pingfs - Stores your data in ICMP ping packets

https://github.com/yarrick/pingfs

https://news.ycombinator.com/item?id=9844725

There should be a Rule 34 for insane software ideas :)


Could be done with data stored in files in S3 buckets. Probably not the most efficient way to run a queryable database, but possibly not the worst either.


If my calculations are correct, that's about $87+ million USD to store 1 PB of data for one year?


Yes that's what I worked out as well. That's not doing any operations just storage.


That doesn't seem right. Will investigate.


1 [mb hour] * 24 [hours] * 365 [days] * 10^9 [mb] / 1000 [points] * 0.01 [$] = 87.6 million

So unless you have some quantity discounts, that would seem to be the price for storing 1 PB , without any querying.


Yeah it's our mistake. It's actually gigabyte-hour.


So in the OP's example, the cost is $87,000 per petabyte-year of storage?

EDIT: I see mentioned in another post this a per-replica cost. So it would be roughly $87,000 times the number of replicas, ignoring the initial queries that inserted the data in the first place?


I think so but we are going to run the cost regressions again and make sure it's in line with market.


How are you running this serverless? Is it a thin application in front of AWS or Google BigQuery?


No, this is not made OF serverless, it's made FOR serverless. We run on big nodes in the same cloud as your app.


Cool. My team has been looking for something like this.


You can jump directly into the developer dashboard with only an email address here, and start playing with queries: https://fauna.com/serverless-cloud-sign-up


My team wants to understand this in context of other databases. Are there any architecture docs available?


We have a white paper coming soon...if you email us at priority@fauna.com we can give you a preview.


This sounds a lot like google spanner. I'm no expert though. What's the difference?


The big difference is that we use a logical clock and batches of operations so that we aren't dependent on atomic clocks. We also have a different API style, and plan to run on all the major cloud providers. You'll be able to do consistent operations visible to application code running in different providers.


can't wait to learn more


The docs are now public...so you can.

We have a blog post by Daniel Abadi coming as well about the consistency model.


Have you talked to Aphyr yet about testing it and having it become an entry in https://aphyr.com/tags/jepsen?

I've learned to not believe that distributed software will work in practice in the way that its authors claim it will. The stronger the claims, the more important it is to have an independent test validating it before I even think of trusting it.


Totally understandable. We have been talking to Kyle. We have internal verification systems and will be publishing more later this year to that effect.

For what it's worth, we have built high-performance distributed systems before....so it's not just wishful thinking.


Do you support a hard limit on money spent? I would like to be able to say 30 bucks a month max or something


If your database is not Open Source then your marketing lingo needs to be more open or else you'll have the same mistake as FoundationDB (which looked like vapor-ware).

As a proprietary service, you are now competing against Cloud Spanner which (while people love the underdog) means your toast because they have Eric Brewer to hand wave away their marketing lingo.

On the flip side, you are competing against Cockroach, but they are Open Source so that puts you in a rock and a hard place. From previous comments of mine, you may know I don't think Cockroach has much of a future either because Globally Consistent databases aren't going to cater to the necessary P2P future of the web (5B+ new people coming online, 100B+ IoT devices, graph enabled social web, Machine Learning, etc. which is what we, http://gun.js.org/ , caters to and we just successfully ran load tests on low end hardware doing 1.7K table inserts/sec across a federated system, we plan on getting this up to 10K inserts/second on cheap [if not free] hardware).

Why are these systems going to fail to pick up the market? Because the best of the best, both in engineering and as an Open Source community, RethinkDB (which I praise highly) couldn't. At the end of the day, the few companies that need globally consistent transactions will trust (for better or for worse) Cloud Spanner, and the others who want to roll their own infrastructure will try Cockroach but ultimately switch to RethinkDB in the end.

So on that note, as others have noted, don't use your /fantastic/ marketing opportunities (top of HN) to make false claims about being "industry first", it won't help you gather a developer community. Use this time to win developers over like Firebase did (which itself now has their community scared of when/if Google will shut them down, those developers are now flooding to RethinkDB and ours, despite Firebase being one of the best - high praise for them as well, like Rethink).


> Globally Consistent databases aren't going to cater to the necessary P2P future of the web

Well, that's an interesting assertion. Why do you think that?


Because even a "3ms latency" (which was a problem, with respect to "global", that other people have commented out) can absolutely kill the performance for IoT data that may be emitting thousands of updates a second.

Those systems are largely highly localized, and so /strong eventual consistency/ is more important than globally consistent blocking operations.

Also, again with 5B+ people coming online, Master-Slave systems (even distributed ones) still have a huge bottleneck already in the present day. P2P systems (master-master) will scale better in these settings.


I was more curious about the "necessary P2P future of the web" part.

I think there's an assumption here that most of the responsibility for storing the source of truth will move out to things like IoT devices (i.e. fog computing).

And sure, there will probably be a need for that. But regarding the assumption that most web services will go away, I don't there's sufficient evidence to bet on it happening anytime reasonably soon. Data centers and public clouds will probably still be there in the next decade or two.


Twitter is spending over $15M/month on server costs alone to support 333M active monthly users.

Now compare to Pokemon Go's huge explosion of 20M daily users from a while ago.

This problem is only going to get worse with another 5B+ people coming online into the 2020s.

In order to scale, using (what you call) "fog computing" will be absolutely necessary. Cloud services will still be used, of course, but they will be built as P2P systems to take advantage of the "fog".

Cloud infrastructure will always be around, but how apps are built will be a fundamentally different architecture. But when S3 goes out, like it did the other week, we can't suffer worldwide downtime - that will be unacceptable.

Rethink's unfortunate failure to capitalize in this market is a signal that Master-Slave databases (even the best of the best) will have a very small role with respect to the total amount of data flowing through the internet.

My thoughts here: https://hackernoon.com/the-implications-of-rethinkdb-and-par...

As well as my the Changelog podcast interview: https://changelog.com/podcast/236


Why do you think people will leave CockroachDB for RethinkDB?

I ask this as a long-time user of RethinkDB.


While Cockroach has more emphasis on being globally consistent than Rethink (which has more emphasis on realtime), they are both distributed Master-Slave systems. So:

(1) RethinkDB got good reviews/patches on the Jepsen tests, the recent Cockroach review wasn't as successful (although I'm sure they'll get patches and performance up).

(2) The convenience of the realtime updates and developer community friendliness is going to win over (from a social perspective) the types of startups/teams that choose to roll their own not-locked-in all-open-source infrastructures that they deploy to clouds.

I'm pretty strongly opinionated on these things, I think Firebase and RethinkDB nailed it, and other contenders (in those spaces, whether a Master-Slave service or open source one) have hard fighting battles.


1. CockroachDB is still in beta.

2. I've been using RethinkDB for years and I've never found a use for the realtime updates. I think the benefit of that is mostly limited to chat apps, realtime collaboration, etc.


It's all based on use case, I guess. I spent the early part of my programming career building ERP and Accounting type systems, where real time updates are not factored into any design.

However, of late, I have been working on collaborative type apps, including IoT device programming, and real time updating is not just a luxury, it is expected. Indeed we are seeing things like SSE (Server Send Events) etc. being incorporated in the latest browser specs to support this.

Granted, unless you are using frameworks like Meteor etc., there is still a lot of work to be done to ease the integration between back end server push and browser real time display. Websockets are great, but require a lot of tedious management at scale.

But the thing is - once you start down the path of realtime updated apps, possibilities open up, and you begin to wonder how you used to program without it. For me, it all started when I knocked together this [0] real time update of Hacker News as a weekend project using RethinkDB for push updates, and Vue.js as the front end...

[0] - https://tophn.info


Interesting, would you mind sharing more of what you are doing? Batch processing, or something? The category of "realtime collaboration" seems to be the broad catch-all that I'm thinking of (todo lists/trellos, chat apps/gitter, social networks/facebook, search apps/google, productivity suites/gDocs, recording apps/youtube, automation tools/IFFT), plus the hype around drones, IoT, ML, etc.

That would be excluding banking apps, reports, etc. could you expand on yours/other uses that don't benefit from live updates?


I use it mainly for storing user data, error data, login info, etc. I can't imagine how realtime could be useful for that.


Ahh, that makes sense. Logs/status and such. Thanks!


No probs. I suspect a lot of organisations use a database mainly for user data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: