Hacker News new | past | comments | ask | show | jobs | submit login
Latency Comparison: DynamoDB vs. FaunaDB vs. Redis (news-app-two-omega.vercel.app)
57 points by noahfschr on April 13, 2021 | hide | past | favorite | 55 comments

As others have pointed out, measuring latency from an AWS Lambda function to a co-located single node in-memory non-durable key-value database (Redis), or to a co-located single AZ eventually consistent key-value database (DynamoDB), doesn't have anything to do with measuring client-observed latency from the browser to a globally distributed ACID-compliant document database.

A similar process co-located with a Fauna region normally can also perform simple reads in small numbers of milliseconds.

Similarly, a browser client querying a lambda function multiple times from the other side of the world will also be quite slow, even if the lambda "thinks" its queries are fast because its database is right next to it.

It is not completely clear to me what else is going wrong, but the basic premise of the benchmark is invalid, and there are other errors in regard to Fauna. For example, index serializability only affects write throughput; it has nothing to do with read latency. The status page reports write latency, not read latency, etc.

For a more even-handed comparison of Fauna to DynamoDB, see this blog post: https://fauna.com/blog/comparing-fauna-and-dynamodb-pricing-...

I think we figured out at least one issue here.

Fauna is a temporal database, not a time series database. The code in the test that updates the score on every post after every read is creating new historical events every time it does that. These have to be read and skipped over during the read query which will continually increase latency proportional to the number of updates that have occurred.

By default, Fauna retains this data and makes it queryable (with transactional guarantees) for 30 days, unlike DynamoDB or Redis. Reducing the retention period would help a bit, but event garbage collection is not immediate so there will still be differences for heavily churned documents. Normally, having a few updates to a document or an index has no noticeable impact but in this case it appears to be swamping the other factors in the latency profile.

It is possible to manually remove the previous events in the update query; doing that should reduce the latency. Nevertheless, Fauna is not a time series database so this is a bit of an anti-pattern.

I have commented this section and redeployed the code.

See the Fauna endpoint: https://71q1jyiise.execute-api.us-west-1.amazonaws.com/dev/f...

See the code: https://github.com/upstash/latency-comparison/blob/master/ne...

Did you delete all the extra events that have been created already?

If you mean histogram, yes I reset the histogram for Fauna.

If you mean deleting Faunadb internal events, I do not know how to do. Can you guide me?

AWS DynamoDB is multi-AZ and is strongly consistent (at least for a single key update).

I believe you confused it with a Dynamo DB described in a paper Amazon published long ago. AWS DynamoDB has nothing to do with a eventually consistent design the paper describes. It was purely marketing gimmick to call the new AWS Service that.

Good article, thanks.

I'm currently reading The DynamoDB Book, and even it's author acknowledges that DynamoDB is serverless more as a byproduct and shouldn't be used for that reason alone

But as a product developer all these highly integrated AWS services that can be provisioned with CloudFormation are pretty convincing.

What's Fauna's IaC story?

I will be happy if someone from Fauna team helps me to improve my code. https://github.com/upstash/latency-comparison

Upstash is not non-durable. It is based on multitier storage (memory + EBS) implementing Redis API. In a few weeks, I will add upstash premium which replicates data to multiple zones, to the benchmark app.

In the blog post, I mentioned the qualities where Fauna is stronger than the others: https://blog.upstash.com/latency-comparison#why-is-faunadb-s...

Your own docs say that by default “writes are not guaranteed to be durable even if the client receives a success response”.

Upstash has two consistency modes. Eventual consistency and Strong consistency. Please see: https://docs.upstash.com/overall/consistency

In my code, Upstash database was eventually consistent. Similarly the index in the FaunaDB was not serialized.

But both of those should not affect the latency numbers in the histogram because those numbers are all read latency.

That's an apples to oranges comparison though. Upstash couples durability with consistency/isolation. Regardless of configuration, FaunaDB and DynamoDB both always ensure durability of acknowledged writes with a fault tolerance of greater than 1 node failure. To compare them on equal footing, Upstash would need to be configured for strong consistency, at least according to the docs.

DynamoDB also guarantees your write is distributed to multiple data centers as well.

No it doesn't. The write is not durable in other datacenters when the client acknowledgement is returned. It is still possible to lose data.

It does. The three AZs they replicate across are just as good as anything else someone typically calls a "datacenter." Amazon itself operates retail out of one region and uses multi-AZ as a fault tolerance strategy.

It won’t lose data in the event of a data center failure. Each of the replicas is in a different AZ, and at least two of three have to durably write the data before the put succeeds.

An AZ is not a datacenter.

This is what I was referring to: An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. AZs give customers the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center. All AZs in an AWS Region are interconnected with high-bandwidth, low-latency networking, over fully redundant, dedicated metro fiber providing high-throughput, low-latency networking between AZs. All traffic between AZs is encrypted. The network performance is sufficient to accomplish synchronous replication between AZs. AZs make partitioning applications for high availability easy. If an application is partitioned across AZs, companies are better isolated and protected from issues such as power outages, lightning strikes, tornadoes, earthquakes, and more. AZs are physically separated by a meaningful distance, many kilometers, from any other AZ, although all are within 100 km (60 miles) of each other.

While your points are valid, that you're CTO of Fauna is relevant information.

not really. these are all facts, not opinions.

> For a more even-handed comparison of Fauna to DynamoDB, see this blog post: https://fauna.com/blog/comparing-fauna-and-dynamodb-pricing-...

An even handed comparison from the authors of FaunaDB?

I found that comparison to be technically objective. Did you find some bias on it?

“Writes always go through a leader replica first; reads can come from any replica in eventually-consistent mode, or the leader replica in strongly consistent mode.”

This part isn’t correct. The two follower replicas can serve a consistent read even if the leader is behind / down. And there is no guarantee the primary even has the data persisted to disk when the subsequent read call is made.

If people really want to know how DynamoDB works, this is a good tech talk: https://www.youtube.com/watch?v=yvBR71D0nAQ

As a general rule in benchmarks, if product A does something in 1 ms while product B doing it in 500 ms, something is wrong. Either you misconfigured it or these products serve different purposes so they are not comparable. Either way, your benchmark is flawed unfortunately.

As a general courtesy, when you see a dramatic difference, it’s better to involve parties or ask for review before publishing your results.

FaunaDB is doing more than Upstash and DynamoDB in the author's examples, as the author describes in the related blog post:

- FaunaDB is providing strong consistency and isolation; Upstash and DynamoDB are providing eventual consistency.

- FaunaDB is replicating the data worldwide and offering similar access everywhere; Upstash and DynamoDB are deliberately configured in the same AWS region as the lambda function.

DynamoDB can be used with strong consistency, would be interesting if it doubles the latency.

Dynamo strongly consistent transactions are still limited to a single region, and are eventually consistent outside of that region. For example, in Dynamo, it is not possible to enforce uniqueness via multi-region strongly consistent transactions. Fauna can do this.

The Dynamo transactions do increase latency, but not to the same degree as Fauna. However, they are not achieving the same level of transactional correctness either, or really any correctness at all in a multi-region context.

I struggle to think of a scenario where cross-region transactions would be useful. Why not just pick a primary region where transactions will be carried out for a given piece of data if consistency guarantees are needed?

I'm a DBA, and I agree. Cross-region transactions rarely make sense because if one region is not reachable (down, network-partitioned for more than 1 minute), then you can't do writes to any region. Think about it. :)

I guess if your partition time interval was known to be very short, like a flaky ISDN link, it could make sense for some use cases using retries, but then you should just get a better link.

CockroachDB discusses a multi-city vehicle sharing use case where multi-region transactions could be worth consideration, but I'm skeptical:


(Developers and students get all excited about distributed systems, CAP, etc. but as a DBA, network partitions are largely not solvable from both technical and business standpoints. The solutions that do work include using vector clocks, or investing in a very reliable network, which is what Google is doing with dedicated fiber.)

You can indeed perform writes in other regions; this is the entire point of Calvin, Spanner, and other modern distributed transaction algorithms: maintaining consistency and maximizing availability in the face of partitions. Your perspective is about a decade out of date.

Yes, I know.

I was just interested if a ConsistentRead would change the latency.

It has to to acquire and release locks, so yes.

Are you sure about that? DynamoDB is paxos based, so that seems unnecessary.

Given data is always replicated to at least 2 of 3 storage nodes before ACK’ing, you can always just read from 2 different replicas and be sure you have the latest data.

Oh, I thought you were referring to the multi-key consistent transactions, not single key strong consistency mode. I think you are correct.

My suspicion is that this may not tell the full story. For example, availability-wise I bet there are differences between these databases. As just one example, I bet this person wasn't running with a multi-AZ setup for Upstash, since https://docs.upstash.com/overall/databasetypes says "Multi Zone Replication" is a premium feature. Whereas DDB doesn't even let you store your data in a single AZ, AFAIK (https://docs.aws.amazon.com/amazondynamodb/latest/developerg...).

(My understanding is shallow compared to real experts, but even so I know this is a deep topic and this is only one example of the type of thing you'd want to consider when figuring out whether to take this comparison at face value.)

From other comments, it appears OP is associated with upstash.

I would love to see more options in the serverless persistence space. Given the infinite combination of tradeoffs that exist with databases there's a lot of room for different solutions.

AWS Aurora Serverless v2 is close to what I'm looking for but think it's held back by trying to make non-serverless technology serverless.

Fauna is close but I don't really need global consistency and don't want to pay the 150ms latency price for it: https://status.fauna.com/

Region selection is coming up if that interests you. We are actively working on it :)

I need to do some benchmarking myself but it seems that even in a local region writes are in the 100s of ms. I'm aiming for my Lambda functions to be < 100ms so using Fauna seems difficult to work in.

Unless you meant I can limit "how global" my data is and that would improve write speeds?

The spanner approach could dynamically shift the leader location for parts of the key space, so writes that tend to be done from one location could avoid needing to communicate outside the local region.

I've been using Fauna for a year or so.

What's slowing Fauna here are the global writes and the non-idiomatic FQL queries.

Right now the code is doing a bunch of separate queries, but in idiomatic FQL this would be done in a single transaction.


I'm going to do a PR to update the FQL code if the author accepts it.

Sure, I will.

Thanks for the merge. As you pointed in the PR you're only measuring the latency of the first query so it will have no effect in the benchmark. I'm guessing the "current request" latency will improve though, no?

I have to say the latencies you're getting are much higher that I've experienced with Fauna. Obviously it's expected for a KV database to be faster, but I'd be surprised to get more than 100ms of 50th percentile.

Current request latency is also the read latency (from lambda function -> db). I should have been more clear on that.

Still, almost a second of total latency (50th percentile) is super high.

I'd love to do the same test from Cloudflare Workers instead of AWS Lambda. Could you please DM on Twitter to discuss this?



I just saw this comment by Evan from Fauna which explains why the latency could be so high:


Network latency is not accounted for: they use AWS lambda calling DynamoDB / Redis in the same data center; Fauna endpoint is somewhere else.

Curious what Firebase latency would be in comparison, when called from the same GCP data center or AWS.

According to the blog post this site runs in AWS us-west-1. With Fauna you don't know the data center but according to their status page they have infrastructure in AWS us-west-2. Latency from us-west-1 to us-west-2 is 25ms so you can subtract that from the total time.

Fauna claims to route your request to the nearest data center so I'm interested in validating this. Seeing 400ms latency where I'd expect <50ms is important to me, especially on Lambda where you are billed waiting for the response.

Which should be the case, depending on what you do you will probably experience 10-50ms read latencies. Look for Evan's answers why the measured values here are higher.

Sorry, what is the overlap in usecase between DynamoDB vs. Redis vs. FaunaDB? There aren't many problems I would be evaluating either DynamoDB or Redis for - usually the problem domain is one or the other (an in-mem DB vs. a persistent store). I also have no idea what FaunaDB is, so I'm not sure how it compares to either.

Comparing Fauna with DynamoDB and Redis gives the wrong impression. Trade-offs are totally different.

However, the product in question (Upstash), looks very interesting: serverless storage with a pay as you go model with a maximum price cap is an excelent idea.

I would love to have more options in this space, especially in GCP to use together with Cloud Run.

I would be interested in seeing how DynamoDB fronted by DynamoDB Accelerator (DAX) performs in a latency comparison such as this.

Is this the price Fauna pays for its consistency guarantees?

No it's not, that price is far far smaller and should not impact pure reads. This is probably an artifact of an anti-pattern where the same documents are constantly updated which creates significant history. At this point, that can have an impact on the index. We are working on optimizing that in which case history will no longer have an impact on these latencies while retaining the possibility to go back in time or get changesets.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact