Hacker News new | past | comments | ask | show | jobs | submit login
Amazon DynamoDB Transactions (amazon.com)
173 points by leef 4 months ago | hide | past | web | favorite | 75 comments

Any experience with using Aurora in place of DynamoDB?

A couple years ago there was an interesting tidbit at re:Invent about customers moving from DynamoDB to Aurora to save significant costs.[1] The Aurora team made the point that DynamoDB suffers from hotspots despite your best efforts to evenly distribute keys, so you end up overprovisioning. Whereas with Aurora you just pay for I/O. And the scalability is great. Plus you get other nice stuff with Aurora like, you know, traditional SQL multi-operation transactions.

It was kind of buried in a preso from the Aurora team and the high-level messaging from Amazon was still, NoSQL is the most scalable thing. Aurora was and is still seemingly positioned against other solutions within the SQL realm. I sort of get it in theory that NoSQL is still theoretically infinitely scalable whereas Aurora is bounded by 15 read replicas and one write master.. but in practice these days those limits are huge. I think one write master can handle like 100K transactions a second or something.

So, I'm really curious where this has gone in the past couple years if anywhere. Is NoSQL still the best approach?

[1] https://youtu.be/60QumD2QsF0?t=1021

Oh cool. For those reading along this is titled "How Amazon DynamoDB adaptive capacity accommodates uneven data access patterns (or, why what you know about DynamoDB might be outdated)". Is this a new feature?

Yeah I should have elaborated a bit. I believe adaptive capacity was announced at re:invent in 2017 and may have released shortly after / maybe early 2018. The feature is getting a lot more press & push from AWS lately though for sure.

I remember having a conversation with our AWS rep about 2 years ago during our quarterly feature request meeting. I remember asking for DynamoDB autoscaling and burst capacity; pretty happy they finally delivered.

Since then we've pretty much cut our DynamoDB bill in half and had a drastic reduction in throttled responses.

I personally recommend using a SQL database until you're absolutely positively sure you don't need one, for many reasons.

But, as far as the "you end up overprovisioning" because of hotspots thing, DynamoDB does offer autoscaling these days, which should alleviate a lot of provisioning-related headaches and save you money compared to the provisioning you would have done with DynamoDB, from what I understand.

We use a hybrid. We process a lot of incoming data and dump most of it into dynamo (it's ephemeral so the TTL feature is nice) and if we get capacity errors (Dynamo takes a while to scale up sometimes) we just dump our objects in the DB. The end result is we keep a huge amount of writes off our DB for processing incoming largish objects. The amount of data it stores would cost an arm and a leg to put into redis.

Granted, I don't think I'd want to use Dynamo for anything other than temporary data. Lock-in makes me nervous, and the way it scales up/down really makes it difficult to use it for hourly workloads...by the time it scales up we're close to done needing more capacity, then it doesn't scale down for like 40m after. We set up caps and the DB overflow machanism keeps things from grinding to a halt.

Why don't you use Kinesis for this? Isn't that what it's made for?

> DynamoDB does offer autoscaling these days, which should alleviate a lot of provisioning-related headaches

The problem they noted isn't lack of autoscaling, it's that you have to provision the entire datastore to accommodate your hottest partition.

GP used the wrong term, think they meant adaptive capacity, which is a newer feature where shards will automatically lend capacity to each other in the case of hotspots.

Autoscaling doesn't always help with hot shards (which I think gp was referring to) because you can have a single shard go over its share of the throughput[0] while still having a low total throughput.

[0] total throughput/num shards

This has largely been resolved, a single shard can now consume more of the throughput than your equation would give you. AWS refer to it as Adaptive Capacity


Yes. Relational databases are very fast and using them as key/value stores is a great use-case. Using a scale-out system like Aurora makes it even better. It's slower because of SQL parsing and generally the SQL clients are not as fast, but you can get close to single-digit millisecond latency these days.

We use Aurora or Postgres for key/value unless we need something specific, like multi-regional capacity or really high-end performance. For that we run ScyllaDB.

> It's slower because of SQL parsing and generally the SQL clients are not as fast

I'd be really surprised if the client library introduces a latency significant enough to be compared to the network latency between the app server and the database server.

Many libraries handle db connections poorly, or have heavy-handled pooling systems, or aren't fully async, all of which limits total throughput. The key/value clients usually have a much simpler APIs like HTTP which scale much better.

I don't understand. What makes you think it's easier for NoSQL clients (versus SQL clients) to correctly implement connection pooling and async networking? For example, MongoDB and Cassandra wire protocols are not based on HTTP. And even if they were based on HTTP, connection pooling and async networking still requires a specific effort. Which libraries are you thinking of (as examples of good and bad behavior)?

Relational databases tend to have bigger and more complicated protocols, with more complex session management, data types and parsing requirements, and connections that may only support a single in-flight query.

Libraries just have to do more work, compared to simpler protocols, or HTTP which is incredibly easy to scale and pretty much handled automatically by the standard libraries at this point.

Example: psycopg2 (python-postgresql driver) doesn't have (or sucks) prepared statements compared to cassandra driver.

Right, but that has nothing to do with connection pooling and sync. And there is no structural reason that makes easier to implemented prepared statements for PostgreSQL than for Cassandra. It's anecdotal evidence.

I have the exact same experience with npgsql. It's exporting postgres's "one session - one server process" model which is very outdated.

Whether NoSQL is the best approach and whether DynamoDB is the best approach are two separate issues. I find DynamoDB too limiting with the way that it handles indexing, read and write capacity, etc. compared to traditional NoSQL databases like ElasticSearch and Mongo.

That being said, one advantage of DynamoDB is that it is API based and you can make a true serverless web app where all of the logic is on the client, you use Web Federation for authentication to DynamoDB, and you host your JavaScript files, html and CSS on S3.

Another advantage until two days ago, was that with most of the data stores on AWS, you kept your databases behind a VPC and if you used lambda, your lambda also had to be in a VPC and that increased warm up time for the lambda.

Now, there is the Read Only Data API for serverless Aurora. You don’t have to worry about the traditional connection pooling or being in a VPC.

You can write too - not just read-only.

Aurora did not work well for us (it was using local ephemeral disk to do sorts so our query results were truncated / limited to largest local storage) so the best option for us was to run MySQL or PostGres on a i3 instance with local SSDs.

Ok but I'm not sure this is relevant. We're talking about using Aurora in place of DynamoDB, not how it compares to other SQL DBs. With DynamoDB the kind of internal sort you're talking about isn't even possible, right?

"Plus you get other nice stuff with Aurora like, you know, traditional SQL multi-operation transactions." THIS !!!!

NoSQL has such a nich usage!

My wishlist for DynamoDB is now down to:

* Fast one-time data import without permanently creating a lot of shards (important if you are restoring from a backup)

* Better visibility into what causes throttling (e.g. was it a hot shard? Was it a brief but large burst of traffic?)

* Lower p99.9 latency. It occasionally has huge latency spikes.

* Indexes of more than 2 columns

* A solution for streaming out updates that is better than dynamodb streams

Also, better insight into partition sizes / what's causing hot spotting. The DB abstracts a lot from the user, which isn't necessarily great, because it's still subject to the normal pitfalls of a NoSQL database.

Bigtable is a different beast, but it's new "Key Visualizer" is impressive. Has helped us quickly find anomalies https://cloud.google.com/bigtable/docs/keyvis-overview

Wish Dynamo had something similar

Not a particularly easy solution, but you can use dynamo streams to achieve this by loading fast into a temporary table, trickle-feeding via a stream into another table. When it’s caught up, stop writes on the import table then swap over to the permanent table.

A way of doing this without expending all that effort is oh my wish list too.

> * A solution for streaming out updates that is better than dynamodb streams

What bothers you about dynamodb streams specifically?

What kind of p99.9 latency are you looking for?

and would Dax help?

Congrats to the DynamoDB team for going beyond the traditional limits of NoSQL.

There is a new breed of databases that use consensus algorithms to enable global multi-region consistency. Google Spanner and FaunaDB where I work are part of this group. I didn’t catch anything about the implementation details of DynamoDB transactions in the article. If they are using a consensus approach, expect them to add multi-region consistency soon. If they are using a traditional active/active replication approach, they’ll be limited to regional replication.

They warn about other regions seeing incomplete transactions (if you opt into transactions on global tables), which fits with the current "copy each new item from the stream" async replication.

“DynamoDB is the only non-relational database that supports transactions across multiple partitions and tables.”

Uh... this is just not true.

Can you identify some others?

The Google Cloud Datastore (formerly the "App Engine Datastore") has had cross-entity-group transactions since 2011:


The cross group transactions are a little limited - https://cloud.google.com/appengine/docs/standard/java/datast...

I don't think it's fair to compare them.

However, the more recent Google storage offerings based on Cloud Spanner do seem to offer this. I don't see how Amazon can make this statement - that doesn't stop it being an excellent enhancement to DynamoDB though.

Cloud Firestore (the next generation of Cloud Datastore) removes those limitations. https://cloud.google.com/firestore/docs/manage-data/transact...

It also supports the Cloud Datastore API.

(I work on it!)

DynamoDB is limited to 10 items, whereas the Cloud Datastore limits are 25 different 'tables' -> The new version via Cloud Firestore doesn't even have that restriction. AWS is several years behind and several NoSQL systems behind in this area. Still, a cool addition.

I don't know if the overall statement is true, but Spanner is relational and the statement was limited to non-relational databases.

The “and tables” clause is the differentiator, I think. DynamoDB tables are roughly equivalent to Datastore namespaces; I don’t believe Google Cloud Datastore supports cross-namespace transactions.

It does, and has for several years.

As far as I'm aware these offerings support transactions across the entire database.

Google Cloud Spanner: https://cloud.google.com/spanner/docs/transactions

Google Cloud Firestore: https://firebase.google.com/docs/firestore/manage-data/trans...

Plus if you use Cloud Firestore in Datastore Mode then Google Cloud Datastore would satisfy this requirement as well.

Spanner is not a non-relational database.

As for Firestore, it’s not clear whether it supports cross-collection transactions. Cloud Datastore does not support cross-namespace transactions AFAICT.

a) Cloud Firestore supports transactions across the entire database. You can learn more about them here: https://cloud.google.com/firestore/docs/manage-data/transact....

b) Given that the primary use case for namespaces was/is multitenancy, it's not clear to me why you'd want to transact across them. Nevertheless, you can. What's leading you to draw this conclusion?

The documentation is what led me to that conclusion, since it's not explicit as to what the transaction boundaries are, but I could be mistaken. Does this mean that the poster's claim is erroneous?

It does. It's not in the documentation because it doesn't have boundaries within the database.

Can you have more than one database per project? If so, there still might be a valid claim here.

FoundationDB https://www.foundationdb.org/ not only supports transactions, they are mandatory. They also go one step further and support atomic operations, which are especially killer.

I don’t think FDB supports cross-database transactions, though.

What do you mean by this? There is only one “database” in FoundationDB terms. You can write transactions over the entire keyspace regardless of which machine the data is stored on.

Multiple clusters, then, or whatever you specify to FDB’s API to identify the instance when making a client connection.

I'm still not sure what you mean in terms of contrasting this with DynamoDB's new features. You could implement the entire DynamoDB API, with even stronger semantics than the new features listed in the article, on top of FoundationDB. Additionally, the latency would be theoretically lower as they describe needing to do a read, write, and another read per key to verify isolation, whereas FoundationDB uses an optimistic concurrency control scheme to verify at commit time that transactions do not conflict. In the common case (where transactions don't conflict) this is faster.

All I’m trying to do here is trying to see whether the claim made in the blog post is true or not. Some commenters were claiming it was false, but I don’t think they considered all the components of the claim.

Agreed 100% and as someone who has had to use DDB before, nothing would make me happier than seeing this built.

There's not really a concept of "Database" in FDB. There is however a concept of key spaces, and "directories", which are basically the same, and these all support transactions.


/database1/key1 = foo

/database2/key2 = bar

Hyperdex Warp - I'm not sure if it's still available - which purports to provide serializability over multi-key transactions. In the HyperDex model that means over all defined spaces in the cluster. That's a stronger guarantee than DynamoDB provides, which is still susceptible to phantom reads. The DynamoDB team ought to be aware of it, because it's one of the first hits for "multi-key transactions" and the paper is an important one for designing transactions on KVS.


MongoDB (https://www.mongodb.com/transactions)

“Multi-document transactions can be used across multiple operations, collections, databases, and documents.”

However: “Multi-document transactions are available for replica sets only. Transactions for sharded clusters are scheduled for MongoDB 4.2.” DynamoDB is sharded by design.

I see, it appears to come down to how each db interprets "partitions".

If we're referring specifically to shards then "DynamoDB is the only non-relational database that supports transactions across multiple partitions and tables." no longer sounds like hyperbole.

FaunaDB (mentioned in previous comment) -- it is multi-model NoSQL so you can do relational queries and it supports transactions across multiple partitions, documents, replicas. json docs, not tables.

CockroachDB And you can get pretty close to being schemlass with the json column, and maybe with the inverted index.

RavenDB I think?

This is cool, it lifts the burden of having to bake "atomicity" into your app if you're using a key/value store like DynamoDB. I can see a nice balance of combining this with some built in error checking in the app itself.

I'd be interested to see comparisons/benchmarks against FoundationDB. DynamoDB transactions make dynamo a serious alternative to FDB now. I can see the two manage advantages for FDB being: 1) you can deploy it on premise (which is potentially important for some B2B companies), 2) it shuffles data around so that hot-spotting of a cluster is eliminated (which dynamo appears to still suffer from).

Foundation DB is open source!

Postgresql gets native JSON support (at least since 9.2 onwards) to store schemaless free flowing text. Dynamodb gets transaction guarantees.

There is globalization and intermingling happening on technology too.

On a similar thought, a few years back, C# and Java got `Any` generic types, while Python/JS got static types (via python3 typings, typescript)

C# doesn't have an Any generic type (Foo<?> In java parlance)

>If an item is modified outside of a transaction while the transaction is in progress, the transaction is canceled and an exception is thrown

You are still responsible to implements a Queue or a Lock on the Items you want to mutate.

That said this is a huge milestone for DynamoDB, we can now safely mutate multiples items while remaining ACID.

Max 10 items per transaction, that's quite a restriction! I guess you have to plan all the transactions you would perform and make sure they meet the bounds.

Is the heat map available to customers now or is it still a request you have to do?

Still have to request it AFAIAA

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact