> Each hash key resolves to a number of possible servers the data can be on. Data is replicated across several of these servers. For redundancy. The hash key determines which shard to use.
> On individual machines, each set of data is stored by a compound key of hash key and sort key (if there is a sort key). The data is probably stored on disk sequentially by sort key or close to it.
This is pretty much exactly correct. The hash key maps to a quorum group of 3 servers (the key is hashed, with each quorum group owning a range of the hash space). One of those 3 is the master and coordinates writes as well as answering strongly consistent queries; eventually consistent queries can be answered by any of the 3 replicas.
> They possibly use something like LevelDB for this.
Sigh...if only. I don't remember the exact timeline but LevelDB either didn't exist when we started development or wasn't stable enough to be on our radar.
DynamoDB is this very elegant system of highly-available replication, millisecond latencies, Paxos distributed state machines, etc. Then at the very bottom of it all there's a big pile of MySQL instances. Plus some ugly JNI/C++ code allowing the Java parts to come in through a side door of the MySQL interface, bypassing most of the query optimizer (since none of the queries are complex) and hitting InnoDB almost directly.
There was a push to implement WiredTiger as an alternative storage engine, and migrate data transparently over time as it proved to be more performant. However, 10gen bought WiredTiger and their incentive to work with us vanished, as MongoDB was and is one of Dynamo's biggest competitors.
Ah! I knew it! It's an orchestration layer on top of a mysql layer with floating masters.
I remember back when DDB was first launched we all sat around at Powerset and tried to figure out how Amazon did it given the two pieces of information we were given: "it is not based off the dynamo paper" and "it is based off both open source and proprietary code". We figured it had to be MySQL or Postgres that they were referring to.
I didn't know you were completely bypassing the query layer though.