
A Decade of Dynamo - werner
http://www.allthingsdistributed.com/2017/10/a-decade-of-dynamo.html
======
rm999
DynamoDB is amazing for the right applications if you very carefully
understand its limitations.

Last year I built out a recommendation engine for my company; it worked well,
but we wanted to make it real-time (a user would get recommendations from
actions they made seconds ago, instead of hours or days ago). I planned a 4-6
week project to implement this and put it into production. Long story short: I
learned about DynamoDB and built it out in a day of dev time (start to finish,
including learning the ins and outs of DynamoDB). The whole project was in
stable production within a week. There has been zero down time, the app has
seamlessly scaled up ~10x with consistently low latency, and it all costs
virtually (relatively) nothing.

~~~
eropple
This is the good side of Dynamo, and it's awesome that you've had that
experience.

The flip side: Dynamo gets _expensive_ and it gets expensive _quick_ , and
being a custom API (and, indeed, a very different way to think about
datastores) makes migration difficult.

It's great to use, if you understand the tradeoffs. Just make sure you
understand them before you make the leap.

~~~
rm999
>Dynamo gets expensive and it gets expensive quick,

DynamoDB's pricing scales sublinearly with volume; if it starts getting
expensive it was an initial misuse of DynamoDB that got obvious with scale.
There are a lot of factors that go into whether you should use DynamoDB and
how you implement it. I recommend anyone who is considering using it very
carefully understand this page first:
[http://docs.aws.amazon.com/amazondynamodb/latest/developergu...](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/BestPractices.html)

~~~
candiodari
This is how enterprise developers use the database, sometimes:

[https://thedailywtf.com/articles/The-Query-of-
Despair](https://thedailywtf.com/articles/The-Query-of-Despair)

You do this on your own server, slowness and bad performance are the result
(but it may never, or very rarely get called). You do it on dynamo, a $10k
bill may be the result.

~~~
moduspol
That's not quite true. With DynamoDB, you provision capacity, so by default,
it will simply get slow (throttled) if you use more than you expect.

The only way you could be "surprised" with a $10k bill is if you set up
autoscaling for it with your upper limits (which it requires you to choose)
high enough to reach $10k. And then you'd have to forget that you did that.

------
simonebrunozzi
I was at AWS from 2008 to 2016. Werner Vogels, Amazon's CTO (yep, not just
AWS', but Amazon's, as he had to point out numerous times) has been one of the
most talented, humble and generous senior exec I've ever met in my life.

Lots of good memories of time spent with him, and one of the sad aspects for
me of leaving Amazon.

His blog writings are really interesting. If you haven't already, I suggest
you search the archives, there are several hidden gems there.

~~~
sneak
I have always tremendously admired that guy; what a job he has! He is
literally responsible for about half of the shit on the internet not going
down (and also running the biggest internet shopping mall and making sure a
bunch of crappy speakers can talk, but both of those are straightforward by
comparison IMO).

Can you even imagine? I want to know what his direct report structure looks
like.

~~~
iliveinseattle
He is an individual contributor

~~~
toomuchtodo
Interesting contrast to YCs path post from the other day ("Founder, Executive,
Individual Contributor").

------
jchw
I'm more interested in solutions like Spanner and Cockroach. Different
tradeoffs for different applications, but they seem to be the most general
purpose of the highly scalable databases. DynamoDB is cool and I've tried to
adopt it for things, but it's surprisingly hard to imagine an application
where the model isn't somewhat limiting. The capacity provisioning is also
quite painful, which doesn't help matters any.

~~~
jchanimal
The databases you mentioned both have strong consistency, but do not have
serverless pricing models. My employer FaunaDB has a similar consistency
model, but a pay-as-you-go model that requires no provisioning or capacity
planning.

You can read more about our ACID transactions here:
[https://fauna.com/blog/consistent-transactions-in-a-
globally...](https://fauna.com/blog/consistent-transactions-in-a-globally-
distributed-database)

~~~
jchw
Pretty cool. Of course, the serverless version doesn't have too many regions
yet, so some of the advantages of strong global consistency may be less
useful. But I'll keep my eye on this regardless.

------
gelatocar
I'd be interested to hear how others are handling read/write capacity
configuration for dynamo. It seems like it would be very easy to hit the
account limit of 10,000 units once you are querying any significant amount of
data. I've also run into issues with auto scaling where you have to endure up
to 15 minutes of downtime before the scaling kicks in [0]. Even on a table
with ~2000 items I've found it becomes quite slow and costly to fetch data.
Also the 25 item limit on batch writes makes it pretty frustrating to
edit/delete lots of data.

\- [0] [https://hackernoon.com/the-problems-with-dynamodb-auto-
scali...](https://hackernoon.com/the-problems-with-dynamodb-auto-scaling-and-
how-it-might-be-improved-a92029c8c10b)

~~~
ryanworl
You can request that limit be increased through the limit increase form.

Also, if you need to scan a ton of items to assemble your desired result, you
should re-think using DynamoDB as a whole.

------
peterwwillis
When they mention companies using DynamoDB, at least one of those actually
uses their own implementation of Dynamo that they wrote to work around cost
and performance limitations.

The main problems faced are not the ability to scale or reach performance
benchmarks or keep data safe. They are operational, and primarily problems of
infrastructure complexity and management. Oh, and having developers architect
and manage the operations of a really freaking huge service is a bad idea. (No
offense intended - those developers don't want to be woken up in the middle of
the night either)

------
netvarun
Direct link to the Dynamo Whitepaper PDF:
[http://www.allthingsdistributed.com/files/amazon-dynamo-
sosp...](http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf)

------
fiokoden
I don't know why amazon is so taken with Dynamodb. I find it to be incredibly
unintuitive and lacking real world application, requiring applications to
perform gymnastics to work with it.

~~~
freedomben
I've found just the opposite actually. While it's far from perfect, it has
been amazing for rapidly standing up new apps (especially prototypes). We've
used quite a few different strategies and found it to be flexible and
performant.

The only downside is we do find ourselves sometimes implementing relational DB
functionality at the application level to compensate for Dynamo DB's
"flexibility." Postgres is still the go-to for data that is relational in
nature. But man, letting Amazon worry about hosting and scaling is also pretty
awesome...

~~~
fiokoden
>> we do find ourselves sometimes implementing relational DB functionality at
the application level to compensate for Dynamo DB's "flexibility."

Yep, this is A-grade crazy, and exactly my point. I would question if it's
"sometimes", or "actually almost all the time, now that we think about it,
there's not much that we CAN do with DynamoDB without writing application
level database functionality."

------
pritambarhate
Now that autoscaling is available for Dynamo DB, my main complaint with
DynamoDB is the lack of out of the box backup solution that works at scale.

Production DB without backups is unthinkable. It just takes one human mistake
to erase tons of data. Consistent and regular backups are must have for any
production system.

------
deepsun
> The Dynamo paper was well-received and served as a catalyst to create the
> category of distributed database technologies commonly known today as
> "NoSQL."

No, sorry, it was Memcached and Bigtable paper that popularized "NoSQL" term.
Although there were many NoSQL databases tracing way back to 60s [1], those
were the ones that "served as catalyst" for the term "NoSQL".

[1] [http://blog.knuthaugen.no/2010/03/a-brief-history-of-
nosql.h...](http://blog.knuthaugen.no/2010/03/a-brief-history-of-nosql.html)

~~~
pavlov
Dynamo was certainly one of the products that spiked interest in the “NoSQL”
datastore category.

The phrasing “served as a catalyst” seems right — it doesn’t imply the only
catalyst.

------
mankash666
Congrats to aws on the impact DynamoDb has had on the ecosystem & industry.
The article does make it seems like DynamoDb was the first to publish a unique
noSQL architecture. Is this true?

~~~
fiokoden
Lotus Notes was first. Distributed, replicated key value store.

Actually I'm not right: [http://blog.knuthaugen.no/2010/03/a-brief-history-of-
nosql.h...](http://blog.knuthaugen.no/2010/03/a-brief-history-of-nosql.html)

------
jbergens
I would be interesting to read some comparisons between DynamoDb, CosmosDb and
maybe Spanner.

------
jaxondu
Need a library sdk for a Dynamo Sync feature to allow easy development of
offline mobile apps. Similar function to Cognito Sync. Also hope that AWS will
release a serverless SQL db. And cheaper price.

~~~
SteveNuts
> Also hope that AWS will release a serverless SQL db.

You mean like RDS?

~~~
fiokoden
RDS is serverful not serverless.

------
rdiddly
Oh _that_ Dynamo. Not this Dynamo:
[http://dynamobim.org/](http://dynamobim.org/)

------
gt_
What is the machine in the photo?

~~~
monkmartinez
A dynamo... ;)

------
ddou
amazing product

------
sheeshkebab
The thing doesn't even support a useable cross region replication. On top of
that the whole read/write capacity is a joke (a painful one at that).

Other than a dirty js config or a prototype store this db is useless.

~~~
eropple
I would be very, very careful of calling anything a company of such very sharp
people does a "joke."

One of my prior gigs was pushing a billion data points a day through DynamoDB
without it breaking a sweat. We were paying for it, too--but it was there and
it worked.

~~~
sheeshkebab
Anything that can go to dynamo, can go to s3, especially at that volume. And
you get proper multi region replication, read/write capacity based on actual
usage and instant scaling.

I stay by my comment that dynamodb is a joke wrapped in thick layer of
marketing crap.

~~~
ryanworl
You cannot replace DynamoDB with S3. S3 can not perform atomic and strongly
consistent operations.

Edit: as other commenters have noted, you can perform a read after write on a
new key.

~~~
monkmartinez
I am pretty sure S3 is atomic. That is, you can't get a transitory state. Your
object either updated/PUT or it didn't AKA read after write consistency.

