Hacker News new | comments | show | ask | jobs | submit login

Not the grandparent, but I have a lot of experience with DynamoDB as well. I migrated a self-managed, sharded redis store to DDB close to a year ago, and what it really depends on are your read and write patterns, and whether or not you can live with the opacity of what DDB is doing behind the scenes.

An example: if you provision X capacity units, you're actually provisioning X/N capacity units per partition. AWS is mostly transparent (via documentation) about when a table splits into a new partition (I say "mostly" because I was told by AWS that the numbers in the docs aren't quite right), but you'll have no idea how the keys are hashed between partitions, and you won't know if you have a hot partition that's getting slammed because your keys aren't well-distributed. Well, no, I take that back -- you will know, because you'll get throttled even though you're not consuming anywhere near the capacity you've provisioned. You just won't know how to fix it without a lot of trial and error (which isn't great in a production system). If your r/w usage isn't consistent, you'll either have periods of throttling, or you'll have to over-provision and waste money. There's no auto-scaling like you can do with EC2.

Not trying to knock the product: still using DDB... but getting it to a point where I felt reasonably confident about it took way longer than managing my own data store... and then they had a 6-hour outage one early, early Sunday morning a couple months ago. Possibly solution: dual-writes to two AWS regions at once and the ability to auto-pivot reads to the backup region. Which of course doubles provisioning costs.

Ok, maybe I'm knocking it a little. It's a good product, but there are definitely tradeoffs.

You should put some effort on designing the key schema, especially the hash key. Don't do timestamp, or sequence. I had to do a 'encrypted' key schema on my sequence IDs so sequential records are spread into different partitions. Actually it worked out well as I can exposed these encrypted keys externally too.

Agreed that DDB documentation could be better on the best practices. I found this deep dive youtube video very helpful: https://www.youtube.com/watch?v=VuKu23oZp9Q

Oh, absolutely. The problem is that your key schema also ties you to what kind of queries you can do efficiently. My primary keys actually are very well-distributed across the hash space, but in my initial implementation, one of the global secondary indexes had a key that caused throttling and absolutely destroyed performance. Dropping that index meant losing the ability to do an entire class of queries. In this case, we were (sorta) able to live with that, but I can imagine many cases where that's a problem.

That's actually another instance of the lack of transparency and trial-and-error: "oh hey, writes are getting throttled... no idea why... let's drop this index and see if it helps".

Was the ElastiCache service around when you migrated Redis to Dynamo? This is another alternative I'm considering.

The problem with ElastiCache -- and why I rejected it as an option -- as that they make you define a 30-minute "maintenance window" where AWS can apply patches and reboot your cache instances. In practice I've heard that this happens rarely, and when it does, the downtime is short, but it can in theory cause longer outages.

And in the case of both the redis and memcached backends, if the maintenance requires restarting redis/memcached or rebooting the instance, you lose all data in the cache (at least up until your last backup). For this particular project, that amount of downtime would easily cause a real outage for customers, and was unacceptable.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact