
Using DynamoDB to Track Changes to DynamoDB - mooreds
https://www.transposit.com/blog/2019.09.24-dynamodb-audit-table/?c=hn
======
SubuSS
Sweet - I built the streams feature in DynamoDB (well tech lead). Awesome to
see it on HN!

~~~
reilly3000
Awesome work, I really think Streams is the killer feature of DynamoDB. Are
there any plans on the roadmap to allow for filtered/faceted streams or
Streams of a single GSI? How about consolidation between DDB streams and
Kinesis? It would be great if the "DynamoDB Streams Kinesis Adapter" program
didn't need to exist :)

My main pain point has been around how many times a lambda is invoked to
consume a stream on a busy table but I really appreciate the record batching
aspect of them.

~~~
SubuSS
Thanks! Let me also preface this a bit: I built that 5 years ago :). I've been
out of AWS (in snap) for the past 4 years. So I will have to wait on the
current owners to answer this. But last I checked,

\- faceted streams wasn't in plan - streams pretty much expose a cleaned up
version of base table's replication logs. In theory they could do the same for
a GSI (by exposing its logs), I don't know if there's enough demand.

\- Kinesis and Dynamo used the same backing storage technology. By itself,
Kinesis didn't provide some of the guarantees that we wanted IIRC - so there
was a different story. Also I don't think customers will want to pay for
kinesis (streams was free?). You may want to expand on why kinesis streams is
better for your use case though.

\- yes batching was indeed put in place for this. In fact I know of big enough
tables where the shard listing part got to be the bottleneck, but that should
be a corner case.

------
kevan
We do a variant of this for most of our data, except we use the version
attribute (also used for optimistic concurrency) instead of a timestamp to
identify changes in the history table. Our change velocity on most data is
really low so the audit trail is basically free.

Another pattern to achieve the same goal is v0/vN records, where you store all
versions in the main table but keep the most recent info in v0 for quick
querying. This[1] SO answer has a lot of context on the tradeoffs between the
approaches.

[1]
[https://stackoverflow.com/a/54600512/2811887](https://stackoverflow.com/a/54600512/2811887)

------
reilly3000
I just made a plan the other day to make an architecture like this. With an
index on keys I'll be able to provide a record-level history in my UI =)

~~~
mooreds
It was surprisingly easy. Granted, I cribbed an awful lot of plumbing from the
AWS post (which pushed changes to SNS), but I was pleasantly surprised.

~~~
reilly3000
And I shall crib from you :) Thanks for sharing your work!

------
koolba
I think this lambda handler is broken. It should be waiting for completion of
the store operations prior to invoking the completion function. Otherwise the
lamda could be unloaded or frozen prior to the operations completing. As the
lamda stays active for a while after receiving a request it may seem like it's
working fine until it's not.

