
DBLog: A Change-Data-Capture Framework - srijan4
https://netflixtechblog.com/dblog-a-generic-change-data-capture-framework-69351fb9099b
======
adenverd
I'd love more info how Netflix propagates schema changes to downstream stores.
How do you apply migrations to heterogeneous databases? Applying binlog
messages only works if downstream stores are the same flavor database as the
source. And common message formats like Avro don't have a guaranteed migration
strategy like protobuf.

I suspect it's more of a process solution than a technological solution. Are
non-backwards-compatible migrations scheduled in advance, and broadcast to
dependent teams? Are downstream consumers expected to have a replay/dead-
letter queue?

~~~
aandreakis
This is a great question :) We may share details about that in a future blog
post.

------
antpls
This was nice to read. According to you, what would be the minimal change to
push to postgresql and mysql in order to reduce complexity and better support
tools like DBLog?

------
rammy1234
Added a kafka layer to make this real time capture. I guess lot of people are
trying to do this. I guess what is the keypoint i am missing here.anyone ?

~~~
srijan4
Kafka or similar is already used by the existing solutions mentioned in the
post.

I think what this adds is a better way of dump processing without using
database specific features.

------
theknarf
Is this similar to Debezium ([https://debezium.io/](https://debezium.io/))?

~~~
rajnathani
Debezium is mentioned in the post.

------
seanlaff
This depends on row-based binlog replication, correct? Has netflix had to deal
with systems with statement-based replication?

~~~
aandreakis
correct. This way we can capture create, update and delete events of
individual rows. binlog_format must be set to ROW in order to make this work
in MySQL. For Postgres we are using replication slots which provide row based
events.

We use MySQL RDS and it has "mixed" as the default binlog_format. Mixed uses
statement based logging for some event types (see MySQL docu for details).
Hence statement based replication is part of the mix unless one explicitly
switches to ROW based replication (which is required for DBLog).

------
grumpy8
Interesting article thanks. Maybe add a link to the previous article in the
"Blog Series" like it does in part 1.

~~~
aandreakis
This is a good point, we will fix that. Thx for the feedback

