
Thinker – Easily clone and sync RethinkDB databases - internalfx
https://github.com/internalfx/thinker
======
jlhawn
Is this meant to be used for making backup and restore easier? If so, doesn't
RethinkDB already have `dump` and `restore` commands for handling this? The
docs[1] already make it seem pretty easy to do. It would be a good idea to add
motivation to the README for this project, just a paragraph on what Thinker
solves that `dump` and `restore` do not.

Edit: I'd also like to read why/when `thinker sync` should be used over
utilizing RethinkDB's built-in table replication functionality.

[1]
[https://www.rethinkdb.com/docs/backup/](https://www.rethinkdb.com/docs/backup/)

~~~
internalfx
Done.

~~~
jlhawn
Thanks! The example used case you describe for `thinker sync` is certainly a
good one for smaller dev shops - unfortunately many larger enterprises might
have issues with cloning production customer/user data for development
purposes. Of course I don't expect this to apply in every situation.

One more thing that you might want to clarify: you gave a metric for the time
it takes to sync to your local database over the internet. I think it would be
a good idea to indicate the connection bandwidth and the amount of data which
was actually saved from being transferred (perhaps only in number or
percentage of records) as well as post the full time of a regular clone as a
base metric.

~~~
internalfx
> many larger enterprises might have issues with cloning production

Agreed, I work with a very small company. But it's still useful for moving
even test data around. All you need is a source DB and a target.

> connection bandwidth

That gets more complicated...The rated bandwidth is 100mb, the most I've ever
pulled from the DB over the VPN was ~25mb.

> amount of data which was actually saved from being transferred

That is mostly a function of how much data was changed since last sync. The
hashes themselves are very small, even if you let RethinkDB generate all your
id's. It only takes 13MB of bandwidth to download all the hashes for a table
of ~210000 records.

------
marknadal
To clarify: This is for devops/sysadmin work. Not for any of RethinkDB's
realtime functionality? When I saw the title and saw that it was written in
JS, I immediately jumped to web-dev conclusions not devops/sysadmin
conclusions. Great sysadmin addition!

