

Show HN: Gut-sync, real-time bidirectional file sync via modified git - dantillberg
https://github.com/tillberg/gut

======
dantillberg
My quick intro to gut-sync:

I built gut-sync (first in Node, then in Python, then in Python3, and now in
Go) to synchronize all of my source code between my laptop, desktop, and dev
VM on AWS.

I wanted something that enabled me to seamlessly move between the laptop and
desktop, propagating edits from any one machine to all others. And it had to
be real-time, so that I can edit files on the desktop/laptop and have the
changes be live with a second or two on AWS. gut-sync accomplishes those goals
for me.

I just learned about the really awesome/fantastic Syncthing project on Friday
(just as I was wrapping up gut-sync!), and many/most people will prefer that
for various reasons. Here are the core differences that might make gut-sync
preferable for some users, though:

\- Uses git under the hood (just with the U's changed to I's, for technical
reasons); gut-sync itself does no heavy-lifting, leaving all the hard work to
git. If you're not as familiar with git, or if you need to handle large or
frequently-changing files, this is probably a minus.

\- Uses SSH to deploy itself and communicate, which works well iff you use SSH
for everything already.

Comments, criticism, and editorial remarks appreciated. Thanks! :)

~~~
anarcat
it's an interesting project! you may want to take a look at git-annex as well,
which also uses git and ssh (and rsync) to synchronize things but also
supports large files.

you would need to run a daemon on both ends to get inotify-like behavior
however.

why did you switch from node to python to go, btw?

~~~
dantillberg
Thanks!

The switch from Node to Python was mostly incidental -- I had written a peer-
to-peer style program in Node, and I wanted to simplify the logic by rewriting
it with a primary/secondaries model.

The switch from Python to Go was more intentional. While running gut-sync,
there are times that sub-commands that can take a long time to complete, e.g.
git-fetch or implicit calls to git-gc, and git provides feedback in the form
of progress meters. I wanted to be able to have gut-sync do simultaneous
fetches to N servers, and to have statuses interleaved in the console in real-
time (I'm doing a bit of TTY hacking here). I tried rewriting logic first
using Python 3.4's asyncio, but I found it too complicated and too slow to do
_non-blocking I /O_ in Python. There are readline() methods and read(N)
methods, but I had a ton of trouble getting them to work in non-blocking mode.

Basically, Python's subprocess stuff works great if you're fine using
proc.communicate() or proc.stdout.readline(), but if you want to do something
real-time, the lower-level stuff gets very complicated and slow. asyncio is a
step in the right direction, but it was still pretty awkward to have to
specify `@asyncio.coroutine` and `yield from` everywhere, and it could be
fairly tricky to track down whenever you screw one of them up.

Over in Go-land, though, you can just read streams one byte at a time if you
want, and do whatever sort of fancy processing you need to, and it's all
blazingly fast by comparison. And go-routines are very simple and flow
naturally with the language (compared to feeling bolted on like an after-
market mod in Python).

Edit: PS I still love Python. It's wonderful for many programming tasks. It's
just not as great for low-level real-time systems programming as Go.

