Best way I've found to do joint versioning of code with large datasets (whether ...

Best way I've found to do joint versioning of code with large datasets (whether binary or tab-delimited text):

1. Check in symbolic links to git. You can include the SHA-1 or MD5 in the file name.

2. Have those symbolic links point to your large out-of-tree directory of binary files.

3. rsync the out-of-tree directory when you need to do work off the server

4. Have a git hook check to see whether those files are present on your machine when you pull, and to update the SHA-1s in the symbolic link filenames when you push

By using symbolic links, at least you have the dependencies encoded within git, even if the big files themselves aren't there.