
Storing large binary files in Git repositories (2015) - pmoriarty
http://blog.deveo.com/storing-large-binary-files-in-git-repositories/
======
joeyh
This is a good article, but its git-annex information is getting slightly out
of date. git-annex has a new repository format available which makes its
interface work more like git-lfs. [http://joeyh.name/blog/entry/git-
annex_v6](http://joeyh.name/blog/entry/git-annex_v6)

Also, it's worth noting that git-lfs stores 2 copies of each large file on
local disk. git-annex stores 1 copy. Which is a pretty big plus the article
left out.

~~~
drdaeman
Whoa, this looks great!

Sorry for possible off-topic, but I wonder if anyone knows a way to
convert/upgrade older (created an year ago) git-annex repo to a newer v6 one?

~~~
sk3w
[https://git-annex.branchable.com/upgrades/#index1h2](https://git-
annex.branchable.com/upgrades/#index1h2)

I haven't tried this yet myself.

------
CoolGuySteve
It may be sacrelege but you could store your binaries in hg and in hg use a
git submodule for the git repos that were used to build those binaries.

This is what I'm doing after a disastrous excursion into git-fat.

But really, if git's maintainers don't want to store binaries, why shoehorn it
in? Use the right tool for the job.

~~~
Mathiasdm
For some added context about handling binaries in Mercurial, see the (in
development) Mercurial book: [http://hgbook.org/read/scaling.html#handle-
large-binaries-wi...](http://hgbook.org/read/scaling.html#handle-large-
binaries-with-the-largefiles-extension)

------
m0th87
We made git-fit, which was inspired by git-media:
[http://github.com/dailymuse/git-fit](http://github.com/dailymuse/git-fit)

On the plus side, it's stupidly simple. On the down side, it's stupidly
simple. The readme explains how it works.

------
bokchoi
It would be nice if git would support large files natively rather than relying
on these extensions.

------
mschuster91
I have a couple of projects with _huge_ binary assets under source control
(movie files, 4K images, essentially the whole website). Deployment via
jenkins and rsync, it's really fast after the initial checkout.

How? Well, the framework stores all assets inside /assets, which normally is a
git submodule. Jenkins works pretty fine with this construction, and for
developers there are a couple self written PHP shellscripts that execute a
shallow clone outside the repository and then do a bit of rsync magic to sync
it all.

------
gcb0
isn't it idiotic? if you can't get a textual diff, or if a textual diff is
meaningless, what benefit do you even get having the file on git?

this sounds like a problem for teams blindly following orders from clueless
managers.

~~~
AndrewDucker
Version control isn't about generating a textual-diff. It's about storing all
of the necessary assets to be able to generate an application. Some of those
assets are code, some are binary. And you want to keep them in sync, in some
way.

Using multiple different systems to get myself "up to date" is a massive pain
- I should be able to run one command and then have a wholly buildable set of
files.

~~~
gcb0
So, you keep your compiler source and binary in your project source tree?

yeah, you don't. :D

