
GitLab Annex: Large Binaries with Git - jeremiep
https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/
======
flyingyeti
What is the difference between GitLab Annex and git-annex?

[https://git-annex.branchable.com/](https://git-annex.branchable.com/)

Edit: Just found the relevant quote from the article

> _In GitLab 7.8 Enterprise Edition this problem is solved by integrating the
> awesome git-annex._

I guess the repeated branding of "GitLab Annex" just seems a bit strange to
me.

~~~
jobvandervoort
GitLab Annex is git-annex integrated in GitLab. Meaning you have the
authentication and protection that GitLab gives you with git-annex.

------
alimoeeny
I am trying to understand why putting large binaries under source control is a
good idea in the first place. Unless you have a way to make sense of the diffs
this will not help right? and to to make sense of the diffs, you need to
understand the structure/format of the binary at least after you have the
diff, or maybe even to make a useful diff, right?

~~~
barkingcat
Diffing is just one small part of version control. Keeping binary files in
version control is the number one step away from messy file-name-based
versioning.

Many design studios are forced to use project1.ver1.psd project1.ver2.psd
project1.ver3.psd, etc and so on in order to version their files. Single psd
files can be on the order of high hundreds of megabytes to low gigabytes for
high resolution ready-for-press files.

Not being able to diff the files is not a problem from an organisational point
of view. Of course, in an ideal world there would be diffing of large binaries
in a way that makes sense, but thinking there's no use in versioning binaries
is very short sighted.

~~~
jeremiep
This exactly. We keep our game's asset in SVN at the moment only for the
convenience of versioning. Artists are still locking files to ensure no
conflicts are happening (and therefore no merges are needed).

It's much easier to naively do a repository checkout/update than manually
detecting changes (or rolling your own solution using rsync or similar).

Especially when considering the game has over 100Gb of raw assets. SVN, Git or
Perforce might not be the best tools for such a task but it works great.

------
michaelmior
Duplicate of
[https://news.ycombinator.com/item?id=9063705](https://news.ycombinator.com/item?id=9063705)

------
jobvandervoort
GitLab engineer here, let me know if you have any questions.

We're quite excited about GitLab Annex and curious to hear what people think
about it.

------
icebraining
Joey Hess mentioned this in the git-annex devlog: [http://git-
annex.branchable.com/devblog/day_256__sqlite_conc...](http://git-
annex.branchable.com/devblog/day_256__sqlite_concurrency_argh/)

------
bitwize
Most grownup software shops use Perforce precisely because it can handle large
repositories with large amounts of non-source non-textual content. Maybe this
will bring Git in the same league with the big boys.

~~~
jobvandervoort
We believe source control should be everywhere and git isn't great yet to work
with large binaries. Git(Lab) Annex makes things much better for people
working with these kinds of repositories.

We'd love to get feedback on it.

~~~
Jare
As I mentioned in another thread, efficient storage of these files is
necessary, but not sufficient. Exclusive locking is a requirement for working
with editable (but not mergeable) binary files in a team environment. It's the
reason most games studios use Perforce or SVN instead of git. Tracking locks
using a separate system does not scale, it needs to be integrated and enforced
by the VCS.

PlasticSCM has a hybrid approach that is worth studying.

~~~
lucaspiller
Woukd you mind explaining a bit more why locking is a requirement? It sounds
more like an organisational issue rather than something _required_ in SCM.

Taking the game studio example, why would two developers/artists/whatever need
to work on the same asset at the same time? Locking stops them from stepping
on each other's toes, but it also means one has to wait for the other to
finish their task before they can do anything.

~~~
Jare
Stepping in each other's toes is part of the game artist workflow. :)

Seriously, with the tens of thousands of assets in a medium to large game, the
way they can be reused across the game, and the peculiarities of art and in-
house tools, it's not uncommon for two artists to try to modify the same
asset. Sometimes it's accidental, sometimes it's necessary, sometimes it is an
oversight, sometimes it is an organisational issue as you say... but locking
detects the conflict and prevents it from turning into wasted hours or days.

Relying on a separate tool to manage this issue is a notable increase in
friction. Since the SCM handles changes and conflicts for source code (and
thankfully allows merging), why wouldn't it do so for art?

