

Alternatives To Git Submodule: Git Subtree - durdn
http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/

======
beagle3
One thing that submodules do better than subtrees is scale; in a subtree,
everything is part of your repository (thus, index, and the associated
problems with having having millions of files).

Submodules, on the other hand, have independent index files tied to
independent repositories, so it's possible to not have a huge index anywhere
regardless of total repository size.

~~~
pixelcort
One issue with submodules is if they rebase and GC out the commit you're
pointing to, you're going to have a bad time.

This may not be a problem when pointing to the commit of a stable tag, but
when pointing to a development commit that might be rebased out later, you may
find the submodule not to work.

Not to mention if the server hosting the submodule is down and now all of a
sudden you can't redeploy.

One workaround is to fork the submodule and point to a commit in your fork.

~~~
beagle3
But that's basically not different than rebasing a public repo regardless of
submodules or subtrees - all hell breaks loose

------
jedbrown
Subtree makes the most sense when you have a component that is completely
dominated by its parent, but which you want to also release stand-alone. For
example, we use it for a build system that gets 90% of its use and testing
(and essentially all its development) from one parent project, but is used in
a few other much smaller projects.

Submodules provide weaker coupling and make the most sense when the submodule
has its own healthy upstream and you want to track those versions. It's
awkward if all submodule development is happening from within the parent.

My rule of thumb: If the sub-project has a test suite that is complete enough
you can trust it, then use submodules or a different installation method. If
the sub-project always needs the parent (preferably only one parent) to verify
correctness, such that you prefer to make commits from the parent repository,
use subtree.

------
mindjiver
I introduced git at $DAY_JOB roughly 2 years ago and based it on submodules.
We use the submodule subscription feature of Gerrit to lessen the pain [1]. So
submodules are not _that_ bad but I really believe they are trying to solve a
problem better solved by a dependency management system (maven and similar)
instead.

So even subtree might be "easier" you should really try to solve your
dependency problem using a resolver instead of your version control system.

[1] [https://gerrit-review.googlesource.com/Documentation/user-
su...](https://gerrit-review.googlesource.com/Documentation/user-
submodules.html)

~~~
durdn
Oh I totally agree! If you can it makes sense to manage dependencies with
tools that have been refined for years for the task like maven, gems, pip,
npm, etc... I had this exact discussion with a colleague at Atlassian while
writing about git submodule. Using git submodule or subtree might make sense
if your environment is not homogenous, for example if you need to mix
different languages and keep everything consistent. Even then you can use
broader focus package managers like dpkg, rpm, apt, etc.

In any case I like the feel of git subtree as I have written in the post.

~~~
mindjiver
And I agree to this as well. :)

Our setup is quite special and in that case Gerrit/submodule subscription sort
of works. But in hindisght we should have also tried to migrate to some
generic packaging system. The reason for going with submodules was that we
have ~1000 devs and most of their time they spend in their specific part of
the code base. So we created submodules for these discrete parts so they
wouldn't be "bothered" by changes to parts of the code base they don't
normally access. Also we _thought_ there wasn't so many cross-dependencies
between the repos, but this turned out to be false.

We have now started to look at subtree for another product but I hope we can
go directly to some packaging/dependency tool instead.

------
acjohnson55
I've migrated all my project submodules to subtree after a number of
frustrations, culminating in finding out that Heroku doesn't support
submodules. But it's still a huge pain to actually push changes upstream.

Not counting the issues of submodule support by services that interact
directly with Git repos, I think the big question is how often do you need to
hit the upstream repo for pulls/pushes. If the answer is "not super often",
subtree is probably lower maintenance. I think submodules are a little bit
more natural in cases when you are making a lot of changes to the subrepo,
particularly when it's more standalone. But even then, subtree is probably
worth the extra bit of effort.

Actually, in cases when the coupling is very loose, I forgo both options and
just check out the subrepo directly into the superrepo, and rely on .gitignore
in the superrepo to keep them seperated.

At the end of the day, I think there really needs to be another strategy, and
it would be great if there were a way it could be implemented in to the Git
core. The concept of a modular project just doesn't seem to completely jive
with the Git architecture. Anyone have any ideas how this could work?

~~~
mattberg
what are the issues/implications of "just check out the subrepo directly into
the superrepo, and rely on .gitignore in the superrepo to keep them
seperated"?

~~~
tasuki
When making a new checkout, you have to check out and keep the subrepo in sync
manually. That is, unless you're using a dependency manager, which you
probably should, as it will give you almost all of the advantages of
submodules without their disadvantages.

------
astrobe_
I've used this method:

[http://jasonkarns.com/blog/merge-two-git-repositories-
into-o...](http://jasonkarns.com/blog/merge-two-git-repositories-into-one/)

... in order to manage a communication protocol library specific to two
programs. Originally there was three repos (Program A, Program B and the
library), that I merged into a single repository. I can merge changes from the
older repositories, that are still active.

It looks that this method is a close ancestor of git-subtree.

------
onedognight
> There are several reasons why you might find subtree better to use:

> subtree does not add new metadata files like submodules doe (i.e.
> .gitmodule).

So instead you have to memorize the exact information that in the .gitmodule
file and type it in every time you run "git subtree" and then out of band
communicate this information to your colleagues.

------
jb55
My only problem with subtree merging is I have to look up how to do it
everytime

------
pnathan
I would like to point out/plug an alternative to hg subrepos:

<http://mercurial.selenic.com/wiki/GuestrepoExtension>

I was part of the team that came up with it and I am inordinately proud of it.
:-)

~~~
shuzchen
Does guestrepo support non-hg guestrepos (as subrepos does)? My beef with git
submodules is it only allows git.

~~~
pnathan
Sorry about the delay in answering.

No. Guestrepos does not handle non-hg repositories. I'm not offhand sure how
complicated that would be. At some level, it should be very simple: `git clone
&& git checkout $rev`.

Pretty sure that a simplistic foreign implementation would work quite well.

------
gfunk911
Not the same thing, but I feel like Bundler does a similar thing to git
submodules (for Ruby only) but does it much better

