Because Github has their explicit, top down, a-fork-is-only-a-fork-by-clicking-the-fork-button kind of graph between projects, using subtree won't work easily with their online tools to do pull-requests and see the network graph of forks.
Basically my repo is going to have the histories of 7 different projects combined in one repo and from that single repo I will be pushing back changes to all 7 (and others versions of those 7). Github's pull request feature is going to have trouble with that concept because they make the invalid assumption of a single upstream.
There are work arounds for sure like pushing your changes to a staging repo before finally doing a pull-request upstream but that is cumbersome. I'll probably write a shell script to automate it.
If that is the worst problem -- one that is easy to solve and easy to repeat the few times you need it -- then the more love for submodules. I'll take that over any other available solution any day (I don't count subtrees as 'available' yet).
Ever had an upstream gitsubmodule move or shut down? I can't go back 6 months in my SCM and rebuild exactly what I had because the repo is gone now. This is bad and against the concept of having an SCM in the first place.
There is a horrible habit of people forking projects on github just so their submodule stay stable. It's broken.
> There is a horrible habit of people forking
> projects on github just so their submodule
> stay stable. It's broken.
People are forking projects on github in that way because it's a quick, easy, cheap way to create a mirror/hosted copy of the repository they want to use.
I don't see how this is an example of how git-submodule is broken. If you want to use someone else's code, that is under someone else's control in your repository without creating your own backup of said code, then you're the one taking the risk by not creating said backup.
If the git-submodule is mission-critical to you then you should either:
1. Always keep a separate mirror of the 3rd party repo.
2. Mirror the 3rd party repo to a hosted location, so that your submodule can point to the mirror instead of the source (basically creating a caching layer under your control).
This is no different than the guy that keeps all of his email in Gmail, then complains because Google shut down his account that 'email is broken' because it's possible for this to happen.
subtree also includes the full history of the other project in your project's history unless you squash all of the commits (and if you squash them, then you haven't stored the full history). There might also be licensing reasons for wanting to keep the repositories separate, but not.
Which is why you should only use submodules with repositories under your control. At my work we happily use git submodules. Much better than svn externals and much better than any alternative that existed so far.
So this looks pretty awesome. I can see myself using it a lot. There is one workflow that I do with submodules, that i don't see how to do in subtrees:
clone a repo to the target dir. Add it as a submodule. Decide to play with a branch of the submodule, so switch to that branch in the cloned repo. Then if i decide to work with that, update my submodules, otherwise switch back.
Is that sort of workflow available in subtrees? How do I do it?
Dependency control it is a solved problem. Which is why SVN has externals, Hg has hgsubs, and Git has submodules and now subtrees. This isn't a git-in-the-large problem, it's an almost-any-nontrivial-project problem. It only seems unnecessary 'till you've had it and tried to live without it.
Bringing other libraries into my project is beneficial because then my build system is able to wrangle those just as easily as my code. (And it's also simply useful in the case where I've written both modules but maintain a separation for whatever reason--one is an open-source project and the other is not, whatever--to be able to make changes to one from within the other, run the tests for the submodule, and push it up to staging or upstream, without having to leave my current project.)
After you have divided your project into independent modules you have agreed that the changes in these modules are going to have minimal impact on each other, then what exactly is the point of merging the history of all those changes?.In my use case that would actually create a bigger mess.
Now I think this could perhaps be useful if there are modules that I have forked from elsewhere and the fork is going to be used in my project only.Although even then I dont see any downside of using submodules.
The downside of submodules is lack of good commands .For example a command to check out a different branch of each of my submodules - the branch which is used in this project.This could be done using the -for-each tag but its not trivial.
EDIT:In this thread zbowling makes a great argument against submodules.
There is a horrible habit of people forking projects on github just so their submodule stay stable
The vim-colors-solarized repo is a subtree of the solarized repo. This is mostly a convenience for vim users that use something like pathogen + git-submodule to keep their plugins up-to-date. This way you can create a submodule @ ~/.vim/bundle/vim-colors-solarized and it would be the root of the bundle tree. If the vim colorscheme was only part of the larger repo, then users would be forced to create their own repos, or else do something like:
It makes it a little easier to maintain one or more forks of some library. Without submodules, you must have a separate repo for that, and there's no clear connection to the repo of your particular project.
This gets even more fun when you have a second project that needs an incompatible fork of the same library.
It's important to not only reference an external project (e.g. library) but also to reference a particular version of that library. (e.g. newer versions could remove deprecated methods which you are using, i.e. which weren't deprecated when you wrote your code.)
Different versions of your code could rely on different versions of the library (e.g. you update your code to a newer version of the library.) So which version of the library you rely on also needs to be version-controlled.
For instance because you have componentized your product into different repositories, use different combinations of the components in different projects, but are still actively developing many of them simultaneously on each of those projects. Having to build, deploy and fetch gems (or whatever other method of distribution your language uses) is cumbersome if development is still really active.
I don't think you should ever want to bring unreliable external components, like a random github project that you aren't actively contributing to, into your tree like that.
"For instance because you have componentized your product into different repositories, use different combinations of the components in different projects, but are still actively developing many of them simultaneously on each of those projects."
Funny, I thought this was the beginning of an argument agreeing with me. :)
git-subtree is not the same as a subtree merge. The author himself likes to point that out. A subtree merge is a one-time operation. git-subtree, on the other hand, enables you to continue to merge in upstream changes. You can also split the subtree (including it's history) from your repo and make it a standalone repository again.
Thank you for the clarification. Quoting the author:
“[Subtrees] are also not to be confused with using the subtree merge strategy. The main difference is that, besides merging the other project as a subdirectory, you can also extract the entire history of a subdirectory from your project and make it into a standalone project. Unlike the subtree merge strategy you can alternate back and forth between these two operations. If the standalone library gets updated, you can automatically merge the changes into your project; if you update the library inside your project, you can ‘split’ the changes back out again and merge them back into the library project.”
I've been trying out git subtree and I understand it's advantages. But I've wondered if there's a way to use it without adding the merge commits to the timeline when I update the subtree repository. Something similar to what git rebase does. (Maybe this is a dumb question and I'm not using git subtree right) :)