
Mastering Git submodules - tddian
https://medium.com/@porteneuve/mastering-git-submodules-34c65e940407
======
jessaustin
Nice to see this caveat right at the beginning:

 _On the other hand, if the technological context allows for packaging and
formal dependency management, you should absolutely go this route instead: it
lets you better split your codebase, avoid a number of side effects and
pitfalls that litter the submodule space, and let you benefit from versioning
schemes such as semantic versioning (semver) for your dependencies._

~~~
nawitus
The drawback is that things like feature branches are more difficult to use.
For example, with npm you need to resort to various scripts or handle them by
yourself manually.

~~~
jessaustin
You're right, but in general I don't think any npm module ought to depend on a
feature branch of another npm module. If a feature is worth keeping around,
it's worth publishing (even if only to your private npm repo). If 19 different
projects want 19 different behaviors from the _module.foobar()_ function, then
split the function up or pass it flag parameters or whatever, but _don 't_
keep 19 different versions of the code floating around. Pain is inevitable in
that scenario. Of course while you're developing a new feature you might want
to use it from another module on your dev machine, but "npm link" is what you
want to use for that.

~~~
nawitus
>Of course while you're developing a new feature you might want to use it from
another module on your dev machine, but "npm link" is what you want to use for
that.

Yeah, but that's the manual way. It works okay when you're only developing a
small number of modules (say 1 to maybe 3), but at some point doing it
manually becomes pretty annoying. And you need scripts and commits to
package.json files if you have a build server which makes builds of feature
branches.

------
jawngee
I swear I must be the only person in the world that has no problems using
submodules and I use them _constantly_. It puzzles me that people have such
problems with it, which I guess illustrates how unreliable anecdotal evidence
really is.

I've had detached head issues, but nothing that didn't take a minute or two to
solve.

It might be that I'm using SourceTree exclusively, so maybe SourceTree is
hiding away the painful bits or something.

~~~
e40
_which I guess illustrates how unreliable anecdotal evidence really is._

Or, maybe your use case skirts around the problems?

~~~
edvinbesic
I think you're both saying the same thing.

------
chocolateboy
For a less painful [1] solution, see git-subrepo. [2]

[1] [https://github.com/ingydotnet/git-
subrepo/blob/master/Intro....](https://github.com/ingydotnet/git-
subrepo/blob/master/Intro.pod#git-submodules)

[2] [https://github.com/ingydotnet/git-
subrepo](https://github.com/ingydotnet/git-subrepo)

~~~
perlgeek
I use git-subrepo too, but it has its own sets of warts. It generates pretty
verbose (and ugly) commit messages made of json. It doesn't let you do
anything (like a pull or clone) when you working directory is dirty.

But on the whole, it is often less painful than submodules. I haven't tried
subtree yet, will do that next :-)

~~~
chocolateboy
> It generates pretty verbose (and ugly) commit messages made of json.

This was fixed recently. [1]

> I haven't tried subtree yet, will do that next :-)

The git-subrepo intro has a pretty good overview of the issues with git-
subtree. [2]

[1] [https://github.com/ingydotnet/git-
subrepo/issues/40](https://github.com/ingydotnet/git-subrepo/issues/40)

[2] [https://github.com/ingydotnet/git-
subrepo/blob/master/Intro....](https://github.com/ingydotnet/git-
subrepo/blob/master/Intro.pod#git-subtrees)

~~~
perlgeek
OK, the commit messages are less ugly now, but they still give the wrong kind
of information.

Four lines of the commit message are about git-subrepo, which I don't care
about. I'm not a git-subrepo developer, and don't care what version of the
tool was used to generate the commit.

What I do care a lot about is _why_ a commit was made, and those auto-
generated commit messages don't tell me that at all. They can't, because the
tools don't read minds. That's why git usually prompts me for a commit
message. I like that. It's one of the reasons I'm using version control. It
would be nice if git-subrepo played along.

~~~
chocolateboy
The metadata in the commit message isn't used in any way, so you can always:

    
    
      git commit --amend
    

I personally find it a useful default, though, i.e. there's not usually much
more I want to say than "added a subrepo: [metadata...]", and, for something
like this, I'd rather err on the side of too much information than too little.

------
TheHippo
I find git subtrees[1] way better then submodules.

[1]: [http://blogs.atlassian.com/2013/05/alternatives-to-git-
submo...](http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-
git-subtree/)

P.S.: Since when is GoDoc a dependency manager for Go?

~~~
ern
Agreed.

It should be noted that there are two confusingly similar concepts in the git
- the _subtree merge strategy_ and the _git subtree_ command (which is, in
fact a git command like _git submodule_ , a point which seems ambiguous in the
OP).

------
Navarr
> For instance, themes and plugins for Wordpress, Magento, etc. are often de
> facto installed by their mere presence at conventional locations inside the
> project tree, and this is the only way to “install” them.

> In such a situation, going with submodules (or subtrees) probably is the
> right solution

As a Magento Developer, I'm afraid submodules wouldn't work for Magento
plugins. As they unfortunately have to be installed to multiple top-level
project directories. It is a nightmare (and is better served by the composer
efforts for dependency management).

~~~
tokenizerrr
The last time I worked on Magento, and it has been a while, we would install
the dependency in a single directory, and then you'd run a tool over it to
"install" it into the proper locations. Sadly I cannot remember the name of
this tool, but from memory it seems you could combine that with submodules?

We used this so we could track an entire extension's source code in Git, and
move them around easily.

edit: I was thinking of modman

------
kolev
I use them daily and they're painful! Thankfully, there's Peru [0].
Unfortunately, it only works on Python 3.3+.

[0]
[https://github.com/buildinspace/peru](https://github.com/buildinspace/peru)

~~~
rspeer
What's unfortunate there? It's not like Python is hard to install.

~~~
kolev
For stuff you use daily - not an issue, but having to install Python 3.3+, and
then Peru, and then being able to do something with a third-party project
would be a bit too much for some.

------
gitaarik
I wrote very a brief how-to on Git Submodules [1] for my colleagues a while
ago, because a lot of people seemed to have trouble with it. It explains the
real basics and how to use them painlessly in most situations. So it's much
shorter than the OP's article and that might help if you don't want to force
everyone in your team to read a big article like this.

[1]:
[https://gist.github.com/gitaarik/8735255](https://gist.github.com/gitaarik/8735255)

~~~
bronson
This guide seems woefully incomplete and optimistic. No coverage of submodule
merge conflicts? No describing how to checkout a revision before the submodule
was added or after it was removed, or how to bisect a project with submodules?
No warnings about accidentally abandoning commits on submodules? (maybe: don't
edit files in a submodule and for pity's sake never commit to one without
immediately pushing?) Should also probably cover what your whole team must do
when a submodule moves or gets renamed upstream.

Also, you might want to mention how git status is affected by submodules (&
diff, log, etc).

Finally, I'm not sure you should make --init --recursive the default. If you
don't realize a project contains submodules, you're going to make mistakes.

------
donatj
We used them for a few months and it was always painful. Always. We ended up
writing our own composer installer plugin that let us put things in the
various places we need them and it's been soo much better. Someone needs to
make a dead simple non language specific git-tag package manager where I just
specify github.com/blah:1.0.* -> folder/blah and it keeps it up to date. It's
what we're using composer for but its a stretch.

------
tinco
At work we have a git repository called the 'projectname/super-project'.
Developers clone this repo, and all it contains is a pair of shell scripts and
a bunch of git modules. A module for every project, it's a SaaS cluster, we
have projects like 'http-frontend', 'http-backend', 'gateway', 'indexer',
'marketing-website' and some forks of open source projects we customized for
our use case.

After a fresh developer has cloned the super project, they just run the
`./setup` shell script and it will inform them of any dependencies their
system doesn't meet and then start the docker provisioning process. All
subprojects have docker containers associated with them, and we have shell
scripts that launch all docker containers and link them together.

A new developer can cold provision the entire cluster on their laptop in a
matter of minutes, pretty neat :)

~~~
ericclemmons
Nice to see this post! We're dismantling a monolith and choose this exact same
solution. I'm glad it's working out in the wild.

------
Mathiasdm
Interesting overview!

Some of this caveats are kinda surprising for me, coming from a Mercurial
background:

* Every time you add a submodule, change its remote’s URL, or change the referenced commit for it, you demand a manual update by every collaborator. Forgetting this explicit update can result in silent regressions of the submodule’s referenced commit. -- This is something handled automatically in Mercurial. Is there any reason why the same behaviour is not used in Git?

* Commands such as status and diff display precious little info about submodules by default. -- This should be possible to implement, no? It's also something that's available in Mercurial (using the --subrepos flag), and it's a huge boost to usability.

~~~
perlgeek
> Commands such as status and diff display precious little info about
> submodules by default. -- This should be possible to implement, no?

Not just them. git archive and git grep for example totally ignore submodules.
The whole thing is bolted on, with minimal integration into the core commands.
So, usually something to avoid.

------
tddian
By the way, the subtrees article I alluded to in the original article was
written since, and I _do_ favor subtrees over submodules, every time:
[https://news.ycombinator.com/item?id=9080096](https://news.ycombinator.com/item?id=9080096)

------
crdoconnor
I was thinking of using this on a project where a part of the code base needs
to be kept secret from some members of the team (stupid proprietary
requirement).

It sounds painful, though.

