

CVS's problems resurface in Git - suraj
http://lackingrhoticity.blogspot.com/2010/08/cvss-problems-resurface-in-git.html

======
mansr
Keeping unrelated projects in a single repo is every bit as bad as keeping all
the source code for one project in a single 500k-line C file. CVS modules are
a nightmare to work with, and I consider the lack of such abominations a
feature of git et al, not a shortcoming.

The author's straw man of a git repo per file is also quite unconvincing.

Git may have some shortcoming, but this is not one of them.

~~~
sandGorgon
not really. The problem is the way that projects are dependent on other
projects. It is especially illustrative, if you use a package manager like
apt-get. Installing something pulls in tons of other libraries/dependencies.

When you want to build something, you would want to build against their
dependencies.. which you would want to tag in some manner.

For e.g. v1.0 of ProjA needs v0.8 of ProjB and v3.2 of ProjC. A nice example
is NodeJS that needs The GoogleV8 engine. Take a look at the number of
changesets that are simply managing revisions of V8 inside the NodeJS tree
(<http://github.com/ry/node/commits/master/deps/v8>)

In an ideal world, you could have tagged a version of NodeJS source code with
a particular version of V8 (which lives in a different repository), all under
the same directory structure.

~~~
pilif
If v8 was available as a git repository, node could have gone the submodules
route which works perfectly and is solving the exact problem the parent
article is talking about.

To stay with node/v8, you would add v8 as a submodule and check out a specific
revision or tag. The parent repository (node) would keep track of that
revision and whenever you clone the parent, you'd be able to fetch the
submodule and check out the correct revision.

If you want to update v8, you cd into the v8 dep directory and check out the
newer tag. The parent repository will see this and offer you to commit that
change to the submodule as a commit of the main repository.

We're using this facility internally with some projects, sometimes adding
internally developped modules as submodules, sometimes adding public
libraries.

Public libraries as submodules is really cool as you can use git's awesome
merging features to keep around modifications you made to said libraries even
if you update the target library.

I'm having a bit of difficulty at explaining the concept, but once you do it,
it's simple and beautiful.

------
rue
Did the upvoters actually read and understand this post and its errors and
decide to up it just because it was an "interesting" boneheaded post?

Or maybe it is all the anti-TDD folks upvoting first and then maybe checking
the whole thing actually makes sense? :)

 _Submodules_ , if you find you must depend on external libraries at the
repository level (for patching, perhaps).

~~~
amackera
What does git have to do with TDD?

------
js2
> Checking out a Git repository involves downloading not only the entire
> current revision, but the entire history. So this creates pressure against
> putting two partially-related projects together in the same repository,
> especially if one of the projects is huge.

See git clone's "--depth" option. There are limitations to what you can do
with a shallow clone, but you do not, in fact, need to pull down an entire
history.

Regarding the rest of TFA, the author should look at git submodules, as
already mentioned here in the comments. However, submodules should not be used
without understanding their implications and limitations (it is worth
searching the Git mailing list).

There are at least three alternatives to submodules I'm aware of:

1\. git-subtree - <http://github.com/apenwarr/git-subtree>

2\. gclient - <http://www.chromium.org/developers/how-tos/install-gclient>

This is part of so-called "depot_tools" used by Chromium and ChromiumOS, and
can tie together subversion and git repos.

3\. repo - <http://source.android.com/source/git-repo.html>

This is used by the Android project.

The latter two are not really designed to be used outside of the projects for
which they were created, but both are Python code so they can probably be
adapted pretty easily.

------
jerf
The problem is not CVS or Git or SVN. The problem is fundamental and actually
very widespread. Within a given semantic domain, one program, one database
manipulated by one program, one OS process, we can enforce all sorts of useful
guarantees, from consistency of the database to guaranteeing interesting and
useful properties about the data to strong typing (in the program case) and so
on. Once you leave the island, you are back in the untyped swamp of arbitrary
code. Git makes progress by enforcing more interesting properties on a larger
chunk of stuff than CVS, but nobody can enforce those interesting properties
on the entire universe.

When you learn to see this problem, you start to see it everywhere; it's so
fundamental and pervasive you can't hardly even see it.

------
dochtman
Mercurial's subrepo is a pretty good stab at fixing this, IMO. It's
unpolished, right now, in that it doesn't deal with some of the more complex
scenarios well, but the idea is really getting there (and it works with SVN
subrepos, too -- we have git in the works).

------
pieter
* _The DAG-based systems don't represent changesets that cross repositories. They don't have a type of object for representing a snapshot across repositories. _

* _Creating a tag across repositories would involve visiting every repository to add a tag to it._

These two, at least, are technically accounted for by git submodules, though
they are a bit of a pain to use.

~~~
pilif
I don't know. Personally, I love submodules. Once you "get it", it's really
straight forward.

It really shines, if you use them for third party libraries that you are
keeping patches for. In the old days, updating said libraries was nearly
impossible once you began patching them. With submodules, you can use git's
really cool merging capabilities to help you.

Even better: The workflow when updating a library patched by you to a newer
upstream version stays the same as if you'd just do a traditional merge in
your own codebase. So there's no need to learn even more commands or new
workflows - it just works.

I'm a huge, huge fan of how git submodules work.

~~~
binaryfinery
Until you have merge conflict. If someone could explain to me which file
contains the conflict markers so I can fix them I'd be really happy. Its not
the .gitmodules file.

~~~
pilif
if it's a merge conflict happening while updating the submodule, then the
conflict is inside the working copy of that submodule. Go there and resolve it
using the usual tools, commit and commit the updated submodule revision in the
parent repository

~~~
binaryfinery
Except that doesnt work, because that just changes the hash of the submodule
(as stored by the supermodule) again. The problem is that the supermodule just
stores the _hash_ of the submodule. The only way I found of fixing it was to:

1\. Create a new repo.

2\. Get the branched submodule.

3\. Check that in as head: Note, now HEAD will NOT compile - I've committed
broken code.

4\. Go back to the original repo.

5\. Get HEAD: this pulls the hash I need.

6\. Do the merge. The submodule doesn merge, because it has the right hash.

Surely there must be some way to just modify the hash. But I don't know where
to change it. I found it in several locations, but I don't know how it works,
and hacking my VCS isnt my idea of good sense.

------
nuclear_eclipse
> _In the DAG-based systems, branching is done at the level of a repository.
> You cannot branch and merge subdirectories of a repository independently:
> you cannot create a commit that only partially merges two parent commits._

Please inform me, how can you ever possibly create a commit that is a "partial
merge" from two parent commits? At some point, you had to explicitly choose
what you wanted to merge or not merge, and you can do exactly that in Git, and
I would certainly assume that you could do that in Mercurial and Bazaar too.
Or am I missing something?

------
zokier
I don't see what the problem is here. Git should manage project dependencies?
WTF

------
Seth_Kriticos
I don't get it. What prevents him from writing a simple shell/python/whatever
script that uses a simple hash-table with tag associations for different
repositories for synced checkout via git hooks?

