

Why I Don't Hate Git: Hidden Consistency - kozlovsky
http://lucumr.pocoo.org/2015/2/17/ui-and-hidden-consistency/

======
jordigh
(copied over from lobsters, with links to more material there:
[https://lobste.rs/s/odj5y1/why_i_don_t_hate_git_hidden_consi...](https://lobste.rs/s/odj5y1/why_i_don_t_hate_git_hidden_consistency/comments/dtlwwf#c_dtlwwf)
)

Haha, I spurred this debate with mitsuhiko over IRC yesterday. I was arguing
with him over git vs hg yesterday, and this blog post is obviously his retort.
Here is my defence of Mercurial:

Mercurial’s design aims to welcome people who come from other VCSes. It
started being welcoming to CVS and SVN users, and its CLI mimicked those, as
well, as a few ideas of Bitkeeper. Git’s initial design was very bitkeeper-
like too, such as branching-by-cloning being the only way to branch. Nowadays,
Mercurial also makes some concessions to git users.

Despite its various sources of inspiration, Mercurial works hard to keep all
of these ideas consistent. Commands rarely grow new options. Many deviations
from core functionality are first tested in optional extensions for a long
time before going into the core. Lots of time is spent bikeshedding what a
command’s name should be, and what language the documentation should use.
Consistency is very important. Backwards compatibility is tantamount. The CLI
is the UI and the API.

A thing git is often lauded for is the simplicity of its internals, which are
frequently deemed to be as simple as to not be internal at all. Despite being
a binary format, Mercurial’s revlogs are also approximately simple, which is
why people sometimes write parsers in other languages.

But Mercurial is a lot more than just git-with-a-nicer-UI. There are many
exciting features in Mercurial, features that I don’t think will ever make it
into git because they are just too different from the way git works. Mercurial
Evolve really changes the way we collaboratively edit commits. Templates and
revsets can be combined to program interesting extensions. New extensions can
scale Mercurial into gigantic repos.

And because I think these ideas are so great and must be explored and
improved, I will keep using Mercurial, teaching Mercurial, and improving
Mercurial

~~~
alblue
No, Mecurial revlogs are not approximately simple. They were designed to be an
append-only log for individual files, in the same way that CVS and RCS
performed file-level change repositories. This choice causes problems when
doing file renames and history merging because you need to either duplicate
data or provide complex pointers between filenames which may not even exist in
the tip.

I wrote a post comparing the file layouts a few years ago:

[http://alblue.bandlem.com/2011/03/mercurial-and-git-
technica...](http://alblue.bandlem.com/2011/03/mercurial-and-git-technical-
comparison.html)

The TLDR is that Mecurial revlogs were designed from lofty architectural goals
(append-only formats, designed to be parsed forwards) whereas Git is just an
object soup with pointers to other objects in the same soup. As a result,
different file storage mechanisms have been created (direct file, push to
remote HTTP/S3, BigTable etc.) and new features (bitmaps, packed archives,
delta compression between the same and different filenames) have been grafted
on over time. The format is also versioned, with feature versions being added
at a later stage to the transport protocols that ship packs of these deltas
between versions.

Frankly the only valid criticism of Git appears to be 'The command line flags
are a little funky' and given the extensive Git tooling that has been built
(it's provided by default in Visual Studio, Eclipse, IntelliJ, Xcode and
others) the fact that command line tooling takes a bit of getting used to is
really such a non-issue that I'm not going to waste further time talking about
it here.

Mercurial is dead, but its fans just haven't noticed yet.

~~~
rbehrends
There are problems with Git's "object soup" approach, too. For example:

(1) The performance issues with "git blame" appear to be essentially
unfixable. That tradeoff is by design [1]. Note that by "performance issues" I
mean that git blame can take several minutes on some repositories (e.g.
src/xdisp.c in Emacs).

(2) Git is pretty much tied to a 1:1 repository:directory model and cannot
safely support a 1:1 branch:directory model with multiple branch checkouts
sharing a repository (git-new-workdir as the closest approximation is not
safe).

In general, a lot of operations on Git have an O(branch history size) or even
O(repo size) complexity; that is not a problem if you do not need them, but it
puts limits on what you can do efficiently with Git (at least without adequate
caching and the necessary porcelain to use it).

That, of course, on top of the other criticism typically targeted at Git
(poorly designed command line interface, difficult to understand internal
model [2], possibility of data loss [3], etc.).

> Mercurial is dead, but its fans just haven't noticed yet.

Not everybody here is concerned with the silly Git vs. Mercurial war (the
modern version of Emacs vs. Vi). My personal concern is that the result of
widespread Git adoption is that VCS development has stalled and has settled
for a "good enough" that I don't really consider good enough; Mercurial is
interesting not because they're doing things better (though they do some
things better, and other things worse), but because they appear to be the one
team still actually experimenting with new things (e.g., changeset evolution).
I really wish there were more going on with Bazaar (which is basically in
maintenance mode now, and Canonical doesn't even seem to put a whole lot of
effort into that, with lots of bugs still outstanding) or Fossil (which is
mostly trying to be conservative rather than innovative [4]).

In short: Competition is good, monoculture is bad. The desire to have one VCS
to "rule them all" worries me.

[1]
[http://marc.info/?l=git&m=116991865311836](http://marc.info/?l=git&m=116991865311836)

[2]
[http://people.csail.mit.edu/sperezde/onward13.pdf](http://people.csail.mit.edu/sperezde/onward13.pdf)

[3] [http://jenkins-ci.org/content/summary-report-git-
repository-...](http://jenkins-ci.org/content/summary-report-git-repository-
disruption-incident-nov-10th)

[4] Which is a worthy goal in itself, but it also means that they aren't
really moving VCS development forward.

~~~
saidajigumi
Your data loss example [3] is precisely _not_ data loss, like virtually every
other example I've seen regarding git. The commits and history were
definitively not lost, and although the article doesn't say, the repositories
affected by this tool should have even still contained the history record of
the old head(s), making rollback to the latest commits a fairly
straightforward affair.

This is honestly no different than someone misconfiguring a tool for any other
VCS and overwriting the repo's head with old junk. In the (other VCS) case,
the fix operation is to go into VCS history and resurrect the correct head
commits (e.g. via a 'revert' style operation). In git's case, the fix
operation is ... go into VCS history and resurrect the correct head(s),
traditional revert not being relevant to how git functions in such exceptional
cases. It's either ignorance or disingenuousness to call git's behavior here,
though different than SVN, hg, etc. as "data loss".

~~~
rbehrends
First, how do you know that no data was lost? There is no way to even verify
that all data was recovered. They are pretty confident, but there's really no
guarantee, is there?

More generally, yes, the case you are worried about is typically not when
there are lots of repository clones in circulation (though the KDE case [1]
shows that data loss is quite possible even then and that replication is no
real alternative for proper backups [2]).

And, of course, the reflog will keep commits alive for a while and garbage
collection will not occur while the grace period is over.

The situation where this doesn't work so well is personal/small group
repositories or branches that only experience intermittent commits and that
aren't being mirrored by a large number of users. In that case, user errors
can easily translate into data loss when garbage collection finally catches up
with you.

[1]
[http://jefferai.org/2013/03/29/distillation/](http://jefferai.org/2013/03/29/distillation/)

[2] A practical example would be where a repository is so large that many
contributors prefer to use shallow clones.

------
krupan
Basically this says, "git's UI was so bad that it forced me to learn the
internals, and once I groked the internals git made a whole lot of sense."
That summary might sound like I'm trolling but I don't think that's a bad
thing (else, why is git so popular?). I've been the mercurial "expert" on my
team at work for the past 5 years and I can't count the number of little DAG
diagrams I've drawn trying to get people to understand what was _really_ going
on with their repository. I'm pretty sure there are a few people who still
don't really understand how mercurial works under the hood. Maybe that's a
positive aspect of mercurial...but maybe it's not :-)

Now, I will say that the internals of mercurial that you need to understand to
gain enlightenment seem simpler to me than what you have to know for git.
Mercurial has commits and commits have parent reference(s) that link commits
together into a DAG. Commits might have a branch label. You move commits from
one repo to another with push and pull. You create a commit with two parents
by doing a merge. And that's it!

Git has those same basic concepts, but you also have to know about the index,
and branches (which are really pointers to commits that may or may not be a
branch in the DAG), and remote branches, and merges that aren't really merges
because they just move the branch (which is a pointer, remember, that's why a
"branch" can move) to the head, and all kinds of other interesting (and
useful, no doubt) concepts.

------
fsk
Three words against git: Detached Head State

I view git as a sort of shibboleth.

You can't really understand how git works unless you understand trees as a
data structure. That excludes all but the hardcore types.

Some designers and CSS experts need to use source control, but Git is too
complicated for them.

Once you get a detached head state or corrupted repo, then you need a git
expert to clean things up. I once committed while in a detached head state,
and so git ate my changes and I had to reflog to recover them. That is just
insulting.

At my job, I work with some designers now, and they always leave the test
server in a detached head state.

But when I switched from the GitHub client (yuck) to the SourceTree client,
most of my concerns went away.

~~~
allemagne
>You can't really understand how git works unless you understand trees as a
data structure. That excludes all but the hardcore types.

DAGs and BSTs are taught in second year undergraduate classes where I'm at.
What's hardcore about them? Serious question.

~~~
to3m
Non-programmers tend not to have taken undergraduate computer science classes!

~~~
fsk
Yes, that was my point. There are people who don't have a CS degree, that
should use source control. For example, designers should use source control
for their work. Explaining git to them is not feasible.

------
scrollaway
Good article. That old tutorial is scary :)

> I screwed up really badly before, merging wrong things together,
> accidentally deleting data and much more, yet I never lost any data or felt
> left abandoned by my tool.

I can relate for the most part. I can only think of one instance (in over half
a decade) where I felt git's shortcomings: there is no way to get a deleted
non-gc'd object from a remote to your local repository, even if you try to
reference it by its sha1.

This happened to me when some bad changes were force-pushed to a repository on
Github and did not have access to a machine which had the latest changes. My
repository on Github still knew about the old commits, but they were
unreachable by git itself.

~~~
phs2501
> I can relate for the most part. I can only think of one instance (in over
> half a decade) where I felt git's shortcomings: there is no way to get a
> deleted non-gc'd object from a remote to your local repository, even if you
> try to reference it by its sha1.

For better or worse (and I've wanted it to work too) it's an intentional
security feature that you can only pull objects from a git remote that are
reachable by its refs; that way deleted branches (e.g. containing data that
wasn't intended for release) are instantly unavailable rather than needing to
wait for GC.

~~~
scrollaway
Seems like a shaky justification. I understand not offering things that are up
for deletion but there wasn't even a way to do git pull --i-really-want-
everything or some such.

If you push passwords and keys to your git server, then force-push those
things out, you most definitey want to run a gc. Git is a flimsy security
layer around this.

~~~
avar
Of course you can't run "git pull --i-really-want-everything", you're the
remote attacker this feature is meant to protect against!

The use-case for this is that you're pushing to some shared hosting like
GitHub where you can overwrite and delete refs, but you can't force a gc.

You don't want someone to scour your Git commit announcements and see "oops,
deleted password!" and go and fetch the deleted SHA1.

------
akkartik
Do others have examples of software "where the way it works is a crucial part
of the user experience"? The one that comes to my mind is lisp macros; if you
mess with them for a month or two you can't help but have a pretty good
understanding of how they work. Clean internals can colonize your brain in a
way that merely clean interfaces can never compete with.

~~~
pepve
I like Redis as an example in this category.

I also think this ties in with the law of leaky abstractions, in that a good
understanding of that law will make ui designers choose lesser/thinner
abstractions over bigger/leaky ones.

------
melloclello
This article begs the question - what might a sane/consistent UI on top of git
look like?

~~~
tootie
I'm surprised a facade hasn't emerged yet. Personally I think it's nigh
impossible to sanely do merges with a GUI. I know people do, but I think
they're crazy. Using git via any JetBrains IDE is the best I've seen.

~~~
recursive
I'm confused. Isn't the JetBrains IDE a gui? How can it be the best you've
ever seen and impossible to sanely do merges?

FWIW, I've only ever done git merges in a gui, and never had any difficulty
with it.

~~~
gknoy
He might mean that he uses Jetbrains' IDEs with git for nearly everything, but
feels that using it to merge things is madness.

It seems strange, but I really like having a console-based workflow with my
VCS. I can leverage grep, shell aliases, and so on to customize my interface a
fair deal, and the forced interaction helps keep me mindful of what I am
doing.

------
zallarak
The conceptual model matches the implementation - this is what Don Norman's
"Design of Everyday Things" says is good design. When this is the case, a user
can interact with the object's interface using logic and intuition and get
predictable results.

I agree, git really embodies this.

~~~
bch
> The conceptual model matches the implementation

I depends where on the conceptual stack you think you lie. I use fossil[1]
because I don't have detached heads, need to understand the backing-store to
reason about "where am I?" and have "porcelain" that seems to leave every
single operational aspect laid out for me, instead of abstracting it away.
When I work w/ version control, I work with files, and putting them away.
Detached heads shouldn't be my concern, nor the conflicting ways describing
how to sync[2], etc., etc., etc.

[1] [http://fossil-scm.org/index.html/doc/trunk/www/index.wiki](http://fossil-
scm.org/index.html/doc/trunk/www/index.wiki)

[2] [http://stackoverflow.com/questions/15316601/in-what-cases-
co...](http://stackoverflow.com/questions/15316601/in-what-cases-could-git-
pull-be-harmful)

EDIT: spell 'porcelain' correctly.

------
wcummings
Having grown up w/ git (svn was on the way out when I started programming) and
then having had to use svn at a previous employer (I migrated it to git,
eventually), I can't fathom a reason to prefer svn or to "hate" git other than
ignorance.

You can spot an ex-svn user from a mile away by their commit history. We need
rehabilitation clinics.

~~~
marssaxman
There are more VCSes in the world than just those two. Many people seem to
credit git with all the virtues of distributed version control in general,
without noticing that its interface is terrible and that the same ideas have
been implemented in a much more friendly fashion by other tools. Git is
popular because it is the official vcs of the Linux kernel, and the network
effect did the rest.

~~~
therealdrag0
> in a much more friendly fashion by other tools.

Examples?

~~~
ufo
Mercurial, Darcs, and so on...

------
serve_yay
I am getting a message when I push stuff to Github (or any remote) because
there is a git config value that I need to set. I just did a search to try to
understand the effect of each option for this value, so I could pick the one
that best fits how I want git to act. I am approximately 90% sure I understand
what each one does, however all of the explanations of them have some element
of confusion or lack of clarity. This is the problem of using git.

~~~
eridius
Could your "problem of using git" be a result of "doing a search" instead of
reading the documentation that Git provides? `git help config` gives
documentation on all of its config values.

------
zem
tangentially, that's a really lovely header font ("jim nightshade"). one of
the very few times i've seen a "fancy" font work well in a post like this.

