(copied over from lobsters, with links to more material there: https://lobste.rs...

alblue · on Feb 18, 2015

No, Mecurial revlogs are not approximately simple. They were designed to be an append-only log for individual files, in the same way that CVS and RCS performed file-level change repositories. This choice causes problems when doing file renames and history merging because you need to either duplicate data or provide complex pointers between filenames which may not even exist in the tip.

I wrote a post comparing the file layouts a few years ago:

http://alblue.bandlem.com/2011/03/mercurial-and-git-technica...

The TLDR is that Mecurial revlogs were designed from lofty architectural goals (append-only formats, designed to be parsed forwards) whereas Git is just an object soup with pointers to other objects in the same soup. As a result, different file storage mechanisms have been created (direct file, push to remote HTTP/S3, BigTable etc.) and new features (bitmaps, packed archives, delta compression between the same and different filenames) have been grafted on over time. The format is also versioned, with feature versions being added at a later stage to the transport protocols that ship packs of these deltas between versions.

Frankly the only valid criticism of Git appears to be 'The command line flags are a little funky' and given the extensive Git tooling that has been built (it's provided by default in Visual Studio, Eclipse, IntelliJ, Xcode and others) the fact that command line tooling takes a bit of getting used to is really such a non-issue that I'm not going to waste further time talking about it here.

Mercurial is dead, but its fans just haven't noticed yet.

rbehrends · on Feb 18, 2015

There are problems with Git's "object soup" approach, too. For example:

(1) The performance issues with "git blame" appear to be essentially unfixable. That tradeoff is by design [1]. Note that by "performance issues" I mean that git blame can take several minutes on some repositories (e.g. src/xdisp.c in Emacs).

(2) Git is pretty much tied to a 1:1 repository:directory model and cannot safely support a 1:1 branch:directory model with multiple branch checkouts sharing a repository (git-new-workdir as the closest approximation is not safe).

In general, a lot of operations on Git have an O(branch history size) or even O(repo size) complexity; that is not a problem if you do not need them, but it puts limits on what you can do efficiently with Git (at least without adequate caching and the necessary porcelain to use it).

That, of course, on top of the other criticism typically targeted at Git (poorly designed command line interface, difficult to understand internal model [2], possibility of data loss [3], etc.).

> Mercurial is dead, but its fans just haven't noticed yet.

Not everybody here is concerned with the silly Git vs. Mercurial war (the modern version of Emacs vs. Vi). My personal concern is that the result of widespread Git adoption is that VCS development has stalled and has settled for a "good enough" that I don't really consider good enough; Mercurial is interesting not because they're doing things better (though they do some things better, and other things worse), but because they appear to be the one team still actually experimenting with new things (e.g., changeset evolution). I really wish there were more going on with Bazaar (which is basically in maintenance mode now, and Canonical doesn't even seem to put a whole lot of effort into that, with lots of bugs still outstanding) or Fossil (which is mostly trying to be conservative rather than innovative [4]).

In short: Competition is good, monoculture is bad. The desire to have one VCS to "rule them all" worries me.

[1] http://marc.info/?l=git&m=116991865311836

[2] http://people.csail.mit.edu/sperezde/onward13.pdf

[3] http://jenkins-ci.org/content/summary-report-git-repository-...

[4] Which is a worthy goal in itself, but it also means that they aren't really moving VCS development forward.

saidajigumi · on Feb 18, 2015

Your data loss example [3] is precisely not data loss, like virtually every other example I've seen regarding git. The commits and history were definitively not lost, and although the article doesn't say, the repositories affected by this tool should have even still contained the history record of the old head(s), making rollback to the latest commits a fairly straightforward affair.

This is honestly no different than someone misconfiguring a tool for any other VCS and overwriting the repo's head with old junk. In the (other VCS) case, the fix operation is to go into VCS history and resurrect the correct head commits (e.g. via a 'revert' style operation). In git's case, the fix operation is ... go into VCS history and resurrect the correct head(s), traditional revert not being relevant to how git functions in such exceptional cases. It's either ignorance or disingenuousness to call git's behavior here, though different than SVN, hg, etc. as "data loss".

rbehrends · on Feb 18, 2015

First, how do you know that no data was lost? There is no way to even verify that all data was recovered. They are pretty confident, but there's really no guarantee, is there?

More generally, yes, the case you are worried about is typically not when there are lots of repository clones in circulation (though the KDE case [1] shows that data loss is quite possible even then and that replication is no real alternative for proper backups [2]).

And, of course, the reflog will keep commits alive for a while and garbage collection will not occur while the grace period is over.

The situation where this doesn't work so well is personal/small group repositories or branches that only experience intermittent commits and that aren't being mirrored by a large number of users. In that case, user errors can easily translate into data loss when garbage collection finally catches up with you.

[1] http://jefferai.org/2013/03/29/distillation/

[2] A practical example would be where a repository is so large that many contributors prefer to use shallow clones.

laurencerowe · on Feb 18, 2015

Git won, but Mercurial seems to have found a niche in organisations with huge repositories. https://code.facebook.com/posts/218678814984400/scaling-merc...

jgraham · on Feb 17, 2015

FWIW, I think that the tendency to push things into extensions is a huge net negative for hg consistency and learnability.

VCSes are complex tools and it takes a little while for people to understand the data model and how it maps to the command line UI. This means that people new to a tool — or just new to a particular project's workflow — will often have "how do I do X?" type questions. With git helping people is pretty easy; you just sit down with them and go through the set of commands they need to achieve what they want, explaining the operations on the tree along the way (if necessary). It's even the sort of thing that you can do over irc without too much difficulty.

When using hg there is a whole extra level of complexity because the default setup isn't actually suitable for use in real projects; to get something useful you first have to enable a bunch of extensions. So given a random user with unknown configuration it's hard to know what commands are even available to them without diving into a configuration file, possibly downloading some random python scripts from the internet, and so on. For example the Mozilla source tree has a whole mercurial setup script that configures about a dozen extensions, a few of which are providing useful Mozilla-specific functionality (e.g. bugzilla integration), but most of which are plugging holes in the hg featureset.

I also find if baffling that, given git exists, when hg developers integrate ideas from git wholesale — something that is on the whole positive — (e.g. bookmarks as an implementation of local branching), they conspire to do so in such a way that the experience is jarring for people moving from git. To me it seems obvious that if you are coming second in a space with strong network effects, you don't go out of your way to make it painful for people to switch.

I get the impression that at least some mercurial fans expect it to win in the enterprise, citing organizations like Facebook optimising it to work with their giant repositories. But it isn't clear to me that there's much trickle-down effect from there; most smaller projects simply don't have that much source to control, or the ability to enforce the kind of homogenous environment that forgives many of the shortcomings of mercurial's UX.