It's astonishing to me that Git has won out given how much easier it's been for me to explain Hg to other people than to explain Git. To this day, in our SVN workflow at my company, nontechnical people who have merely seen a Hg diagram on a whiteboard by my desk immediately grasped the idea and the lingo, and ask me questions like "hey, can you branch the code to commit those changes and push them to the testing server? This thing's really cool and we don't mind playing with the alpha version, but we might scrap it all later."
Maybe I'm a too long time user of git, but I really fail to see why git as of the last 5 years is any harder to explain than hg. Personally I think the branching in hg is pretty much broken; alone the fact that it's pretty much impossible to get rid of branches is horrible.
Because the diagrams for hg are very simple, there is a really simple way to do branching that obviously works and commits a relatively forgivable sin: just `cp -a` the folder.
Now, I know that that's in essence an admission of defeat! I'm not pretending that it's anything less than that. However, this is also the easiest explanation of, and model for, branching that anyone has ever created. The explanation of branching which the nontechnical user immediately understood was, in fact, just having a couple of these repository-folders sitting side by side with different names, `current_version` and `new_feature`. It is a model of branching that is so innocent and pure and unsullied by the world that a nontechnical person got it with only a couple of questions.
Like I said, I'm actually employed at an SVN shop, where branches are other folders in the root folder and the workflow is less "push this to the testing server" and more "commit to the repository, SSH into the testing server, and then update the testing server." But that Hg model resonated with someone who doesn't know computers. To me, that was a moment of amazement.
I'm not even saying which one is better really; I like Git branches too! It's just that I'm astonished that the more confusing DVCS is winning. Most peoples' approach to Git is "I am just going to learn a couple of fundamentals and ask an expert to set up something useful for me." I would guess that most Git users don't branch much; they never learned that aspect to it. I'm really surprised that software developers aren't more the sort to really say "why am I doing this?" and to prefer systems which make it easier to answer those questions with pretty pictures.
> I'm really surprised that software developers aren't more the sort to really say "why am I doing this?" and to prefer systems which make it easier to answer those questions with pretty pictures.
The network effect should explain it.
That said, using pictures to answer questions is fairly sadistic when those asking them are blind. I know a blind developer and I never use pictures when talking to him in IRC.
They do answer the questions with pictures. But the pictures look like http://xkcd.com/1597/ and the answers are perhaps not what you are expecting. (-:
I think that the hate that gets lumped on mercurial's branches, though understandable, is a bit unfair
Disclaimer, it's been years since I used hg as my primary DVCS. So some of my thoughts here might be out of date, or have a misrecollection.
> branching in hg is pretty much broken*
It really isn't. It's absolutely not suited for the task that many people want to use it for, but it's totally fitting with the intended use case and the "history is immutable" philosophy of mercurial.
Using mercurial branches for anything resembling feature branching is a bad idea. But mercurial branches are perfect for things like ongoing lines of development. So, for a project like PostgreSQL, you'd have a "master" (default) branch for the head of development and then once a release goes into maintenance mode you create a new branch for "postgres-9.4" and any fixes that need to be applied to that release will be made to the maintenance branch.
Following hg's "immutable history" policy the fact that the commit was performed on a maintenance branch is tracked forever. And it should be because the purpose of your source control is to track those kinds of things: "This is the branch we used for maintenance releases of version X.Y.Z, it is now closed since we no longer support that version"
The issues with mercurial's branches are:
- For a long time they were the only concept in hg that had a simple name and looked like "multiple lines of development". Even though hg supported multiple heads and multiple lightweight clones, neither of those had commands or features with a clear and simple name, so they people turned to "branches" expecting them to do what they wanted even when they were a bad fit.
- "branch" is very general name that is often used (quite rightly) to refer to a bunch of slightly different ways of working with multiple concurrent versions. In general use it might refer simply having 2 developers who both produce independent changes from the same parent. Or to intentionally having multiple short lived lines of development based around feature. Or splitting of development right before a release so that the "release branch" is stable. Etc. Yet the feature in hg that is called "branch" is useful for only some of those things. It would have been better to call it a "development line" or something like that.
- It took far too long for hg to get a builtin way to refer to named heads (bookmarks). The model assumed that each repository (clone) would only ever want to have 1 head on each branch (development line) and that producing multiple heads was a problem that ought to be resolved as soon as possible. There's a lot of history behind that approach (almost every CVS and SVN team I ever worked with did that), but DVCS tools made it easier to move away from that, yet official hg support lagged.
So even today, the "branch" concept in hg is only useful for a small number of cases, and the "bookmarks" concept is what most people want, but they're separate things with names that don't align with expectations.
This distinction between branches and bookmarks looks like one of the things missing the most from git. Grab a random branch from some place and try to guess whether the committer intends to rewrite it in the future or not: good luck.
> So even today, the "branch" concept in hg is only useful for a small number of cases, and the "bookmarks" concept is what most people want, but they're separate things with names that don't align with expectations.
And it doesn't help that the primary hosted repository system is Bitbucket and Bitbucket didn't support pull requests from bookmarks last time I checked.
Your original comment linked broken branches with the fact that they can't be deleted. That's only true if you intended to talk about Mercurial named branches which aren't broken, they just aren't what you want them to be.
Bookmarks (today, and for several years) work just fine. So "branching" in the general sense isn't broken even though the combined feature set is a bit haphazard.
That bitbucket doesn't work well with bookmarks is a sign of how little Atlassian cares about hg, rather than an hg issue.
If you're arguing that the hosting options for hg are limited and fall far below the git options, then I'm not going to disagree.
In other words, in Git, a branch is just a named pointer to a particular revision. In Hg, these are called 'bookmarks' and they work exactly how you're imagining; and there is an immutable sort of bookmark that is called a 'tag' (bookmarks can be repointed; tags cannot). By creating a new head (i.e. branching) and naming that new head with a bookmark, you do exactly what `git branch` does.
Mercurial also supports an autonaming of revisions which automatically applies to all child revisions, too: these are meant to be independent lines of development with their its own head revision, and are called 'named branches'; that is what `hg branch` does. The problem that you're identifying (and that I agree is counterintuitive!) is that these names become part of the commits themselves and therefore public knowledge. Mercurial warns you when you `hg branch` that this is happening and says "did you want a bookmark?" but does not tell you, e.g., "to undo what you just did, type `hg branch default`."
"hey, can you branch the code to commit those changes and push them to the testing server? This thing's really cool and we don't mind playing with the alpha version, but we might scrap it all later."
Are they talking about hg or git here. Because that flow in git is:
git branch
git checkout
git commit
git push
The only thing that git adds to that workflow is that creating a new branch doesn't immediately move you onto it (also that most would use checkout -b to do both). And it's not immediately obvious that a non-technical user would need to know about that in order to get the above point across.
No. That fails with the following semi-helpful error message:
remote: error: refusing to update checked out branch: refs/heads/master
remote: error: By default, updating the current branch in a non-bare repository
remote: error: is denied, because it will make the index and work tree inconsistent
remote: error: with what you pushed, and will require 'git reset --hard' to match
remote: error: the work tree to HEAD.
remote: error:
remote: error: You can set 'receive.denyCurrentBranch' configuration variable to
remote: error: 'ignore' or 'warn' in the remote repository to allow pushing into
remote: error: its current branch; however, this is not recommended unless you
remote: error: arranged to update its work tree to match what you pushed in some
remote: error: other way.
remote: error:
remote: error: To squelch this message and still keep the default behaviour, set
remote: error: 'receive.denyCurrentBranch' configuration variable to 'refuse'.
Actually all of these diagrams for Git need to look substantially more complicated because you first off need to introduce repositories which have a cylinder with a cloud over them (the cloud of course is the staging area) with a sort of recycle-reduce-reuse pattern of arrows `add`, `commit`, `checkout` between these three entities, with the caveat that `checkout` is only kinda-sorta what you're looking for with this. In fact there is a cylinder-to-cylinder `pull`-type operation called `fetch`, but `git fetch; git checkout` will not actually update any files, revealing the gaping hole in this simple picture, and you'll have to type `git status` to find out that you're directed to do a `git pull` anyway, which has to be diagrammed as an arrow pointing from the remote cylinder, bouncing off the local cylinder, and then pointing at the local folder.
To get to talk about `push` you then need to introduce the SVN-style "bare repository" in the diagram, a folder-box with the cylinder now drawn large inside it, and explain that this folder exists only to contain the .git subfolder and act as an SVN-style repository. You can then draw `pull` arrows down from it and `push` arrows up to it.
Now that almost works, except the `git branch; git checkout` flow is not the proper way to push changes in the working directory to the new branch. (The context of the conversation was stuff that was already being developed, presumably on the master branch.) That fails on the checkout with an error message like:
error: Your local changes to the following files would be overwritten by checkout:
foo
bar
Please, commit your changes or stash them before you can switch branches.
Aborting
But, I mean, close enough. It's `git stash branch <newbranch>` and it generates an ugly error message but it does exactly what you want it to do, so you can ignore that error message and hack away.
Now, you're missing the point if you think "God, drostie is really pedantically getting on my case for missing the remote-repository-update and the git stash here! Anyone will learn that workflow eventually!"
The point was not any such thing, the point was clean diagrams when explaining the idea to a fellow developer -- in fact a diagram so clean that a nontechnical user asked about it and accidentally learned enough to get some new vocabulary about how a developer's life works, so that they could more effectively communicate what they want to the developer.
It is my contention that the git diagram, as opposed to the git workflow, is sufficiently messy that a nontechnical eye will lose curiosity and most certainly will not get the idea of "make a branch, push the branch to the shared repository, then update the testing repository, then switch to that branch, then discard that branch if things don't work out." That strikes me as too in-depth for nontechnical casual users to express.
I think people have a way too high tolerance for this kind of crap. We're also kidding ourselves if we think we're smart enough to work with this kind of complexity at no cost.
Our job is often to think up new things. It's really hard to come up with new abstractions when your thinking is muddled by all kinds of incidental complexity.
This is buried but in case anyone reads it, the real reason to open source BK is to show the world that SCM doesn't have to be as error prone or as complicated as Git. You need to understand how Git works to use it properly; BK is more like a car, you just get in and drive.
That metaphor... needs work. Cars need a considerable amount of training to learn to use safely, let alone correctly. I can hack C enough that I dream in it routinely, and due to the resulting brain damage found git intuitive from the start, but there is no way I'll ever learn to drive: it's just too hard.
> I think people have a way too high tolerance for this kind of crap.
I agree.
The core problem with Git is that it was designed to serve the needs of the Linux kernel developers. Very, very, very few projects have SCM problems of similar complexity, so why do so many people try to use a tool that solves problems they don't have? Much of that internal complexity extends up into the Git interface, so you're paying for complexity you don't need.
Others in this thread have praised hg and bzr for their relative simplicity for a DVCS. I'd also like to point out Fossil.
In the normal course of daily use, Fossil as simple to use as svn.
About the only time where Fossil is more complex is the clone step before checking out a version from a remote repository.
Other than that, the daily use of Fossil is very nearly command-for-command the same as with svn. Sometimes the subcommands are different (e.g. fossil finfo instead of svn status for per-file info in the current checkout) but muscle memory works that out fairly quickly.
Most of that simplicity comes down to Fossil's autosync feature, which means that local checkout changes are automatically propagated back to the server you cloned from, so Fossil doesn't normally have separate checkin and push steps, as with Git. But if you want a 2-step commit, Fossil will let you turn off autosync.
(But you shouldn't. Local-only checkins with rare pushes is a manifestation of "the guy in the room" problem which we were warned against back in 1971 by Gerald Weinberg. Thus, Fossil fosters teamwork with better defaults than Git.)
Branching is a lot saner in Fossil than svn:
1. Fossil branches automatically include all files in a particular revision, whereas svn's branches are built on top of the per-file copy operation, so you could have a branch containing only one file. This is one of those kinds of flexibility that ends up causing problems, because you can end up with branches that don't parallel one another, making patches and merges difficult. Fossil strongly encourages you to keep related branches parallel. Automatic merges tend to succeed more often in Fossil than svn as a result.
2. Fossil has a built-in web UI with a graphical timeline, so you can see the structure of your branches. You have to install a separate GUI tool to get that with most other VCSes. The fact that you can always get a graphical view of things means that if you ever get confused about the state of a Fossil checkout tree, you'll likely spend less time confused, because you're likely also using its fully-integrated web UI.
3. Whereas svn makes you branch before you start work on a change, Fossil lets you put that off until you're ready to commit. It's at that point that you're ready to decide, "Does this change make sense on the current branch, or do I need a new one?"
Fossil's handling of branches is also a lot simpler than Git's, primarily because the local Fossil repository clone is separate from the checkout tree. Thus, it is easy to have multiple Fossil checkouts from a given local repo clone, whereas the standard Git workflow is to switch among branches in a single tree, making branch switches inexpensive.
(And yes, I'm aware that there is a way to have one local Git checkout refer to another so you can have multiple branches checked out locally without two complete repo clones. The point is that Git has yet again added unnecessary complexity to something that should be simple.)
Why is it that some people get Git naturally and some experience a world of frustration trying to use it? I think the kind of problems you describe usually come up if you approach Git with a mindset formed by another SCM. They are typical for people who are proficient with, say, SVN and who try to use Git thinking that Git must work something like SVN. (I'm using SVN as just an example here; it could be any other SCM, but I most frequently see people coming from SVN to really struggle with Git.) Well, Git is nothing like SVN and you'll always be missing something if you try to understand Git through SVN concepts. It's best to forget what you used before and learn Git from a clean slate.
Maybe I was just really lucky to never having to learn SVN (or CSV or ClearCase), so Git concepts and workflows were clear and almost effortless to understand and use. Or maybe it's like the concept of pointers: some people get it right away and others never get it.
The only people I've ever seen "get Git naturally" were developers starting from the implementation details and working their way up (#0)[0].
Everybody else either worked very hard at it(#1)[1] or just rote-learned a list of commands(#2) that pretty much do what they want from which they don't deviate lest the wrath of the Git Gods fall upon them and they have to call upon the resident (#1) or heavens forbid the resident (#0) who'll usually start by berating them for failing to understand the git storage model.
> Well, Git is nothing like SVN and you'll always be missing something if you try to understand Git through SVN concepts.
Mercurial is also nothing like SVN, the problem is not the underlying concepts and storage model, it's that Git's "high-level UI" is a giant abstraction leak so you can't make sense of Git without understanding the underlying concepts and storage model, while you can easily do so for SVN or Mercurial.
[0] because the porcelain sort of makes sense in the context of the plumbing aka the storage model and implementation details
[1] because the porcelain in isolation is an incoherent mess with garbage man pages
> Now that almost works, except the `git branch; git checkout` flow is not the proper way to push changes in the working directory to the new branch. (The context of the conversation was stuff that was already being developed, presumably on the master branch.)
> But, I mean, close enough. It's `git stash branch <newbranch>` and it generates an ugly error message but it does exactly what you want it to do, so you can ignore that error message and hack away.
Isn't
git checkout -b new_branch
git commit -a
git push
what you are looking for?
And as for the push problem:
1. You aren't going to encounter it in git when pushing a newly created branch, but yes, you then have to ssh in and check it out.
2. I wonder how Mercurial handles pushing to repository with uncommitted changes, does it just nuke them?
Regarding 2 - Do you mean the destination repo has uncommitted changes? There is no need to nuke them, as pushing to this repo will have no influence on the working set: it will just add new changesets in the history!
Regarding your first question: Yes, you can also `git checkout -b newbranch`, rather than stashing the changes and then stashing them into a new branch; I just tend to stash my changes whenever I see that there's updates on the parent repository. Call it a reflex.
Of course, you can also commit your changes and then `git branch`, which sounds insane (that commit is also now on the master branch!) until you remember that branches in git are just Mercurial's pointers-to-heads. This means that you can, on the master branch, just `git reset --hard HEAD~4` or so (if you want the last 4 local commits to be on the new branch and you haven't pushed any of them to the central repo), and your repository is in the state you want it in, as well. (And you'll need that last step even if you `git checkout -b`, I think.)
Regarding your second one, Mercurial's simplified model is actually really smart. You have to understand that Git complects two different things into `pull`: updating the repository in .git/ and updating the working copy from the repository. In Mercurial these are two separate operations: you update/commit between the working copy and the repository; you push/pull between two repositories. The working copies are not part of a push/pull at all. So if you push to a repository with uncommitted changes in its working copy, that's fine. The working copy isn't affected by a push/pull no matter what.
With that said, if that foreign repository has committed those changes, Hg will object to your push on the grounds that it 'creates a new head', and it will ask you to pull those commits into your copy and merge before you can push to the foreign repository. (The manpages also say that you can -f force it, but warn you that this creates Confusion in a team environment. Just to clarify: a 'head' is any revision that has no child revisions. In the directed acyclic graph that is the repository history, heads are any of the pokey bits at the end. You can always ask for a list of these with `hg heads`.)
"OK," you say, "but let's throw some updates into the mix, what happens? Does it nuke my changes?" And the answer is "no, but notice who has the agency now." Let's call our repositories' owners Alice and Bob. Alice pushes some change to Bob's repository. Nothing has changed in Bob's working folder.
Now if Alice tells Bob about the new revision, Bob can run an update, if he wants. Bob has the agency here. So when the update says, "hey, those updates conflict, I'm triggering merge resolution" (if they do indeed conflict), he's present to deal with the crisis. Git's problem was precisely "oh, we can't push to that repository because we might have to mess with the working copy without Bob's knowledge," and it's a totally unnecessary problem.
Bob can also keep committing, blithely unaware of Alice's branch, if Alice doesn't tell him about it. The repository will tell him that there are 'multiple heads' when he creates a new one by committing, so in theory he'll find out about her commits -- though if you're in a rush of course you might not notice.
Bob can keep working on his head with no problem, but can no longer push to Alice (if he was ever allowed to in the first place), because his pushes are not allowed to create new heads either. In fact he'll get a warning if he tries to push anywhere with multiple heads, because by default it will try to push all of the heads. However he can certainly push his active head to anyone who has not received Alice's branch, just by asking Hg to only push the latest commit via `hg push -r tip` -- this only sends the commits needed to understand the last commit, and as long as that doesn't create new heads Bob is good to push.
PTSD? :) Use local topic branches for everything to avoid unpleasant surprise merges. Once you are ready to merge, pull the shared branch, merge/rebase onto that and push/submit/whatever.
I sometimes keep separate branch for each thing that I intend to become a master commit. This way I can use as many small and ugly commits and swearwords as I please and later squash them for publication after all bugs are ironed out.
This helps with remembering why particular commits look the way they do, especially in high latency code review environments where it can take days or weeks and several revisions to get something accepted.
> Git's problem was precisely "oh, we can't push to that repository because we might have to mess with the working copy without Bob's knowledge," and it's a totally unnecessary problem.
Actually Bob's working copy isn't modified, it's just that if his branch was allowed to suddenly stop matching his working copy, he would probably have some fun committing (not sure what exactly, never tried).
And mpm wrote hg, never forget:
http://lkml.iu.edu/hypermail/linux/kernel/0504.2/0670.html
http://lwn.net/Articles/151624/