I’ve been out of version control for a bit, and GitHub itself for quite a bit longer, but this part was interesting to me:
> Because: It isn't really about Git anymore. GitHub used to be a Git repository hosting company. But now, they are about providing other software development and management infrastructure that augments the version control. They might as well change their name to Hub at this point.
Just for added color: starting around 2010 or 2011 or so (around when we added Subversion to GitHub), we had a pretty solid idea that version control wasn’t “the thing”. Mostly because Mercurial was a strong alternative, and there always felt like there might be something new in the wings that could dominate in the next few years. Version control felt really dynamic, and deeply susceptible to change. And if something was better, we’d totally ditch Git for the new thing. The name was a bit of a misnomer, from that perspective.
I think that changed over time — I had a hell of a time trying to get movement on basic version control improvements back in 2014 — and now they’re clearly much more about the ecosystem rather than version control itself. It’s where the money is, of course, but it’s also where the bigger innovation is, at least for the time being.
I think the author is right to say that Microsoft is targeting a different area than version control specifically, though you could argue if the outcome of that is good or bad. It’s certainly different, though- they’re especially growing headcount right now, and the company makeup is wildly different than what many customers tend think it is today, imo.
I agree fully, but the one "feature" missing from drh's analysis is GitHub's social aspect. Getting code out in front of people and allowing them to interact with it (directly, via commits and PRs, or indirectly, via issue comments) as a feature of a social network is their differentiator.
GitHub's choice of git wasn't germane to building a social network: GitHub could have been successful using svn, excepting that git was ascendant at the time.
Fossil's feature of an integrated web server is an anti-feature to GitHub; it defeats the purpose of having a centralised "Hub."
I've been thinking about adding CI/CD support and other tooling, but the CPU resources required would be cost prohibitive.
Unfortunately, this seems pretty hard to do, so I'm afraid git is too powerful. I absolutely detest its porcelain, but what can we do?
 http://fossil-scm.org/home/doc/trunk/www/mirrortogithub.md "Notes" section
This makes sense, because GitHub doesn't add value in "improving" the git part. Anyone can run a git server, and git itself doesn't really have much else to it.
GitHub adds value in all the places around it. Collaboration, process, teams, communication, code review, security. These are things that most people using git will need, which are better when integrated or close to their version control system, and things that git doesn't address.
As a GitHub user, I'm glad that this was what they focused on.
Improving git itself goes directly against their interests.
This is one thing (but not the only thing) that I intensely dislike about git vs bzr. Loglevels are not particularly new and can be applied so directly to the VCS log that bzr log has a level option.
to me, this kind of result is the most compelling argument that the world of high tech isn't nearly so much a meritocracy as it is made out to be.
git is not complex though. It just has an absolutely godawful "high-level" user interface.
Porcelain (high-level commands) are really shortcuts on sets of low-level operations usually performed together, which is why e.g. `checkout` is used simultanously for reverting working copy chances and switching the entire thing to arbitrary commits or branches.
That also makes git extremely hard to learn "top down" beyond rote learning of a few commands: from that POV the CLI is completely incoherent so you can't really build an intuition for what command could do what operation. The terrible naming doesn't help either.
If you have the time and desire to start from the on-disk storage (ignoring packfiles) you can build your own in a few hours.
Edit: gonna be fun to see how guides like https://www.atlassian.com/git/tutorials/using-branches/git-c... are becoming outdated in a way where switching/checking out doesn't mean the same thing anymore, but the article uses them interchangeably
git checkout [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>…
git checkout [<tree-ish>] [--] <pathspec>…
git checkout (-p|--patch) [<tree-ish>] [--] [<paths>… ]
* is used to revert working tree changes, possibly interactively
* is used to set working tree changes to either side of a merge (in case of conflict)
* is used to revert specific local paths to a historical version thereof, possibly interactively
However there have been lots of "porcelain replacements" efforts in the past, but most of them either were abandoned because by the time they'd built something complete enough the author had had to so deeply understand the underlying model they could use the official fine, or remained niche because they were extremely opinionated (and limited) with respect to the workflows they'd support.
We had a proposal around commit standards at work recently: we came to the conclusion that rebasing your private branches and squashing out irrelevant commits is the recommended flow, to make reviews easier.
That's because you joined the "it's a story" camp without noticing. If you instead view git history as a sort of gentlemen's audit log, then "refactoring" it is indeed both lying and dishonorable. And in no other profession it would be OK to mess with something used for review / audit purposes.
Personally, I'm in favor of the history/audit log view, because you can't predict today what information you may need in the future, and refactoring git history throws away a lot of historical context.
In fact, I'm sure that you're lying too: every time you Ctrl+Z in between commits, you're removing parts of your audit log. Choosing when to commit is telling a story.
(Unless Fossil/whatever system you're using stores every character you ever type - I'm not familiar with it.)
Edit: To use an example from the article:
> Yet, sometimes we come upon a piece of code that we simply cannot understand. If you have never asked yourself, “What was this code’s developer thinking?” you haven’t been developing software for very long.
With my commit, I'm telling other developers/my future self what I was thinking, rather than having them try to figure that out by themselves from my code. The assumption there is that I'm better at explaining my thinking than my code is.
Sure, that is the intention. But humans are very fallible in communication and communication is hard. Being later able to see what you did along with of what you explained is of obvious utility.
And I disagree here. The way I view it, my editor is my sandbox, I keep playing in it until I have something that I want to enter into record. When I commit work, I enter it into record, with a commit message explaining what the piece of work is.
But to be honest, my repo clone is sort of my own sandbox too, I don't consider Git a fully append-only log, so I'd sometimes do commit editing on my local repo. But once published, I consider it immutable.
(Or rather I'd prefer to; the team in the main project I'm working on right now has a rebase-heavy workflow.)
On a practical note, I'm fine with history cleanup done on the spot. E.g. I've committed three things in the past hour, I squash them together. Or I rearrange stuff I made over the course of the day. But I don't like attempts at messing with history that's many days old (or more), because at that point the person doing the cleanup doesn't have the context of the work in their minds anymore, so history edits throw away valuable information.
The way I view it, my development branch is my sandbox. I keep playing in it until I have something that I want to enter into the record. When I merge work, I enter it into record with a merge message explaining what the piece of work is.
I don't see the point of immutably recording typos a reviewer noticed. I view a pull request (or whatever you call them) as a patch series. Something to be tinkered with, rewrite and resubmit until it's considered good enough. If I want to record every stumble, issue and hare-brained idea, that can go into the patch messages.
What's the value of having this in your audit log? Is it more valuable than being able to revert the commits without having to first revert the typo fix, or do a git bisect without running into broken commits that exist only to preserve an audit log?
I guess I just don't see how that's not "telling a story". The commit is not a recording of the process you went through to get the code in a certain state, but a piece of work of which you decided it should be entered into record, with a message you wrote that explains it.
This position makes no sense to me. How can your editor be your sandbox but your branch before pushing is not? That’s an arbitrary distinction without merit.
Noone rebases on a public branch so both scenarios are about what you do before publishing your work.
I think the whole conversation about history is missing the point. The problem we’re trying to solve is complexity. Having a bunch of out of date commits floating about does nothing to reduce complexity.
But then you would have reimplemented mercurial's obsolescence markers.
I don't believe most software development work requires such a strict log. I believe that most teams should be free to use git in the most productive way for them. I believe the "PR-as-story" camp is more productive, and already have war stories where not having that hurt. Always happy to hear war stories from other camps.
Git can easily handle getting rid of or modifying commits, so why not make use of that ability in a controlled manner?
I.e. I'd typically not re-build the history of the master or another major branch just to fix a bug introduced a while back at the source (although I guess backporting fixes might be similar). But for work in progress, one or a few commits about to be merged, smaller fixes are better made at the source. Not to talk about entirely irrelevant commits like "it's the end of the day, let's commit and push to a branch so it's not just on my machine".
I'm not talking about that. But do you edit your diary to "refactor" the things you wrote a month or year earlier?
Related: people here seem to implicitly assume "never rebase published branches". I don't know how widely accepted that rule is, but in the teams I worked with, the workflow was "every now and then, rebase develop / feature branch on top of master, and then push --force".
It's very easy to accidentally break your own rules, si this goes wrong regularly and the gallout becomes apparent only at some point down the pipeline.
It's an added cognitive burden to be vigilant about which git mode you are working in.
But if you did it, hopefully the tag would continue to refer to the pre rebase revision. Anyone know the answer? This is the kind of uncertainty that rebase introduces to the semantics!
(Disclosure: I currently work at GitHub, but these opinions are my own)
> 1. DevOps
> 2. Security
> 3. Collaboration
> 4. Insights
Interesting to see "Security" as number two.
Rebasing is good on private branches: it lets you write clean linear history. Conversely, intra-project-branch history is of little or no interest whatsoever once pushed upstream. Rebasing published branches is not a good thing, of course, but fans of rebasing don't propose that.