
The Biggest and Weirdest Commits in Linux Kernel Git History (2017) - swsieber
https://www.destroyallsoftware.com/blog/2017/the-biggest-and-weirdest-commits-in-linux-kernel-git-history
======
rraval
My startup's monorepo has 2 root commits as well. When the company was first
starting out, my co-founder and I created independent git repos. I was writing
OCR type research-y code and he was doing more traditional CRUD REST webserver
things.

When it came time to pull things together, we thought it'd be fun to try and
maintain the histories. So I added his repo as a remote and simply merged his
unrelated history into mine.

Fast forward and now we offer hoodies as swag to anyone who contributes to the
repo. We personalize the hoodie with your git username, the truncated commit
hash of your first commit, and the number of parent commits to your first
commit.

Having 2 root commits means that both my cofounder and I have hoodies with a
large 0 as number of parent commits. Just a nice way to commemorate this
accident of history :)

~~~
kirubakaran
That's a fantastic idea. Do you have any photos of the hoodies?

~~~
rraval
Yep, here's one:
[https://i.imgur.com/LEI4BAA.jpg](https://i.imgur.com/LEI4BAA.jpg)

It makes for a great conversation piece, even around non-tech crowds.

------
fredley
I had never heard of Octopus merges, and wondered if they're the reason why
the GitHub mascot is Octocat. Turns out they are!

[http://cameronmcefee.com/work/the-
octocat/](http://cameronmcefee.com/work/the-octocat/)

~~~
blattimwind
Yet Github never creates octopus merges.

~~~
hinkley
They have one octopus and don’t want any more?

------
ericfrederich
Visualization of octopus merge with 66 parents:

[https://imgur.com/gallery/oiWeZmm](https://imgur.com/gallery/oiWeZmm)

Run this:

git log --graph --abbrev-commit --decorate --date=relative
--format=format:'%C(bold blue)%h%C(reset) - %C(bold green)(%ar)%C(reset)
%C(white)%s%C(reset) %C(dim white)- %an%C(reset)%C(bold yellow)%d%C(reset)'
2cde51fbd0f3

------
zwkrt
The article was very interesting, but as a nit the best fit curves in the
graphs were questionable.

"everything is linear if plotted log-log with a fat magic marker"

~~~
IshKebab
Check the legend.

------
hartator
Despite having used Git on a daily basis since more than 5 years, I still have
trouble using it, and have to Google things almost everytime I need something
a but out of the ordinary.

~~~
Maakuth
I recommend having a frontend like SourceTree for git - it uses bare git
underneath so the end result and even the state in your working copy will be
just the same, but you'll have better picture of what is happening. As the
working copy changes in identical way to the use of the command line git, it
neatly works as a scaffolding for your learning. You can use the command line
client for the stuff you are comfortable with and then resort to the GUI when
you get into trouble. This way you can shortcut your learning, provided that
you want to learn to use the command line git. Nothing wrong with staying with
the GUI forever though.

~~~
Zigurd
Toolchains are plagued by pseudo-GUIs - graphical overlays that don't actually
abstract a CLI. Git is perhaps the most resistant thing ever to GUI-izing. I
don't believe it can be done in a way that doesn't disappoint.

~~~
Maakuth
On the other hand, if the abstractions on the CLI tool are well chosen, it is
good that the same ideas work are employed in the GUI as well to facilitate
learning who the tool actually works.

~~~
Zigurd
Trouble is, a complicated CLI isn't an API to a data model. You can help the
user visualize, but you have command line tools for that, too. You can further
simplify some simple operations. But you can't build a real MVC GUI
application.

------
chx
> (Update: it was an accident, which Linus responded to in his usual fashion.)

This jab at Linus is unfounded, he have replied calmly and professionally.

~~~
ajuc
Calling calm and professional response "his usual fashion" isn't a jab :)

~~~
chx
All too often people call Linus' rare passionate rants "his usual fashion".

~~~
hermitdev
I had the photo of Linus flipping the bird to NVIDIA as my desktop wallpaper
for quite a while at a previous job. Used it as a reminder not to commit
something stupid (not to Linux - the internal project I was working on at the
time).

------
focal-point
I'd love to read a book on git wizardry that goes beyond the git book.

Manishearth, a Mozillian and Rust contributor, wrote this post on splitting
apart a repo. He's good at explaining all sorts of things.

[https://manishearth.github.io/blog/2017/03/05/understanding-...](https://manishearth.github.io/blog/2017/03/05/understanding-
git-filter-branch/)

------
selljamhere
>> It's pulled, and it's fine, but there's clearly a balance between "octopus
merges are fine" and "Christ, that's not an octopus, that's a Cthulhu merge".

This made me chuckle.

------
chrismorgan
Just recently I started doing octopus merges regularly, though they won’t ever
make it onto master. The situation is where you deploy from a particular ref
in git, and have a beta environment; I’ll often want to have unrelated things
that are sitting on beta for a while, either because they’re long-running work
or because I’ve fixed something but the change hasn’t been reviewed and merged
yet. My beta branch is then just an octopus merge of all the feature branches
I’m working on at present and want included in the beta environment. That way
I am not limited to just one thing on beta at a time, or worrying about
maintaining a beta branch as well as the feature branch. It has been very
liberating.

I’ve built a fairly simple tool that tracks which branches to merge into my
beta branch and automates its regeneration, and I’m polishing it up so it is
distributable and can just be a regular Git subcommand, git-managed-branch.
What I have already is useful to me, and I think it’d be useful for many
others as well.

The essence of the work of the script I have at present boils down to this:

    
    
      git stash  # (if necessary)
      git checkout --no-track -B origin/master staging
      git merge --no-edit feature1 feature2 fix3 fix4
      git push --force
      git checkout -
      git stash pop  # (if necessary)

------
bipson
Now I wonder if the warning suggested by Linus in the linked mail was ever
implemented. Did also not know git would not even complain.

Does anyone know? (Am on phone, slightly difficult to check).

~~~
ushi
yes. its implemented:

    
    
      --allow-unrelated-histories
    
        By default, git merge command refuses to merge histories
        that do not share a common ancestor. This option can be
        used to override this safety when merging histories of
        two projects that started their lives independently. As
        that is a very rare occasion, no configuration variable
        to enable this by default exists and will not be added.

~~~
Programmatic
See also the relevant lines in the function he proposed a patch for:
[https://github.com/git/git/blob/master/builtin/merge.c#L1401](https://github.com/git/git/blob/master/builtin/merge.c#L1401)

------
teagoat
From previous reading of Linus rants, it sounds like they're pretty particular
about keeping source code and git history clean. Why didn't they go back and
reverse the new root created by the README.md repo?

~~~
cesarb
His rants are also pretty particular about not rewriting published history. As
he said on the linked message, "[...] I didn't notice the history screw-up
until too late, [...]", so he probably already had pushed to the public
"master" branch on the git.kernel.org servers. Once it's there, other kernel
developers might already have pulled from it, so trying to rewrite the git
history to remove the commit would only lead to an unholy mess (and the
offending commit coming back) the next time he merges from them.

~~~
bostik
I have spent the last 2-3 years educating auditors and after quite some
effort, they have learned to appreciate git. To the point where they are now
starting to ask some of their other clients why _they_ are not doing something
similar.

Auditors _LOVE_ immutability. To be fair, git doesn't provide that, but it
provides the next-best alternative: tamper detection. If anyone rewrites
history, git will show that. The gitrefs between two points in time will not
match if anyone has modified data or commits in the meantime. The auditors
also have no problem looking at previous years' documents where they have
recorded the relevant gitrefs at the time.

This has gone so far that this year's policy review was a breeze. Our
compliance documentation is maintained in a git repo, with all documents as
markdown files. The final documents are simply compiled PDF and HTML
artifacts.

In 2016, the auditors asked if we can provide snapshots of previous policy
versions. In 2017, they already understood that we have everything in git, and
knew to ask for clarifications as to when a particular change was done and who
had signed it off. This year our auditors _literally_ asked for the latest
compliance documentation bundle from CI, all the individual commits, and the
overall diff over the year.

Wall time spent for policy review: ~20 minutes.

~~~
exikyut
How does someone go about looking for auditors who will be similarly receptive
to this kind of enlightenment??

(Understanding that some effort will need to be put in)

~~~
bostik
As long as the audits are for purposes where there is actual competition[0],
this is possible.

The important thing is to never treat compliance audits as box-ticking
exercises. That's a never-ending, vicious cycle. In fact, many of the findings
are simply different aspects of the same thing. You can pre-emptively work on
this: identify what parts of requirements are essentially duplicates, and make
improvements that satisfy _all_ of them at once.

Then proudly flaunt them. When you can show to the auditors, in person, that
you have considered the wider business implications and worked to _understand_
the compliance requirements, you are on much better ground. That buys trust.

Then, educate the auditors when necessary. Show them in practice how something
simple can provide a better trail and an improved experience. When possible,
provide evidence in the format they initially ask for, but also in the format
which is more suitable and more convenient. Auditors are humans. They just
often are not aware of the leading edge, of what is possible. Show off
solutions that are more convenient to both of you.

They will learn. They will be impressed by some things you do. Anything that
makes their job easier, while satisfying the _intent and spirit_ of the audit,
will be an easy sell. They also believe in repeat business. Show repeatedly
that you know what you are doing, and why you believe that your approach makes
more sense (while delivering _better_ audit trails).

Convenience is a strong currency.

0: There are some domains where a single company has essentially a "Royal
Charter". These are much, much harder to deal with, because the monopolist has
little need to employ personnel with proper technological understanding. They
can also strong-arm and bully their customers at will, because there are no
alternatives. Audits like these can _very_ easily degrade into box-ticking
bonanzas. My advice for these cases is: pick your fights. Double down on what
you truly believe and give in on smaller, less disruptive items. Rinse and
repeat on subsequent years.

------
djsumdog
I find rebasing can get so insane that most of the times, when I'm ready to
submit a pull/merge request to master, I create a whole new branch from
master, cherry-pick all my commits in order, rebase -i and squash them if I
need to, and then push the branch/create a PR/MR.

I find this looks a lot cleaner in tools like Gitlab/Bitbucket as well.

~~~
spockz
Is there a way so you can rebase and push without breaking everyone who
already pulled your repo? I have this issue continuously when using rebase on
branches that have gitlab or github merge/pull requests. I need to create a
different branch all of the time.

~~~
cesarb
You aren't supposed to rewrite history (which is what rebase does) on which
other people are basing work. If people have merge/pull requests on your
branch, then people are basing work on your branch, and you shouldn't rewrite
it.

The best approach (which is what git.git itself uses) is to do the work on a
separate branch, which is rebased often; once it's moved to the master branch,
it's "frozen" and won't be rebased anymore. All pull requests are based on the
master branch, so aren't affected by the constant rebases on the development
branch. (Actually, all the work is done on topic branches, what's often
rebased/rewritten is a sequence of merges of these topic branches into the
master branch.)

~~~
spockz
That is what I meant. Master is stable, I have my feature or bug fix branch
which I already want to show. Nobody is forking from this branch but if the
pulled the branch before it will break after my rebase.

------
thefifthsetpin
I had a modular laravel app where each laravel module was a git submodule.
Eventually we put all of those submodules into the same git repository, so we
then had had n+1 root commits for our n module laravel project.

It worked well until someone ran an accidental `git merge` of one of the
submodules into the main project which sewed a bunch of confusion. After that
point the practice was banned where I worked. Nice to know that git added a
flag to prevent that, though at this point git have also added a subtree
command which I think removes the need for our hack.

~~~
carapace
*sow

~~~
thefifthsetpin
Thanks.

------
ghostly_s
(2017)

~~~
sctb
Thanks! Updated.

------
YeGoblynQueenne
If I may- I think the first graph would look better as a histogram, with a
smooth polynomial on top. It's hard to see the long-tailed shape in a scatter
plot.

------
ezoe
I probably will never do octopus merge in my entire career.

~~~
y4mi
The giflow concept requires one to verify if a branch can be merged into both
targets

hotfix/${version} needs to go into develop and master, so you need that tiny
octopus merge to verify before starting the process

but yeah, incredibly rare.

