
Git Internals, Techniques, and Rewriting History - mxschumacher
https://blog.isquaredsoftware.com/presentations/2019-03-git-internals-rewrite/#/
======
aequitas
My collegae told me I'm the guy with his phone number in the git.txt file,
don't know if that's a compliment?

But whenever I try to explain Git to someone I try to step away from a
computer and just work out the problem on a whiteboard. Using post-it's for
branches/HEAD/tags and marker to write down the commits and the commit-tree.
Preferrably permanent marker to reflect the permanent nature of commits with
regards to branches (eg: rebasing keeps the original commits around). Also I
ditch files in favor of a picture of a cat there changes are attributes like
added body parts or toys since most people like cats better than files.

Taking it away from a computer really helps to reduce the complexity involved
and thinking about what manipulations you make to the Git 'database' visually
really helps understanding the concepts imho. Just going through it step by
step as if you where the Git binary making the changes, and after a while the
Git commands turn from hard to remember trivia to tools in your toolbox.

~~~
pjc50
Conversely, this is a massive indictment of the git user interface.

People used to talk about "CASE tools": Computer Aided Software Engineering,
by analogy with the CAD tools that replaced physical drawing tables once the
technology got good enough. I joke that certain tools are "computer impaired
software engineering", and git can certainly belong in that category at times.

What we have built is a tool with a good internal model but a user interface
that it's easier for you to _not_ use a computer, to step away to the
whiteboard, in order to work out what to do.

We don't tolerate this for, say, word processors; people don't go to a piece
of paper to lay out their headings and only then work out the incantations
necessary to achieve that result.

(Has anyone attempted to make a "transparent porcelain" GUI which represents
the "intuitive" internals in an actually explanatory way? The success of which
could be measured by the reduced number of mistakes and amount of apprehension
experienced by non-expert users?)

~~~
intarga
I don't think it's an indictment since it's a clear tradeoff. Yes git isn't
intuitive or self explanatory for newcomers, but that's only because it's
optimised to be maximally efficient for experienced users.

To me, this is a sensible decision. The target users of git will use it day in
and day out for decades, in this case it makes sense to prioritise experienced
usage. For onboarding new users there are plenty of good tutorials, and I
think in the long run it makes sense to put the legwork in with those, rather
than switch to a more intuitive but less efficient porcelain.

~~~
jcranmer
> Yes git isn't intuitive or self explanatory for newcomers, but that's only
> because it's optimised to be maximally efficient for experienced users.

But that implies that a trade-off _has_ to be there, which it does not. If you
design your features well, it can both be quite intuitive for newcomers while
still affording efficiency for experts.

A great example is the staging area in git. It is a horrible feature, that
causes lots of interactions with other commands that are inconsistent and
unclear, especially for newcomers. How to do it better? Just make the staging
area a full-fledged commit. Considering that it's already pretty easy to edit
a commit anyways, promoting the workflow of just polishing a commit before
publication is more intuitive for users, and it means questions like "how do I
see what the diff of the staging area by itself is?" is really just "how do I
see what the diff of a commit by itself is?" One of those questions I already
know the answer to, the other I don't.

~~~
WorldMaker
It is encouraging with the work going into the `git checkout` split into `git
switch` and `git restore` that there is growing pressure among git developers
to make things more consistent for users and smooth over some of the
roadblocks, including some of the confusion between the staging index and the
working directory.

------
acemarke
Oh hey, that's my post! Glad someone found it interesting enough to submit.

Don't expect this to hit critical mass by now, but happy to answer questions
on the rewriting thing if anyone has any.

~~~
funkattack
Slide #39 seems to use a graph from [https://marklodato.github.io/visual-git-
guide/index-en.html#...](https://marklodato.github.io/visual-git-guide/index-
en.html#reset) , that is published under "Attribution-NonCommercial-ShareAlike
3.0 United States (CC BY-NC-SA 3.0 US)". This should be credited appropriately
and your content should be shared alike.

~~~
acemarke
Yeah, I grabbed a bunch of images from various sources for the slides.
Unfortunately, I was in a hurry and didn't record all the places I got them
originally.

I'll try to find time to go through and attribute things. Thanks for the
heads-up!

~~~
funkattack
In my book "credited appropriately" reads like, put the site down till you
have done your homework!

~~~
acemarke
Look, I'm at work. I agree that it needs to be updated, and will do it as soon
as I have time.

------
pjc50
I love how the first real slide goes straight into the fake manpage generator:
[https://git-man-page-generator.lokaltog.net/](https://git-man-page-
generator.lokaltog.net/)

"App Repo Size Issues:" yes, this is the Achilles heel. By being a full
distributed system where every client has to carry a full copy of the history
of everything, some common practices become unsustainable. I've had two
employers that checked build artefacts into SVN, for example: do that with git
and the repo becomes unusably large very quickly. Vendoring dependencies, a
useful practice if your project is slow-moving, will also bloat the repo. They
should be using "shallow clone" for Jenkins, but even that can be surprisingly
large.

I've also been through the "apply BFG to repo" phase (very time-consuming, and
blocks all commits while you're doing it!)

> New idea: run the formatter against every commit in the history, so that it
> looked as if the code was "always formatted the right way".

This is actually brilliant, for the reason they give - keeping "credit"
assigned correctly to original commits.

> Determined it was okay if older commits were potentially "broken", as long
> as the latest commit runs and has all of our changes as of late 2018

I'm less OK with this, as the chances of an automatically introduced horrible
bug which you can't trace seem rather high _and_ you've wrecked any chance of
using git-bisect! But if the "tip" is the only supported released version, I
suppose it's less critical.

It's also interesting that most of the speed issues are addressed by re-
architecting to avoid syncing to disk. If there was an easy Windows RAMdisk
this might have made almost as much of a difference.

~~~
jacobush
But there is an easy RAMdisk for Windows! [http://www.ltr-
data.se/opencode.html/#ImDisk](http://www.ltr-data.se/opencode.html/#ImDisk)

------
mederel
The slides don't display well on mobile device

~~~
acemarke
Sorry, mobile formatting was not something I was concerned about when I was
putting these together. I presented them from my laptop, so that was all that
mattered at the time.

~~~
Phrenzy
Do I sense an upcoming fork? ;)

------
darekkay
I always found "rewriting history" misleading. Git is immutable by default, so
you cannot "rewrite" the history. In fact, you are creating an "alternate
history" (which the article mentions). While the difference appears subtle, it
takes away the fear of users who never use any "altering" commands because
they think they might lose their changes or mess up something.

~~~
chrismorgan
The most common scenario is that the original history is immediately
discarded, by virtue of no longer having any refs pointing to it—though it
lingers in the reflog for a while before being garbage collected. In that
situation, I think it’s quite reasonable to call it rewriting history, because
the original history _no longer exists_ ; for the term “alternate history” to
have much meaning, the original must still exist, to be compared to.

I’m looking at the long-term picture here, rather than how you interact with
it when in the process of rewriting history.

Clarifying that you’re replacing history rather than modifying it is useful,
but I certainly have no beef with the expression “rewriting history”.

------
ridiculous_fish
What are the best practices for extending git? Every example seems to be a
shells script that calls out to git; is there a better approach?

~~~
saagarjha
What kind of functionality are you trying to add? For something simply,
shelling out to git is probably OK; if you're writing something more
complicated many projects rely on libgit2. For _really_ weird stuff you can
probably just directly go and edit objects inside of .git, as long as you're
careful.

~~~
ridiculous_fish
I would like to extend git such that you can check out a commit, amend it,
rebase all descendants onto the amended commit, and update any branches
accordingly. Effectively like `rebase -i` but not modal.

This requires keeping some state around. libgit2 is a nice pointer, I'll check
that out.

(hg users will recognize this as "restacking.")

------
mettamage
I find the diagram on page 15 to get an overall feel/flow for basic git usage
a very good diagram. If I'd still be teaching at a coding school I'd hand it
out to my students.

~~~
rinchik
Isn't a bit off? It shows "git diff" as a diff between staging and workspace
when it should be between workspace and local repo.

~~~
pjc50
Not quite: if you stage something, it no longer appears in "git diff" and you
have to use "git diff --cached" to show it. One of those things which I trip
over occasionally.

"git diff --cached" = staging vs. HEAD

"git diff" = workspace vs. staging (but when nothing is staged, staging ==
HEAD)

(I think?)

~~~
rinchik
Yep. you are right, "but when nothing is staged, staging == HEAD" is exactly
what i had in mind

------
emilfihlman
Note: the presentation has ui flaws and does _not_ fit on a 3:2 screen, so
you'll have to zoom to about 80% for it to be visible. This is apparent on
slide 8.

------
dreamcompiler
Impossible to resize and read on my phone. It amazes me when a developer
writes a presentation about a developer tool using a goopy JS-dependent
infrastructure that apparently has never been tested on mobile.

~~~
acemarke
As I said down-thread, my only immediate concern when I put the presentation
together was presenting it from my own laptop. I appreciate that folks are
wanting to read it on mobile, but I've got a lot of stuff on my plate
(primarily work around maintaining Redux), and modifying my slides to work
better on mobile has not been a priority.

~~~
dreamcompiler
Fair enough. Sorry about not reading before commenting.

~~~
acemarke
No worries.

The complaints about it not being readable on mobile are legit. If I could
wave a magic wand and make it all magically responsive, I would.

But, my original goal was simply to actually make the slides for the
presentation, and show them during my talk. I do specifically use the
Spectacle React/JS toolkit, as I _want_ to publish the slides online later for
viewing on the web (one of several reasons why I don't make them in
PowerPoint). But, part of the reason I can get away with that is that
publishing them on my blog is just a matter of uploading the built assets and
moving on, vs having to convert them from Powerpoint by hand or something.

I've got a ton of other priorities and tasks to deal with, and figuring out
what's needed to make the slides well-formatted on mobile realistically isn't
anywhere on that list. Honestly, this thread is the first time anyone's
actually complained about that.

I'm not sure how much of the formatting issue is due to Spectacle's own
styles, vs the typical slide layout that I have in there (which is mostly
flexboxes with two items side-by-side). If anyone has some specific
suggestions on how to alter the styles to make them work better, I could try
to apply those and rebuild it.

------
rsp1984
Safari user here. For some reason the navigation arrows are missing and I
can't get past the first page...

~~~
acemarke
Sorry! I built it using the Spectacle web slides toolkit, and have only looked
at it in Chrome and Firefox because that's what I use myself.

------
pikzel
Would have been nice if we could scroll through the pages and not having to
use the arrow keys.

------
samwestdev
any video archive of the presentation?

~~~
acemarke
No, I only gave this as an internal brown bag talk at work.

------
dang
We changed the URL from
[https://blog.isquaredsoftware.com/2019/10/presentation-
hooks...](https://blog.isquaredsoftware.com/2019/10/presentation-hooks-hocs-
tradeoffs/) to the slides which have the content.

~~~
acemarke
Note that the original post links to an earlier article I wrote that goes into
a lot further detail on the actual rewrite process:

[https://blog.isquaredsoftware.com/2018/11/git-js-history-
rew...](https://blog.isquaredsoftware.com/2018/11/git-js-history-rewriting/)

