
Git from the inside out - luu
https://codewords.recurse.com/issues/two/git-from-the-inside-out
======
leni536
> from inside out

As a physicist it always surprises me how thinking between physicist and
programmers most of the time is kind of reversed. Most git tutorials seem like
this to me (Mechanics analogy):

    
    
       1. Slopes
       2. Springs and gears
       3. Horrendous contraptions
       4. Ropes and pulleys
       ...
       9. Newton I., II. and III.
    

These kind of "inside out" tutorials are natural for me and recently I taught
git basics to my SO in a similar way. It worked out well (she is a physicist
too). I don't want to generalize though, it's maybe rooted in the common ways
of teaching programming and physics.

~~~
JoshTriplett
Imagine if everyone learning physics came in with an attitude of "I need to
learn and use the rocket equation as quickly as possible" (or substitute some
other high-level problem for "the rocket equation"). You'd end up with strange
backwards tutorials for physics that start out with that specific model, then
a handful of related models, then how to abuse that model to handle things it
doesn't really apply to, and much later the underlying physics and mathematics
to solve arbitrary generalized problems.

Many people start out trying to figure out either "how do I use git exactly
like (svn, cvs, vss, ...)" or "how do I commit and push my changes", so
tutorials start there. Most people don't approach git by learning its
underlying data model. Arguably people should, because it's a rather simple
data model, and then all the commands become simple applications of that
model.

~~~
ams6110
Most people don't start learning Word by understanding the data structures
used to store the text and modifications either.

For 95% of developers, git is a tool that is incidental to their primary task
(developing software). Having to have a deep understanding of the underlying
data structures in order to use it effectively is the antithesis of how most
"utility" software is designed.

When I am in the flow of coding some part of my project, my head is full of
the data structures, object models, databases, algorithms, requirements, etc.
that are immediately relevant to that task. If I have to do a context switch
and pull the git data model front and center into my thinking to know what to
do to get my work into the repository, that is a serious break in flow and has
always been a problem for me whenever I've had to use git.

~~~
agumonkey
> Most people don't start learning Word by understanding the data structures
> used to store the text and modifications either.

And, there should be a warning when your document gets above 5 pages long that
you should learn the structural sides of the program at hand. Unless you like
spending your weekend crafting a TOC line by line.

~~~
simula67
The world would be a much sadder place if that was the case. I do not
understand how the infrastructure that powers world trade, banking, public
construction etc works. I am grateful that they are packaged into easily
understandable interfaces so I can still benefit from them.

The power of technology is compounded when it can empower even non-
technologists to use it. The more sophisticated tasks they can accomplish with
it, the better.

~~~
agumonkey
It's a double edged sword. By crafting interfaces so simple anyone can use it,
we forbid them to understand what's really happening. That's how people end up
thinking the blue 'e' icon on the screen is Internet.

~~~
JadeNB
But is 'forbid' really the right word here? I certainly agree that such
interfaces _encourage_ people to "accept without questioning", but it seems
that there is no necessary obstacle to making something that's easy to use but
also permits you to dive under the hood and see the details. (I think
particularly of Mac OS before it started becoming all iOS'd. Even now, when
Apple's 'just work'ing settings don't just work for you, you can often fix it
by diving into the command line.)

~~~
agumonkey
You're right forbid isn't appropriate.

IMO, no actual mainstream OS gives ability to understand anything, even UNIX
based. And I don't believe command lines are a way to understand either, or a
very low efficiency one (gotta read a lot, understand complex context, try
mistakeful commands).

You need virtual, mockable, undo-able environments to understand. You need
ways to decode the data and metaphors used by computers.

------
chx
Regular pitch for
[http://www.sbf5.com/~cduan/technical/git/](http://www.sbf5.com/~cduan/technical/git/)

> you can only really use Git if you understand how Git works. Merely
> memorizing which commands you should run at what times will work in the
> short run, but it’s only a matter of time before you get stuck or, worse,
> break something.

It's concise (the linked article is gigantic) and allows for an understanding
of this overhyped user hostile DVCS. I know that git won but don't expect me
to be happy about it.

~~~
barbs
What alternative do you prefer?

~~~
chx
bzr. The CLI interface is much nicer and consistent. It has bound branches
which is really useful useful when you are doing something small (see bsder
comment above). You are not forced to use a staging area. There is no need to
squash commits just to have a clean history instead you commit what you have
and limit the depth of the log on view. Also, bzr is really easy to extend in
Python. And so on and so on.

The git command line interface is the exact opposite of user friendly: some
commands have options that fundamentally change the operation so much so it
should be a different command: git reset removes changes from the index
leaving your files intact and if you do not have anything staged then does
nothing. Now, git reset --hard simply throws away your changes. This does not
end here: git reset 'HEAD^' will remove a commit. How mad is that, git reset
operates on the staging area but git reset something operates on commits??

git checkout filename is almost completely unpredictable as it might get the
file from the index if it exists there or from HEAD if not. There is no
indication at all what happened. And so forth.

~~~
barbs
Interesting, I've not ever seen a recommendation for bazaar over git or
mercurial. I have heard that mercurial's CLI is a bit more sane than git, and
I've also read that it's easier to extend in Python. How would you compare
bazaar to mercurial?

I've actually only ever used git for professional and personal development,
but I think I'll want to play around with both mercurial and bazaar now. Git's
CLI is definitely a bit of a PITA.

I find this video summarises git's frustrations pretty well :)
[https://vimeo.com/60788996](https://vimeo.com/60788996).

~~~
ngoldbaum
Bazaar is basically unmaintained at this point. If you look on Launchpad, the
trunk branch hasn't seen a commit since December 2014 [0]. There hasn't been a
release since August 2013 [1].

Mercurial is very actively developed, with dozens of patches a month [2]. They
also keep to a planned rolling release cycle [3] along with deep commitments
about backward and forward compatibility [4].

New features like the experimental evolve extension [5] are also really cool.
I use it for day-to-day work and find it incredibly useful for editing work-
in-progress commits into nice self-contained patches.

[0]
[https://code.launchpad.net/bzr/trunk](https://code.launchpad.net/bzr/trunk)

[1] [https://launchpad.net/bzr/](https://launchpad.net/bzr/)

[2] [http://selenic.com/pipermail/mercurial-
devel/2015-March/thre...](http://selenic.com/pipermail/mercurial-
devel/2015-March/thread.html)

[3]
[http://mercurial.selenic.com/wiki/WhatsNew](http://mercurial.selenic.com/wiki/WhatsNew)

[4]
[http://mercurial.selenic.com/wiki/CompatibilityRules](http://mercurial.selenic.com/wiki/CompatibilityRules)

[5]
[http://mercurial.selenic.com/wiki/ChangesetEvolution](http://mercurial.selenic.com/wiki/ChangesetEvolution)

~~~
chx
Yes. git won, bzr is dead. As I said: I am aware and I switched to git because
everyone else did. Just don't expect me to be happy about it.

~~~
ngoldbaum
I'm trying to say if you want a git alternative that's not dying and isn't
infuriating, check out mercurial.

~~~
chx
I can't, the world (open source projects and work both) is using git.

------
AceJohnny2
See also "Git from the Bottom Up": [https://jwiegley.github.io/git-from-the-
bottom-up/](https://jwiegley.github.io/git-from-the-bottom-up/)

(originally a PDF in 2008)

~~~
dwyer
Much better article IMO. Introducing the low level commands that the higher
level ones wrap around is a much more fun and interactive way to understand
the .git schema to me.

------
kazinator
Wish `git` didn't have annoying special cases in it. For instance

    
    
       git rebase -i HEAD~2
    

won't work if there are only two commits, because HEAD~2 refers to a
nonexistent commit after the first two.

There should be some friggin' NIL terminator there which takes the HEAD~2
reference.

Imagine having a function to, say, delete characters from a string which takes
an open-ended range [from, to). Then imagine that the index to has to exist in
the string; it must not point one element past the end! Oops, you cannot
delete from a position to the end of the string.

The garbage-collected object graph is nice and "Lisp-like" in some ways, but
silly in others.

Oh, and in case you're thinking "just make an empty initial commit, and it
will effectively be your NIL terminator". No can do; git doesn't allow empty
commits. Of course, you can make a file called ".nil" and add it and commit.
Use "()" as the commit comment. :)

~~~
nshepperd
I feel like the lack of an initial empty commit is really a failure to match
the intuitive graph model. Clearly there should be an arrow corresponding to
"adding the first files". And that arrow needs somewhere to go _from_ and
_to_. Hence, `git init` should always start by creating an initial commit
object referring to an empty tree.

A side bonus of this would be that since the initial commit is empty it has a
fixed id. Suddenly, all git repositories have a common ancestor, and you can
merge any two random projects together without losing history!

~~~
kazinator
I echoed this exact idea in another comment in the thread, right down to that
commit having a fixed ID. (I proposed the all-zero SHA (which is not really a
SHA)).

------
dnc
For grokking git, indispensable resource is git early dev mailing list and
corresponding code base (first couple of months after project started). Linus
explained it in very clear and precise way in the mailing list and related
code. The initial code base is surprisingly small (around 1200 LOC of clear
and precise C code). Used data structures are simple and self-explanatory.
Although most of the original code is not in the git code base anymore, the
data structures and main design ideas have stayed there intact so far.

~~~
voltagex_
I'm going to have to go through the archives later (anyone got an mbox?) but
it's tricky to follow the early development. The archives seem to start at
[http://marc.info/?l=git&r=20&b=200504&w=2](http://marc.info/?l=git&r=20&b=200504&w=2)
and I don't see many design messages from Linus.

------
mendelk
Interesting article.

Also wasn't aware that Hacker School changed their name.

[https://www.recurse.com/blog/77-hacker-school-is-now-the-
rec...](https://www.recurse.com/blog/77-hacker-school-is-now-the-recurse-
center)

~~~
chromedude
Don't worry you aren't very far behind the times. It was announced yesterday.

------
jordigh
Huh, another git explanation.

Either the thing is so easy to understand that everyone can do it and is then
compelled to write about it, or it's so difficult to understand that everyone
feels the compulsion to explain it to everyone else.

~~~
RickHull
Maybe a quip, but that dichotomy is so false it's not even funny. Git is tough
to wrap one's head around at first. It's not intuitive unless one already has
a deep background in this space. Hence, there are lots of attempts to explain
it in order to bring more into the fold of intuition. It seems perfectly
natural and good, and your contemptuous tone puzzles me.

~~~
Dylan16807
Right. It's _simple_ but it's _unfamiliar_.

Confusing command parameters aside.

------
logicallee
>"This essay explains how Git works. It assumes you understand Git well enough
to use it to version control your projects."

so...the opposite of Bjarne Stroustrup's maligned "The C++ Programming
Language", which fails to explain how C++ works, after assuming you don't know
it. :)

seriously though no need for the second sentence. this article is a great
intro!

------
RansomTime
Footnote 3: git prune deletes all objects that cannot be reached from a ref.
If the user runs this command, they may lose content.

In what cases would a user lose content? When something is added but not
committed only?

~~~
m0tive
When you've committed something, but then rebased or reset the branch position
so the commit is not longer in the history of any branch or tag. This usually
isn't a problem, because when you rebase work you are making a copy of the
commit so references to the data should be the same.

I also think it's worth noting, `git gc`, which is triggered automatically
occasionally, actually runs `git prune`.

------
ThinkBeat
What tool did the poster use to create the diagrams?

~~~
maryrosecook
OmniGraffle. I really enjoy using it.

------
Fannon
A great title would have been: The Guts of Git.

~~~
a3_nm
Already taken:
[https://lwn.net/Articles/131657/](https://lwn.net/Articles/131657/)

------
msie
I regular read articles about Git's inner workings and I always seem to forget
it. :-(

------
ams6110
Yet another attempt to explain the incomprehensible.

Why does such a popular version control systems find itself in need of so many
explanatations.

Any startup attempting to market something that required a user to understand
concepts such as this...

[https://codewords.recurse.com/images/two/git-from-the-
inside...](https://codewords.recurse.com/images/two/git-from-the-inside-
out/24-13.png)

...would be laughed out of the room in any other context.

~~~
rudolf0
Git isn't exactly a startup (or a company, or even a product), nor did it ever
intend to be.

Lots of great software happens to be difficult for a lot of people to
intuitively understand.

~~~
Crito
> _" Lots of great software happens to be difficult for a lot of people to
> intuitively understand."_

For real. Maybe it is just me, but I find programs like Photoshop and nearly
every CAD program I've ever encountered to be bewilderingly complicated. I
don't use any of those sorts of software professionally, but have found myself
needing them numerous times for hobby reasons. Every time I try to learn them
I become frustrated with just how steep _and tall_ the learning curves are.

Git though? I felt pretty confident with how it worked and basic command line
operation after just a weekend.

Maybe git's command line is more inconsistent than hg's or subversions', but
in the grand scheme of software difficulty? I just don't get the complaints.
_" Incomprehensible"_? Give me a break. It does not hold a candle to most
_commercial_ professional software.

