
Write yourself a Git (2018) - adamnemecek
https://wyag.thb.lt/
======
sransara
Thanks for sharing. I too agree on this: "... Git is complex is, in my
opinion, a misconception... But maybe what makes Git the most confusing is the
extreme simplicity and power of its core model. The combination of core
simplicity and powerful applications often makes thing really hard to
grasp..."

If I may do a self plug, I had recently written a note on "Build yourself a
DVCS (just like Git)"[0]. The note is an effort on discussing reasoning for
design decisions of the Git internals, while conceptually building a Git step
by step.

[0] [https://s.ransara.xyz/notes/2019/build-yourself-a-
distribute...](https://s.ransara.xyz/notes/2019/build-yourself-a-distributed-
version-control-system-just-like-git/)

~~~
jordigh
While this is nice, I think it should be emphasised that the blob-tree-commit-
ref data structure of git is not essential to a DVCS. One of the disadvantages
of everything being git is that everyone can only think in terms of git. This
makes things like Pijul's patch system, Mercurial's revlogs, or Fossil's
sqlite-based data structures more obscure than they should be. People not
knowing about them and considering their relative merits has resulted in a bit
of a stagnation in the VCS domain.

~~~
thanatropism
Worse is better.

------
seleniumBubbles
This is great, thanks for sharing.

People in this thread might also appreciate this essay:
[https://maryrosecook.com/blog/post/git-in-six-hundred-
words](https://maryrosecook.com/blog/post/git-in-six-hundred-words)

And the more expanded version: [https://maryrosecook.com/blog/post/git-from-
the-inside-out](https://maryrosecook.com/blog/post/git-from-the-inside-out)

It really helped me comprehend Git enough to start understanding the more
complex work flows.

~~~
JustSomeNobody
I really enjoy the way MRC explains topics. I, too, would recommend her essay.

------
dakra
I recommend "Git from the bottom up"[1] for a nice and relatively simple
tutorial that shows you the inner design of a git repository.

[1] [https://jwiegley.github.io/git-from-the-bottom-
up/](https://jwiegley.github.io/git-from-the-bottom-up/)

------
asdkhadsj
On the note of Git being difficult, I'm really curious to see if Pijul[1] ends
up being easier to understand than Git.

[1]: [https://pijul.org](https://pijul.org)

~~~
nemetroid
From my limited knowledge (mostly based on jneems article series[1]), I think
Pijul is more powerful, but for the same reason also considerably more
difficult to understand than Git.

In particular, Pijul supports (and depends on) working with repository states
that are, in Git terms, not fully resolved. In addition, those states are
potentially very difficult to even represent as flat files (see e.g. [2]). Git
is simpler in that it mandates that each commit represents a fully valid
filesystem state.

That said, I still think Pijul might have a place, if it turns out that it
supports superior workflows that aren't possible in Git. But the "VCS elitism"
would probably become worse than it is today.

[1]: [https://jneem.github.io/merging/](https://jneem.github.io/merging/) [2]:
[https://jneem.github.io/cycles/](https://jneem.github.io/cycles/)

------
chrislo
Folks interested in this may also be interested in a new book from James
Coglan "Building Git": [https://shop.jcoglan.com/building-
git/](https://shop.jcoglan.com/building-git/)

It also takes a ground-up "build it yourself" approach and has tons of
interesting detail.

------
driusan
Having written my own git client, I can tell you that "the most complicated
part will be the command-line arguments parsing logic" doesn't go away. I
wouldn't be surprised to wake up one day and find someone published a proof
that NP != P, and the proof involved trying to parse the git command line.

~~~
Jupe
Case in point:

[https://git-man-page-generator.lokaltog.net/](https://git-man-page-
generator.lokaltog.net/)

:)

~~~
driusan
Doesn't seem that realistic. I reloaded a bunch of times, and didn't get a
single one where the same command can mean 3 different context-sensitive
things that each take different arguments but are all named the same command.

------
andrewshadura
A bit of a plug, I know, but here’s a tool I wrote to make selective
committing a bit easier and more interactive:

[https://github.com/andrewshadura/git-
crecord/](https://github.com/andrewshadura/git-crecord/)

Those familiar with Mercurial will surely notice it is, in fact, a port of a
Mercurial’s interactive commit functionality, previously a separate extension
called crecord.

------
MordodeMaru
It is indeed a great (and time consuming approach). Other VCSs have gone a
little further than Git though, simplifying their workflows or the GUIs and
CLI commands to make their use more intuitive like
[https://www.plasticscm.com/](https://www.plasticscm.com/) and others.

------
Milank
I always thought the best way to learn something is by doing it.

The only problem now is the time.

------
speter
This free ebook is also a good resource if you're just starting with Git:
[https://www.git-tower.com/learn/git/ebook/en/command-
line/in...](https://www.git-tower.com/learn/git/ebook/en/command-
line/introduction)

There is a video course as well: [https://www.git-
tower.com/learn/git/videos#episodes](https://www.git-
tower.com/learn/git/videos#episodes)

------
mcbain
Just be aware that it is a good start but isn’t as complete as the command
list would indicate - runs up to about the point of index files, so not quite
at ‘git add’.

------
Chia
Good job, thanks for sharing.

I'm just written a simple git dumper tool
([https://github.com/owenchia/githack](https://github.com/owenchia/githack)) a
few days ago. Learn by doing is a very good way and I really enjoy it.

------
bibyte
This is such a great tutorial to learn Git from the bottom up. I always
thought the "back end" part of the Git is pretty complex but this tutorial
makes it look so easy.

------
zadwang
This is simply excellent. I already know some basics of the internals of git
at a conceptual level but this tutorial makes the knowledge so much more
concrete. Wonderful.

------
blastbeat
Seems to be a great opportunity to learn some Git and Python at the same time.
I like those kind of approaches very much. Thanks!

------
dprophecyguy
Does any body know any more write your own type tutorials for any other
projects ?

Please point them out here

~~~
inetsee
Doing a search on HN for "write your own" returns a lot of answers, including
'Ask HN: “Write your own” or “Build your own” software projects'
[https://news.ycombinator.com/item?id=16591918](https://news.ycombinator.com/item?id=16591918)
from a year ago.

------
vkaku
Great! Thank you for writing an iterative one.

------
ansible
It might be fun do try this in Rust as well.

~~~
Vogtinator
Or in Haskell. Or in C++.

------
cabalamat
I have an admission to make: I don't understand git. By this I mean I have a
few simple commands I use (status/add/commit/push/pull) and if I try to do
anything more complicated it always ends up with lots of complex error
messages that I don't understand and me nuking the repository and starting
again.

So I think: there must be a better way.

I have often thought about implementing a VCS. The idea behind one doesn't
seem particularly complex to me (certainly it's simpler than programming
languages). If I did I would quite probably use WYAG as a starting point. My
first step would be to define the _user 's mental model_ \-- i.e. what
concepts they need to understand such that they can predict what the system
will do. Then I would build a web-based UI that presents the status of the
system to the user in terms of that model.

~~~
AnIdiotOnTheNet
Yeah, I too don't really understand git. It seems that it was developed
without any concern for affording a good mental model of its operation to its
users, and thus it is just a complex black box you chant arcane rituals at and
hope it doesn't decide to burn your world down. I know I _could_ build a
mental model of it if I put enough time into it, but who wants to do that when
there's actually useful things to do? So instead when I have to use it to
contribute to open source projects I have a sheet of notes with incantations
to cover the specific things I've had to do with it in the past.

~~~
chousuke
I have the exact opposite problem. I have a mental model of git, but not a
very good one for most of its competitors; I don't really get mercurial for
example, but git is just:

1) uncommitted stuff in workdir. Potentially can be lost, so commit often.

2) blobs in repo representing snapshots at commit time. Can never be lost.

3) symbolic references to the blobs. Can always recover from reflog.

4) tools to sync the above two things between repositories. (fetch and push)

5) tools to merge, diff and otherwise manipulate the changes between snapshots
and files.

I'm confident that git will never lose my data, so long as I commit it. This
makes experimentation stress-free.

Technically, you can lose data by explicitly deleting your refs, expiring the
reflog, and running gc, but if you go that far you might as well rm -r .git

~~~
krupan
Mercurial is the same except:

0) you forgot to explain git's index. Mercurial doesn't have the index, it
works how you described git.

2) data (blobs, revsets, whatever) in the repo actually can never be lost,
there is no automatic gc

3) no need for some different tool/viewer to view commits that don't have refs

Technically there are several ways to remove data from the repo, but it never
ever happens automatically behind your back.

~~~
chousuke
The mental model helps me with git because it maps pretty closely to what data
actually exists. The index is just a useful thing to help me put stuff into a
repository in a controlled manner. I've never attained a similar transparent
understanding of mercurial.

I think I've asked this before, but what exactly are mercurial's branches?

In git, they are a "physical" feature of the repository as it represents a set
of lineages, not an actual repository object.

As such, any reference to a commit uniquely identifies a branch, so the
concept of a "named line of development" is simply implemented as a reference
that gets updated as you make more commits. When you "delete" a branch in git,
it goes nowhere. Only its name is removed.

What sort of structure does mercurial use to represent its branches? I know
they are not just an emergent thing like in git.

~~~
krupan
Mercurial uses a DAG just like git. It has facilities for embedding a branch
name in each commit, or for not doing that and having anonymous branches. It
also has a feature similar to git's "branches"

Two articles that might help if you really want to dig into it:

[http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-
me...](http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/)

[https://bryan-murdock.blogspot.com/2013/06/git-branches-
are-...](https://bryan-murdock.blogspot.com/2013/06/git-branches-are-not-
branches.html)

~~~
chousuke
Hmm, so the branch names are part of the commit's metadata. I'm not sure I
agree with that, but I guess it's a valid choice.

The first article also states that unnamed branches are useful for small,
temporary diversions, and notes that git has to name branches, but I think
that's somewhat misrepresenting git since you can throw away names as soon as
they are no longer useful. To me it seems kind of silly to have unnamed
branches, given that names are free and much easier to remember than commit
hashes.

------
ssivark
I have nothing to directly comment on the tutorial. Just a tangential mention
regarding the tedious argument parsing boilerplate in Python, I have found
Python Fire to be much more convenient: [https://github.com/google/python-
fire](https://github.com/google/python-fire)

It would have shaved off another 15-20 lines from the 503 line example ;-)

~~~
bibyte
Python Fire looks much more concise. But do you know of any other languages
that handles argument parsing better then Python ?

~~~
agigao
Clojure.

~~~
bibyte
Can you please post a hello world example ?

------
nukeop
If you find these kinds of "build your own X" articles interesting, there's a
repo on Github that aggregates them, sorted in various categories:

[https://github.com/danistefanovic/build-your-
own-x](https://github.com/danistefanovic/build-your-own-x)

~~~
jyriand
I think I found my learning projects for the next 10 years.

------
raju
If it helps anyone, I wrote a two-parter on Git internals a while back

[https://looselytyped.com/blog/2014/08/31/gits-guts-
part-i/](https://looselytyped.com/blog/2014/08/31/gits-guts-part-i/)

[https://looselytyped.com/blog/2014/10/31/gits-guts-part-
ii/](https://looselytyped.com/blog/2014/10/31/gits-guts-part-ii/)

Disclaimer — This is my blog

Update - Fixed formatting / Clarified post

