
The History of Git - wickwavy
https://www.welcometothejungle.com/en/articles/btc-history-git
======
mxschumacher
Google seems like a poor example for a company using git. Yes many of their
open source projects are under git, but the huge monorepo that holds all the
proprietary code works with Piper [0]

[0] [https://cacm.acm.org/magazines/2016/7/204032-why-google-
stor...](https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-
billions-of-lines-of-code-in-a-single-repository/fulltext)

~~~
dijit
It should be noted for those who have not worked at google (since piper is
proprietary) that it works quite similarly to perforce.

~~~
malkia
(Early in my gamedev career, I've started with CVS/VSS quickly to switch to
p4, then went for 2 years to google using CiTC+g4, then back to gamedev with
p4).

Even the tool is called "g4" from "p4", has similar style of commands. The
best part is that I can work from any directory ("client" p4 terms -
[https://www.perforce.com/manuals/v17.1/cmdref/Content/CmdRef...](https://www.perforce.com/manuals/v17.1/cmdref/Content/CmdRef/p4_client.html))
and in "g4" they all have virtual (unrealized) view + merges from my changes.
What's even more appealing, is that I can do that from home, from a browser -
no VPN, just UbiKey +
[https://cloud.google.com/beyondcorp/](https://cloud.google.com/beyondcorp/)
\- then I can even start a build (not on my laptop, it's just a dumb terminal
at this point). I can later even (try) to debug with interface like
[https://source.chromium.org/](https://source.chromium.org/) but my jobs on
borg - not real debug, but I can add print statements, and it'll sync with the
CL (changelist number) of my pushed binary.

So - this is all cool. But then I never really knew how many people, and how
man-hours were behind this coolness..

One of the things I miss about Google, is exactly that dedication to the
engineer and make their work not impeded.

------
pbalau
I'm sorry, I'm getting way too bored in the meeting I'm in atm, but I believe
this is false:

> With distributed VCS, a copy of the most current version of the code resides
> on each developer’s device

There is nothing that can guarantee the copy you have is the latest. Can be
very far back, can be forward, can be diverged to hell.

~~~
gumby
In fact that description is even more wrong since every RCS has a copy of the
checked out tree -- that's the whole point of an RCS!

With git (why does that article insist on spelling it "Git"?) you have a copy
of the whole tree, so every clone can also be a an upstream tree for somebody
else if desired.

~~~
erikstrottmann
AFAICT, Git documentation capitalizes the first letter:

> The advantages of Git compared to other source control systems. [0]

> You can learn more about individual Git commands with "git help command".
> [1]

[0] [https://git-scm.com](https://git-scm.com) [1] [http://man7.org/linux/man-
pages/man1/git.1.html](http://man7.org/linux/man-pages/man1/git.1.html)

~~~
gumby
Interesting, old programs like 'rm' only use lower case but new programs like
bash and git (or should I say "Bash" and "Git") capitalize the program name
sometimes.

Given that Unix (not "unix") inherited case sensitivity from Multics I would
expect more consistency in the spelling.

(Clearly I'm focusing on the most important topics in computing!)

------
divbzero
This article is great for attributing Git’s success to not only Linus
Torvald’s core creation but also subsequent contributions from Hamano, King,
Schindelin, Pearce, and many others [1] in the community. Git’s core
architecture was critical but would not have succeeded without the ecosystem
that grew around it.

[1]:
[https://github.com/git/git/graphs/contributors](https://github.com/git/git/graphs/contributors)

One technical error in the article was understating the distributed nature of
Git:

> With distributed VCS, a copy of the _most current version_ of the code
> resides on each developer’s device, making it easier for developers to work
> independently on changes to the code.

Replacing s/ _most current version_ / _all versions_ / gives us the
technically correct statement:

> With distributed VCS, a copy of _all versions_ of the code resides on each
> developer’s device, making it easier for developers to work independently on
> changes to the code.

Having _all versions_ of code is what allows local branch, rebase, merge, and
conflict resolution before pushing those changes to the shared repo.

------
asix66
This reminds be of a talk I watched by Richard Hipp, "Git: Just Say No" [0]
where he discusses a "top 10 git enhancements" list, as of the time (4 years
ago.) It's a good talk, and caused me to discover Fossil SCM. Sadly I've yet
to try/use Fossil, and my teams still use git.

    
    
      [0] https://www.youtube.com/watch?v=ghtpJnrdgbo

~~~
chmaynard
I'm not convinced by Hipp's claim that storing a repo in a single file (SQLite
database) is somehow superior to storing a repo in a filesystem directory than
contains many files. What am I missing?

~~~
SQLite
I don't actually remember making that argument. But I will try to reconstruct
my thinking...

(1) Keeping an entire repo in a single file is a better abstraction. There is
just one file to move around or rename. There is a single icon on your desktop
to drag around or double-click on. There is a single file to attach to an
email. There is a single file to measure the size of when judging the size of
a repository. And so forth.

Lots of programs bundle multiple entities into a single file for convenience
like this. For example, a DOCX file is really a ZIP archive containing lots of
individual pieces. Would you rather your document be a single DOCX file, or a
directory full of the individual pieces. Which would be more convenient to
use, do you suppose? How is a VCS repository different from a DOCX file in
this respect?

Another way to look at this: Breaking up a repository into a directory full of
separate files exposes internal implementation details to the user.

(2) Perhaps I was making the argument that a relational database is better
than a key/value database for holding a repository. (A directory full of files
is just a kind of key/value database after all.) There are countless reasons
why relational databases work better than key/value databases. One example:
With Git, given an individual check-in, it is difficult to discover the
descendants of that check-in. It is so difficult, in fact, that none of the
common Git tools provide that capability, and Git workflows are engineered
(perhaps subconsciously) to avoid the need to ever figure out what comes after
a specific check-in. But if Git used a relational database to store content,
finding the descendants of a check-in would be a fast and simple query.

(3) I/O to a single relational database is faster than I/O to individual files
on disk. See
[https://www.sqlite.org/fasterthanfs.html](https://www.sqlite.org/fasterthanfs.html)
for details.

~~~
chmaynard
Regarding (1), I just ran across an old blog post by Scott Chacon about using
git bundles.

[http://scottchacon.com/2010/03/10/bundles.html](http://scottchacon.com/2010/03/10/bundles.html)

git-bundle documentation: [https://git-scm.com/docs/git-bundle](https://git-
scm.com/docs/git-bundle)

------
cletus
So can anyone explain no one considered that you might want to ultimately use
a different hashing algorithm than sha1?

All hashing algorithms have a shelf life before it becomes feasible to
compromise them such that you need to migrate to something more secure.

Yet the Git format seems but built for this are all. No hash versioning, no
allowance for multiple hashes and no way you define the hash size such that
now it seems like moving on from sha1 is going to be a giant pain.

TCP/IP had this from many years earlier (a version at least). This seems like
such a glaring oversight. Or am I missing something?

~~~
bebop
It could be that they knew that the hash algorithm was good for uniqueness and
they were unconcerned about security. The SHA1 hash provides no security in
git, only whether a file has been changed or not.

While collisions have been found in SHA1, it is still a decent hashing
algorithm where collisions are extremely unlikely.

~~~
dependenttypes
> collisions are extremely unlikely

Random collisions are quite unlikely, intentional collisions not so much.

> The SHA1 hash provides no security in git

False. OpenPGP signatures in git depend on the SHA1 hash. Same for someone
doing a checkout at a specific hash because they trust it.

------
unpythonic
Let's begin at the beginning, Chapter
da39a3ee5e6b4b0d3255bfef95601890afd80709.

Of course, that's tongue-in-cheek. It all began with
e83c5163316f89bfbde7d9ab23ca2e25604af290.

~~~
DaGardner
sha1("") == da39a3ee5e6b4b0d3255bfef95601890afd80709

and 'e83c5163316f89bfbde7d9ab23ca2e25604af290' is the inital commit of git
itself:
[https://github.com/git/git/tree/e83c5163316f89bfbde7d9ab23ca...](https://github.com/git/git/tree/e83c5163316f89bfbde7d9ab23ca2e25604af290)

~~~
2zcon
>random three-letter combination that is pronounceable, and not actually used
by any common UNIX command.

>actually

Many native French speakers use 'actually' when they mean 'currently' because
of the 'actuellement' false-cognate. This looks like the same mistake but
neither Swedish nor Finnish have a word that looks like 'actually' when I
machine-translate 'currently'.

Any ideas?

~~~
mannykannot
I know nothing of Finnish, but, in poking around on Google translate, I found
'nykyinen', commonly translated as 'currently', but sometimes as 'existing'.
To rephrase the sentence to say "there is no existing use..." would be a
little awkward in English, but would convey the same message.

I felt that in this particular sentence, neither 'actually' nor 'currently'
are necessary, but to be sure I wanted to check the context, only to find that
this sentence is not currently to be found in the article.

~~~
chousuke
Finn here. I don't think the use of "actually" comes from any Finnish
expression specifically but it might be some sort of literary habit that stems
from the desire to emphasize how things turned out to be. It's somewhat common
in Finnish to say how things turned out, rather than that someone (or you)
made it so.

Thinking about it, I might've used the word in a similarly redundant fashion
myself occasionally.

------
namanaggarwal
What is that font?

~~~
apple4ever
I don't know but I don't like it, at least for a long form article.

------
xt00
The history of Git, told with a font that’s only feature is making the letter
G unreadable.

~~~
hjek
[https://support.mozilla.org/en-US/kb/firefox-reader-view-
clu...](https://support.mozilla.org/en-US/kb/firefox-reader-view-clutter-free-
web-pages)

