

Git Source Code Review - laurent123456
http://fabiensanglard.net/git_code_review/index.php

======
zvrba
This is not a "source code" review. There are no comments on the overall
practices that developers must deal with when writing C: resource management,
string handling, general error handling and its coverage (e.g., is EVERY
system call checked for errors?), possible cross-platform issues, unspoken
assumptions about implementation-defined behavior, etc.

~~~
lambda
This is a series of articles, and is not yet complete. It starts from a high
level description of the organization of the project, and is going into more
detail as it goes, but it is not yet complete; notice that at the end of the
third article there is a "Next: To be published: Git internal algorithms (Tree
and Diffcore API)". I presume there will be more after that, though the author
doesn't provide an overview of what he expects the full series to consist of.

Also, I don't believe this is intended to be a review looking at coding
practices that may cause problems, but rather a review in the sense of a
description of a real-world codebase that describes how it works. Think of it
in the sense of a review of the literature, a summary of what a body of work
is about, not in the sense of a code review that is looking for potential
problems that may need to be fixed.

As far as the issues you bring up go, Git is pretty thoroughly cross-platform
if you assume a reasonable Unixy/POSIXy platform; while it can be made to work
on Windows, it's a bit more cumbersome there; you need to use something that
provides at least somewhat of a Unix like environment on Windows, such as MSYS
or Cygwin.

As far as resource management and so on go, one of the big problems with the
Git code base is that it's designed pretty heavily around a one-shot command
model, rather than something which is amenable to being used as a library or
in a long-running process. This is reason why the libgit2 project exists, to
provide an implementation of Git that can be used as a library and integrated
into other applications. Even Linus himself, who wrote the initial Git
implementation, uses libgit2 when he wants something that can be used as a
library[1].

[1]:
[https://plus.google.com/+LinusTorvalds/posts/X2XVf9Q7MfV](https://plus.google.com/+LinusTorvalds/posts/X2XVf9Q7MfV)

------
doesnt_know
Nice write up. I'm surprised there is so much use of shell scripts. I usually
do my best to avoid writing them and use them for little more then "stringing
commands together". I guess not everyone shares my dislike for shell scripting
and it's syntax.

Going back to the introduction, this comment:

"Linux kernel 3.10 release had 15,803,499 lines of code !"

This is probably a stupid question, but what's going to happen to the large
low level projects in the future when these devs have retired. Is the industry
creating enough young talent to take over the low level stuff?

I personally went through Uni without having to write a single line of C. Even
with those that go through a Computer Science (instead of Development focused)
degree, how many actually leave and choose C for their personal projects or
use it in the industry?

~~~
brudgers
The shell can be a disproportionately powerful programming environment.

    
    
        $> foo bar &
        $> foo baz &
    

parallelizes `foo` with a minimum of technical debt compared to a lower level
approach. Perhaps the most famous example in all hackerdom llustrating the
power of the shell was the Knuth-Mcillroy affair.

[http://www.leancrew.com/all-this/2011/12/more-shell-less-
egg...](http://www.leancrew.com/all-this/2011/12/more-shell-less-egg/)

~~~
yp_maplist
"The shell can be a disproportionately powerful programming environment."

Then how do you explain why other languages are so much more popular?

Not only that, but how do you explain why the market for software developers
compensates those with experience in these other languages more than those few
developers who are highly competent in writing portable shell scripts?

Relative to what I have seen written by other developers, I consider myself a
competent shell user.

Practice helps. On average I write or revise more than one new script per day.

And I've been doing this for years. The number of shell scripts I have written
numbers in the thousands.

Is there a place where shell scripting is valued on par with the trendy
languages like Python, Ruby, etc.?

~~~
pjc50
Power is ill-defined and not related to popularity, which is in turn only
lightly linked to market value.

(Arguably the most powerful language by weight is APL, which has a tiny but
dedicated community)

Shell is very good at a narrow range of tasks involving text and file
manipulation, but a couple of crippling limitations: whitespace in filenames
(especially, god help you, newlines) destroys many casually written scripts,
and the only structure really supported by the shell utils is newline-
delimited.

~~~
yp_maplist
I love APL!

I am a student of k and now q.

The biggest problem of the shell is the rules around quoting. That includes
whitespace but also many other snafus.

As for all the utilities being geared toward line-by-line (newline delimited),
this is true.

But quoting can be learnt with practice; I rarely have problems because it is
second nature.

And, there's a shell utility called lex. It lets you design your own utilities
(filters).

And you can create filters that read multiple lines, easy as pie. (Easier than
mastering awk.)

You can even create your own programming language by combining lex with
another standard utility: yacc. This is how C was created.

Do they still force CS students to learn about these utilities?

------
wirrbel
Sadly the source code review stays far away from the source code for most
lines.

I would have liked to see examples on how algorithms are implemented in git
and general notions on git's coding style, testing?? and other interesting
stuff that one might find in one of the landmark open source projects.

------
canadev
Nice intro, I thought it was a bit short though. I was thinking, "OK, now
we're going to get into it..." and then it ended.

~~~
michaelfeathers
I'm still not sure why it took an immediate digression into editors.

------
dugmartin
I built this tool a while ago help me browse source code but haven't done
anything with it yet:

[http://sherlockcode.com/](http://sherlockcode.com/)

Here is the demo - a now older version of the jQuery source code:

[http://sherlockcode.com/demos/jquery/#!/src/core.js](http://sherlockcode.com/demos/jquery/#!/src/core.js)

Hover over a variable to see all instances of that variable highlighted and
click on a variable to see all uses of that variable across all the files. You
can also bookmark lines by clicking on the line number.

~~~
mateuszf
This is very nice. I wish GitHub had something like this built-in.

~~~
dugmartin
This comment has gathered more interest than the Show HN post I did about it
months ago. I'm thinking of running the top 100 projects on GitHub through it
this week and seeing if it has "legs".

------
zachinglis
I'm confused. He talks about how he reviews the code. I expect some thoughts
about it. Then he talks about how he's switching to vim and it's weird? Then
he links to Documentation?

~~~
sjtrny
You're supposed to click the link to take you to the next page.

~~~
laurent123456
Yes I missed that linked too the first time, he really should make it more
obvious that there's a part 2 and part 3.

------
kranner
Somewhat related: I've been using an iOS app called NapCat for reading code
from GitHub repos for pleasure and edification:

[https://itunes.apple.com/app/napcat-github-client-for-
open/i...](https://itunes.apple.com/app/napcat-github-client-for-
open/id606238223?mt=8)

The 'trending' and keyword search features make it stand out from other GitHub
clients on iOS. No affiliation.

~~~
peachepe
I want this for Android :D

------
Myrmornis
To address something a commenter said I'd just like to point out that whatever
Linus may think, a "git" in British slang is an unpleasant person, not a
stupid person.

~~~
RyanZAG
From Wikipedia

> Torvalds has quipped about the name git, which is British English slang
> roughly equivalent to "unpleasant person". Torvalds said: "I'm an
> egotistical bastard, and I name all my projects after myself. First 'Linux',
> now 'git'." The man page describes git as "the stupid content tracker".

~~~
Myrmornis
Oops ok, sorry Linus! I didn't mean to suggest that you had applied "git"
inaccurately to yourself. I guess it's Merriam-Webster that is one source of
the incorrect definition.

[http://www.merriam-webster.com/dictionary/git](http://www.merriam-
webster.com/dictionary/git)

~~~
Myrmornis
And the downvotes are because...the weak attempt at humor I suppose? HN
moderators -- a useful feature would be being able to annotate a downvote with
a reason, so that commenters can learn where they erred.

The Merriam-Webster reference is pertinent in a subthread which is about the
meaning of the word "git" (yes, not germane to the article, but to criticize
that you would downvote the comment at the root of the subthread.) So all
these downvotes for one good-natured sentence?

------
mercurial
I was surprised to see no mention of Perl, but looks like it's only used for
tools like git-svn.

~~~
boklm
It's used for more than git-svn: git-add--interactive.perl git-
cvsexportcommit.perl git-cvsserver.perl git-relink.perl git-svn.perl git-
archimport.perl git-cvsimport.perl git-difftool.perl git-send-email.perl

There is actually more lines of perl than shell if we exclude tests:

    
    
      -------------------------------------------------------------------------------
      Language                     files          blank        comment           code
      -------------------------------------------------------------------------------
      C                              340          20596          18649         135900
      Perl                            43           4698           4310          27503
      Bourne Shell                    77           2523           1843          18766
      C/C++ Header                   140           2636           4635          11132
      -------------------------------------------------------------------------------
      SUM:                           600          30453          29437         193301
      -------------------------------------------------------------------------------
    

If we don't exclude tests:

    
    
      -------------------------------------------------------------------------------
      Language                     files          blank        comment           code
      -------------------------------------------------------------------------------
      C                              341          20602          18656         135910
      Bourne Shell                   761          21736           6908         124222
      Perl                            47           4728           4325          27739
      C/C++ Header                   140           2636           4635          11132
      -------------------------------------------------------------------------------
      SUM:                          1289          49702          34524         299003
      -------------------------------------------------------------------------------

~~~
js2
You missed a big one - gitweb.

------
officialjunk
if the author is reading this. there's a typo in the article: wbout

