Hacker News new | comments | ask | show | jobs | submit login
Git tips from the trenches (ochronus.com)
160 points by ochronus on Feb 2, 2014 | hide | past | web | favorite | 44 comments

My personal favorites:

> git-rerere

will prevent you from getting stuck resolving the same merge conflicts repeatedly, by remembering how you resolved them the last time.

Also, instead of passing any arguments to "git-log", I usually just use "tig", an ncurses display of the commits: https://blogs.atlassian.com/2013/05/git-tig/

Finally, when I was first taught Git, I was told that I should never need to comment out code again. I never understood how this was possible until I learned how to use 'git add -p' and 'git rebase -i' in tandem.

> Also, instead of passing any arguments to "git-log", I usually just use "tig", an ncurses display of the commits: https://blogs.atlassian.com/2013/05/git-tig/

+1 for tig. Definitely my favourite git interface. Its status view is also very handy for viewing diffs, and adding/removing files to the index, as well as reverting changes.

Also, it now has mouse support. I know this, cos I added it :)

> git-rerere

> will prevent you from getting stuck resolving the same merge conflicts repeatedly, by remembering how you resolved them the last time.

That's on by default these days; if you re-do a merge and git finds the conflict in rerere, it'll automatically use the saved resolution rather than inserting conflict markers.

> Also, instead of passing any arguments to "git-log", I usually just use "tig", an ncurses display of the commits: https://blogs.atlassian.com/2013/05/git-tig/

Agree, tig is awesome and in my opinion the best git GUI (or at least representation). Use the normal commandline tools for commiting/cutting/rebasing/... and tig to see where everything is at!

My all-time favorite git tip is adding the '--graph' flag to git log, which will show the log with a branch graph. But while you're at it, might as well go all the way:

git log --graph --abbrev-commit --decorate --date=relative --format=format:'%C(bold blue)%h%C(reset) - %C(bold green)(%ar)%C(reset) %C(white)%s%C(reset) %C(dim white)- %an%C(reset)%C(bold yellow)%d%C(reset)' --all

(Kudos to the unremembered internet person who posted this in the first place)

I think we have benefitted from the same stackoverflow thread =) http://stackoverflow.com/questions/1057564/pretty-git-branch...

I'd like to mention a feature that's been making my life oh so much easier:

  git bisect
If you're ever looking for "the commit that broke feature XYZ", git bisect is your trusty minion.

Git bisect is wonderful. TortoiseHg also has a bisect command in the Workbench. (But command line hg doesn't seem to have it.)

Here's a tip for anyone using bisect with a Rails app (or any system that uses database migrations):

I do a rake db:migrate after a pull to make sure the database schema is up to date. So the first time I used git bisect on a Rails app, I figured I would use db:migrate after each bisect step.

On this project, as in many Rails projects, schema.rb is checked into git along with the migration files.

Somewhat mysteriously, after some of the bisect steps and db:migrate operations I was seeing an uncommitted schema.rb that was different from the one in git. That shouldn't have happened, I thought.

I finally figured out what perhaps should have been obvious: db:migrate was working fine when bisect moved forward in the history, but it was messing up when bisect moved backward across a migration.

Rails experts will know what went wrong here: After bisect moved backward, the db:migrate had no way of knowing which database changes to reverse, because the migrations that were now in the "future" relative to the latest checkout were removed before I ran the migrate.

I fixed it by using db:reset instead of db:migrate, at least after bisect moved backward in time.

Another bisect tip: Sometimes it can be advantageous to run the equivalent of a bisect manually instead of automatically. For example, you may have some pretty good hunches about which commits may or may not be related to the problem. Or you may want to do a db:migrate before moving back in time to avoid the problem I described above, and by doing the checkouts manually you will know which migrations to reverse before doing the checkout.

In my case, I was looking at a somewhat intermittent bug. As I narrowed down the possible bad commits I wanted to keep track of the specific ones I'd looked at, and git bisect doesn't do this for you. So each time I checked out a different commit I created a local branch first on the commit I'd determined to be (likely) good or bad, giving them names like Good1, Good2, Bad1, Bad2, etc.

This way I could just look at the history as I went along and see at a glance which commits I'd investigated so far.

I guess this trick would work with git bisect as well - just create a local branch before each bisect step.

There may be some better way to do this, but it worked pretty well for me.

BTW, I use SmartGit/Hg and really like it. I know everyone likes their command lines, but I greatly prefer a visual way of working with a source code repo.

If you need to more fiddling to get things working in each bisected commit, you can launch the process manually with 'git bisect start', then pass in refs to 'git bisect bad' and 'git bisect good' to indicate a known-bad and a known-good commit.

At that point, git will check out the "middle" revision and you can do whatever's needed to decide if you should mark it as either good or bad with 'git bisect [good|bad]'. This will check out the new "middle" - lather, rinse, repeat until you get down to one commit.

When you're done, 'git bisect reset' will take you back to the present.

While this isn't quite as quick as the automated version, it's a lifesaver when tracking down a regression in a library whose dependencies have changed radically (finding what b0rked your tests between Rails 3.0 and 4.0, for instance).

> TortoiseHg also has a bisect command in the Workbench. (But command line hg doesn't seem to have it.)

It does, since v1.0 (so about 6 years ago): http://www.selenic.com/mercurial/hg.1.html#bisect

As is usually the case, it's not active by default.

> Rails experts will know what went wrong here: After bisect moved backward, the db:migrate had no way of knowing which database changes to reverse, because the migrations that were now in the "future" relative to the latest checkout were removed before I ran the migrate.

And you can't even do that automatically, because git doesn't have a pre-checkout hook (assuming bisect uses checkout internally, which I'm not certain of)

> $ git branch --merged | xargs git branch -d

  does this actually work? `git branch --merged` for me also 
  returns '* master', which when xarg'd would include the '*'
  and also 'master'. I don't want to delete master, and I
  certainly dont want to delete * if it globs.
EDIT: made this into a code block as I have no idea how to make asterisks display.

Yeah, didn't think about that, could be tricky if you're on another branch. Nice catch!

I have an alias that I have created that prevents the branch you are currently on or master from being deleted.

alias gclean='git remote prune origin; git branch --merged | grep -v -E "(\*|master)" | xargs -n 1 git branch -d'

I'm not sure, but I can tell you how I'd find out. I'd clone the repo to a different directory and try it out in that test environment. Then delete the it when I'm done. #gitFTW

It'll be fine, git will refuse to delete the branch you're on and the loop will continue:

error: Cannot delete the branch 'master' which you are currently on.

Worth it just for this:

    git rev-list --all | xargs git grep '<YOUR REGEX>'

"git log -S" and "git log -G" should also be helpful here.

`git grep` is such a weird command: it defaults to being pointlessly redundant with `grep` (with a few default ignores). Contrast with hg grep, which defaults to searching the whole history (although it stops after finding the historical first match, `—all` will display all matches).

"git grep" is a lot faster than grep in a large codebase. One obvious reason is that "git grep" ignores non-checked-in files in the project directory. But I also notice a speed difference even when the project directory is clean.

That's because git-grep doesn't need to bother to go through the filesystem for the on-disk files. It greps directly from the object storage instead.

And AFAIK it can run the grep in parallel. Practically any non-archaic machine has multiple cores now, so git-grep can easily be faster than regular grep from command line.

By default grep is going through all of the .git directly, which is the part `git grep` filters out. `ag` also filters them out (and is only slightly slower than git grep by default, with all the colors and stuff), or you can tell grep to only check relevant files with e.g. `grep $PATTERN $(git ls-tree --full-tree --name-only -r HEAD)`.

On my machine, using postgres's repository, I get the following:

    > time git grep foo > /dev/null                                                           
    0.22s user 0.25s system 151% cpu 0.312 total
    > time ag foo > /dev/null                                                                     
    0.85s user 0.19s system 174% cpu 0.596 total
    > time grep foo $(git ls-tree --full-tree --name-only -r HEAD) > /dev/null                    
    0.13s user 0.10s system 93% cpu 0.255 total
grep's faster than git grep. In fact, grep is already as fast as git grep just ignoring .git:

    > time grep foo -r . --exclude-dir=.git > /dev/null
    0.15s user 0.16s system 91% cpu 0.338 total

You need to take the effect of the page cache into account. Since you are not clearing the page cache after each test, the test after it benefits from the contents. So running 'git grep' first disadvantages it, compared to everything else.

I ran a test on a large repository and here are my results. The repository was Hadoop, and is available from git://github.com/apache/hadoop-common.git.

  cmccabe@keter:~/hadoop4> du -cksh .
  375M    .
  375M    total
  sudo -- sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
  cmccabe@keter:~/hadoop4> /usr/bin/time git grep 'class TestDefaultNameNodePort'                                                                              
  hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDefaultNameNodePort.java:public class TestDefaultNameNodePort {
  0.11user 0.34system 0:00.74elapsed 61%CPU (0avgtext+0avgdata 60512maxresident)k
  260256inputs+0outputs (19major+9718minor)pagefaults 0swaps

  sudo -- sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
  cmccabe@keter:~/hadoop4> /usr/bin/time grep -rI --exclude .git 'class TestDefaultNameNodePort' *
  hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDefaultNameNodePort.java:public class TestDefaultNameNodePort {
  0.13user 0.56system 0:02.40elapsed 29%CPU (0avgtext+0avgdata 5584maxresident)k
  252792inputs+0outputs (2major+414minor)pagefaults 0swaps
So you can see that it is faster, even when excluding the .git directory.

Running grep a second time without clearing the cache gives a bogus result:

  cmccabe@keter:~/hadoop4> /usr/bin/time grep -rI --exclude .git 'class TestDefaultNameNodePort' *
  hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDefaultNameNodePort.java:public class TestDefaultNameNodePort {
  0.03user 0.04system 0:00.08elapsed 97%CPU (0avgtext+0avgdata 5584maxresident)k
  0inputs+0outputs (0major+416minor)pagefaults 0swaps
This is because the data is all in the page cache at that point, so we're not actually accessing the disk.

I was curious about the true source of the speedup, and so I checked the output of the 'perf' tool. git had 1,922 CPU migrations, whereas grep had 52. Following up on this, you can see that git is spawning a bunch of threads, whereas grep only has one thread.

  cmccabe@keter:~/hadoop4> strace -f -e trace=clone git grep 'class TestDefaultNameNodePort' 2>&1 | grep -c '] +++ exited with '                                 
 cmccabe@keter:~/hadoop4> strace -f -e trace=clone grep -rI --exclude=.git 'class TestDefaultNameNodePort' *  2>&1 | grep -c '] +++ exited with '                                               
I also think git might be cheating and using a simpler regex engine than grep, but at this point I got bored. Case closed.

One that I use the whole time is:

git status --untracked=no

This shows only files that are tracked. I tend to do a lot of work which leaves files that I don't want to check in lying around in my git repo, this eliminates these and lets me see exactly what I have been working on. The slight caveat to this is when you are working on a new file that you have not yet checked in ever.

This goes hand in hand with:

git add -u

Which only adds untracked files. With aliased commands this usually results in a flow like this:

> git stu

> git adu

Is there a reason for not adding such files to your .gitignore?

I'm a neat-freak when it comes to file placement, and have a ~/tmp directory specifically for dropping one-off files without cluttering up repos and such.

Are the things your repo isn't tracking mostly temporary/build files? or things that are going to be checked in eventually?

My git shortcuts (in my bash_profile):

alias g='git'

alias pull='git pull origin master'

alias push='git push origin master'

alias gc='git commit -m $1'

alias ga='git add -A'

# Adds all and commits (gg 'Commit message')

function gg () {

  git add -A;

  git commit -m "$1";

  git push origin master;

alias gs='git status'

alias gd='git diff --color'

alias gdc='git diff --cached'

alias gstat='git diff --stat'

I don't like the idea of your gg function, but I add files to the index using git add -p exclusively since it gives me a chance to review my changes as I'm adding them, as well as let's me ensure I split up different logical changes into separate commits. Likewise, I think git commit -m encourages lazy git commit messages (see: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messa...)

Mostly because this helps me keep a clean history, which is something I value highly.

It's a git extension, sure, but git-wtf[0] is indispensable for getting a high-level overview of your feature branches.

[0]: http://git-wt-commit.rubyforge.org/

I liked this one:

    git blame -w
Ignore the whitespace if a block was [un]indented, attribute to original author instead of person re-indenting due to control structure change around block.

The truth is that command-line blame is a pain in the ass. Blame is great for finding out where a piece of code comes from, but (especially on big projects with many contributors) there's a lot of small churn with code being moved around, slightly altered, etc… CLI blame requires an extensive danse of "blame to find out last change before $(rev:tip)" "log with patch to find out if it's the change I'm looking for" "get previous revision", rinse and repeat until step 2 yields a match.

And for all the GUIs available, most of them either don't even touch blame, or just return the output of the CLI blame command. The only GUI I've seen so far which makes it easy to traverse history through blame is for bazaar (qblame/qannotate).

SmartGit/Hg has a great interactive blame display. You can hover over commits to see the commit message, click on them to go back, etc. If you put the focus on the "View commit" select, then you can use the up and down arrows to go through the commits. You can also select different previous commits for the comparison window. Very easy to use and flexible.

IntelliJ IDEA also has a pretty nice blame panel. It's not as full-featured as SmartGit, but it's right there in the editor for any file with a quick Alt-G-N.

I would definitely recommend trying out SmartGit.

`git gui blame` is actually pretty great. You can click around to parent commits and everything.

tig's blame is great, cursor through the file to any line and hit B to jump the whole file back in time to the last previous commit that changed that line; repeat as needed.

My fav: "gitbr" alias, which goes:

for k in `git branch -a|perl -pe s/^..//`;do echo -e `git show --pretty=format:"%Cgreen%ci %Cblue%cr%Creset" $k|head -n 1`\\t$k;done|sort -r

I recommend you check out http://seasonedgit.com/

Thanks for sharing. Git has some osom hidden commands!


Absolutely wonderful article, but am I the only one who was confused when there was no relation to the gittip service (http://www.gittip.com)?

"After a few years with git everyone has his own bag o' tricks..."

So "everyone" (every developer, ever) is a "he"? I'm not here to bash the author, but remarks like this (whether or not they're intentional) really don't help make the field of software engineering, which is already quite the boy's club, any more accessible to non-male people.

Actually you're right. Nice heads up, of course I meant no harm, I fixed it. Just for the record I'm organizing a programming course for girls :)

English is not his first language. No need to write a novel about it.

So two thoughtful sentences now count as a novel? I should really look into getting published.


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact