
Unpacking Git packfiles - chimeracoder
https://codewords.recurse.com/issues/three/unpacking-git-packfiles/
======
chimeracoder
Author here. I discovered this while working on a clean-room implementation of
Git in pure Go. While there are a lot of references to packfiles online,
surprisingly, the actual format of packfiles was rather underdocumented. Most
resources just mention that they exist, and describe how to use `git verify-
pack` to inspect a packfile, without explaining how to parse packfiles and
apply deltas.

I decided to write this up to save others the trouble of having to reverse-
engineer it from scratch!

~~~
wereHamster
Uhm,
[https://github.com/git/git/blob/master/Documentation/technic...](https://github.com/git/git/blob/master/Documentation/technical/pack-
format.txt) ?

~~~
chimeracoder
Yes, I link to a different version of that same file in the article (my link
points to the version hosted on kernel.org, rather than Github). It provides a
bit of high-level context, but by itself it doesn't provide enough detail to
actually reimplement the corresponding Git functions.

Aside from being more terse and (IMHO) more difficult to read than prose with
examples and non-ASCII diagrams, that file doesn't explain the context and
motivation for packfiles, and it doesn't cover the parsing and application of
deltas at all.

~~~
ethomson
If you found that piece of documentation deficient while implementing a
packfile parser, then it would be nice to update it to include those details
that were lacking to help the next person to reimplement git.

~~~
chimeracoder
Thanks for the suggestion! I'll take a look at the contribution process for
the Git project.

------
lucb1e
This ("Unpacking Git packfiles") was a CTF challenge a few weeks ago (at the
Haxpo CTF in Amsterdam), except we weren't given the original repository, we
only got a pcap dump of the traffic. Using `git extract-objects` I was able to
unpack them into object files (stored in .git/objects/xx/*) but even these
were not readable. Eventually found some zpipe command that did the trick.
What a pain to do this with common tooling if you don't have the time to dive
into the format and write a real unpacker.

~~~
phaemon
Yes, the objects are zlib compressed. You can view them with something like:

    
    
        cat .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 \
        | zlib-flate -uncompress; echo

------
ah-
I wonder how the packing process works. How does it find pairs of objects that
compress well with delta encoding?

~~~
bicolao
[https://github.com/git/git/blob/master/Documentation/technic...](https://github.com/git/git/blob/master/Documentation/technical/pack-
heuristics.txt)

More detail than that would require reading the source code..

