
BFG Repo-Cleaner: an alternative to git-filter-branch - davidgerard
https://rtyley.github.io/bfg-repo-cleaner/
======
tlb
I used this yesterday to move 400 MB of binary files from a repo on github to
git-lfs. Works great.

The workflow was roughly:

    
    
       - cp *.mov (and a few other large blob types) ~/tmp
       - git rm *.mov
       - git commit
       - git lfs track *.mov
       - git add .gitattributes
       - git commit; git push
    

In a fresh directory:

    
    
      - git clone --mirror $remote; cd repo
      - bfg --delete-files '*.mov'
      - git reflog expire --expire=now --all && git gc --prune=now --aggressive
      - git push
    

Back to my src directory:

    
    
      - mv repo repo.bloated
      - git clone $remote; cd repo
      - cp ~/tmp/*.mov . 
      - git add *.mov  (it now puts them in lfs)
      - git commit; git push
    

Kind of a chore to figure out, but now my repo is small and zippy.

Tip: do this on a cloud machine instead of your laptop, since most of the time
is pulling/pushing data to github.

------
jleader
I used BFG to migrate several moderately large repos (~10K commits in the
largest) from GitHub Enterprise (which didn't have a file size limit) to a
private GitHub.com account. GitHub.com quite reasonably won't allow any commit
containing a file larger than 100MB, and warns about any file larger than
50MB. Someone had foolishly committed things like demo videos and very large
PDF documents early in the projects' histories. I couldn't get git-filter-
branch to run without errors; BFG worked perfectly (and quickly) with no
errors and lots of reporting about what exactly it had done.

The only difficulty was that we were a Perl shop, running on fairly old
CentOS, and as a Scala app, BFG required installing a fairly new JDK, which
wasn't a muscle-memory level task for us.

------
Xorlev
BFG is amazing for helping you switch a repo from private to public,
especially when you might have been lazy early on and kept credentials in your
repo, even if you shouldn't have. You can replace your known credentials with
__REDACTED __throughout history without compromising the rich history your git
commits (should) record about the evolution of a piece of code over time.

IMO, your commits are just as important as the code its self for understanding
how and why something came into being.

------
baldfat
Please tell me that BFG is a direct reference to Doom and the ultimate weapon
the in/famous BFG

~~~
jleader
It's also a permutation of the initials of the similar built-in command, "git
filter-branch".

------
mikegioia
Does anyone know if it's possible to change email addresses on past commits
with BFG? I have a lot of commits pushed by user "Mike Gioia <none@none>" that
I would love to correct the email address for.

~~~
taylorbuley
That's actually a pretty straightforward operation with `git filter branch`
[http://stackoverflow.com/a/4494037/317937](http://stackoverflow.com/a/4494037/317937)

In my experience, BFG is mostly useful for removing large files, where the
bash command you need to translate data per commit can very complex.

~~~
mikegioia
Thanks, Taylor. I'll try this on a cloned repo. However, I remember trying
something vaguely similar and it copied every single commit. When I checked it
on github, I had duplicates of everything :(

Hopefully this doesn't do that.

