Do not do this. This goes in the bucket of ideas-that-sound-good-but-aren't. If screen diffing is how you want to test your program, you should use / write an independent program to do so.
That said, there are some serious problems to using screen diffing. First, it requires manual testing. You will need a pair of human eyes to determine if a change is breaking or not. Second, there are a lot of false positives that make testing slow. When a few pixels are moved on the page, is it okay? Third, there's a _LOT_ of data you will need to look at before you can be confident about the diffs. You will need to do a screen shot for each page for each resolution for each git commit.
The signal to noise ratio is very low for screen diffing. You would be MUCH better served by using (in the case of web apps) Selenium web driver tests to make sure your page works. As a bonus, with automated tests you can use git bisect to quickly find the offending commit!
Yeah, we wrote Huxley to mitigate many of the GP's issues with UI testing by making it automatable and easy to bring up to date. http://github.com/facebook/huxley.
I think this is a great idea; it's what we do at Instagram and we use Huxley[0] to generate these screenshots. Keeping a set of "golden" screen shots and being able to view their history over time is awesome.
Wow, Huxley looks awesome for folks using Selenium! Looks like it can work with a list like in my article, but it can also record every click in a Selenium test. I also like the idea of including RMS distance in the diff -- makes milestone changes apparent without looking at each image. Thanks
This is a good point. Screenshots are a nice visual way of being able to determine where something broke.
It's worth noting that this is also something Selenium does, so you could just add the screenshots from that layer of integration testing to version control and require these tests be done at each commit or merge. To justinjlynn's point, you should have a separate repo for this due to some of Git's struggles with extremely large blobs.
Just a helpful hint, in terms of authorship:
"Cut out all those exclamation points. An exclamation point is like laughing at your own jokes." —F. Scott Fitzgerald
I wonder if you could set up a parallel git-repo that is kept in sync when the master repo via scripting. This way you don't mix the work product that changed (code) and the results of those changes. (screenshots)
Alternatively, it would be cool if you could keep a repo with copies of every working version of the end software product such as abinary file or live website. e.g. chrome checkout <commit-hash>
Right, this could grow quickly over a few months at which point you would need to script filter-branch to rewrite older commits with spec/screenshots in a submodule.
In general, filter-branch should be avoided rather than part of a long-term plan. It can cause all kinds of havoc as soon as you move beyond a handful of collaborators.
I wonder if there would be some way to host the images externally (imgur, for example) and have a git hook to add the image links to a file that is version controlled. Depending on the markup, GitHub could pull the images in.
This is a good idea for a gem or SAAS. I generate screenshots on the CI server for one of my projects and put them into a dropbox subfolder per commit hash too.
All submodule updates will be reflected in the parent log. The screenshots are tied to commits, so you get per-shot history, diffs and know which commits changed which files. Git would not be more efficient at storing most compressed image formats than the filesystem.
Only slightly serious: You could generate a looping GIF of the two images with Imagemagick and let your eyes do the "hard" work of detecting any movement.
That said, there are some serious problems to using screen diffing. First, it requires manual testing. You will need a pair of human eyes to determine if a change is breaking or not. Second, there are a lot of false positives that make testing slow. When a few pixels are moved on the page, is it okay? Third, there's a _LOT_ of data you will need to look at before you can be confident about the diffs. You will need to do a screen shot for each page for each resolution for each git commit.
The signal to noise ratio is very low for screen diffing. You would be MUCH better served by using (in the case of web apps) Selenium web driver tests to make sure your page works. As a bonus, with automated tests you can use git bisect to quickly find the offending commit!