Chris Farmer at General Catalyst is pretty open in talking about his data driven investment approach. He is doing a big data play, mining source code repositories, LinkedIn profiles, CrunchBase, etc... He wrote up a little slice of his research here: http://techcrunch.com/2011/05/25/top-10-vc-firms-investorran...
I know that SVAngel keeps pretty good stats on their investments, and relies on a good bit of data to make their investments.
A number of top tier VC firms I've talked to this past summer are actively building systems to do more data driven investment.
Brendan Baker's Anatomy of Seed study is one of the best data driven studies of early stage investment I know of. He is working with a grad student to replicate the study to see if his conclusions are reproducible. ref ( http://www.quora.com/Brendan-Baker/Anatomy-of-Seed-An-Inside... )
BlackBox.vc did the startup genome project, and are actively pursuing early stage investments based on that data. http://blackbox.vc/