This is going to be invaluable for information retrieval researchers.
Google, MSR, and Yahoo! have an edge on research over universities because of the large amount of data they collect from the users; all the other institutions are left with either small-size benchmark datasets or synthetic data, which are usually not representative of the actual usage scenarios. I myself had to synthetize a query log from the Wikipedia request logs to test some of my data structures on large-scale data.
I expect to see a huge number of papers which will use these data in their experiments in the immediate future. Thanks, Blekko!
This is our first donation. We have a lot more we plan on giving, but for user queries, for example, the privacy issues are a lot more difficult to work through. We have no interest in being the next privacy scandal.