Looking at Table 2 http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183....
I used to work on a program that is using IBM blades with 4x DDR as an MPI processing cluster. The cluster was significantly smaller than what Google is discussing, though.
Having this background knowledge gives you a much better idea of why the interconnects is going to be such a big problem going forward.... When I was working with QDR IB 5 years ago, it didn't matter if they had faster cards - the PCI bus at the time couldn't support anything faster. So you were literally riding the edge of the I/O capability all the way through the stack.
(At least, that was the case a few years ago. I don't know how much it has changed but I would be surprised if they had totally overhauled things).