I think this is a reasonable concern about DHE, but this blog post was specifically intended to rebut the concern: they optimized DHE and use EC (faster, smaller keys), and managed to get it deployed across all of Google. The subtext is that one reason DHE isn't widely deployed is poor implementation, and they've taken a step towards correcting it.
The post made it sound like just the bugfixes are in 1.0.0e, but the performance improvements may have not been accepted yet. I wonder if that is the case?
This is good to see, and I really appreciate the work google has put into making tls/ssl more pervasive.
I do wonder about the performance differences between their faster DHE implementation and what they were using before (RSA-RC4-SHA). I wish they had provided a bit of data on that.
DH over elliptic curves is definitely fast enough for general purpose use, but on low-power devices it could be problematic. I think a better approach is the one I used in spiped: Each connection can pick whether to do a DH computation and get forward secrecy or to instead use a DH exponent of zero and still get regular secrecy.