"We also test our generators on a monthly cycle to test for failures."
Curious - when "testing" them, how long do you run them for and at what load?
I could see the beancounters being _very_ unhappy with the ops people saying "we want to run both gen sets at full datacenter load for more than 10 minutes at a time, every month", which is what Amazon would have to have done to detect the faulty cooling fan problem. I'm guessing there are _some_ organisations who do that, but I suspect most datacenters don't.
I work for the Fed. We are a remote site and have several power outages each year due to trees on the power lines or snow related issues. It's pretty much required and we've had zero issues justifying it.
The baseline generator maintenance cost exceeds the fuel cost every month. A bigger issue is getting permits from your local/moronic government to run your generators for testing, but for a big datacenter, you're probably in an industrial neighborhood (to get correct power from multiple substations or higher) and this isn't an issue -- it's more an issue with office-datacenters or other backup systems in normal residential/commercial neighborhoods.