A simple fix will be just crawling the links without the request parameters so that we don’t have to suffer.
The advantage of the querystring-method is that you can just find one suitable (i.e. huge) file and force Google to pull it down many times.
Maybe Google should consider putting a bandwidth limiter of some sort on that (or even better: use hashes to avoid duplicates), but I think screaming "security! vulnerability!" is not a good action to take here...
Rate limit per website (e.g. don't download more than 10 images per domain per second)
Limit the total number of images it downloads per document, so a single user can not cause too much traffic.
(I know that servers can be configured not to send ETags or break caches by sending random ones every time, but this could reduce the data usage considerably since most of the responses would only include the headers.)
If we ignore that ETags are related to URLs and not 'files', ETag as suggested by
userbinator might work for some cases, but if the large file is dynamically generated, it's unlikely to have an ETag; defaults in many servers are to make an ETag based on the inode of the file rather than any properties of the file, so if there are multiple servers behind a load balancer, they're likely to return different ETags.
"It's taking a while to calculate formulas. More results may appear shortly."
I set the spreadsheet document to load images like so:
After 30 or so images, google starts to slow down its fetch rate.
It wouldn't be too hard to block by User-Agent: Mozilla/5.0 (compatible) Feedfetcher-Google; (+http://www.google.com/feedfetcher.html); if you notice the traffic.
Feedfetcher does not fetch robots.txt though; so you'd have to do something in your server config.
[edit: fixed a typo, and agree with the update]
I would hope that Google is able to detect abuse of their infrastructure for (D)DOS.
I don't think removing the parameters would be ideal, though, since some sites might legitimately serve up different images based on different parameters.
Just limiting the amount of traffic to a single server, or outbound from a single spreadsheet, seems like a good solution, though.
Doesn't that somewhat defeat the purpose of a dynamic image?
Moreover, the issue is not about the feature, it's whether Google should limit the number of requests made per image. From other comments it seems like Google is hitting each image hundreds or even thousands of times. I suspect this is for cache? If that's the case, Google should look at better way to handle it. A single fetch and propagate to closest zone should be enough. But this is not a reason to limit the feature (eliminating parameter query).
You see a surprisingly high amount of excel-based bill templates - and you may want to hotlink the company logo or a signature.