A site can be crawled from any number of Googlebot IP addresses, and so blocking all except one doesn't help in throttling crawling.
If you verify the site in Webmaster Tools, we have a tool you can use to set a slower crawl rate for Googlebot, regardless of which specific IP address ends up crawling the site.
Let me know if you need more help.
Edit Detailed instructions to set a custom crawl rate:
1. Verify the site in Webmaster Tools.
2. On the site's dashboard, the left hand side menu has an entry
called Site Settings. Expand that and choose the Settings submenu.
3. The page there has a crawl rate setting (last one). It defaults to
" Let Google determine my crawl rate (recommended)". Select
"Set custom crawl rate" instead.
4. That opens up a form and choose his desired crawl rate in crawls per second.
If there is a specific problem with Googlebot, you can reach the team as follows:
1. To the right hand side of the Crawl Rate setting is a link called
"Learn More". Click that to open a yellow box.
2. In the box is a link called Report a problem with Googlebot which
will take you to form you can fill out with full details.
Crawl-Delay is (in my opinion) not the best measure. We tend to talk about "hostload," which is the inverse: the number of simultaneous connections that are allowed.
A few years ago, I did pretty much the same thing myself. Thankfully the late summer was our slow season and the site recovered pretty quickly from my bone-headed move, but the split second after I realized what I've done was bone-chilling.
I think just about everyone has thought at some point that they understood how something worked, only to have had things go pear-shaped on them.
The lesson: people are not fully knowledgeable about everything, even the smart and talented ones.
"You would not believe the sort of weird, random, ill-formed stuff that some people put up on the web: everything from tables nested to infinity and beyond, to web documents with a filetype of exe, to executables returned as text documents. In a 1996 paper titled "An Investigation of Documents from the World Wide Web," Inktomi Eric Brewer and colleagues discovered that over 40% of web pages had at least one syntax error".
We can often figure out the intent of the site owner, but mistakes do happen.
If you're writing HTML, you should be validating it: http://validator.w3.org/
Is there any real downside to having syntax errors?
Obviously, that's not a problem if you already know exactly how different browsers will treat your code, or you're using parsing errors so elemental that they must be patched up identically for the page to work. For example, on the Google homepage, they don't escape ampersands that appear in URLs (like href="http://example.com/?foo=bar&baz=qux — the & should be &). That's a syntax error, but one that maybe 80% of the web commits, so any browser that couldn't handle it wouldn't be very useful.
Anyhow, one downside to having syntax errors might be that parsers which aren't as clever as those in web browsers, and which haven't caught up with the HTML5 parser standard, might choke on your page. This means that crawlers and other software that might try to extract semantic information (like microformat/microdata parsers) might not be able to parse your page. Google probably doesn't need to worry about this too much; there's no real benefit they get from having anyone crawl or extract information from their home page, and there is significant benefit from reducing the number of bytes as much as possible while still remaining compatible with all common web browsers.
I really wish that HTML5 would stop calling many of these problems "errors." They are really more like warnings in any other compiler. There is well-defined, sensible behavior for them specified in the standard. There is no real guesswork being made on the part of the parser, in which the user's intentions are unclear and the parser just needs to make an arbitrary choice and keep going (except for the unclosed center tag, because unclosed tags for anything but the few valid ones can indicate that someone made a mistake in authoring). Many of the "errors" are stylistic warnings, saying that you should use CSS instead of the older presentational attributes, but all of the presentational attributes are still defined and still will be indefinitely, as no one can remove support for them without breaking the web.
There is no reason to allow most of these errors other than coding sloppiness.
The web would have died in stillbirth and it would never have grown to where it is now.
"Be generous in what you accept" (part of Postel's Law) is a cornerstone of what made the internet great.
XHTML had a "die upon failure" mode, and it has died, why do you think XHTML was abandoned and lots of people are using HTML5 now.
The irony of that statement on hacker news is pretty amazing. Have you looked at how the threads are rendered on this page. It is tables all the way down.
Maybe instead that hostload could be parsed from robots.txt? It sure seems like the better mechanic to tweak for load issues (while traffic/bandwidth issues are still unresolved).
Another thing that might help google is for them to announce and support some meta tag that would allow site owners (or web app devs) to declare how likely a page is to change in the future. Google could store that with the page metadata and when crawling a site for updates, particularly when rate limited via webmaster tools, it could first crawl those pages most likely to have changed. Forum/discussion sites could add the meta tags to older threads (particularly once they're no longer open for comments) announcing to google that those thread pages are unlikely to change in the future. For sites with lots of old threads (or lots of pages generated from data stored in a DB and not all of which can be cached), that sort of feature would help the site during google crawls and would help google keep more recent pages up to date without crawling entire sites.
I believe you can do that using a sitemap.xml