I was reading the Slashdot article on Google's new "deep search" (http://tech.slashdot.org/tech/08/04/16/2052206.shtml) where it submits forms and sees what the results are. This is a quite insightful and interesting anecdote one user posted:
http://tech.slashdot.org/comments.pl?sid=525058&cid=23096424
When I interned at Google, someone told me a funny anecdote about a guy who emailed their tech support insisting that the Google crawler had deleted his web site. At first, I think he was told that "Just because we download a copy of your site, doesn't mean your local copy is gone." (a'la obligatory bash [bash.org].) But, the guy insisted, and finally they double checked and his site was in fact gone. Turns out that it was a home-brewed wiki-style site, and each page had a "delete" button. The only problem was, the "delete" button sent its query via GET, not POST, and so the Google spider happily followed those links one-by-one and deleted the poor guy's entire site. The Google guys were feeling charitable and so they sent him a backup of his site, but told him he wouldn't be so lucky the next time, and he should change any forms that make changes to POSTs -- GETs are only for queries.
So, long story short, I wonder how Google will avoid more of this kind of problem if they're really going off the deep end and submitting random data on random forms on the web. Like the above guy, people may not design their site with such a spider in mind, and despite their lack of foresight this could kill a lot of goodwill if done improperly.
For example, they told the guy that he was lucky that they were willing to give him a backup, but it seems to me that Google's the one that should be taking responsibility for their actions.
Its a short jump from, "you have to use POST or we'll delete your stuff" to "you have to follow Google standard X or we won't index your site."
Welcome to the first web empire.