But why retroactively remove the data? The original owner was fine with holding ...

bjt · on Aug 20, 2014

It's hard for a bot to understand concepts of 'owner' and 'completely different person' based on the data they have available. Companies can use this robots.txt feature to un-index old marketing content after a re-branding, for example. Or after an acquisition.

stavros · on Aug 20, 2014

Sure, but, surely, the bot has timestamps saying "robots.txt allowed me to keep these documents last time I spidered them". Why do they have to be retroactively removed? robots.txt only disallows spidering, it doesn't mandate that you should delete all the data you've already spidered.

db48x · on Aug 21, 2014

Because most of the problems come from people who want to hide old material that they didn't realize was being indexed. The automatic behavior is simple and easy to implement, and doesn't require any human intervention.