I spent some time about a year ago fussing with getting a web spider/scraper going; it was simple enough to download data, but actually putting it into a database that was domain-, content-, and time-aware was impressively complex and I put it on the backburner.