
Crawly – Never write another web scraper - rezist808
http://crawly.diffbot.com/
======
chejazi
0\. Arrived at site

1\. Entered HN URL

2\. Entered email reluctantly

3\. Checked email, nothing received

4\. Started writing comment on HN, frustrated

5\. Checked email again, still nothing

6\. Posted comment, disappointed

~~~
brianwawok
I think you just got harvested!

------
ki85squared
Link to more of an overview / documentation without having to fork over an
email address? Silly.

~~~
dwynings
Sure: [http://www.diffbot.com/](http://www.diffbot.com/)

------
DrScump
I got a reply within 3 minutes.

The email said, "We set our crawler loose on <site>, and WOW did we find some
interesting results."

The resulting CSV file? 0 bytes. I guess that's "interesting" in its own way.

~~~
dwynings
Sorry about that! If you let me know the result id, I can take a look at what
happened.

------
pink_dinner
This won't work when I need to scrape 100,000 pages in an hour.

~~~
cat-dev-null
AWS FTW

------
edoceo
CloudScrape works

------
cat-dev-null
www.ncbi.nlm.nih.gov };)

