Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Craigslist web crawler example in python3 and docker-compose (github.com/estin)
8 points by etatarkin on March 2, 2016 | hide | past | favorite | 3 comments


I wrote a basic Craigslist scraper earlier this month to show some statistics about the rental markets in different regions/neighborhoods so I'm eager to check this out.[0]

Just out of curiosity, what is the purpose of this? Perhaps for setting up a secondary market with items from Craigslist?

[0]: https://github.com/brbsix/craigslist-rental-market


Be careful when using this - violates the CL TOS[0] and they've been known to sue people who scrape their system[1].

Relevant excerpt from TOS:

“Any copying, aggregation, display, distribution, performance or derivative use of Craigslist or any content posted on Craigslist whether done directly or through intermediaries (including but not limited to by means of spiders, robots, crawlers, scrapers, framing, iframes or RSS feeds) is prohibited.”

[0] http://www.craigslist.org/about/terms.of.use [1] http://arstechnica.com/tech-policy/2012/07/craigslist-sues-p...


You are right.

But this example only for research purposes, and gathered data will only stored for short time and without publication, personal information of users are not gathered.

And CL team allow any user-agent crawl it by http://www.craigslist.org/robots.txt

If some body want to use this scraper he must ask about it CL team.

Sorry for my poor English.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: