
Ask HN: What’s the legality of web scraping? Part 2 - backend-dev-33
Part1: https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=20256681
(closed to new comments)<p>What (legal?) trick allows search engines to crawl(well, we know that &quot;crawl&quot; is synonim of &quot;scrape&quot;) and index content protected by terms of use?
Is it &quot;fair use&quot; or something else?<p>One example: Craigs List!<p>In their terms of service:<p>&gt; <i>USE. Unless licensed by us in a written agreement, you agree not to use or provide software (except general purpose web browsers and email clients) or services that interact or interoperate with CL, e.g. for downloading, uploading, creating&#x2F;accessing&#x2F;using an account, posting, flagging, emailing, searching, or mobile use. You agree not to copy&#x2F;collect CL content via robots, spiders, scripts, scrapers, crawlers, or any automated or manual equivalent (e.g., by hand).</i><p>On the other hand:
https:&#x2F;&#x2F;www.google.com&#x2F;search?q=site%3Asfbay.craigslist.org+couch&amp;oq=site%3Asfbay.craigslist.org+couch<p>Google is able to index CL and you can query the google index specifying &quot;use only this CL city&quot; and you can see the ads, and we know Google making money with it (advertising for example).<p>I can not imagine google obtaining &quot;written agreement&quot; from CL ))
======
gitgud
Crawling is not entirely synonymous with scraping. Crawling web pages is
usually done with the specific purpose of search and categorisation, whereas
_scraping_ is much more generalized... Data mining, mirroring, avoiding the
official API limits etc...

Google may have written agreements with Craigslist, they're both enormous
companies...

Finally, as others have said it's a legal grey area. It's not completely clear
and it basically depends on what websites you're scraping, how you use the
data and why...

Maybe it's best to just ask the website?

------
backend-dev-33
And here Part1 as link:
[https://news.ycombinator.com/item?id=20256681](https://news.ycombinator.com/item?id=20256681)

