Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In your first case, if you are running on Windows NT 6.1 using WebKit on a new browser for humans called 'Snakemaster Pro', then you aren't doing anything wrong.

If by client you mean a robot, then you are pretending to be a browser and you are accessing the service without permission.

Let me ask you a question, say your client was hitting my service with that user agent, 100 times a second, crawling through urls sequentionaly. Lets say I added it to my robots.txt deny list and starting blocking that user agent. Would you change the user agent and continue?

If someone creates a site that says, 'Access to this site is for 640x480 browsers only, any other use is forbidden'. Then I think its pretty clear that its a stupid site but also that faking your screen resolution is accessing a site without consent. There is no slope, someone (Linkedin) putting explicit terms on their website is pretty clear.



Have you ever heard of "headless browsers" (like [chrome](https://github.com/dhamaniasad/HeadlessBrowsers/issues/37)? What are some defining characteristics of browsers that are absent in scraping clients? If I open a browser window while doing the scraping is that acceptable?


I very rarely use robots, and think I've only been "abusive" (not really abusive, in my book) once.

What if I send a null UA? Or use it as an opportunity to share my favorite quote?

What if the behavior of my software doesn't attack like a robot, does keep the request volume reasonable (use whatever you think is reasonable here) but also doesn't do what you might expect a human clicking around to do?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: