Hacker News new | past | comments | ask | show | jobs | submit login

.. which may be the reason why their demo screen cast shows Zillow.

Yeah, the problem is that Zillow imposes IP bans on you when you've been found to be scraping their site.

Which is why any serious effort involves rotating pools of proxies.

Not just rotating pools of proxies but sometimes shady gray market residential proxies, so that you can appear to be coming from hundreds or thousands of unique geographically distributed end-user DOCSIS3/ADSL2+/VDSL2/GPON/whatever last mile end user customer netblocks.

If you want to go down a rabbit hole of shady proxies run on compromised/trojaned end user SOHO routers or PCs, google "residential proxies for sale"


Once worked for a place using this to scrape search engines.

It's amazing how easy and comparatively cheap it is to get access to thousands of residential IPs. Is it via spyware running on people's machines? Shady people working at ISPs doing nefarious things for cash? We never knew....

The key thing to know is that if you want your traffic to come from an IP "in" some other country (according to geolocation databases anyway) it's really only a few bucks a month to get a proxy. Most of them have poor IP reputation so they suck to use on Google, but work very well for everything else out there...

> Is it via spyware running on people's machines? Shady people working at ISPs doing nefarious things for cash?

Might be as simple as https://hola.org/ & https://luminati.io/ - "unblock a website, download our VPN client", meaning you "unblock" by using somebody else's line. And the also sell access at luminati. Most users aren't aware of the implications.

It's a combination of three general things:

a) The type of "services" luckylion mentions where people have opted in to a shady gray market thing reselling proxies through their connection.

b) compromised home routers/gateway devices/internet of shit devices

c) compromised home PCs (mostly windows 7/10 trojans/botnets)

not that shady... luminati.io makes residential and mobile proxies a snap.

And IP tunneling...

Hello ALL social network folks who don’t know how spam was the Origin of social networks. (Fb, Friendster, hi5, blah blah blah)

Who the hell is documenting the history of the internet

IP bans are simple to bypass.

Step 1) Invest money in non-Zillow real estate app

Step 2) Hammer Zillow with all known ip addresses

Step 3) Profit

Step 4) Friendly chats with FBI & SEC?

Most IP bans are only temporary.

I wonder how Spider Pro does with Facebook, Linkedin, Whitepages and others that try their best to block scraping but still have an introductory free to view webpage...

Since this is designed for non-technical users and only scrapes content that's already been displayed to the user, I can't imagine many folks would use it in such a way that they could tell, unless they included a script to detect this scraper explicitly on their site

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact