
Ask HN: How do you earn a living from scraping in 2018? - hoerzu
I have been scraping housing markets. LinkedIn paired with hunter.io for creating mailing lists of companies. Recently I started MITM and Burp to intercept Apps to get direct api access. I created a docker container using different tor endpoints for different ips. I even used ssh proxying to create my own ip network.<p>To bypass recaptcha I build a container with chrome + puppeteer with anticaptcha preinstalled and pre logged in.<p>Yet I wonder if it is still worth it. Kinda looking for a mentor or inspiration.<p>Been a long time HN reader and I wonder what you think? Why scraping is still worth it in 2018?
======
mrtnmcc
Don't be too discouraged by the negative comments. Half of ML, optimization
algorithms is pulling in valuable data. Companies with data monopolies could
help innovation by sharing it with others more creative than them.

Stick diverse data feeds in a neural network and try to predict earnings for
companies, energy demand or something else valuable to know in advance? Build
off some of the ideas that Google trends/insights offer?

Always thought someone should put together a 'bittorrent client for scraping'
where you can make large distributed queries as long as you service others.

------
dangerface
I used to do a lot of scraping too, but now most things have a decent api, I
rarely need too. I would think that Machine Learning's need for lots of data
just creates a larger market for data. Look at releasing an api for all the
data you scrape and sell access to that with a focus on ML.

Things I think are worth $$$

Scrape anything difficult to scrape.

Scrape things that change and keep historical data.

Make a generic scraper and generic api.

------
fabiandesimone
I can think of a few ways (not sure all within TOS), not sure if helpful but
still worth a shot (maybe):

Build a site for people wanting to buy a car and know what's the avg asking
price of models with the same characteristics.

For example: say I want to buy a Freelander for max 10K. I'll go and check all
freelanders and quickly notice that most in that range are from the year 2010
and 2012. Kilometers differ, guarantees, etc.

For me at least, I go into each and mentally account what's important and
compare to decide which ones I'll contact.

I could use a service that would weigh all the variables and basically say to
me: these 3 offer the best value for the money and within your price range.

======

If you are able to build email lists, I'll just leave this here: you can
upload Look A Like Audiences to facebook to advertise to similar people as
your email list.

I'm sure you can see the possibilities here. Sorry I don't go into details as
I'm mobile.

~~~
perl4ever
I used to make charts off of a used car buying site, to show price vs. mileage
for every car of a particular type and year I was interested in.

The advantage of this, if you pick something common with hundreds of listings,
is you could easily derive a relationship between the two variables and see if
a given ad was above or below average.

I never had any idea where to begin marketing it to other people though. I
suppose if I knew someone who provided a car buying service...

------
wodenokoto
How do you scrape legally? What do you say when asked for sources?

What are the ramifications for ignoring copyright and TOS?

------
_8usx
> Recently I started MITM and Burp to intercept Apps to get direct api access.

Kind of a party foul in my book. .

~~~
hoerzu
I understand what you mean. Still causing less traffic ;) and I'm intercepting
my own devices.

------
jholman
I have a related question. How do you earn a living from scraping, in any
year? Is that a thing? Was that ever a thing?

Like others have said: earn money by doing things that people want enough to
pay you for.

It's like asking "how do you earn a living giving good names to variables in
2018?"

~~~
hoerzu
Well i think the possibilities can go beyond reach. Flight search engines,
many shops are basically curated scraped data. Good names for variables in
2018 -> maybe good names for domains.

------
dchuk
Are you just tinkering with all of those ideas above? Or where are you finding
these ideas?

Like others have said, start with the problem, rather than coming up with
solutions first.

Do you have any of the things you listed on github or anything? Curious to
checkout a few of them.

~~~
hoerzu
Well tinkering is the right word. I build a GitHub Repo recommender. Which
check what users who starred a repo also starred. And I did a data science
christmas calendar for 24 days.

My main goal was to use browser automation for arbitrage.

you can find some old projects here in german.
[https://franz.media/](https://franz.media/)

------
mabynogy
It's technically interesting but I wouldn't spend time of that because of the
legal issues (especially in realestate). It's too risky.

As you know that well, you could do the opposite. An antiscraping product.

------
verismal
Microsoft rewards. Live off Amazon gift cards

~~~
srednalfden
For real, or jesting?

~~~
DoreenMichele
You can make about $10 per month via Microsoft rewards with a limit of one
account per person, five accounts per household. So unless you can survive on
$50/month,this would violate the terms of service.

[https://answers.microsoft.com/en-us/bing/forum/bing_other-
bi...](https://answers.microsoft.com/en-us/bing/forum/bing_other-
bing_users/in-regards-to-bing-rewards-can-i-have-two-
accounts/065d57a3-590e-481e-9820-e1a790684710)

------
gesman
Build something for others worth scraping

------
shurelock
I think his question really is where can I make money from sccrapping in 2018.

~~~
hoerzu
Upwork looks very cheapy imho

------
paulcole
You don't earn a living from scraping, you earn a living by solving a problem
and providing value.

What problems are your projects solving? Are people were paying (in either
effort or money) to solve these problems currently?

~~~
hoerzu
I give a price prediction for the rent of a house. So the buyer can
immediately calculate ROI.

~~~
duskwuff
_You_ do? Or the sites you're scraping do?

~~~
hoerzu
Pretty simple I scrape renting objects and buying objects and then use xgboost
to predict renting prices of an object. Then I apply the prediction to the
buying object. Voila. It’s cool because you can see what features drive
prices.

~~~
krageon
A pretty cut and dry case of adding value I'd say. This is a very cool thing,
thank you for talking about it.

------
smallhands
@hoeruz please can we talk email tejioford@yahoo.com

------
is_true
Google is a scrapper with ads.

~~~
quickthrower2
Scrapper as in 'fighter'? I guess they are in a way.

------
arosier
Send me a message ar@pm.me

