

Ask HN: Want to make $500 for scraping some data? (challenge) - gbachik

Open a new tab and go to google.com.
Type in a band name like: &quot;All Time Low&quot;<p>You&#x27;ll notice a box on the righthand side with more info about the band.<p>If you click the down arrow you&#x27;ll see a &quot;People also search for&quot; section.<p>If you can scrape that data, in under 4 seconds(Sorry no PhantomJS), I&#x27;ll pay you 500$ for your solution.
======
dkyc
nodejs, requires node package _fetch_

Under four seconds, please note that this violates Google's ToS and you will
likely be blocked if you do this on any significant scale.

    
    
      var bandname = "All Time Low";
    
      var fetchUrl = require("fetch").fetchUrl;
      var url = "https://www.google.de/search?q=";
      bandname = bandname.replace(" ", "+");
    
      fetchUrl(url+bandname, {
        headers: {
            'User-Agent' : 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2049.0 Safari/537.36'
        }
      }, function(error, meta, body){
        var ret = [];
        var alts = body.toString().split('alt\\x3d\\x22');
        alts.forEach(function (alt) { var cur = alt.split("\\x22")[0]; if (cur.indexOf("(")==-1)    ret.push(cur) });
        console.log(ret);
      });

~~~
gbachik
Worked like a charm! Whats your email!? I know. This is for an in house
project. I don't plan on spamming google.

~~~
BadCode
Just curious, I am unable to see what was the hard part in creating the
solution to entail a bounty of $500?

------
motyar
Not related search but "keyword suggession" can help you.

Here is Google XML API
[http://suggestqueries.google.com/complete/search?output=tool...](http://suggestqueries.google.com/complete/search?output=toolbar&hl=en&q=hackerNews)

------
Jake232
Probably possible to do, however what's up with just using Last.fm's API?

[http://www.last.fm/api/show/artist.getSimilar](http://www.last.fm/api/show/artist.getSimilar)

~~~
gbachik
I am also using last.fm's API.

I'm looking at a number of api's and doing algorithms on the results

-thanks though

------
opless
I'm sure they get their data from wikipedia, here's a start
[http://en.wikipedia.org/wiki/DBpedia](http://en.wikipedia.org/wiki/DBpedia)

~~~
gbachik
it actually comes internally from their knowledge graph: Freebase

~~~
opless
why not just download the rdf triples then?

[https://developers.google.com/freebase/data](https://developers.google.com/freebase/data)

~~~
gbachik
even with that I would not be able to recreate googles related algorithm
myself. so that wouldn't get me anywhere.

------
exbone
why did you delete your s.o. question?

[http://webcache.googleusercontent.com/search?q=cache:LEbxw95...](http://webcache.googleusercontent.com/search?q=cache:LEbxw95yNqIJ:stackoverflow.com/questions/25336400/how-
to-scrape-googles-similar-searches-challenge+&cd=1&hl=en&ct=clnk&gl=us)

~~~
gbachik
got 4 downvotes and no answers. figured that wasn't going to help anyone to
decided to ask here instead.

------
stangeek
I have a working prototype. How do you want to proceed concretely?

~~~
gbachik
sorry somebody beat you to it!

------
brothe2000
Did you try Import.io?

