just a note: it's bizarre that I absolutely cannot find a way to determine a) how much it would cost to run or b) how I would pay for it if I wanted to run it

I changed it to query from [bigquery-public-data:github_repos.contents] instead, and before I execute the query it says "Valid: This query will process 1.68 TB when run.".

Queries are $5/TB [0].

So a bit less than 10 bucks. :)

Edit: brb, that's totally worth it.

[0]: https://cloud.google.com/bigquery/pricing

OK, so why is the most common document something to do with the Turkish 2012 elections? (If the rough Google Translate is to be believed...)


> 3896 http://www.pdf


Yeah I didn't care to make the regexp perfect. The most common site is www.pdfsharp.com, then www.pdfparser.org, then www.pdflib.com, etc etc

Weird! Mine just says "Quota exceeded..." without ever saying how big the query will be. Where do I find that info?

(http://i.imgur.com/3EkPYIY.png is what I see)

