

Common Crawl announces Open Source Big Data code contest winners - Aloisius
http://commoncrawl.org/announcing-the-winners-of-the-code-contest/

======
bollacker
I like how this contest shows that anyone can ask questions that previously
only Google, Microsoft (Bing) and a handful of 2nd tier search engines could
ask. Now if we can just get a simple query language for it all. I could then
pump in $100 in quarters, ask my question, and wait a couple hours. Isn't this
what the information ask was supposed to get us?

------
5c4r3d
Linking Entities to Wikipedia is awesome. I love the idea of Online Sentiment
Towards Congressional Bills but it's too bad they didn't show their results.

~~~
awavering
Don't worry, we're working on it - we should have our results up by this
weekend. We're planning on doing something like
<http://www.albertwavering.com/projects/commoncrawl/bill.html> to show our
results, but I would love to hear suggestions.

~~~
5c4r3d
Oh cool. How many bills are you going to show? Will there be histograms or
some kind of visualization or just the lists?

~~~
awavering
We are looking at about 50 bills this time around. We really wanted to do a
histogram, but we didn't have time to solve the problem of distinguishing
between when things are crawled and when things are published.

------
rjurney
The YC crowd needs to pay attention to this. There will be a new wave of
startups based on the common crawl as it develops.

------
frederi
Is the "Facebook infection" code open source? There is no link to the code.

