
Tell HN: The Hacker News frontpage effect on a project GitHub stars - fhoffa
How much attention does a Hacker News frontpage post drive to a GitHub project?<p>For this visualization I combined 2 datasets: GitHub Archive (http:&#x2F;&#x2F;www.githubarchive.org&#x2F;) and Hacker News (https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10440502), both living in BigQuery (https:&#x2F;&#x2F;cloud.google.com&#x2F;bigquery&#x2F;what-is-bigquery, https:&#x2F;&#x2F;reddit.com&#x2F;r&#x2F;bigquery).<p>The visualizations were built with Google Cloud Datalab (https:&#x2F;&#x2F;cloud.google.com&#x2F;datalab&#x2F;, Jupyter&#x2F;IPython notebooks on the cloud).<p>With one SQL query you can extract the daily number of stars a project gets, and with another one the GitHub urls that were submitted to the Hacker News - or combine both queries in one:<p><pre><code>    SELECT repo_name, created_at date, COUNT(*) c, GROUP_CONCAT_UNQUOTED(UNIQUE(hndate+&#x27;:&#x27;+STRING(hnscore))) hndates, SUM(UNIQUE(hnscore)) hnscore, SUM(c) OVER(PARTITION BY repo_name) monthstars
    FROM (
      SELECT repo_name,  actor_login, DATE(MAX(created_at)) created_at, date hndate, score hnscore
      FROM [githubarchive:month.201509] a
      JOIN (
        SELECT REGEXP_EXTRACT(url, r&#x27;github.com&#x2F;([a-zA-Z0-9\-\.]+.[a-zA-Z0-9\-\.]*)&#x27;) mention, DATE(time_ts) date, score
        FROM [fh-bigquery:hackernews.stories]
        WHERE REGEXP_MATCH(url, r&#x27;github.com&#x2F;[a-zA-Z0-9\-\.]+&#x27;)
        AND score&gt;10
        AND YEAR(time_ts)=2015 AND MONTH(time_ts)=9
        HAVING NOT (mention CONTAINS &#x27;.com&#x2F;search?&#x27; OR mention CONTAINS &#x27;.com&#x2F;blog&#x2F;&#x27;)
      ) b
      ON a.repo_name=b.mention
      WHERE type=&quot;WatchEvent&quot;
      GROUP BY 1,2, hndate, hnscore
    )
    GROUP BY 1,2
    HAVING hnscore&gt;300
    ORDER BY 1,2,4
    LIMIT 1000
</code></pre>
The visualization: https:&#x2F;&#x2F;i.imgur.com&#x2F;B5awmAL.png<p>(correlation is no causation, but there is indeed correlation between both)<p>--@felipehoffa
======
fhoffa
Links:

\- Visualization:
[https://i.imgur.com/B5awmAL.png](https://i.imgur.com/B5awmAL.png)

\- GitHub Archive:
[http://www.githubarchive.org/](http://www.githubarchive.org/)

\- Hacker News on BigQuery dataset:
[https://news.ycombinator.com/item?id=10440502](https://news.ycombinator.com/item?id=10440502)

\- Other BigQuery projects:
[https://reddit.com/r/bigquery](https://reddit.com/r/bigquery)

------
veddox
An effect like that is to be expected, but it is interesting to see some real
data.

