Direct URL to project details: https://devpost.com/software/tagger-news A few co...

var_explained · on May 14, 2017

One of the devs here.

1. That's the way we were thinking about it :)

2. Oh, excellent! We hadn't found that or we'd have used it, and we'll start working with it.

3. Tomorrow I'm going to blog about how we approached the machine learning. Short version; we manually came up with regular expressions to classify a training set based on titles. The idea is that when we experimented with manual annotations on titles, the vast majority of the time we were looking for only a few key words. There's no question that this adds biases and will not be entirely accurate, but manual inspection convinced us it was a good enough approach for our hackathon, and most of the articles we identified with the resulting algorithm would not have been found by the title regex alone.

You can see the table of regular expressions [here](https://github.com/dodger487/analyze_hn/blob/master/topics.c...) and a bunch of (pretty unstructured) analysis code [here](https://github.com/dodger487/analyze_hn/blob/master/hn-analy...).

searchhn · on May 14, 2017

This is awesome ! Congrats..

https://github.com/HackerNews/API

The firebase API is excellent. I have been using that to keep http://searchhn.com up to date in real time.

Also big query is updated every day with all comments and posts. https://bigquery.cloud.google.com/dataset/bigquery-public-da...

This is what I started with to update the Searchera (https://searchera.io) index which powers Searchhn

var_explained · on May 14, 2017

Oh, that was silly of us not to use BigQuery! I was just able to use that download a full million stories (though we still would have had the rate-limiting step of downloading the articles).

During a hackathon it can be hard to tell when to keep searching for an easy solution like that, as opposed to going with something slow you know will work- sometimes it turns out to be a dead end.

Thanks for the recommendations!

var_explained · on May 15, 2017

I've now blogged in more detail about building Tagger News- check it out here! https://news.ycombinator.com/item?id=14343854

jibbolo · on May 14, 2017

Hey mate, you should follow this guide step by step when you deploy a django app: https://docs.djangoproject.com/en/1.11/howto/deployment/chec...

BTW, congrats for the projects, well done!

nthcolumn · on May 16, 2017

The Awful Reign of the Red Delicious (2014) (theatlantic.com) is tagged 'Microsoft' 'Apple'

Might wanna tweak that...