

Current Article Popularity Trends on Hacker News - Cieplak
http://hn.metamx.com/

======
bravura
I believe that the note on the bottom, about machine learning, is not germane
to the graph displayed. The graph is a straight visualization of hacker news
post rank.

The machine learning discusses this earlier blog post:
<http://metamarkets.com/2011/hacking-hacker-news-headlines/>

I don't understand why the authors combined boosting with stochastic-gradient
descent.

Boosting means you do greedy, forward-selection of features. You add one
feature at a time, the one that has the largest gradient with respect to the
loss. i.e. you pick the feature that looks like it will fit the data the best,
given what the model currently knows, and then add the feature to the model.
You can then choose the optimal weight for the feature using a line search.

Why is SGD necessary in this scenario? After you picked your features, did you
go back and re-adjust the weights? I'm not sure why this is necessary. If you
are going to do SGD over many features with strong normalization, why not do
it over the _entire_ feature vector and see which features disappear?

------
allardschip
Beautiful graph. The notes with the graph are TC;DNG for me (too complicated;
did not get). Sounded impressive though.

