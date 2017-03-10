Hacker News new | comments | show | ask | jobs | submit login
Reverse Engineering the Hacker News Ranking Algorithm (sangaline.com)
139 points by foob 1 hour ago | hide | past | web | 19 comments | favorite





Or you could just search for Paul Graham's posts[0] :)

> (= gravity* 1.8 timebase* 120 front-threshold* 1 nourl-factor* .4 lightweight-factor* .17 gag-factor* .1)

    (def frontpage-rank (s (o scorefn realscore) (o gravity gravity*))
      (* (/ (let base (- (scorefn s) 1)
              (if (> base 0) (expt base .8) base))
            (expt (/ (+ (item-age s) timebase*) 60) gravity))
         (if (no (in s!type 'story 'poll))  .8
             (blank s!url)                  nourl-factor*
             (mem 'bury s!keys)             .001
                                            (* (contro-factor s)
                                               (if (mem 'gag s!keys)
                                                    gag-factor*
                                                   (lightweight s)
                                                    lightweight-factor*
                                                   1)))))
[0]: https://news.ycombinator.com/item?id=1781417

That's almost seven years out of date. The algorithm has changed significantly since then.

Example: Flags affect a story's position, but that algorithm doesn't mention flags at all.

Yeh I'm not sure why this is even a thing. I thought everyone knew that algorithm at this point? Hell, I'm using the same thing on a couple projects.

Anyone interested in the topic of HN's ranking algorithm should look through the HN's submission archives:

https://hn.algolia.com/?query=How%20Hacker%20News%20ranking%...

This is the article that was discussed in yesterday's "The stories that Hacker News removes from the front page" [1]. After speaking with @dang, it sounds like what happened with the original submission was that a moderator accidentally put "(2010)" in the title and users flagged it because they incorrectly thought it was old. He invited me to resubmit the article today to allow for real discussion and to demonstrate that what happened to the first submission was accidental.

I know that this analysis will get less attention than the one from yesterday, but I personally find it far more interesting and hope that it can stand on its own merits. I'll be around to answer any questions that might come up.

[1] - https://news.ycombinator.com/item?id=13857086

Being old is not a reason to flag a submission, at best appending the title with the year would be all that would be required.

Not by itself, but if the age suggest that the content is stale it could be. How the HN algorithm worked 7 years (assuming it has changed) isn't of that much interest, even though the analysis might be.

[innocent yet incisive comment removed due to excessive down-voting]

That's... not how the internet works. But it doesn't matter; we invited foob to repost it as a way of owning our own mistake.

> it sounds like what happened with the original submission was that a moderator accidentally put "(2010)" in the title and users flagged it because they incorrectly thought it was old

mhm, I'm sure that's what really happened

Which bit? The 2010 thing is precisely what happened. It was a case of sleep deprivation, which is one lesson of how trying too hard to make this place good can mess with a person.

The other bit was just my attempt to explain why users might have flagged the post. User flags were what demoted its rank, and it isn't obvious why people flagged it. There's also the issue that meta posts aren't great for HN in the first place, but those rarely lack for upvotes.

It was asked by someone yesterday, but the question got lost in the noise, whether the voting ring detection extends to flagging rings (whether on posts or comments)?

It would be naive to assume that it doesn't happen...


It would be nice if there was non-filtered view of HN available for users with a rep above 500, much like the "show dead" option for user comments that have been hidden. Basically allowing the submissions to be placed in the submission rank as if their ranking was not pinged due to users flagging the submission.

I like this idea, but it'd be nice to have it not based on karma (says a person that doesn't comment very often and has low karma).

Understand though I could easily see spammers using the data and getting 500 really should not take more than a month if you post a handful of comments a day; for example: 25 days, 5 comments a day, average of 4 upvotes per comment.

Worth noting with a rep of 500+ HN users are able to downvote comments.

But this is not just some random meta post it's about ranking algorithms, I think those are doomed to be popular whether it is HN or Reddit.

Hanlon's razor: "Never attribute to malice that which is adequately explained by stupidity"

Nice. Summarizes my longstanding argument with conspiracy theorists.

On the one hand, I'm also skeptical. On the other hand, I could see someone glancing at the date and accidentally reading "Mar 10, 2017" as Mar '10.

