

Ask HN: How is HN's ranking algorithm implemented? - jgannonjr

There is an older article from a post about a year and a half ago titled "How Hacker News Ranking Algorithm Works" (http://news.ycombinator.com/item?id=1781013).  I understand how the algorithm works mathematically but how is it implemented on the server?  More specifically, how does the server keep its rankings updated if each stories rank decays as a function of the time since submission?  The ranks must be getting continuously calculated at some point, but I'm not sure when that would be--it doesn't make sense to calculate them on request, and there are way too many stories to be constantly processing all of them all the time.  My thought is that the rank is recalculated after each vote, and there must also be some rank threshold at which a story's rank is automatically processed and recalculated continuously in the background, this way the recent stories are constantly being processed while older ones past the threshold become stale and are ignored.  This could also be why there is no real pagination (instead the next url is /x/fnid=xxxxxxxxxx), possibly to discourage people from going too deep. I could also be completely wrong.  Can anyone shed some light on this?
======
cd34
<http://arclanguage.com/install>

contains the source code less some of the secret bits and bytes.

~~~
jgannonjr
Thanks for the link. I by no means have any experience reading lisp, not to
mention there is a lot of code here to sort through (especially for someone
not versed in lisp), but from what I can make out from a quick glance seems to
confirm my initial guess. It looks like rank is calculated when an item is
voted on. Also, it looks like a random front page story is reranked regularly
to ensure no items get "stuck" at the top if voting stops (rerank-random).
There's also a function (gen-topstories) to generate the rankings for the top
1000 stories, although I can't tell if this is being run regularly (it looks
like it is just used to initialize the rankings). However, I could definitely
be wrong.

I'm not able to determine how often the rerank-random function is called
(looks like every 30ms maybe?) or how often the gen-topstories function is
called (if it is even being called regularly or not).

Can you verify what I'm saying here is correct, or correct me if I'm wrong?
Also, any insight would be much appreciated.

~~~
cd34
It has been a while since I looked - the one thing I wanted to fix was the
expired link issue. It is rare that I can get past page 2 of the /newest
listings before getting the Expired link notification, so, I probably miss
quite a few submissions.

Basically, the link code that you see is a closure that contains the results
for your particular view of the articles. In that snapshot in time, your list
remains 'constant' until you refresh it with a few to /newest or / allowing
pagination to work consistently as you move through the list. When you get the
expired link notification, it means that your closure has been removed due to
GC or a server restart.

Each time a story is voted on, the global pool is updated, but, until your
closure is recreated, you would not see the effect of the vote. Therefore, the
shuffling you would see when you or others upvote a story that might change
pagination won't affect you.

I believe the top 1000 is separate from the normal operation of the site and
is run 'at some interval'. I don't recall, but I don't think it is
recalculated frequently.

Ranking is a function of vote velocity over time. A story that gets a rapid
number of upvotes will get pushed up higher than one that has a lot of
upvotes, but was submitted hours ago. i.e. 100 votes in the last 20 minutes
will be pushed above 140 upvotes in the last 60 minutes.

I'm in the middle of some code, but, if I get a chance this weekend, I'll take
a look. My lisp is rusty - haven't used it since the late 80s.

------
mdg
Just by observing, I think it goes like this:

    
    
        if linked_article.is_relevant_to_YC_company():
            goto front_page;
        elsif linked_article.is_negative_to_YC_company():
            linked_article.dead();

