

Karma-rank (Pagerank for social sites) - ntoshev

There is a direct application of Pagerank to social sites like news.YC that seems to have a number of advantages over the typical "one user, one vote" scheme.<p>I won't explain how Pagerank works, there is a good explanation here:<p>http://www.ams.org/featurecolumn/archive/pagerank.html<p>Let's drop the "every vote counts equal" scheme. Say you want to get a global measure for how important every person's votes are to the site. This means you want to assign a number to every person. This is analogous to the pagerank number assigned to a web page. The votes a person has got on all his comments and submissions corresponds to the inbound links of a webpage. The votes a person gives correspond to outbound links from a webpage. You would like that the authority every user gives equals the authority every user receives (both via upvotes, let's ignore downvotes for now). Every user's importance gets divided equally among the submissions/comments that he has upmoded. These are exactly the rules Pagerank operates on, except that one user may have upmoded another several times and you need to account for downmods too.<p>What are the advantages this would bring to sites like news.YC? Such an algorithm would be pretty resistant to voting circles in the same way pagerank is resistant to spam (one user with lots of karma will weight more than a lot of sockpuppets with no karma). It gives a global estimate of people's importance to the site and it should allow a site to preserve it's culture better as it scales.<p>Drawbacks? Some people may object that it is unfair not to count votes equally. It is harder to implement (especially as you need to modify the karma-rank on the fly, although if this turns out to be hard, you can use an approximation and recalculate karma-rank daily, for example). Last but not least, it puts serious performance requirements on the site implementing it that Arc is probably not ready to handle.
======
wehriam
This would create a feedback loop. Users that exhibited a certain type of
behavior would quickly amass a large amount of karma.

Unfortunately Pagerank is not inherently resistant to spam. To extend your
metaphor, this would be good if we had a lot of "Ms. Wikipedia's" but bad if
we had more than a few "Mr. Linkspam's."

It would discourage new users, as they would likely never catch up to early
adopters. The A-List blogger phenomenon is a good example of this. Big,
popular blogs stay big and popular. While possible, it is extremely difficult
to break in to the upper echelons of blogging.

It would also reward those who post and vote on topics that are mainstream, at
the expense of diversity in the community.

The normalizing effect might be particularly well suited for some specific
communities - financial advisers for example, or those dealing specifically
with popular culture.

~~~
ntoshev
The social sites are much more dynamic than the web. This means, for example,
that people high in the ranking like pg would upvote newcomers and they would
instantly get a karma boost.

Pagerank is usually log-scaled because of these feedback loop effects.

------
pg
I've thought of doing something like this. The computational load wouldn't be
a problem, because weights could be calculated asynchronously and wouldn't
have to be especially up to date. Maybe I'll try calculating but not using it,
and see for myself if it produces better frontpage rankings than we'd have
otherwise...

~~~
evgen
What you would need to commit to is a frontpage that is personalized, which
might not be exactly what you are really looking for. There is some benefit to
having a "shared" sense of community. The most recent design struggle I have
wrestled with related to this is preventing an echo chamber of sorts from
developing. A few quick observations related to how HN works now is that you
would probably need to make the downvote more readily accessible to every user
and might gain some benefit from "seeding" the network a la Advogato's initial
trust sources.

~~~
ntoshev
Pagerank sets global weights; you won't get a personalized homepage if you
implement just that.

~~~
evgen
Strict pagerank does set global weights (necessitating the "seed" values that
I mentioned in another comment), but this is not a requirement of flow-based
weighting systems. If you implement a canonical pagerank algorithm then you
will get a global view based upon what certain people feel is
important/worthy, and as you expand the set of seed users you will end up with
complete personalization. There is a continuum here and deciding where to
reside within this range has some pretty large effects on the nature of the
site.

Google determines global pagerank because they lack the computational
horsepower to do otherwise and because web pages and users are not similar
entities, but on a site like this where the links between users and articles
are effectively a single measurement this sort of individual pagerank is
possible.

------
gaika
You can do one more step and build a recommendation system that tracks not
only trust but individual preferences. Performace requirements are pretty
steep for real-time recommender, but it is totally worth it.

Oh, by the way, we have it implemented already on jaanix. For an opensource
implementation you can take a look at <http://www.advogato.org/>

~~~
ntoshev
Do you have implemented the described system, a recommendation system, or
both? If you have implemented similar global estimation of how much every vote
counts, I'd be interested in hearing how did it actually affect your site.

~~~
gaika
It is both. Trust and recommendations are mixed together in an SVD like
algorithm as described in <http://sifter.org/~simon/journal/20061211.html> .
On top of that jaanix lets you adjust all your personal preferences in real-
time.

The biggest effect is that it is discouraging trolls and spammers, as there's
no front page they can spam or troll. The downside is that the critical mass
required to make the site truly social is a lot higher, and unfortunately we
have not figured out how to gain enough users for it to really shine.

~~~
ntoshev
I agree recommendations are a way to take trust into account, but it is a
different way with its own drawbacks. You wouldn't need higher critical mass
if you have just implemented a pagerank equivalent as opposed to
recommendations.

You can make a page without recommendations the default, at least until you
acquire critical mass. You can just have a "personalized" slider defaulting to
no personalization.

------
rglullis
I kid you not: "a website where users could point which users they trust, in a
variety of different websites (like reddit, news.YC, OpenID handles, etc) ,
with the intent of creating a TrustRank, for people", was my proposal for YC
this summer.

The idea is in the shelf. If someone wants to work on that, I'd be glad to
join and exchange some ideas.

~~~
ntoshev
You would need voting data to make anything with the nice properties of
pagerank.

------
evgen
As gaika mentioned, the mechanics of this are pretty easy if you want to do a
simple pagerank equivalent. There are a few caveats though: pagerank is a
positive weighting only so you would need to google up "yaprank" (which I
believe is the system jaanix uses) to see an example of how to deal with
downvotes, you need to deal with rank sinks and other ephemera of the pagerank
system, and you need a network that is large enough to let the ranking system
bring real benefits.

As someone who is working on a system like this that can also do realtime rank
adjustments I can assure you that the mechanics are simple, the hidden details
are subtle, and until you have a community that is active enough for this to
make a difference it is hard to say that the effort is worthwhile.

~~~
gaika
yaprank page is long gone, if somebody still needs this info, please contact
me.

------
njharman
Pagerank has had to change many times to combat black hat SEO and other
undesirable outcomes. It is esp vulnerable to gaming the system. It has other
problems people have mentioned here.

Do you want to wage a constant arms race with the exploiters? Is that really
the best use of your time at a start up? Are you gonna expend the amount of
resources Google expends on it and still doesn't get it perfect?

------
sanj
Wow, I had a completely different idea based on the title:

Incorporating rankings from news-voting sites (digg, yc, reddit) directly into
pagerank for search.

