In my opinion, we want people to keep quiet until they have something meaningful to contribute to the conversation. The idea that making any post is a "return on investment" is nonsense.
I rarely open my mouth. There was a short period where I looked to snipe with some witty comment, but I quickly out grew that.
I've made one post and I am rarely finding things pertinent to the tribe that haven't already been posted, so instead I look for new interesting content here and only open my mouth when I think I have something valuable to add to an existing discussion.
I think something like the proposed solution would leave a user like me out in the cold. I am by no means saying this objection should kill the idea, you gotta break a few eggs, and if I happen to be one of them, so be it. It should be given some thought though.
(If I were changing HN, I'd love an option to replace the karma score with a reply inbox, so I don't have to poll my threads page. I'm trying to break myself of my post-and-run habits, of my general fear of defending my ideas, and of my overvaluation of karma.)
I think the submission makes an incorrect assumption that valuable users contribute regularly. Certainly regularity isn't baked in to any karma measures on HN right now, and I'd expect that's be design rather than a flaw to be corrected.
Those OP measures don't account for that, I think.
Since Reddit is open source, I could create a copy right now that would have no content on it. Lets say I populated it with all of the old, highly rated posts. This would be valuable as a source of high quality content, but has less value to me because it lacks current events and interaction with other users.
I said 'somewhat' because there clearly is a value in having high quality content, so I see it as a trade off. I think there's a market for both high-quality/low-volume sites as well as lower-quality/higher-volume sites.
dx is the number of days the user is registered. If the user made one post and then stopped, his karma score would shrink over time as dx approaches infinity while the numerator remains constant.
The numerator technically has an extra term for each successive day, but it looks like that additional term is 0 if no comments were made on that day.
Before we optimize how we weight karma, we should first ensure that points are awarded for valuable behavior. Right now, I think it's measuring conformity. Is that true? And if true, is that deireable?
Since downvoting erases, it should definitely not be used for simple disagreement. It should be used on poor arguments, bad faith, pointless posts, etc.
It's harder to suggest that upvoting should not be used for agreement.
Regardless, in both cases, I prefer that voting be used to evaluate quality.
By summing up and down votes, you are destroying data. There is no difference between a controversial point and an idea nobody cares about.
I use a browser extension to show up and down votes on Reddit and I really miss that info here on HN. Not that we get to see comment scores anymore but I still miss it even on my own comments.
mostly upvotes: keep it
mostly downvotes: drop it, its junk; the community has spoken
many mixed up/down votes: controversial topic, maybe delay the appearance of reply links?
few votes: boring, maybe sort it lower
I find myself wishing there were two values: agree/disagree and valuable/useless. Not because I think we particularly care about the agree/disagree ratio (we can form our own opinions, and comments are always more interesting valuable in this regard anyways), but because then it would absorb opinion votes, removing them from the valuable/useless score that helps determine where a comment is located and how it is colored.
Do url submissions deserve points: http://news.ycombinator.com/submitlink?u=http%3A%2F%2Fnews.y...?
I think that this may drive certain undesirable behavior (e.g. duplicate story submissions, down-vote capability for new-ish accounts). On the other hand, points derived from the comments you make are probably a better indicator of the quality of your contribution to HN.
Karma in websites is not necessarily about accuratly reflecting some ground truth upvote probability, mean chance of liking a comment, expected future vote ratio of the commenter, etc. It's an incentive-design mechanism, in that a karma system is good if and only if it leads to the desired behavior when people use the website. When you ask yourself "how should I compare 5 upvotes and 5 downvotes versus 1 upvote versus 1 downvote versus no action at all", the answer is not weighting one of these situations higher/lower because it will better approximate one of the criteria above in expectation or something like this, but instead weighting these high/low depending on, for example, if you want to encourage activity, agreement, controversy, etc.
Ideally, you should have some other behavioral metric in mind (say, mean comment quality, top comment quality, bottom comment quality, engagement, etc) and try to tune the voting system to maximize this quality over time. (this tuning can either be done intuitively, as pg tries to do, or algorithmically, using something like the technique behind Gmail's priority inbox or bandit algorithms)
Do not "define away" a social problem with mathematics, use the mathematics to help you solve the actual social problem.
That's precisely what my metric is intended to measure. The community derives quality of individual comments. The article explains why measuring mean, top, or bottom comment karma is a flawed approach. By measuring what we might call an "enchanced Sharpe" , we encourage consistent, high-quality engagement.
I suppose I'm not really understanding what your issue is with the formula. It's certainly not trying to "define away" some social problem-- how users vote, which articles make it to the front page, etc., is an exercise left to the reader. The only thing this metric is intended to do is replace the total score shown in the top right with one that better reflects your contribution to the community.
 Or just the Tansey Ratio, if it's not too presumptuous.
For example, it is discouraging to log in to HN and see that your karma has fallen since your last visit. Using your formula, this would be a very regular occurrence for all but the most active users. Discouraged users are less likely to continue attempting to engage and many would eventually give up their attempts to maintain a decent karma. Then, while your measure may be "more accurate" in some sense, it would easily be less effective for goals that have more to do with engagement & participation than with notional accuracy.
Not trying to poo-poo the spirit of your post though, because as someone without much of a math background, this type of discussion is very enlightening. Just trying to clarify that inaccuracy may very well be a feature, not a bug.
I can see this punishing infrequent, but quality posters and decreasing the signal-to-noise ratio.
Sadly, somethings that work out just fine in math, don't mesh that neatly with human behavior.
I'm not sure that's true. Lots of people play games where their rankings shift down if they don't constantly play. In a lot of cases, this actually increases engagement. I think we would need to see some evidence here, but the null hypothesis should be that user engagement does not change.
If one wanted to assume that it does negatively effect engagement, however, then maybe an extended approach then is to show the ranking of the user's ratio score? This is less of a judgement of their score and more of a pleasant reminder that they are not contributing as much as others. Alternatively, you could also set the risk-free-rate to 0 for both the comment and day, then only update the scores periodically.
I suppose one could argue that it's not a competition and you shouldn't be vying for a higher score, but then why show us our karma at all? Similarly, if one believes that consistent contribution is not important to the community, then it's a philosophical difference and we'll have to agree to disagree there.
To counter this, all we would need to do is define what rate of karma inflation is acceptable, then adjust all displayed karma ratings to compensate.
Cumulative karma scores probably correlate to long term contribution, but they don't meaningfully reflect daily contribution because some days the best contribution I can make is to shut up and listen.
A possible solution is to scale by the number of people who look at the comment, although this might be difficult to do well. You could probably get better results by estimating a regression containing the following variables: age of post, score of post, number of comments, average score of comments in the thread, and the depth of the tree in which the comment was posted to get a good determination of how many points an average comment in that situation would get.
The process is actually relatively painless and still seems to work.
"Total" and "Average" are -really- easy to explain to someone, and encourage them to make good quality posts. Volatility adjusted Sharpe ratio doesn't readily explain anything.
For instance I'm posting this late in this threads life. If 10 people read it, and 7 up vote it, that's a very high percentages of up votes. If I had posted this 10 hours ago when the thread was created, I would have received a lot more up votes even though the comment is the same. Sure, maybe the comment isn't as valuable now because less people will read it. But I think the goal should be to judge a comments value regardless of if the user was lucky enough to find the thread when it was first created.
The simplest way to calculate this is to use page views of the thread after the comment was posted, and maybe factor in how high on the page the comment is to estimate how many people have read it.
On top of that, it's semantically difficult for me to grok what the downvote button is supposed to be used for. Should I downvote posts simply because I don't agree with them? Should I downvote garbage instead of flagging it?
If pg got rid of that button, the meaning of karma would be cleared up, both for posts as well as for users. Upvote comments you think are high quality (or because you agree with them), flag things that are not supposed to be here, and simply ignore all the rest.
It already works like that for stories, let's just go one step further and treat posts the same way.
In math, superscripts usually mean exponentiation. The formulae are really simple. Just take out the x completely and consider everything to be in the context of a single user.
And that's not even considering the many drawbacks of these calculations. The main one being that it encourages a lot of commenting when we don't necessarily need everybody commenting all the time, but rather when they have something useful to add to the discussion.
It's probably going to have problems. Wouldn't it punish someone who had mostly good comments and posts, and occasionally gets one with a huge amount of karma, versus someone who never got one with big values of karma.
Well, I didn't, Sharpe did. :)
However, the important distinction is the notion of a risk-free rate of return. In this case, it's (loosely) the 1 upvote you automatically get for every comment; in finance, it's usually the return you get on US Treasuries (around 1%, thought right now it's effectively 0%).
>Wouldn't it punish someone who had mostly good comments and posts, and occasionally gets one with a huge amount of karma, versus someone who never got one with big values of karma.
Assuming all else is equal (meaning it's N comments of karma K vs. N+1 comments where the first N are of karma K and the N+1th comment is something huge relative to K)? No. This makes sense if you think about how the standard deviation is derived, also note that I am capping the minimum standard deviation at 1 so consistently hitting the same karma does not give you an infinite score.
However, if both individuals have the same mean karma but one got it from consistently scoring around that mean, and another got it from having one huge upvoted comment and several smaller ones, then yes. But isn't that what we want?
Karma goes astray as both an incentive and a measurement primarily because karma inflation distorts incentives; more users and more voting dilutes the impact of down voting, leading people to judge comment scores relative to other comments which makes genuine trouble makers harder to spot in the trends.
Hacker News appears to be the same as Reddit with regard to how people treat karma, which seems to have been encouraged by the adoption of private karma: thermostat voting has gone down (but I remember offhand a comment by pg that this has not influenced scores, if this is correct the only significance is that down votes are now a much better data point should pg ever want a more sophisticated ranking system), but it has exacerbated comment relativity, doubly so because rather than using comment scores to only order comments, HN also obscures posts with a score lower than one.
Disclosure: Community moderation is something I like to tinker with, and my own experiments have increasingly lead me away from karma. But there is no doubt that it is one of the most effective forms of soft moderation we have today, though I see little reason to believe that it can effectively scale past tens of thousands of users.
"Hacker News, since about two years or so when there was a large influx from Reddit and Digg people, appears to be the same as Reddit with regard to how people treat karma"
Also, could be just me, but I found your notation a little confusing, where you've used x as a superscript.
The most important aspect of HN is the culture which we want to preserve. So users should get new points/votes/karma when they reinforce the culture and lose points/votes/karma when they break the culture apart.
Voting is the only means to change the karmic dynamic of users. So voting should be reserved to the active old timers in greater proportion and to the newcomers in smaller proportion.
For example, newbies shouldn't be able to vote at all below a karma threshold and when they are, their votes should have a fractional effect compared to the vote of an old timer. Maybe something like if one 4000-karma guy downvotes a comment it would need to take 100 of 40-karma guys to upvote it back to zero but only 10 of 400-karma guys. If all HN users voted on an item, the weight of each vote would be a single user's karma / total sum of karma of all HN users. Of course, only a subset of users ever vote on a single item so the total sum should be limited to a subset of users, such as those expressing interest in voting for that item or those having voted for that submission or any of its other comments. As long as newcomers can't come and upvote each others comments to gain karma without the approval of the high-ranking old timers. Therefore, those newcomers who can already vote would contribute fractional karma points witht their votes.
Those old timers who define the culture can perhaps be identified by their karma, as recursive as it sounds. At some point the karma reaches some natural limitation as it's much more difficult to obtain tens of thousands of karma points than thousands. So eventually the most persistent ones would gradually join the higher ranks because it would be really hard to escape.
But the computation of karma probably doesn't matter much. Just adding up votes will sufficiently track the relative ranking of each user.
Not that this matters much. I think people just stop reading various websites when they judge the site, overall, to be boring or useless. For example, if people find stackoverflow to be more informative than (insert name here) then stackoverflow "wins". Whatever winning means.
The other issue that has come up in multiple comments already posted is the role of downvoting. Downvoting is a pet issue on HN--I have seen more than a dozen full threads about downvoting and scores of comments about downvoting in other threads in my 1010 days of registered participation on HN. Before I came on board, 1284 days ago, pg (the site founder) wrote, "I think it's ok to use the up and down arrows to express agreement. Obviously the uparrows aren't only for applauding politeness, so it seems reasonable that the downarrows aren't only for booing rudeness."
Although I would agree with putting in a two-dimensional voting/flagging system (with one dimension being agreement with the statement(s) in the post, and the other dimension being a judgment of how much the post contributes to the community), while such a bivariate system is not yet implemented, it makes sense to downvote comments without further follow-up comment if they add nothing to the posted discussion as it is already posted, in light of the submitted article or question opening the thread. No one should be obligated to comment on a useless post before downvoting it. It is the responsibility of each commenter (as several commenters here implicitly agree) to make the case for his or her own comment being visible by what is said that is new and helpful in the comment.
When pg opened a thread 142 days ago with the question "Ask HN: How to stave off decline of HN?"
he wrote, "The problem has several components: comments that are (a) mean and/or (b) dumb that (c) get massively upvoted."
That's still the key issue. It doesn't do any reader of HN any good if a comment that is dumb gets net upvotes. Nor does it do any good if a mean comment is upvoted--that causes active harm to the community. If participant behavior brings about higher scores for good comments, and lower scores for mean, dumb, or other bad comments, that is helpful to all readers of HN.
Some users who are worried about downvotes are worried also about HN hivemind or groupthink. It may be that there are unexamined opinions without factual warrant that are held by the majority of HN participants--that is to be expected on the basis of psychological research.
The thing to do about groupthink is to dare to comment, karma be damned, and to respond with thoughtful, informative comments that challenge majority opinions. I have also thought that it might be useful for veteran participants here on HN who have a Web presence to post a Web page or blog post discussing what they see as the main hivemind or groupthink issues on HN, with citations to good sources of information on those issues, and then to put links to such online discussions in their user profiles. That way, if a user is a contrarian on an issue that a lot of HN participants care about, the user can invite all other HN participants to look up facts on the issue. That might help raise the level of discourse here.
After being here 1010 days and seeing a few rule changes and MANY discussions of upvoting, downvoting, and karma rules, I think the main thing to do here to improve the quality of discussion is to UPVOTE more. Upvote a person who asks a follow-up question like, "Do you have any sources to back up that statement?" (I often see such comments grayed out, indicating that they have been downvoted, but comments that ask for more verifiable information are nearly always helpful.) Upvote a person who says "Thank you" out loud, and silently upvote a comment that you think deserves thanks for politeness or thoughtfulness. Upvote a comment that provides a link to an online resource you didn't know about before. Upvote a comment that apologizes for a gaffe or that admits a factual mistake. Upvote that which is good, and there will be fewer problems with inaccurate signaling here.
Feel free to review the site guidelines
and the site welcome message
for guidance on what is desired here and thus guidance on how to vote.
Garbage in, garbage out. If you don't upvote good stuff, other people will upvote bad stuff.
P.S. Sorry, I never tl;dr but it's important to highlight this key point.