
Hacker News metrics (first rough approach) - edwintorok
http://mjg59.dreamwidth.org/33551.html
======
dang
_The Hacker News ranking algorithm tends to penalize stories that address
social issues._

The Hacker News ranking algorithm does not penalize stories that address
social issues. It doesn't consider social issues at all, apart from a couple
of special cases that we turn off once a major wave subsides [1]. The effects
the OP is observing are mostly caused by user flags and the voting ring
detector.

1\. The intention there is to prevent copycat and follow-up stories from
clogging up the front page. For example, NSA stories used to be weakly
penalized, but no longer are, and I think there's still a similar penalty on
ebola stories.

~~~
slg
I seem to remember reading that in an attempt to avoid "controversial"
subjects, posts are penalized if they generate comments too quickly. Social
issue are often hotly debated and occasionally some of the most actively
commented on posts on the site until they drop off the front page. So while HN
might not actively be penalizing these posts, they might be penalized in
indirect but still very real ways.

~~~
krapp
I believe a thread with more comments than votes is penalized as well,
suggesting that commenting on a thread without upvoting it first is implicitly
flagging it.

And of course there may be other arcane metrics involving thread depth,
average comment rate, comment size, thread shape, etc. Who knows?

~~~
Chinjut
Whoa, really?

...I've been flagging a lot of articles.

~~~
AndrewDucker
Only a factor if it's got more comments than upvotes.

------
V-2

       This clearly isn't an especially rigorous analysis
    

Not especially, no.

Data samples consisting of two or five articles per keyword??

And 2 out of all 5 penalized "female" entries have nothing to do with social
issues etc. (one is about female mice, another about mixed-sex animals).

Comparing stories on social justice against ARM and Intel is also
cherrypicking, because we don't know whether this is supposed to be a bias
against social issues, or just non-tech stories in general, or something else.

And what's your methodology of choosing keywords?

"Female" \- which made it onto your list - gets penalized 5 to 1, all right.

But then in case of "girls" (omitted from your list) it's actually 0 - 3,
telling us a different story.

And unlike with the "female" bunch, none of these articles is about mice.
They're about how girls get better grades, how to get them into engineering
and why it's easy to teach them code.

"Women" (plural) lose 13 - 4 and so the keyword is featured on your list, but
"woman" (singular) wins 7 - 2, and surprise surprise, it's absent.

"equal", as a word stem (counting "inequality", "unequal" etc.) - penalized 2
times, not penalized 3 times. Holy smokes Batman, it rates better than "x86"!

Statistics, huh ;)

If this submission gets penalized, do we count it as bias against social
justice, or junk science

------
onewaystreet
This article is humorous considering that it was submitted on the same day as
Tim Cook's op-ed which has been #1 on HN all day and now has over 2000
upvotes.

------
notlisted
Interesting. I see that that four stories critical of AirBnB were penalized
despite many comments (113, 70, 63, 54) and upvotes. The ones touting its
value weren't.

I am not surprised. At least 5 times now, I've seen negative stories on AirBnB
disappear from the home page fast, whilst silly links remained. Is someone
protecting their investment?

------
styger
Thanks for doing the work of trying to apply data to investigate this. It's
too often lacking in discussions like these.

------
aw3c2
Not sure about penalising algorithms but I make sure to flag pointless social
war mongering posts.

~~~
pron
Right, because it's _the posts_ that incite social wars rather than the real
events they discuss. This is how it happens:

1\. Something bad happens due to what we can call "SV culture", at least bad
for some people.

2\. A brave soul writes about the injustice, and posts the story (or someone
else does) to HN.

3\. The story is promptly flagged because the discussion of the injustice is
"pointless social war mongering".

4\. Order is restored.

5\. Another brave soul writes about how HN reflects the SV social hegemony by
ignoring important issues.

6\. That story is promptly flagged.

~~~
imaginenore
So much butthurt.

This is Hacker News, not Tumblr. Social justice idiots try to inject their
nonsense into every culture, and I would prefer this community to stay on the
technology.

~~~
pron
What's technology for? Technology isn't neutral (like science can be). It is
either to advance social justice or to obtain or maintain power. I think
you're not particularly powerful, so I can only assume you're so laser-focused
on technology, so religiously fanatical about it, because, how did you put it?
Well, because you're a "social justice idiot".

~~~
imaginenore
You seem to have a mental disorder when everything seems to be revolving about
the issues you like, and so you are unhappy when they aren't covered by site X
you visit.

Most of us like technology because it's interesting, not because it gives us
some magical power. It's infinitely more interesting than whining about the
abuse of virtual characters in games and other such idiocy.

~~~
dang
> You seem to have a mental disorder

Personal attacks are not allowed on Hacker News.

~~~
imaginenore
Aren't you a mod here? You should ban me, or whatever the proper punishment
is.

~~~
YuriNiyazov
Rest assured that you will be banned eventually if you continue like that, but
the proper first step is to tell someone what they are doing wrong, rather
than ban them outright.

------
walterbell
See also this thread about manual elevation,
[https://news.ycombinator.com/item?id=8313505](https://news.ycombinator.com/item?id=8313505)

------
pron
> But for now the evidence appears consistent with my innate prejudice - the
> Hacker News ranking algorithm tends to penalise stories that address social
> issues.

My innate prejudice says something worse: that many stories that have to do
with social justice but are oblivious to their potentially adverse effect
(intentionally or not) are _not_ penalized and are rather handsomely rewarded.
These are stories that seem to be about "innovation", but are really about a
power struggle (e.g. stories about companies providing marketplace services in
the "sharing economy"), with the sole exception of privacy (which fits with
the techno-libertarian bias).

It is often said that discussion here shouldn't be about politics, but the
worst kind of politics is that which you don't even notice (or choose not to
notice).

And once something of substance does come up, it's so frustrating to see it
quickly drop in rank because it's automatically tagged as controversial. So a
story about a company offering cheap food delivery can stay at the top for a
while, but a discussion about a workers' strike against Uber quickly
disappears from the front page.

~~~
maxerickson
You may be including it in "automatically tagged as controversial", but I have
the feeling that there are lots of people that are flagging those topics
(which fits under "automatic" if you include reflexive flagging of social
issues topics, but doesn't if you limit "automatic" to the software running
the site).

I pretty much don't flag anything, but I can see how that flagging could be
driven more out of a feeling that those discussions don't accomplish anything
than out of a desire to penalize those stories.

~~~
pron
Right, flagging is another issue. But as to "those discussions don't
accomplish anything", what does any discussion on HN accomplish? On the more
technical stories, the discussions often educate. Thing is, HN isn't
/r/programming, and there are many stories about company culture, startups
raising money, SV superstars etc.. What do the discussions about that kind of
stories accomplish? It's just people schmoozing for entertainment. What's
wrong with that?

But I have a practical suggestion: Instead of having unpleasant discussions
about important stories related to technology, innovation and startups (like
women in startups, which is a far more important topic than, say, Uber raising
another trillion dollars), simply block comments on these stories, and at
least let people read them without having them drop due to too many comments.
Sure, people will still flag the stories, but AFAIK, the flagging privilege
can be revoked if a user abuses it.

~~~
maxerickson
I don't really have an answer to the first paragraph, like I said, I don't
really flag things (exceptions are things like dupes of recent, decent
conversations or if I happen to look at /new, the really dumb, offtopic stuff
(like celebrity news or cat pictures, the stuff that is waay out there)).

I think we also probably see the importance of visibility on hacker news
differently; you're arguing that it should be a certain way, apparently
because you think it will do some good (the soft phrasing there is because I'm
guessing at what you are thinking, not sarcasm), I don't think it is going to
have much impact (for example, many of the people that make threads awful see
it as an opportunity to defend, with guile and shenanigans, their pet point of
view; that's pretty clear evidence that the story isn't reaching them). I also
think the media spends a lot more time pandering than it does manipulating
(just throwing that in there as a similar situation where I apply a similar
world view).

------
grimtrigger
I think there's plenty of evidence that humans by nature are uninterested in
"politics", except when it might effect them. Then it becomes an object of
great importance.

As for the role of the algorithm, I think its just an extension of "garbage
in, garbage out".

Stories like this are a nice reminder that every community is also a bubble.

------
goodJobWalrus
Can someone explain how flags are applied? For example, today a post where
person is asking for a job got killed, but I have seen same posts (people
looking for a job) before that were left on the front page. Why was this one
killed?

------
hippowithgas
wow - the money quote "I scraped Hacker Slide to get just over two months of
data in the form of hourly snapshots of stories, their age, their score and
their position" \-- how did he do this? I can barely get to the homepage say
14 times out of 20, without hitting the error page from their CDN/DDOS
Protection pages (cloudfare?) And that's not to scrape content, just to read
the headlines from time to time.

------
untog
An interesting read. Of course, it criticises HN for actively penalising talk
on social issues, which might well mean that this story itself will be
penalised. And so the great cycle continues.

~~~
zorpner
In fact, it was:
[https://twitter.com/mjg59/status/527890332206510080](https://twitter.com/mjg59/status/527890332206510080)

~~~
dang
According to HN's software, this story has been heavily ring-voted. That's
what has affected its rank.

Call that "penalized" if you want, but it has nothing to do with the content,
so that tweet is misleading—and ironic, too, given what provoked the
"penalizing".

(Also, in case anyone's wondering, moderators haven't touched the post.)

~~~
zorpner
_it has precisely nothing to do with the content_

Do certain types of content set off the ring-voting detection more frequently?
The type of content likely influences how it's shared and how people respond
to it, which would affect how it's voted (and/or flagged). Algorithms are not
unbiased, particularly when no one's bothered to study their confounding
factors.

~~~
dang
> Do certain types of content set off the ring-voting detection more
> frequently?

Sure—people posting their startups, for example, set off the ring detector
more frequently than the typical news article. Friends voting for friends'
blog posts, too. But is that a bias of the algorithm? That strikes me as a
stretch.

~~~
AndrewDucker
Interesting. How do you guard against coincidental rings?

(i.e. a subgroup of people who all love, say, Lisp, and therefore always vote
up Lisp stories. They're not an organised group - but they do tend to vote in
concert.)

~~~
dang
I used to worry about that too, but it turns out not to be a big problem in
practice. (I realize that's not a satisfying explanation, but this is one area
where we can't give out details without enabling people to game the site.
Having put a ton of effort into HN's anti-voting-ring measures, I dread the
thought of having to climb all the way down the mountain and push that boulder
up again.)

