
Popularity Dynamics and Intrinsic Quality in Reddit and Hacker News (2015) [pdf] - usgroup
https://pdfs.semanticscholar.org/ccf6/0d08bdd989ea3595bbbda132dedd71c47acf.pdf
======
dang
_Reddit received approximately 450 million page views in December 2014, while
Hacker News received approximately 3.25 million._

I just looked, and HN had well over 60M page views that month. The Reddit
number is likely way too small as well.

~~~
qubax
Yeah. Both numbers seem very low. Are they confusing monthly visitors with
page views?

Pretty sure on DEC 2014, the monthly page view for reddit would be near a
billion and HN would be in the tens of millions.

------
lellotope
This was an extremely interesting paper to me, about a topic that I see as
economically and sociologically fundamental.

I was actually impressed by the methods they used. I found myself thinking
"this is what I'd really like to see," and then they'd report it. Validating
their method on the MusicLab data seemed critical to me, as did examining
reddit resubmissions versus YouTube views.

Although I thought methodologically it was almost as well done as it could
have been outside of an experiment, I disagreed with the author's conclusions.
They acknowledge some of the problems, such as the problem of the huge number
of forgotten posts they didn't model at all, but other issues they don't.

For example, it seems the question of most interest is, given an observed post
score, what's the actual "quality"? If you look at, say, Figure 3, it's
apparent that there's huge variability in quality conditional on score, as
observed score increases.

I think the correlational-style relationship they focus on obscures things
like this that are critical to interpreting the findings. Yes, there's a
strong estimated relationship between quality and score, if you ignore all the
missing data that constitutes the bulk of submissions, and the fact that the
relationship is being driven very strongly by a large quantity of very
low-"quality" posts versus everything else, and the variability everywhere
else. It's an odd, heteroscedastic, nonlinear relationship that isn't well-
captured by a correlation, even a nonparametric one.

I also would have liked to see examination of variability in links across
sites. How much variability is there in rank of an initial link, to the same
material, across reddit, HN, Twitter, etc.? Maybe tellingly, the authors
report the relationship between YouTube views and number of reddit
_submissions_ , but not the relationship (if I'm reading correctly) between
YouTube views and _rank of initial_ reddit submissions, which is kind of the
key relationship.

So, liked the paper but if anything it just reconfirms the conclusions of
earlier studies to me, that social network dynamics has a big influence on
apparent popularity.

------
arethuza
From the abstract:

 _" We define quality as the number of votes an article would have received if
each article was shown, in a bias-free way, to an equal number of users."_

I haven't yet the whole paper yet - but isn't that ignoring other major
factors like how "newsworthy" a particular link is? A low quality link might
get a lot of upvotes simply because it was the first link submitted that
describes an inherently interesting event.

~~~
iainmerrick
“In a bias-free way” needs careful definition. That ought to include the order
in which stories are shown (so it would eliminate any advantage of being the
first link posted relating to a specific event).

------
pjc50
"Intrinsic quality" is a terrible name - it should be something like
"decontextualised quality" or "neutrally presented quality", because it's
still an aggregate subjective view on the quality of the article.

~~~
platz
Why use a word like "quality" at all here.

It seems unnecessary. They should've just used "estimated votes" , since that
is what they are, or something derived from votes.

Quality is almost content-free and worst case is chosen in bad faith or hubris
to make the result seem more important

------
claydavisss
aka tragedy of the commons

~~~
wufufufu
I don't think this is similar to tragedy of the commons at all.

[https://en.wikipedia.org/wiki/Tragedy_of_the_commons](https://en.wikipedia.org/wiki/Tragedy_of_the_commons)

~~~
gwern
I would call this more of a stag hunt:
[https://en.wikipedia.org/wiki/Stag_hunt](https://en.wikipedia.org/wiki/Stag_hunt)
There's a tension between spending your time helping vote on /newest to get
stuff on the main page where they are then accurately ranked, and slightly
tweaking the ranking on the main page while enjoying the overall fruits.

