Hacker News new | past | comments | ask | show | jobs | submit login
The Reddit Front Page Is Not a Meritocracy (toddwschneider.com)
485 points by lil_tee on Nov 6, 2014 | hide | past | web | favorite | 147 comments



This post is interesting, but he should have just read the code:

https://github.com/reddit/reddit/blob/master/r2/r2/lib/norma...

This is an effect of the normalization algorithm, which biases towards trying to have a post from as may different subreddits as possible on the front page.

Edit: At least we're consistent! Ketralnis said the same thing about an hour ago. :) https://news.ycombinator.com/item?id=8568021


I don't think reading the code replaces the post. A good example is that even when you implement an algorithm you monitor it, measure the results and tweak it.

Code is the theory (bottom to top) and his research (which is great) is a top to bottom analysis


I wasn't discounting his research (apologies if it sounded that way). I was simply pointing out that at the end he made it sound like this behavior was unexplained, when in fact that code explains it quite well.


out of curiosity, what does the 'r2' stand for? does the two indicate that it's a rewrite of the original lisp code, as in r(eddit)2?


It's actually the second python version. :)


Thats an awful lot of funny looking graphs to blame on a "normalization algo". e.g.

>r/funny posts simply never appear on the bottom half of page one or most of page two

Note the "never" part....not just rarely...never. Well within this 1.2 million point dataset anyway.


It's likely that, at any given time, the top 2-3 links of /r/funny are highly-ranked enough to make the front page -- except the algorithm won't let more than one from that subreddit through at a time.

So the alpha-funny link makes it, and the beta-funny link has to sit on page 3, suppressed, unable to take its rightful spot... until suddenly alpha-funny is too old, and is kicked off the front page, at which point beta-funny immediately ascends to take its place.


I see. That does make sense. Bit of a curious view on "normalization" but I can see how that would result in the strange graphs.


It's kinda like the British royal family: Prince Charles is just one heartbeat away from the throne, but as long as the queen is healthy, she's the one who gets to be on all the money.


It appears the normalized_hot is only used for the first two pages. The rankings after that are subreddit-agnostic.


This is a gorgeous post and I shiver at the thought of how much work went into this! Thoughtful, detailed and chock full of great visualizations of the data. I'd be interested in a similar analysis of HN, as I'm interested in the editorial intent thus revealed.

On tangential note, though, I was a bit surprised by the hypothesis. 'Meritocracy' to begin with is a dubious fiction, but especially so when mapped on to the vote distribution of a given post in a semi-decentralized human-moderated environment that is constantly being gamed. "Merit" just seems like a big value judgement over such a noisy channel.


A note about the graphics: all the charts were made with R/ggplot2. Not only ggplot2, but basic ggplot2: the charts in the blog post can be made with about 5 lines of code maximum each.

You can customize ggplot2 for even more beautiful charts, such as the ones used in my analysis of HN comments: http://minimaxir.com/2014/10/hn-comments-about-comments/

Note that reproducing this analysis for the HN front page is harder since there is a lot that happens behind the scenes that cannot be generalized. Here's an attempt at performing such an analysis (note dang's comments): https://news.ycombinator.com/item?id=8533757


I really enjoy using ggplot2, but the basic "theme" is terrible.

- They tried too hard to do a Tuft* - but it's actually way too flashy.

- Basically the issue is that it's not generic enough. Whenever you see a ggplot2 plot it's always screaming at you "I WAS MADE IN ggplot2!!" Just KISS...

- The grey background is completely unusable if you want to print out your charts!

- The default pastel colors (ie. PowerPoint 2012) are always a disaster for readability. In meetings people constantly can't tell them apart and they are definitely not colorblind friendly.

It' a shame b/c each script I write ends up having to have an obnoxiously long theme declaration to make it look "normal"

* I think most people miss the Tuft's point. He presents a way about thinking about data presentation. Instead people just look at it and think "oooo! That looks pretty. Let me copy what he did". The guy has his own personal style - it's definitely nice, but the point isn't to copy it.

EDIT: For those thinking about learning ggplot2.. maybe wait off for a bit. It seems like it'll be deprecated soon and replaces with ggvis.


NB: Here is a paper the derivation of the ggplot styles: http://vita.had.co.nz/papers/layered-grammar.pdf

I would recommend using theme_bw(), which helps solve some of the problem with the gray background.

EDIT: This is ggvis: http://blog.rstudio.org/2014/06/23/introducing-ggvis/

This is news to me, so I'll look into it to see how it differs from ggplot2.

EDIT 2: It seems more like that the difference is that ggvis is more for interactive charts, but then it requires a dependency on Shiny, which is not optimal for blog posts.


Thanks for linking the paper! I'll definitely read over that. I've heard it mentioned before ...

I'm definitely not an R guru or "in" on the latest news, but it seems that Hadley Wickham (who probably single handedly is the reason R is still relevant) now works for the RStudio guys and he's reworking his tools. plyr is now dplyr and gpplot2 is now ggvis. And there also another tool called tinyr. My understanding is that they're still in development, but they'll ultimately provide an "integrated" ecosystem for processing data.frames (with hooks into the RStudio IDE)

He talks about it at the beginning of this video https://www.youtube.com/watch?v=wki0BqlztCo


By the way, there is an awesome course track on Data Science from Johns Hopkins on Coursera right now, which includes introduction to R.

https://www.coursera.org/specialization/jhudatascience/1?utm...


My favorite example of ggplots ease and customizability comes from xkcd style graphs:

http://stackoverflow.com/questions/12675147/how-can-we-make-...


Not all the charts are ggplot2. It looks like the big interactive one is javascript, specifically HighCharts.


Thanks! I love those chunky graphs, definitely going to play around with ggplot2 a bit.


A deep analysis was published in http://www.righto.com/2013/11/how-hacker-news-ranking-really... (HN discussion: https://news.ycombinator.com/item?id=6799854 (920 points, 345 days ago, 190 comments)).

AFAIK there are no filters to bind the maximal post of a domain in the front page, but some domains have automatic penalties.


that (referenced) analysis is out of date


I'd be interested in a similar analysis of HN, as I'm interested in the editorial intent thus revealed.

HN no longer functions entirely off of the "reddit algorithm." I know that it's now curated to some extent.

"Merit" just seems like a big value judgement over such a noisy channel.

I think the takeaway from such investigations -- and all of human history -- is that the system will be gamed. To survive, systems must be robust in the face of corruption and manipulation.


> it's now curated to some extent

It was before, too. We just explain it more now.


I like as describe Reddit and HN to be voter nominated, not voter selected. Votes nominate stories, but then the algorithm decides which ones to pick from that pile.


And for HN, the "algorithm" is the subjective judgment of the moderation staff, as witnessed by countless articles on the misogyny in tech (as an example) getting axed from the front page.


Or maybe those subjective articles on misogyny in tech are flagged by users like me.


The biggest factor there by far is user flags.


This is a nice way to punt, but you've (in the plural sense, not you specifically) have still admitted to removing them from the frontpage.


It appears he just meant to say "democratic".


One thing I'm interested in knowing is whether Reddit manipulates the votes on its own submissions. See, for example, r/blog: http://www.reddit.com/r/blog/new/

Every submission made there went to the frontpage. What are the chances? I mean, it's hard to believe that mainstream Redditors at this point are so interested in Reddit news that they keep sending every Reddit announcement to the top. What's more likely, to me at least, is that Reddit is either leveraging its knowledge of how things work to get to the top in a surefire way, or it's plain messing with the upvote numbers. I would love to see data on these submissions, if possible. Of course, there's good reason for them to actually do this -- they need to advertise the Reddit marketplace stuff and so on, it's how they make money.


I've been out of the loop for a few years, but no the votes aren't manipulated.

But the normalisation algorithm https://github.com/reddit/reddit/blob/master/r2/r2/lib/norma... does prefer to get one at least of each subreddit. This means that if /r/blog has a post it will probably be placed on the front page immediately. That action gets it a lot of votes just because it's very visible.


> I've been out of the loop for a few years, but no the votes aren't manipulated.

AHA! Let's just cancel out the double negative ("no" and "n't") and look what we uncover, in this former reddit admin's very own words: "the votes are manipulated"


For those who don't know, raldi is a former reddit admin as well.


That's not how english works. That's not a double negative.


Now let's cancel out the double negative ("not", and "not") in what you said:

> That's how english works. That's a double negative.

Exactly!


Or...

> That's a double positive.

Indeed!


Umm, that isn't a double negative.


Can you explain why /r/blog is treated differently in multireddits?

Back when there were only 25 default subreddits and multireddits had just been launched, I tried to create a default multireddit. If I just included the top 25, the rankings would match the logged out view perfectly. But if I added /r/blog and /r/announcements they would appear in different places. I think they usually ended up at a slightly lower rank, but they also stuck around in the top 50 much longer, with week old posts often being inserted into the middle of the rankings.


I don't think it is. AFIAK that's all of the normalisation code right there. IIRC, the front page inherits from MultiReddit internally.

But if your front page is 25 links long (the default) and has 25 subreddits, I imagine adding 2 more means that you need to show 27 subreddits in 25 links, so you hit some edge-case there where things need to be laid out again.

The way this probably works is that we pick 25 subreddits at random (and cache that random choice for a while), some of which may have no qualifying submissions. So you may end up with 2 /r/funny posts, even though you had 27 subreddits for 25 links.

Or at least that's how it used to work. I know this bit has changed but I don't know what the new way is


>I've been out of the loop for a few years, but no the votes aren't manipulated.

Could you then explain the, pretty much on the dot, massive score changes every 2 hours? That was even more noticeable when the downvotes were shown - although they were 'fake' the final score, which is according to the FAQ correct, was changed.

I noticed this when I created a simple graphing script, you can see example graph of what I'm talking about here: https://github.com/Nikola-K/reddit-thread-graph#example

You can see that on any popular thread until the score is "normalized" around 3k points.


If accurate, I'd guess that it's the cache lifetime for which of the default subreddits to show on the front page


> it's hard to believe that mainstream Redditors at this point are so interested in Reddit news that they keep sending every Reddit announcement to the top.

"Mainstream Redditors" don't decide what is on the homepage because they don't vote. Only about 20% of redditors vote, and only about 20% of voters comment [1].

If you still find it hard to believe, just read the 900+ comments written in less than a day on their most recent post, which announced nothing and was just about how it's nice to be nice to people.

[1] http://www.quora.com/reddit-website/What-percent-of-Redditor...


The follow is CWACkery: Complete, Wild-Ass Conjecture... kery.

Are you familiar with the 90/9/1 rule of thumb? That for any site with user-generated content (message boards, social networks, etc.), only 10% of people contribute to the discussion and only 10% of those people start new conversations?

At first, I read your comment about "20% of 20%" and thought "wow, that would be 80/16/4, that's incredible".

But this adds voting into the middle of the equation. So perhaps the 80/20 rule continues, and we can estimate that 20% of voters comment, and 20% of commenters post original content. That would make it more like 80 (reading only)/16 (voting)/3.2 (commenting)/0.8 (posting original content). Which might be considered (with rounding) 80/19/1. The original content contribution is roughly the same, but the voting system seems to increasing the middle-ground of engagement.

Double sign-ups by adding a voting feature?


http://www.nngroup.com/articles/participation-inequality/ if you're looking for an article on it.


Only about 20% of redditors vote, and only about 20% of voters comment [1].

That doesn't really ring true. If you make a post with a question in the title, you'll often get 5+ answers before a single person votes. Even downvotes.

Maybe it's more accurate to say "Of those people who don't hang out in the new submissions queue, only about 20% of the voters comment."

But, really, Reddit could do anything it wants internally, and we'd have no idea. Only they have access to the stats. They already compress the votes so that it only looks like there are ~10,000 votes maximum, when in fact hundreds of thousands of people have probably voted on certain posts.


> They already compress the votes so that it only looks like there are ~10,000 votes maximum, when in fact hundreds of thousands of people have probably voted on certain posts

Again I'm out of the loop by several years here. But to my knowledge, no that's not true. There are just fewer voters than you think there are.


Just to nitpick, accounts != people. Reddit in particular seems to attract throw away accounts and a large number of posts seem to start with "Throwaway". But otherwise, yes!


Also you don't need an e-mail address to register, and some people don't care about karma, so some people like me just create a new account whenever my saved login info is lost for any reason. Not necessarily using throwaway accounts in the traditional sense, but not overly concerned about maintaining a single account either. I've probably made 6-7 Reddit accounts over a few years [and 2 HN accounts over a longer time frame].


It's possible, but it seems unlikely.

https://www.reddit.com/comments/z1c9z Obama's AMA. According to the sidebar, (upvotes - downvotes) = 14,700, with upvotes/(upvotes+downvotes) = 94%. Reddit recently changed their algorithms so that the 94% figure is very accurate. That means 14,700/0.94 ~= 15,600 people voted, according to the sidebar.

Here are Reddit's traffic stats for the Obama AMA: https://www.reddit.com/r/IAmA/comments/z3msa/traffic_stats_f...

The AMA brought in an extra two million uniques over two days. Also, it was one of the most legendary and historic events thus far on the internet, because it was the first time a sitting US president directly engaged with the public on a social media website.

Reddit has millions of subscribers. They can't really obfuscate the subscriber counts, because the subscriber count for each subreddit is visible every day. If it suddenly slowed down, people would notice. And the info needs to be available to moderators in order to manage their subreddit. Therefore, when /r/funny says it has 7 million subscribers, we can be reasonably certain there are at least 7 million Reddit accounts, many of which are active.

So, assuming there are only about a million active Redditors (there are probably more), and that a large number of those Redditors visit http://www.reddit.com/r/all on a regular basis, and considering that the Obama AMA was one of the most significant events in Reddit history (and indeed all internet history), and considering that over three million people viewed the AMA, it seems hard to believe that only 15,000 people voted on it.

It's possible. Statistics are one of the most counter-intuitive fields. Behaviors emerge at scale which aren't seen in the initial stages. Maybe it's possible people saw that 5,000 people had already upvoted the AMA, and so were less likely to upvote it themselves. But given the nature of politics and the significant historical status of the event, could it be true that the fate of the AMA was influenced by just a few thousand people?

Another observation: Reddit has had steady upward growth, but I remember that as of a few years ago the upvote counts were regularly reaching 3k. That number hasn't gone up too much in the meantime: https://www.reddit.com/r/all/

... but Reddit's traffic seems to have doubled since the start of 2013: http://www.google.com/trends/explore#q=reddit

If the number of voters could be calculated as a simple percentage of total active Reddit accounts, and Reddit's traffic has doubled, then why haven't the vote counts in /r/all also doubled?

That said, I'm not entirely convinced of my position. Reality is weird, and I'm often wrong. All I'm saying is that if it's true that only 15,000 people voted on Obama's AMA, and that a submission on /r/all regularly receives only ~5k votes out of a million active redditors who see it, then I'm surprised.

EDIT: The more I think about it, the more I think I'm mistaken. If it were true that Reddit compresses the vote counters for submissions, then they'd also have to do it for comments, because otherwise the top comment would regularly display a count that's way higher than the submission. They'd also have to compress the added comment karma, etc. This is where it moves from "plausible" to "the simplest explanation is that very few people vote."

Huh. Interesting.


> If the number of voters could be calculated as a simple percentage of total active Reddit accounts, and Reddit's traffic has doubled, then why haven't the vote counts in /r/all also doubled?

The people who are joining reddit now are more lurkers than participants, so as the site grows, the percent of people who participate (vote and comment) gets smaller.


I think we can mostly blame (or thank, depending on your view of the average contributor) mobile devices for that. They make it much easier to consume reddit, but much harder to contribute.

Personally, it makes me sad.


I feel like people vote only partly on whether they think something is good or bad; they also (or maybe mostly) vote based on whether they think something has a higher or lower score than they think it deserves. So as a post or comment's score rises, people will stop upvoting when they feel it's got enough.


I like the way you ask yourself critical questions and adjust!


1500 people hitting up arrow because the poster is an admin is believable to me given Reddit's size.


This seems likely to me as the same thing seems to occur when a moderator on a small subreddit makes a post (even without making it stickied). Users seem to take it upon themselves to upvote out of reverence.


The score does not correlate directly w/ number of upvotes


It's not a r = 1 correlation, but it's definitely strongly correlated.


It is close enough, especially for content that is not likely to be down voted.


I've always assumed there was no doubt Reddit moved their own posts up to the top in some unmeritocratic, undemocratic way... it serves the same function of sticky threads on phpbb (and other) forums. The massive amount of votes on these posts would happen after being moved to the top as a form of self fulfilling prophecy... people are interested in what Reddit (the org) is doing to improve Reddit (the site).


My guess here is that r/blog (and r/announcements?) is special in that every redditor is subscribed to it. I think I remember some small choas around us all being auto-subscribed to a subreddit we didn't care about (potentially). So it caused a small storm, but ~100% of redditors subscribe to it, which is untrue of probably every other subreddit


Every new reddit user is subscribed to all of the default subreddits and has to manually unsubscribe to get any of them off their front page.


They're just like other default subreddits, which is to say that you can unsubscribe from them. I only resubscribed recently because I started working at reddit and figured I should know what's going on in my own company.


Some discussion at https://www.reddit.com/r/BestOfReports/comments/2g62rg/dont_... ( which, btw, has a pretty funny screenshot of off-topic "reports" that a Reddit blog post got) says that it is possible to unsubscribe from /r/blog and /r/announcements, but it is not the default.


Reddit does intend to have the /r/blog posts on the front page, but they don't necessarily need to rig the votes.

As mentioned in the article, the algorithm for selecting which posts appear on the homepage seems to allocate slots for each subreddit. If one subreddit has more people voting than another subreddit, they can both appear alongside each other, rather than the larger one always filling the homepage.

Since the only posts made to /r/blog are infrequent, they will always get the slot for the subreddit and will likely always appear on the front page or second page even with few votes. Then, because they are visible to many users, they can attract more votes.


>Much to my surprise, I found out that reddit's front pages are not a pure "meritocracy" based on votes, but that rankings depend heavily on subreddits.

Is this really a surprise? If just went by upvotes alone, a sub with 1M subscribers will always dominate a sub with 500k. You need to factor in the that context.


That's what I thought at first too, but the data makes it clear that it isn't just based on votes weighted by subscriber count either.


It's not votes weighted by subscriber count at all. It's "hotness" weighted by the hotness of the highest-hotness link per subreddit

Hotness: https://github.com/reddit/reddit/blob/master/r2/r2/lib/db/_s...

Front-page weighting: https://github.com/reddit/reddit/blob/master/r2/r2/lib/norma...


IANAQ* , but could not the effects shown in this article occur from content-neutral rules, combined with some clustering in the popularity of various subreddits?

For example, assume there is a rule that a given subreddit can have no more than N posts in the top 50 at a given time. It seems like this alone would explain the clustering shown in the article. Super-popular subreddits like /r/funny would rarely have posts on page 2, simply because they usually already have N posts on page 1. Thus they drop off sharply in likelihood to appear in the 40s, then shoot back up after #50 when the limiting stops.

Meanwhile clusters 2 and 3 appear to be the subreddits which rarely and often (respectively) reach the top 50, but only due to the limiting rule. Cluster 2 is the least popular in the unlimited spots past #50, so it makes sense that it usually reaches the lowest of the limited spots, while cluster 3 (apparently medium in overall popularity) takes the middle region.

Naturally I'm just squinting at it, but it looks like the article's findings could easily occur without Reddit treating some subreddits differently from others (as I take the author to imply it might, given the title). Am I missing something?

* I am not a quant :P


For those curious,

quant |kwänt| noun informal a quantitative analyst.

ORIGIN 1970s: abbreviation.


I'd like to see this kind of analysis done on /r/all, since it seems to more closely operate like the author anticipated. The default front page is meant to weight the subreddit like they discovered, but IIRC /r/all is strictly based on score and time, as if everything was submitted under the same subreddit.


Dumb question but what is the reddit front page? Isn't it customized for everyone depending on what you're subscribed to? Or are there a ton of users that never log in?

It seems like you wouldn't get much value out of reddit if you just view the front page without logging in?


Reddit has a default frontpage for people who don't have accounts. I lurked for years and was just fine without actually creating an account. :)


Isn't the default page mostly cat pictures and jokes? I tried viewing it logged out one time and I couldn't figure out who would like that?


That is what the OP is about, they have since altered their algorithms to avoid pure cat pictures and jokes by inflating the value of certain sub reddits.


For many people, reddit is merely a source of internet junk food. They go there for cheap, quick hits of mindless amusement.

See also: buzzfeed, 9gag, imgur, funnyjunk, etc.


I wish so hard I could filter out the junk posts. Just give me something to read, a good article, a debate.


Log in, unsubscribe from all the default front-page subreddits, and put together your own mix based on stuff you're interested in. The Reddit I read by doing this is a pretty genteel and sensible place.


r/all right now has 4/25 threads from the funny subreddit

no cat pictures, 2 gifs of dogs being cute

I check r/all because it's the breadth focused zeitgeist of the internet. Whether the comments are right or wrong doesn't matter. Once you can fix a point on popular sentiment you can interpret how it will ripple through more "prestigious" forums.


They changed the default subreddits a while back, and also have done work to make sure no particular theme (such as funny pictures) outweighs the others.


reddit has been around since at least '07. I remember visiting the default front page back then and it was mostly developer-specific stuff.

A lot has changed.


The default front page is also the default set of reddits you're assigned when you create a new account. There is tremendous inertia there so the "front page" is still a useful concept.

Also, there's /r/all, which tends to closely mirror the front-page since non-default subreddits rarely climb in /r/all


The author defines the term "default front page" with an annotation:

"In other words, assuming you’re not logged in. I don’t have any supporting info, but I’d imagine that a large chunk of reddit’s traffic comes from logged out users."


Todd was using the default front page, which consists of these subreddits: http://www.reddit.com/r/defaults/comments/24zz8z/list_of_def...


yeah, not a dumb question - there are a ton of not logged in 'casuals' who just come by to consume consume consume whatever frontpage throws up


related recent reddit blog post about the 'defaults', changing them etc...which was not without controversy of course but ...

http://www.redditblog.com/2014/05/whats-that-lassie-old-defa...


This is an interesting study into how Reddit works, but I have to say that I'm fine with the fact that Reddit is neither a meritocracy or democracy when it comes to how posts make it to the front page. I'd rather not see any more funny or aww posts on the front page than are already there (I can go to their subreddits directly when I want to).

In fact, I'd rather see a variety of posts from subreddits I don't usually go to or usually follow. I want to see thing I don't already follow. Most posts I end up liking are ones I find in the subreddits I visit. I'd like my front page to give me posts from other unsubscribed subreddits so that I may end up expanding which subreddits I'm subscribed to.


> In fact, I'd rather see a variety of posts from subreddits I don't usually go to or usually follow.

There is a little line of text on the top of the reddit front page called "trending subreddits"...I've found a bunch of cool stuff that way http://imgur.com/okDu5AN

But of course, the best way to find new subreddits is to read the first 5 or so comments on a weird gif or picture. Someone will link to either subreddit that it came from, or a subreddit where it could have been posted.


I'm pretty sure that those are human-selected rather than algorithmically selected.


Relatedly, the Hacker News front page is not a meritocracy either. That's why counterbalance tools like flagging exist. Additionally, dang is implementing algorithms to identify articles that slip through: https://news.ycombinator.com/item?id=8157698

EDIT: I misread the article's argument; there's a lot of luck on HN, but I cannot confirm there's a systemic bias toward people/topic.


One thing that immediately jumped to my head when I saw the falloff towards 50 and then the jump right after 50 was simply visual prominence. I would guess that people naturally pay more attention to posts that appear at the top of the page, whether it's page 1, 2, or 3. There's also a blip of attention at the very bottom, since that post is also visually and conceptually distinct - everyone looks directly at it because it's the last one, so it gets more eyeballs and potentially quite a few people giving it a charity vote to "save" it from dropping to the next page. Meanwhile people scan more quickly over posts in the middle, perhaps (as I do) merely skimming over a couple words and the score, to see whether it merits moving my eyes over the whole line (because my time is that valuable).


Many years ago I had an article on my personal blog about web design climb it's way to the front page of Digg (I'm dating myself here) and Reddit. Despite my server buckling rather quickly, I still saw at least ~250K visits over a couple days (Digg then was slightly more popular) and to this day (5+ years later) that one first page post continues to deliver tens of thousands of visits a year thanks to a rather healthy Google rank and a huge network of incoming links from related tech blogs.

Reddit exposes you to a huge audience, who then in turn comment, link to, and debate your post. Reddit's memory is short however, and within hours your post will be gone. However the reverberating effects benefit you in many ways and almost guarantee traffic for years if the topic is "evergreen".


For both Reddit and HN, articles/posts live or die based on popularity, not meritocracy nor karma.


A high karma is a causal effect for popularity, however. (on HN, an article at #1 will receive much more activity than an article at #30.)


Very cool. Although, maybe another possible explanation is that people get "tired" at reading a whole page and skip to the next page, resulting in a top of the page bias.

Also, love the use of R. R is beautiful.


R is awful. It's full of decisions made by people who seem to not have much programming experience, in that they seem good at the time but cause major issues later on. See, for instance, http://www.talyarkoni.org/blog/2012/06/08/r-the-master-troll... , http://blog.revolutionanalytics.com/2008/12/use-equals-or-ar... , http://shape-of-code.coding-guidelines.com/2012/02/29/parsin... , and the necessity of http://tim-smith.us/arrgh/ (I wrote up some stuff about the *apply functions, but not yet in a form suitable for the guide: https://github.com/tdsmith/aRrgh/issues/18 ) .


Popularity isn't exactly a meritocracy either.

That is, while popularity is correlated with quality, the two are rarely considered identical, and one is often a poor heuristic for the other.


Beautiful post very interesting data.

The balance trying to be achieved can most simply be described as known good content vs. discovery. I wouldn't call it uneven it's more like; this is interesting vs we might think you'll find this interesting but we're taking a gamble because it has low visibility. I'm betting subreddits can move from cluster to cluster over time as well fairly frequently. Maybe an interesting thing to try to track over the next month or 2?


Just a though: It looks like things toward the bottom of a page drop off in popularity. Perhaps users are more engaged when looking at the top of the page, clicking all the links, and perhaps less engaged towards the bottom, skipping remaining links and just going to the next page. I imagine this would only be seen by users who use the actual page rather than RES.


Reddit is internet power - I cant recall a day in the last 5 years when I haven't checke reddit. When people ask me what websites do I read I can't recall any other than reddit. So funny. And I grew up with slashdot. People dont even know what slashdot is anymore. Reddit is the internet explorer button for me.


A while back ago I would have agreed with you (also having spent a lot of time on Slashdot), but the quality of posts these days on the front pages keep from returning on a regular basis. Yes, I know that with Reddit you need to unsubscribe from the defaults and things like that and there are quality sub-reddits out there. But from my perspective, the front page used to contain enough good content that one could causally scan through and find something interesting without the haggle of signing up, maintaining subscriptions, searching for relevant content and things like that.

At the end of the day, I'm just there to browse and not jump through a bunch of hoops just to filter out junk that only appeals to the under 18 crowd.


There's no such thing as a meritocracy, unless you include ability to game the system in your measure of merit. Either that, or every system is a meritocracy, and you're just not trying hard enough to win.


They talked about this eons ago. Maybe 3, 4 years? It's an attempt to highlight smaller subreddits and hopefully boost subreddit discovery.


Nice job. I guess the practical takeaway is, if you don't want to look at funny pictures and animal GIFs, start on page 2.


Or just unsubscribe from /r/funny and /r/gifs. The default front page is total nonsense, you have to find reddit for yourself.


Or you can just filter them to create your own front page. Reading reddit becomes a much different experience once you start to customize it.


Reddit doesn't have just one frontpage anymore, the defaults vary depending on country.


Sounds like there are 50 subs that get spots 1-50, sorted mainly by vote count.


Not quite, as you can see multiple posts from the same subs in the top 50 and it changes. They aren't fixed "slots" but just appear to be weighted like that.


"I found out that reddit's front pages are not a pure meritocracy based on votes, but that rankings depend heavily on subreddits"

(meritocracy was between commas). Clearly votes are not what people think when you speak about merits, but let it be.


Is anything actually a meritocracy?


On the internet? No.


Good time for me to do my annual bitching about how much I hate HN comments. So who made the most agreeable comment on this thread? I don't know...


and again this why you should use http://www.sagebump.com/?info&view=technocrat to manage your social aggregator news


Lovely piece of work.


"meritocracy" is a pompous and politically charged word to use in this context, but ok.


On HN how is it that posts with 7 points make it the front page but others do not?


Posts with 7 points that get those points quickly (within a half hour of being posted) will hit the front page due to the time portion of the algorithm.

Whether it stays there is another story.


I feel the same, about reddit, I only go to r/linux and r/android, I don't go to r/news or r/technology anymore, let alone the front page, is all liberal biaz, and I like impartial news.


When you login you get to pick the subreddits that show up on the front page. Getting rid of /r/politics was why I created a login and now I have a front page with few "easy laughs" due to self selection (no /r/funny etc.)


No shit Sherlock. It's no secret that the front page is heavily weighted. It's also subject to personalization, so basically no two redditors have the same front page.

The default front page is just the landing page for newcomers to get a first impression and a starting point for personalization.

An analysis of and comparison with /r/all would have been way more interesting.


I left reddit years ago (and found HN) because although I agree with mostly liberal views it became too much of an echo-chamber for liberal views the frontpage is the extreme example of that.

There are many great sub-reddits of course but it's not a place for politicial discussions (in fact I am still looking for a good place to have political discussions)

Edit: Why does my personal experience get downvoted?


I don't really buy that... the only liberal views I can think of that are popular in the aggregate on reddit are about the war on drugs and sometimes socialized medicine. In general, reddit is pretty hostile towards women, skeptical of feminism, overtly racist, and pro-military.

I guess I would characterize it more as libertarianism strongly centered on the perspectives of middle class, white men.

Edit: I guess "I don't really buy that" was a poor choice of words. I don't think you're lying to me. But your experience was very different from my experience. I stopped using reddit because I felt like all the big subreddits were too socially conservative!


I'm not sure how being skeptical of feminism is an inherently non-liberal view. I'd say it depends on whether the feminism in question is oriented around civil rights, or around some form of critical gender theory. "Feminism" alone is not descriptive enough to make any judgment over.

Pro-military? How? I haven't visited it in a while, but a common sentiment on AskReddit was a total disillusionment with the chauvinistic "support the troops" mentality and a belief that military service does not make one righteous in of itself.


Yes, the critical theory is what I'm talking about, and I think that knowledge is definitely associated with the political "left". The issue I had is that the skepticism is almost always ignorant and dismissive. It's just impossibly frustrating to try to talk to someone who refuses to do any background reading but wants to tell you you're wrong about concepts that live in an academic and historical context.

As for the military, I guess that one is more debatable. I agree that the Bush-era "support the troops" jingoism isn't around much, but I used to see a lot posts where it's implicit. Things like hugely popular photos of special ops soldiers followed by adoring comments of how "badass" they are.


>I haven't visited it in a while, but a common sentiment on AskReddit was a total disillusionment with the chauvinistic "support the troops" mentality and a belief that military service does not make one righteous in of itself.

Enough kids who grew up playing Call of Duty has shifted reddit into being more pro-military and pro-weapons. There is also a steady stream of "coming home" photos and videos with kids and pets.


Not sure why being pro-weapons is bad, but alright.


You don't buy what? my experience?


I disagree. Pretty much any news article or situation is sided with the left on Reddit. Its even difficult to have an open and honest discussion here on HN, because anything that goes against the mainstream view (which tends to be liberal) is downvoted and silenced.

Its getting to the point where the only way to defend against such tactics is to play just as dirty.


> I am still looking for a good place to have political discussions

I sometimes wonder if such a place can exist. Outside of friendships, I kind of don't think it can.


Theoretically, I think it can exist. However it would take a fair bit of luck for the core group that grows up around it being diverse in their opinions but also open minded.

Unfortunately most people these days are more interested in their initial opinion being right than in having real discourse that everyone can learn and grow from.


Political conversations that don't devolve into idiocy and hatred are the exception, and not the rule, in any forum or group I've ever encountered.

I think if your theory involves mass-exclusion/banning you could possibly end up with a place like this. But the conversation would actually lose an important voice at that point. The angry people are (often) angry for a reason, they're just usually pretty bad at making their point civilly.


Some of the most civil (and then again, some of the least civil) political discussions I've ever seen on the internet happen on places like Facebook, where people are friends, and where almost everyone uses his/her real name. I think those two factors are very important: 1) real identities, 2) some form of reciprocal relationship among all parties to the conversation.

If you look at a political debate from a game-theoretical standpoint, a debate among friends is a multi-stage game. You don't want to go nuclear right off the bat, because when the dust clears, you'll still want to be friends with your opponent. There's a continuity to the relationship. There will be a second, third, fourth,...,500th "round" to the "game."

I'm not sure if this situation is any more likely to yield productive discussions. But on average, it yields more civil, less abrasive discussions. On the downside, it can often result in echo-chamber conversations that never really get interesting.

Conversely, the worst and least civil debates I've seen have occurred on long-standing message boards or communities, wherein the members are anonymous, but they're known for their handles. These members have "known" each other for years in some cases, but they don't really know each other, and they have few qualms about unleashing the flames. Especially when they feel their reputations or credibility within the community are at stake. These users have created identities for themselves, and paradoxically, they'll often defend those proxy identities more ferociously than they'll defend their real identities.

Tl;dr: when people assume group or tribal identities, you get worse flame wars and less substance. When people assume individual identities, you get fewer flame wars, and possibly more substance.


I think, at least in the US, our extreme politicization makes that difficult: http://www.vox.com/2014/11/1/7136343/gamergate-and-the-polit...

My favorite part of this article is the graph showing how "Do you think Twelve Years a Slave should win an Oscar" breaks down by party lines.


I really want a place for a _structured_ political or philosophical discussion to take place. The same way that Stack Overflow placed a lot of structure and moderation around a Q & A panel.


Thats a g good question. I do have some good debates on Facebook but wish I could have it isolated from all those of my friends who aren't interested in politics.


You might want to look at http://www.reddit.com/r/changemyview


They eventually bit the bullet and removed politics and a few similar subreddits from the default page.


Interesting. Thanks for that info. I will have a look again.


It's odd the Ron Paul was popular on Reddit for a while though. How do you explain that?


Libertarians align with the left on personal freedom (Weed, abortion, relatively blind to race or sex, less foreign intervention...) and they align with the right about economic freedom.

I use to hang around Reddit back then, but I don't clearly remember. My guess is that reddit found all the personal freedom stuff appealing, while mostly ignoring of the facts about closing the Fed, eliminating minimum wages, reducing intervention... But remember that during that time (2009-2011) most of the econ talk was surrounding the bank bailouts. Ron Paul's speech was actually aligned with that of Occupy WS, since most libertarians opposed the bailouts and oppose big corporations, given that they usually lead to crony capitalism. I also have the feeling that there was less nonsense on reddit back then, but this might just be my perception.


Reddit liberals and Reddit libertarians are in massive agreement on not starting wars or locking people in a cage because they have a plant; views on economics take a backseat.


Reddit was a pretty libertarian place in the very beginning.


Ron Paul was "popular" on almost every website on the internet because his fanatical supporters flooded the web with an endless barrage of Ron Paul support. Reddit, newspaper comments, other discussion boards, polls, you name it, they were there to tell the world about Ron Paul. They were a very loud minority.


That sounds like a borderline conspiracy theory. If Ron Paul's popularity on Reddit around 2008 was a fascade, it was very convincing.


Because there were no liberals running for the Republican nomination, so many/most liberals would rather support a libertarian than a conservative.


I don't. I am not saying everyone there is liberals but most are at least from my experience.


Reddit loves 4chan memes.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: