Hacker News new | comments | show | ask | jobs | submit login
PG: What's your current take on reddit's comment system? (reddit.com)
21 points by brett 3545 days ago | hide | past | web | 64 comments | favorite

On the topic of community makeup, here are some interesting findings that come from Yahoo's social scientists (using Yahoo lists as the dataset):

1) Small groups have higher numbers of core users

2) Private and semi-public groups have higher numbers of core users than public groups

3) People in private groups remain in the core for longer

4) A user who is in the core of one group is more likely to become a member of the core when joining another group

5) New members who eventually become part of the core are more likely to receive messages from other users who have been long-time core users

6) New members who eventually become light users are more likely to receive messages from other light users, if they receive replies at all.

This suggests some strategies for building a strong group of core users to transmit social norms. For example, having an idea of who the core users are in other similar communities and making a special effort to reach out to them if they visit your site.

Unhappily I lost the original paper and all I have are my notes, so I don't know exactly how they defined the core users offhand.

Alex, this is very interesting. Do you think there might be a way to get one's hands on the original? If there is a person/persons at Yahoo to contact I would do it, as this is something I have thought about quite a bit and would love to see more about the definitions of core, light, new, long-time, large and small.

You can email me if you like.

So I heard this talk at the Cornell Microsoft International Symposium on Self-Organizing Online Communities. The presentation was called "Implications for Future Research; Theory and Methods in Large Scale Analysis of Online Communities." And the three Yahoo presenters were Ravi Kumar, Andrew Tomkins, and Cameron Marlow. Not sure which one those three specifically, but I'm sure you could ask any of them.

I've developed this hobby of sneaking into academic conferences. It works best if when everyone else is wearing a suit and tie, you show up in shorts and a t-shirt. That way everyone just assumes that you're friends with one of the speakers, or else that you're Sergei Brin. :-)

At least one related paper is available online:


I probably won't have time to read it until this evening, but will post then.

Yeah, this is mostly about speed of growth in social networks and only looks at Yahoo 360 and Flickr. I will keep looking for the original paper and post back if I find it.

I suspect these findings will apply to a certain type of "community." It does not, for example, apply easily to the wikipedia community.

What would be equally interesting as these findings, are the premises. What service, what content, what medium, what purpose, and extremely importantly, how presented.

I have the same problem with reddit and digg: I like to read exceptionally downmodded comments. When comments get modded down so much that they are not displayed, and also they generate a lot of responses, it's the kind of thing I can help but look at. Like a train wreck I suppose.

It actually sucks me into the troll MORE when they get such special highlighting. I should say I do like how top level troll posts get put at the bottom for reddit.

It bugs me that downmodded comments trash karma - political disagreements often get downmodded not because they aren't valid, but because someone disagrees fundamentally. That doesn't seem right.

Reddit comments DO NOT AFFECT KARMA.

Only submitted articles do.

I have no idea how you got any different idea into your head, but it was not through observation of reality.

You can set the threshold or turn off the hiding of downmodded comments on reddit.

Incidentally I came across this while trying to find information about how reddit stores threaded comments. I've read they use postgres and while I've also read about many workable techniques to force hierarchical (and frequently reordered) comments into a relational db these solutions all sort of make me cringe. I'd welcome anyone's ideas here as well.

The approach I took on Diffle (http://www.diffle.com/ - unfortunately it hasn't been exercised as we haven't really got traction yet) was to store the comment ID as a varbinary(255). A child comment takes the ID of the parent and then appends a 2-byte sequential number. So, the first comment is 0x0001, the second is 0x0002, a reply to the second is 0x00020001, a second reply is 0x00020002, the third is 0x0003, a reply to the third is 0x00030001, etc. To find the children of a comment, you look for all comments where the parent's ID is a prefix. MySQL can use leftmost-prefixes as indexes, so this computes very quickly. Also, with the standard varbinary collating order, 0x01 sorts before 0x0101, 0x0102, 0x02, etc, so all I have to do is order by the ID and everything will come out in proper threaded order. And nesting depth is calculated easily by len(id) / 2.

Reordering can be done just by looking at all records whose IDs contain the parent as a prefix and have a length 2 greater than the parent, and then renumbering them. This should be computationally feasible in most cases.

This does limit users to 65,536 replies to a single comment, and a maximum comment nesting depth of 127. Based on my experience with some very active LiveJournal threads, I considered these to be acceptable limitations (LJ limits comments to 5000/post anyway). A nesting depth of 127 would be nearly 2400 pixels over, so it's not like it'd all fit on one screen anyway.

Why not just store the parentid (can be another comment or the news topic), username (instead of userid so you don't have to access the users table), and votecount for each comment?

Also, each comment should have a newsid field that points to the original newstopic. The left-most posts would have the parentid and newsid that were the same (or parentid can be zero, depending on how you do it). But any replies would have the parentid to the parent comment, but the newsid would still point to the newstopic. Then you just run an SQL query to find all comments with the newsid for that newstopic. You won't have to join any tables. After that, with everything in memory, you can just sort the way it has to be. Finally, you can cache the page in memory or a file in case the page doesn't change for the next set of visitors.

How do you get the full thread for a given parent? Selecting based on the parentid alone gives you only immediate descendants. Recursive queries are generally a no-no, as they drag your performance down fairly quickly. Selecting based on the newsid gives you every single comment on the item, and then you'll have to do a lot of work (basically building up the full tree structure in memory) to find out which are children. For a site with traffic like news.yc, it's probably acceptable, but it's still more complicated to code than a simple select.

You would find the full thread by searching for comments with the appropriate newsid.

To simplify the recursive function (using data in memory, not in the database), you could also have another field, called indent, that stores the level in the comments.

You could find all the comments with one SQL statement (where newsid = news.id), then go through each level of each comment, and sort it that way.

You could even make the parentid field a double, so you would store 123.4 where 123 would be the parent and 4 would be the indentation level.

I guess the difference is, do you want to store everything in one field or have different fields for everything?

I think this gives the original poster many options to think about. To me, a recursive function would be easier to write than coming up with ways to combine multiple fields into one using MySQL.

I think the order of optimization is: 1) Optimize so you can use a cached result if it exists 2) If the cache is out of the date, try to make just one SQL call 3) don't use table joins 4) reorder the data as needed in memory, then save it to a cache

As far as the single SQL call goes, I don't think it matters whether you have 5 fields or 30 in the actual table as long as you only request the fields you need, but I could be wrong.

By the way, do you have any other interesting examples?

i'm not sure why nobody has said this already... but why would you need an extra filed for keeping indentattion nuber when you can use <pre> in html for that.

Interesting examples of what?

I did a quick and dirty forum a couple years back that now serves up hundreds of new threads a day (having tens, sometimes hundreds of messages each).

My solution was to store a comment thread in display order in the database with hinting as to how deep in the heirarchy the comment is. Optimizes for output, although with some cost every time something gets posted. Luckily, for most forums, views outnumber posts by hundreds of times.

The one issue that system has is that if you're doing a lot of reordering, let's say from quality scores. In that case you'll basically have to remake the thread after every post and after vote that makes a difference. Not recommended...

i am in the process of making a news.yc clone... threaded comments are not as easy as it looks on news.yc.

[edit] especially when the comments get reordered on votes.

If you're trying to mimic this forum with a database solution, it's going to be a pain. PG's keeping this stuff in memory, almost certainly as the tree that it is.

edit on the edit: scratch that.

I mentioned a possible solution here: http://news.ycombinator.com/item?id=33902, but I'm really commenting because there's a really cool connection between PG's approach of storing it as an in-memory tree and my approach of a 2048-bit index:

Y'know how in a typical CS datastructures course you'll have to build a heap, and then reimplement with an array? A heap is conceptually a balanced binary tree where every leaf is filled, while there's nothing tree-like about an array. However, your CS profs were trying to get across that any binary tree can be stored as an array, with the root in position 0, its leaves in positions 1 and 2, their leaves in positions 3, 4, 5, 6, etc. If a node doesn't have a leaf, that position in the array is empty.

Normally this is very inefficient - a totally-unbalanced (linked list) binary tree of length 20 would require a 1MB array to hold it. However, heaps are always perfectly balanced, so an array really is the most efficient implementation for them.

By restricting users to a certain number of replies to each comment (65,536), I'm effectively constraining an arbitrary tree structure to a 65,536-ary tree. Which can then be represented by an array. If held in memory, it'd be a very sparse and inefficient array.

But databases are excellent at storing sparse arrays. If I don't store a potential key, it doesn't take up any space in the DB. And with a good indexing system, I can reference any element in constant time (well, O(logN), but the log base for B-trees is so huge that it might as well be constants).

So, I'm really just applying some of the meta-principles of CS coursework to basic data structures. (Actually, I got the idea off a Slashdot posting and had to figure very little of this out on my own, but it's still really cool.)

Incidentally, there's apparently a deep connection between number systems and tree structures. You can read more about it in Chris Okasaki's Purely Functional Data Structures. Apparently, every number system corresponds to a data structure. For example, Peano arithmetic = linked lists, binary numbers = binary trees, fibonacci numbers = binomial heaps, etc. There's a short presentation on it here: http://www.informatik.uni-bonn.de/~ralf/talks/BCTCS.pdf, and I'd strongly recommend the book.

You probably don't even need to be using the word "bit" unless you're writing something a lot more performance intensive than a forum.

Optimized ob. quote:

"Premature &#8594; evil."

Just thinking off the cuff here, but couldn't one store a lisp-style list as a string in a relational database. If I was using Ruby on Rails, I could simply build a simple lisp interpreter in Ruby to parse the string, then capture it as an in-memory tree. This way I would only have active items in memory and could use some of the advantages of the standard web development platforms for whatever else I was doing.

Edit: See Ruby/Lisp for one such already built interpreter.

Sure you could. You could also select all comments on the item and rebuild the tree in-memory, as Vlad suggests elsewhere on this thread. The code isn't even as hard as I'd thought it'd be:

  comments = dbh.query('SELECT * FROM comments WHERE news_id = $newsid')
  response_tree = {}
  for comment in comments:
      comment.children = []
      response_tree[comment.id] = comment
  for comment in comments:
The nifty thing about storing it in the database is that you need no application-level post-processing, you can take advantage of the DB for ad-hoc queries (what if you want to show subtrees of all posts by a certain user, like the "threads" page here?), and it happens to map to a set of simple and fast operations provided by most databases.

After some reflection, I suppose my only hesitation with this method is that it locks one into dealing with the comments in a particular way, whereas the lisp model I outlined is easily adaptable to a variety of schemas.

However, I'm not sure how else one would want to model things. Must think more...

This is closely resembles the way i did it... With recurrsive includes (views). Performs flawlessly.

Why are you using a database, in this case? Aren't you better off serializing your tree using the native methods of your language? (e.g., with pickle.dump() in Python.)

Database for searching for by the item (news story, game, journal entry), pickle or sexprs for the individual comment threads. The database helps you manage the hopefully large number of news stories you'll get. The tree takes care of individual comments for a single news story.

Many filesystems have trouble with large numbers of files in a single directory, as you'd get with eg. pickle tree to a file per item.

Is this what you are actually doing? How do you store the pickle output?

I'm not - I'm using the approach here: http://news.ycombinator.com/item?id=33888 . That gives me much more flexibility in picking out individual comments from a thread, returning all comments by a specific user, getting all comments posted today, etc. I'm just recognizing this as another possible way to solve the problem, if you don't care about the extra functionality..

As for storing the pickled output:

  dbh.execute('UPDATE articles SET comments = $comments WHERE news_id = $id', 
The comments field can be a normal TEXT field, since the default format for pickle is ASCII-based.

Okay, makes sense. I hadn't read your earlier posting that carefully before, but it sounds better than what I had in mind.

The question has to do with storage. I suppose you could serialize it and store in a flat file, but will this beat the DB access time, even if there is extra time needed for interpretation?

What's hard about threaded comments, exactly?

how you store them, to how they are fetched, to how they are displayed and arranged according to popularity (i think thats how news.yc works) on every vote.

If anyone is interested in a blog post about the problems with comment and forum systems in general, here is something I wrote on the subject: http://www.digitalkarate.net/?p=20

While reddit's comment system is obviously an improvement over Slashdot's, I'm increasingly convinced that community makeup is far more important than the tool it uses.

I agree, basically. Except I think customs are as important as or more important than makeup. That's why I try to discourage ad hominems.

To be fair to the reddits, it wasn't their fault that the discussion on reddit sank down toward that of digg and slashdot. They have really strong beliefs against censorship. They were never going to jump into a discussion and tell people to stop being jerks. But empirically it looks as if you may have to.

The optimistic way of phrasing this is: if you have good customs and existing users enforce them, you can probably survive the influx of 14 year olds when it comes.

That really interesting; your emphasis on community custom is not something I'd really thought about. You could even go so far as to classify reddit's innovations as an attempt to programatically encourage or enforce such customs. I wonder if you could take it further programatically. Off the top of my head you could go the email route and switch to a three button system for comments: "up", "down", and "troll". I'm guessing you can tell where I'm going from there: A Plan For Trolls?.

Clay Shirky has a couple good essays about this:

http://www.shirky.com/writings/group_enemy.html http://shirky.com/writings/broadcast_and_community.html

I agree with most of his points. Better technology can not substitute for better people, although I observe a strong correlation between mental acuity and humility and, hence, 133t hacker skills.

One of the unfortunate things about communities (reddit included) is that they tend to gravitate towards a certain conformity of opinion which then rewards people who parrot the standard opinions and stifles innovation.

I was reading a blog post recently on how someone creeped to the top of Y-Combinator karma by observing karma-award patterns and posting articles which he thought people would award karma for.

This, obviously, is an effective way of winning prestige, but prestige has rarely been a very good correlate for anything. In fact, as Pg stated in his talk at Rails Conf, even at the individual level an increase in prestige frequently has a negative impact on innovation.

Redditors have been observing the same pattern in their own community. As more people have become aware of the community (the prestige has increased), the quality of the postings and karma system has declined to the point where inane 'Bush resembles Chimpanzee'-type postings dominate the main page.

There is ultimately only one solution to this problem - the creation of walls. The last online community I participated in went through a similar phase of evolution. The content was progressively diluted until a number of core members went and created their own community. This new forum/community now has a application for membership which requires applicants demonstrate writing proficiency prior to entry.

Tie-ins to current political machinations could be articulated, but seem unnecessary at time.now

I wonder if there may be a way to build the wall you describe in the general-public site. Pretend you could rate commenters, not publicly but privately--for your own private use. If you spot a particularly dumb comment on reddit, you could blacklist the commenter--privately, so that your action would affect only you. Similarly, when you notice a brilliant comment you will want to permanently highlight its author so that for you--and only you--the author has an advantage on your attention in the future. Commenters you perceive as intolerable would never have another shot at your attention while commenters you view as geniuses would forever get artificially modded up--but only to your eyes.

The same algorithm could work for postings. If I want more Ron Paul, peace be upon Him, I can privately champion those authors who celebrate his news items. You may have had enough of the distinguished doctor-representative from Texas, so you could exercise your prerogative to ignore (or artificially down-mod, just for you) submitters whom you view are overly sympathetic to Him.

Further thoughts on reddit's comment system...

Problem: Logical fallacies largely go unchecked. Just today, someone replying to my comment pulled a tautology so I reflexively linked to the appropriate Wikipedia page. As mentioned, ad hominem is a more popular sin and the list goes on. What if, when I clicked on the "Report" link, I could select from several logical fallacies? In this way, if enough others agreed with me, we could get an official-looking stamp on the offending comment--corresponding to what the community felt was the logical fallacy (e.g., straw man, false analogy, non sequitur).

Problem: Today, if I submit a comment and I down-mod all other comments I increase my comment's visibility. I want my comment to have visibility (that is why I wrote it) and I suspect other redditors may employ the cheat. So I face an ethical dilemma. We can talk about real-life karma, but why tease commenters to begin with? Solution: If you comment you can't vote in the thread.

Problem: I imagine a lot of redditors, like me, sort by top->flat and I suspect most of the remaining redditors sort by top->nested. So if I comment, I significantly increase my visibility odds if I reply to the front-runner comment. To the extent I care that people will read what I write, I will tend to favor the popular comments--even if they are not as worthy of a response as those nearer the bottom.

Your algorithm suggestion is interesting, but probably not sufficient. Although reddit trolls are one step up from internet spammers, the same principle applies. With sufficient quantity of trolls/spammers blocking each one individually becomes an incredibly time intensive task - enough to drive away all those who don't have tons of time on their hands.

Pg speaks above about customs, but the problem with customers have to both (1) be educated about site customs (2) want to be educated about site customs. I think with any influx of 14 year olds of sufficient you will find both 1 & 2 problematic.

Now, personally, I think there is a significant difference between a news aggregation site and a community site. In a community site persons actually develop relationships and relate to each other on this basis. I have no seen that happen much on reddit or here. One reason for this is the speed of discussion - most of the best online discussions I have been involved with have taken days or weeks to fully develop. In any news-blog-digg driven site the time frame is closer to 24 hrs.

What I would favor is a multi-tiered system. For something like YC News there could be something like three tiers: (1) Old school hackers with experience in startups (2) Less experienced coders (3) Whoever wants to participate. Each conversation could be assigned a tier, and people of that tier and above tiers could participate. For example, a tier 3 discussion would accommodate people of tier 1,2 &3, a tier 2 discussion could accommodate tiers 1&2.

In many ways this model is what we already do. We have low-tier sites like digg which we may check, we come to places like this for a free form discussion mostly among people of type 2, and type 1 discussions are mainly engaged in via email. So in some ways the only innovation I propose is that some of these type 1 discussions be viewable to the public, so we can see what cool lisp hackers talk about in private (and solicit them for advice).

Response to other reddit comment comments:

Fallacy - I don't think the fallacy thing will work simply because (as with the site customs described above) one has to be educated as to what any given fallacy means.

Ethical dilemma - funny I never thought of doing that, although I think your solution would work.

Popular comments - yes this is a big problem; I have noticed it.

I tend to think online discussion systems will evolve to approximate real discussion. The online aspect is simply to allow easier information management, synchronization of more people (which would be physically impossible), and decrease of distance. Temporal issues aside, it makes pretty much sense, in that problems in real discussion easily predict problems in online discussion, and fixes to online discussion systems address problems in live discussion. It seems to me, however, that developers of discussion systems stop relating the online system to the real interaction and try to impose methods that make more programming sense, but less natural sense. Of course I'm probably not qualified to say this because I don't know how valid my theories are; we will see.

Nevertheless, based on this perspective, I suspect the tiered system would not work, simply because we don't think in tiers in real life. Everything is graded in subjective degrees. One day your Lisp email list would evolve into a nonsensical banter (somebody starts posting after getting drunk). One day a couple geniuses post on digg and write a bestseller novel on the comment system within 24 hours. Perhaps unlikely, but not impossible. There are zero barriers in real life that let this happen; when it needs to happen, it will. But online, the system becomes the barrier.

So instead of a multi-tier system, I believe the best system is bound to be an aribtrarily-tiered system. How might this work? Consider yomama on reddit, who posts yo mama jokes on every thread. Troll yes, but also a clown. If there is a clown thread, yomama gets the crown (sorry). Thing is, there are "tiers" in the sense that you only have upmod and downmod. And really, what do those mean? A tier of good, a tier of bad? Predefined categories will never be enough.

If there are disagreeing voices, I'd love to hear.

I disagree that we don't think in tiers in real life. We do - except they are called roles or titles. For instance, the typical university has undergraduate students, graduate students, professors and administrators. One could simply say that there are four tiers. The reason that we don't think of it this way is most of these institutions have grown up organically over time, or at the very least we aren't aware of the original design decisions.

Now I'm not arguing that we should necessarily replicate real-world institutions in online form, but one should at least recognize that real-world institutions (like universities) provide various measures which enforce content quality.

As for approximating real discussion, I'm not sure exactly what you mean. There are many different types of real discussion, from idle conversation about sports to academic journals about the poetry of W.B. Yeats. Although certain similar principles apply, the way in which you would model either of these would have to be different.

I'm afraid I also don't understand what you mean by an arbitrarily-tiered system. Are you arguing in favor of the reddit model?

Good points. But again, these real life tiers are invented systems and not intuitive. They're just created for efficiency in operation in a particular environment. For universities, some roles work. For companies, others. For internet citizens, I think since there is a whole new level of information transparency (or lack of), we can jump out of the traditional way of thinking about roles.

The examples you give about discussion are all "real discussion." What I'm saying is that I believe the ideal discussion system would accomodate all these discussions. Poetry, sports, academic journals, etc. If not now, eventually it must.

While sometimes it makes sense to nicheify, like separating a fashion-specific website from a tech-specific one, the material isn't always mutually exclusive, and so lots of information is lost. There should be channels for different discussions to connect to each other. Wikipedia is one example of how it can be done.

I guess arbitrarily-tiered system is a bad phrase. I just mean no explicit tiers, like, you have a bit set for mod status, a bit set for troll status, etc. Everybody will interpret authority, or hierarchy, differently, so just leave it as-is, at least with subjectively assessed content, like puns, politically incorrect jokes... Of course objective measurements are useful for winnowing things like necessarily true statements and necessarily informative statements, but not everything.

I think reddit has a pretty good model. So is Slashdot. They're both obvious improvements from BBS style discussions. But there's a lot more to explore, and this more should be where there is more flexibility. One example where this flexibility is lacking is in funny Slashdot comments. Some of the +5 Funny are great stuff, but I can't just, say, have a Slashdot Jokes section and only look at those (and add jokes to the jokes section). So there's seenonslash.com. But that's a reposting of slashdot comments; in effect, lots of information that could've been more effectively harnessed if the Slashdot system itself was more flexible.

Just my thoughts, but in general I don't like institutions and hierarchies and I'm biased against them. Of course it's insane to abolish all demarcations, but why not allow responsibilities to be expanded or dimished in small increments according to performance? It might take x percent of y to impeach someone in real life, but a virtual system would allow you to, say, have a bit less power if the populace thinks you should have less.

It'd be an interesting experiment if anything.

Lot of good thoughts there whacked. I also tend to be biased against institutions and hierarchies, although I have a similar bias against youtube comments and trollishness (which tend to be the same things). I think one of the main reasons that none of these basic problems has been solved is that the general mindset on when approaching them approximates your own. That is "we can jump out of the traditional way of thinking about roles."

I wish that were true, but the traditional way is the traditional way for a reason. It worked, providing necessary efficiency in a specific context. Now, I would argue in an 'ideal discussion system' there would be a variety of different contexts that would be modeled in different ways. This means that you have to have flexibility in the individual implementation, which is part of the reason I was exploring the possibility of lisp list strings in transactional databases below on this thread.

The problem with the arbitrarily-tiered system as I see it is that it relies on tagging. This may work in a smallish community but then you still have a problem when what you or I might consider trolls put youtube-style comments into this discussion on social modeling.

The main way that this has been dealt with is to have evolving communities that focus on very technological subject material. By keeping things at a sufficient level of abstraction, one is able to create a new tech-specific language that must be mastered before one can participate. In other words, one has created a de facto tiered system.

The dark side of this is that the need to force discussion into tech-specific language prevents people from having substantial discussion about fundamentals. And fundamentals (like how people discuss things, or would like to discuss things) are often more important than the latest trends in functional programming.

So I would say that Slashdot and Reddit, though good models, encourage a certain faddishness with respect to the abstract language which they enforce.

As for responsibilities being incremental instead of predefined, I have had similar thoughts. I think the problem (as articulated in political philosophical tomes as well as other places) is that the populace may shift unless it has an investment in something in particular. Usually this investment is in property. With the online world one would need something like core users which are in some way invested in the success or failure of a particular online venue. Once one had core users (property owners) then there are a number of fluid mathematical systems one could employ for the evaluation and promotion both of good content and good users.

I was going to post more philosophical musings related what I see as the Turing Project, but I think Pg deleted my last comment on the topic and this post is quite long already.

I'm interested in these mathematical models used for content filtering, and I made up a little thought experiment as an exercise, but somehow it was too convoluted and I couldn't get an answer. Basically it ended up being something like, how many cookies would you give to a well-behaving kid? Yeah, it's about as general as that, although I had all these different rules for the game. It seems rather futile to predict it beforehand, so I decided that I would just run the system and see what happens. After I fix up the horrible UI...

Digg and reddit do it in the most straightforward manner. Slashdot is good because it actually matches up with some material in conversation theory. The metrics they chose are pretty good. The rating system though, not good enough.

I agree that the tech-specific language acts as a filter itself, but from a different perspective, I realized that Slashdot's modding system's UI and rules are probably a filter too. i.e., if you are willing to learn all the modding rules, and use the modding system, you've passed the "ritual." To the average user, these complications may well be intolerable.

I need to figure out what's tolerable, soon :)

Here's one model I'd be interested in testing. Three roles: basic, member, and administrator. Each account has karma (natural numbers), but the exact of karma is invisible to all except the administrator. There is a karma meter which displays karma as a color gradient with different levels assigned different colors.

Content rating is done on a color gradient, from white to blue. Blue is good, white is bad, light blue would be okay. Only members can rate content. Rating would be also be weighted according to the karma of the person doing the rating. In other words, members with more karma would give (or take away) more karma per rating.

Each post would receive a default rating based on the karma of the poster, which would display as a color on the white to blue spectrum already described. The color would change as people rate the content.

Not only comments, but also threads could have ratings on a similar basis. This would allow people to set up certain filters so they would view threads with a certain content rating. Probably a person could automatically become a member when they pass some arbitrary karma threshold.

What do you think about this? Are you already working on something 'tolerable'?

Similar, but a non-karma karma system. :)

One thing though, I am fairly uncertain about the color gradient. In the beginning I used red-green. But some cultures have the order reversed, and total red and total green excite the eyes differently. Then I opted for using colorless-color (white-"gold", since gold is supposed to be culture-neutral), but when I showed some people they were confused and told me to use red-green :/

I also considered the "full functionality after x threshold" but It's opaque to me what would be a reasonable x. I hope I'll be able to make a judgement later, based on real user data.

Yeah, I'm not big on the word 'karma' myself, but it happens to be what is out there.

I'm not sure about 'x threshold' either. If 'karma' assignment is weighted it wouldn't matter much anyways.

As for colors, what about a bar that increases vertically and has multiple segments? Higher is better and darker? This would give people two things to lock onto, just in case color is not enough.

just my two beans...

"They were never going to jump into a discussion and tell people to stop being jerks."

Really? Isn't that more or less what they did with Pica when they requested that he stop using certain epithets?

Did they? I didn't know about that, but I'm not surprised. Still, it shows how tolerant they were if it took him to make them say something.

I remember reading something from spez (I think) to the effect of "We asked him to tone it down, and he did." And I think pica said he'd been told to say "n-word" instead of, well, the n-word.

I have no idea why this was satisfactory to the reddits, since all it did was add a hilarious dash of ironic political correctness.

I really doubt that they did...

I believe richardkulisz was also dinged. His new username is "redditcensoredme". So it does seem to only be the most extreme cases.

There are some very popular forum sites that have existed for years and maintained civility. Any Reddit-type site could learn a lot from them. They almost all have a group of core users who act as moderators and enforce a set of published rules. It's critical that they not make up the rules on the fly, otherwise users won't respect the decisions.

I think richardkulisz just misunderstood the reporting algorithm. He was a little less polite than pica, so people resented and reported him rather than downmodding and ignoring.

The reddit founders are letting the users use the report button for censorship instead of spam control -- so that's what it is. That's not good if you ask me.

Lowtax (he runs somethingawful.com) gave a talk about web communities at UIUC a few years ago and he basically said without good moderation large sites will just degenerate into crap. The talk(http://www.acm.uiuc.edu/conference/2005/video/UIUC-ACM-RP05-...) takes 15 minutes or so to get into anything serious, but I think there are some good lessons in there.

Did anyone point out to the reddits that telling someone to stop being a jerk is not censorship?

The one issue I have with the comments are they are not in chronological order and reorganize as new posts are added. It is annoying to figure which messages you have already read and which you have not since they are all mixed together. A simply chronological thread would be nice.

On the right hand side of the screen, you can sort the comments chronologically.

Thank you. Never saw that before. That should be incorporated on YC News.

I would like if there is a function to hide and expand the comments. For example one click on a comment will hide itself and all its sub-comments. It is a good way to organize the discussions, and I always believe a website is at its best when scrolling is minimized.

I'd like to see 2 modderation options.

1. The standeard up/down arrow -- does this contribute to the debate, is it a troll etc. 2. The agree/disagree poll. This could be a thumbs up/thumbs down.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact