

PG: What's your current take on reddit's comment system? - brett
http://reddit.com/info/17844/comments/c50
While reddit's comment system is obviously an improvement over Slashdot's, I'm increasingly convinced that the community is more important than the tool. 

======
Alex3917
On the topic of community makeup, here are some interesting findings that come
from Yahoo's social scientists (using Yahoo lists as the dataset):

1) Small groups have higher numbers of core users

2) Private and semi-public groups have higher numbers of core users than
public groups

3) People in private groups remain in the core for longer

4) A user who is in the core of one group is more likely to become a member of
the core when joining another group

5) New members who eventually become part of the core are more likely to
receive messages from other users who have been long-time core users

6) New members who eventually become light users are more likely to receive
messages from other light users, if they receive replies at all.

This suggests some strategies for building a strong group of core users to
transmit social norms. For example, having an idea of who the core users are
in other similar communities and making a special effort to reach out to them
if they visit your site.

Unhappily I lost the original paper and all I have are my notes, so I don't
know exactly how they defined the core users offhand.

~~~
Jd
Alex, this is very interesting. Do you think there might be a way to get one's
hands on the original? If there is a person/persons at Yahoo to contact I
would do it, as this is something I have thought about quite a bit and would
love to see more about the definitions of core, light, new, long-time, large
and small.

You can email me if you like.

~~~
Alex3917
So I heard this talk at the Cornell Microsoft International Symposium on Self-
Organizing Online Communities. The presentation was called "Implications for
Future Research; Theory and Methods in Large Scale Analysis of Online
Communities." And the three Yahoo presenters were Ravi Kumar, Andrew Tomkins,
and Cameron Marlow. Not sure which one those three specifically, but I'm sure
you could ask any of them.

I've developed this hobby of sneaking into academic conferences. It works best
if when everyone else is wearing a suit and tie, you show up in shorts and a
t-shirt. That way everyone just assumes that you're friends with one of the
speakers, or else that you're Sergei Brin. :-)

~~~
Jd
At least one related paper is available online:

[http://research.yahoo.com/publication/structure_and_evolutio...](http://research.yahoo.com/publication/structure_and_evolution_of_online_social_networks)

I probably won't have time to read it until this evening, but will post then.

~~~
Jd
Yeah, this is mostly about speed of growth in social networks and only looks
at Yahoo 360 and Flickr. I will keep looking for the original paper and post
back if I find it.

------
acgourley
I have the same problem with reddit and digg: I like to read exceptionally
downmodded comments. When comments get modded down so much that they are not
displayed, and also they generate a lot of responses, it's the kind of thing I
can help but look at. Like a train wreck I suppose.

It actually sucks me into the troll MORE when they get such special
highlighting. I should say I do like how top level troll posts get put at the
bottom for reddit.

~~~
deramisan
It bugs me that downmodded comments trash karma - political disagreements
often get downmodded not because they aren't valid, but because someone
disagrees fundamentally. That doesn't seem right.

~~~
icky
Reddit comments DO NOT AFFECT KARMA.

Only submitted articles do.

I have no idea how you got any different idea into your head, but it was not
through observation of reality.

------
brett
Incidentally I came across this while trying to find information about how
reddit stores threaded comments. I've read they use postgres and while I've
also read about many workable techniques to force hierarchical (and frequently
reordered) comments into a relational db these solutions all sort of make me
cringe. I'd welcome anyone's ideas here as well.

~~~
twism
i am in the process of making a news.yc clone... threaded comments are not as
easy as it looks on news.yc.

[edit] especially when the comments get reordered on votes.

~~~
aston
If you're trying to mimic this forum with a database solution, it's going to
be a pain. PG's keeping this stuff in memory, almost certainly as the tree
that it is.

edit on the edit: scratch that.

~~~
nostrademons
I mentioned a possible solution here:
<http://news.ycombinator.com/item?id=33902>, but I'm really commenting because
there's a really cool connection between PG's approach of storing it as an in-
memory tree and my approach of a 2048-bit index:

Y'know how in a typical CS datastructures course you'll have to build a heap,
and then reimplement with an array? A heap is conceptually a balanced binary
tree where every leaf is filled, while there's nothing tree-like about an
array. However, your CS profs were trying to get across that _any binary tree
can be stored as an array_ , with the root in position 0, its leaves in
positions 1 and 2, their leaves in positions 3, 4, 5, 6, etc. If a node
doesn't have a leaf, that position in the array is empty.

Normally this is very inefficient - a totally-unbalanced (linked list) binary
tree of length 20 would require a 1MB array to hold it. However, heaps are
always perfectly balanced, so an array really is the most efficient
implementation for them.

By restricting users to a certain number of replies to each comment (65,536),
I'm effectively constraining an arbitrary tree structure to a 65,536-ary tree.
Which can then be represented by an array. If held in memory, it'd be a very
sparse and inefficient array.

But databases are excellent at storing sparse arrays. If I don't store a
potential key, it doesn't take up any space in the DB. And with a good
indexing system, I can reference any element in constant time (well, O(logN),
but the log base for B-trees is so huge that it might as well be constants).

So, I'm really just applying some of the meta-principles of CS coursework to
basic data structures. (Actually, I got the idea off a Slashdot posting and
had to figure very little of this out on my own, but it's still really cool.)

Incidentally, there's apparently a deep connection between number systems and
tree structures. You can read more about it in Chris Okasaki's _Purely
Functional Data Structures_. Apparently, every number system corresponds to a
data structure. For example, Peano arithmetic = linked lists, binary numbers =
binary trees, fibonacci numbers = binomial heaps, etc. There's a short
presentation on it here: <http://www.informatik.uni-
bonn.de/~ralf/talks/BCTCS.pdf>, and I'd strongly recommend the book.

~~~
Jd
Just thinking off the cuff here, but couldn't one store a lisp-style list as a
string in a relational database. If I was using Ruby on Rails, I could simply
build a simple lisp interpreter in Ruby to parse the string, then capture it
as an in-memory tree. This way I would only have active items in memory and
could use some of the advantages of the standard web development platforms for
whatever else I was doing.

Edit: See Ruby/Lisp for one such already built interpreter.

~~~
nostrademons
Sure you could. You could also select all comments on the item and rebuild the
tree in-memory, as Vlad suggests elsewhere on this thread. The code isn't even
as hard as I'd thought it'd be:

    
    
      comments = dbh.query('SELECT * FROM comments WHERE news_id = $newsid')
      response_tree = {}
      for comment in comments:
          comment.children = []
          response_tree[comment.id] = comment
      for comment in comments:
          response_tree[comment.parent_id].children.append(comment)
    

The nifty thing about storing it in the database is that you need _no_
application-level post-processing, you can take advantage of the DB for ad-hoc
queries (what if you want to show subtrees of all posts by a certain user,
like the "threads" page here?), and it happens to map to a set of simple and
fast operations provided by most databases.

~~~
Jd
After some reflection, I suppose my only hesitation with this method is that
it locks one into dealing with the comments in a particular way, whereas the
lisp model I outlined is easily adaptable to a variety of schemas.

However, I'm not sure how else one would want to model things. Must think
more...

------
acgourley
If anyone is interested in a blog post about the problems with comment and
forum systems in general, here is something I wrote on the subject:
<http://www.digitalkarate.net/?p=20>

------
brett
While reddit's comment system is obviously an improvement over Slashdot's, I'm
increasingly convinced that community makeup is far more important than the
tool it uses.

~~~
pg
I agree, basically. Except I think customs are as important as or more
important than makeup. That's why I try to discourage ad hominems.

To be fair to the reddits, it wasn't their fault that the discussion on reddit
sank down toward that of digg and slashdot. They have really strong beliefs
against censorship. They were never going to jump into a discussion and tell
people to stop being jerks. But empirically it looks as if you may have to.

The optimistic way of phrasing this is: if you have good customs and existing
users enforce them, you can probably survive the influx of 14 year olds when
it comes.

~~~
byrneseyeview
"They were never going to jump into a discussion and tell people to stop being
jerks."

Really? Isn't that more or less what they did with Pica when they requested
that he stop using certain epithets?

~~~
pg
Did they? I didn't know about that, but I'm not surprised. Still, it shows how
tolerant they were if it took him to make them say something.

~~~
byrneseyeview
I remember reading something from spez (I think) to the effect of "We asked
him to tone it down, and he did." And I think pica said he'd been told to say
"n-word" instead of, well, the n-word.

I have no idea why this was satisfactory to the reddits, since all it did was
add a hilarious dash of ironic political correctness.

------
transburgh
The one issue I have with the comments are they are not in chronological order
and reorganize as new posts are added. It is annoying to figure which messages
you have already read and which you have not since they are all mixed
together. A simply chronological thread would be nice.

~~~
bls
On the right hand side of the screen, you can sort the comments
chronologically.

~~~
transburgh
Thank you. Never saw that before. That should be incorporated on YC News.

------
naivehs
I would like if there is a function to hide and expand the comments. For
example one click on a comment will hide itself and all its sub-comments. It
is a good way to organize the discussions, and I always believe a website is
at its best when scrolling is minimized.

------
dantheman
I'd like to see 2 modderation options.

1\. The standeard up/down arrow -- does this contribute to the debate, is it a
troll etc. 2\. The agree/disagree poll. This could be a thumbs up/thumbs down.

