

Ask PG: Statistics - tel

There have been a few posts (one being mine) around concerning fears about "fluff" posts taking over. Solutions are ranging from allowing downmods, adjusting weighting algorithms, and even just blacklisting Reddit posts.<p>One thing I'd like to see before considering adjustments like this would be more detailed statistics about how people vote, how leaders vote, how karma is distributed, &#38;c.<p>I feel like interesting data sets would be:<p><pre><code>   Across all users
     Karma
     Number of votes
     Number of submissions
     Times downmodded
     ...

   Across all posts
     Karma
     Comments
     Flag for whether it's considered "fluffy" (mod's discretion)
     Flag for Reddit submission
     ...
</code></pre>
Really, it could be a pretty big problem. It does seem like a potential playground of hacks and at least some of those statistics are certainly not difficult to mine (probably Arc one-liners).<p>So, if it seems interesting and there's not any issue with privacy (strip out usernames and it'd be hard to correlate anything beyond the leaderboard) is there any chance of seeing a hn-stats tarball?
======
bigtoga
Anyone so concerned with stats that does not work for HN is just looking for
more places to waste their time instead of working on their hack/startup. This
isn't your community; it's a website. You may think of it as your community
but where will you be two years from now? Still wasting time on this site
asking for stuff like this? No. Google "This too shall pass" for it is the
lesson of life. Really - the post today, "Do It Fucking Now", applies here as
well lol. Figure out what's important and do that; having stats on this site
is not important in your or my life.

~~~
bayareaguy
Some of us aren't interested in the numbers but we are interested in the
patterns behind them, especially if they could be used to identify destructive
trends.

~~~
fiaz
Organizations that think like this:

DHS

FBI

CIA

DOD

KGB

Thought Police

Please don't add YC.News to this list. It's one of the few places I enjoy on
the net at the moment (and of those few, this is my favorite). I don't like
the notion of having some sort policing activity to identify "destructive
trends" as you call it.

Sorry for putting it so bluntly.

~~~
rrival
Having observed Digg and Reddit devolve, solving this problem has nothing to
do with conspiracy theories. This is more about addressing Eternal September
concerns (brilliant reference by someone yesterday). History will repeat
itself until a clever, llama-free community _ahem_ addresses these problems
intelligently. I can't think of a better place for it.

~~~
fiaz
I'm sorry that you think I was subscribing to some sort of conspiracy theory
in listing those organizations. The point I was trying to make is that what
has been termed "fluff" is now "destructive trends", and worse it was somehow
suggested that by exposing others behaviors (or rather, revealing stats as
illustrated in the above description) would somehow lead to a (final?)
solution.

The degree to which the original "problem" is becoming viewed as cancerous is
alarming because I feel that News.YC is a place for diverse thinking and
sharing interesting content. Digg/Reddit have devolved because they are more
homogenous than they are diverse (hence the mob-like mentality).

The very diversity that makes an online community interesting should be
embraced instead of eradicated, but that's just my opinion. I know that
eventually as News.YC becomes more popular this community will homogenize and
follow the same fate as Digg/Reddit.

It is inevitable...

~~~
bayareaguy
_I know that eventually as News.YC becomes more popular this community will
homogenize_

Suppose you wanted to convince someone who doesn't "know" that. Don't you
think the right data could prove your point?

~~~
fiaz
I don't feel the need to convince anybody of that which they do not know.
Also, I don't feel that having data of users is more advantageous than proving
my point; there are also other pieces of info I can dig up (if you're
interested) that may or may not strengthen my argument. I am VERY open to the
idea that I might be wrong, which indeed might be the case. However, I will
always maintain that what most concerns me about the nature of this discussion
is NOT the efforts put forward to solve a perceived problem but the way in
which a threat is interpreted and what we are willing to give up in exchange
for dealing with the perceived threat.

Btw, I really don't want to come off as being antagonistic here. I'm sticking
to my view until I see a benefit to what everybody has been advocating. In the
meantime, feel free to downmod me if you disagree. I have no problem paying a
penalty for disagreeing, if that is the cost associated with expressing my
views, regardless of how well they are accepted.

(thanks for reading this long-ass post if you got this far!!)

------
andreyf
In general, Paul, is there a reason not to open up a machine-readable HN
platform for people to tinker on top of (other than the time it'll take to
code)?

~~~
ed
Exposing voting history strikes me as an obviously bad thing to do given that
votes have always been anonymous. Even if you tried to scrub personal
information from the data, it'd be fairly easy to match anonymous ID's to HN
nicks.

~~~
tel
Full voting history, probably, yes. Number of votes (some relation to
activity, perhaps) not so much.

Anything that's obviously going to make it easy to reverse engineer anonymous
things isn't a good idea, but that doesn't mean you can't still find
interesting gems.

~~~
yters
I agree. Aggregate stats such as whether a small group does most of the
voting, whether the same people seem to vote for the highest ranked items,
etc. would be pretty interesting while not giving away too much information.

I am also curious whether a completely open system would work. Has that ever
been tried? On a system the size of Reddit or Digg it would be intriguing to
see how groups cluster.

------
iamdave
I'm in favor most of the flag option. I understand how users with a high
degree of karma can prove themselves worthy by virtue of what karma even
means, but combining flags and the number of times downmodded creates a more
level playing field for the process.

------
ghiotion
The original question aside (for the record, I'd find it interesting as well),
does anyone else think this comment thread is kinda nasty? Ugh.

------
fiaz
I have the following question for those that are advocating a
tech/statistical/mathematical solution: How many of you are individuals who
are in the process of creating a startup in the hopes of getting YC funding?

The reason why I'm asking this question is because I had the impression that
the articles here are more reflective of the VC/hacker roots of YCombinator
(silly me!). I also feel that somehow alot of the up-votes for this
perspective is reflective of the notion that if you complain to PG enough,
he'll swoop in and fix things so that all is well. Hint: if you continue to
have this attitude, then you will see your chances of getting YC funding (or
even have them bat an eyelash at your application) decrease rapidly because
your objection to "fluff" is so strong; do you really think anybody in their
right mind would fund somebody who can't handle the "fluff"?!?!?!?!

I recognize that there are many who have been suggesting solutions that PG
implement (and to his great credit he has pleased many of you by expanding the
leaders list to 100 now!); but I also want to know if your energies would be
better suited in actually finding articles that you like and submitting those
(in other words, being competitive), and perhaps even EARNING karma points
(which translates into recognition as well).

Not everything is solvable in code and not everything that you do not like is
a "problem". I would argue that the diversity (which can be read as the mix of
hard-core geekstuff along with fluff) is reflective of the VC/hacker roots of
Hacker News. Or in other words: break out of the box and have a multifaceted
view of the world.

~~~
iamelgringo
I'm probably going to regret responding to this, but here goes...

Many of us are also very invested in this site, and are interested in keeping
it close to its roots. That's why there's been a lot of alarm regarding the
content on the site. I'm sure that some of it will blow over. Some of it will
cause changes.

I don't really think that people are trying to get PG to automagically fix the
problem for us. I think that we're trying to have an open discussion about
solving the Eternal September problem with social sites. This is a recurring
problem that occurs on pretty much any site that has a user-contributed
component and gets popular.

And, regarding earning karma points: If you take a look at the people
frequently involved in the discussion on fluff, trolls and quality slides, I
think that you'll find that many of the people that are concerned about this
are actually quite active on the leader board. I think it's because we're on
the leader board, that we're so invested in keeping Hacker News on the
straight and narrow. We've invested a lot of time and effort into the site,
and are willing to continue that investment.

~~~
fiaz
First of all: great response!!

"...a lot of alarm regarding the content on the site."

The fact that people are finding "fluff" to be "alarming" is what is
concerning me more because this is the attitude that leads to mob mentality
very quickly. Everybody is free to think and act as they please. So if this
means combatting "fluff", then by all means do that.

On the other hand, I will always be speaking out for diversity. I'm sure we
can all agree that great ideas don't grow out of homogeneity and that it
requires a mix of information from different sources.

Fear mentalities are much more dangerous than "fluff" because they can seep
into other aspects of your life...

Btw, can anybody explain to me why I'm being downmodded so much? I really
don't care about the karma itself (except for real-life Karma of course!), but
I would like for somebody to explain to me what is wrong with my arguments
regarding this emerging "War On Fluff" (sorry, I couldn't resist!!). I will
also add that I hope when I ask for feedback on my startup/concept in the
coming weeks people are just as sincere about giving me some feedback!

;)

~~~
Spyckie
Its because you imply (intentionally or no) that everyone who believes that
some form of content control should be implemented has intentions similar to
the DOD, CIA, KGB, etc.

In retrospect, ultimate control of the site belongs to pg and if fluff does
start flooding the site, he could just reincarnate the site elsewhere with a
different policy backend. However, there's no harm in trying to do it right
the first time. :)

~~~
fiaz
Interesting....

The reference to that list of covert organizations was to imply the fact that
there is a lot of snooping going around into people's activities in real life
and that I like the fact that I can hang out with some people online without
the need to worry about what other people think my intentions are. Or more
directly: I don't like to have to worry about my participation on News.YC
being accepted/rejected on criteria other than being voted up or down.

You disagree with me on something? Fine, down vote me or respond, but don't
start trying to make predictions on what I might do next. There is nothing
friendly about trying to anticipate what my next move might be so that you can
identify something "alarming" about what I'm doing. I think the aptly termed
"fluff" is harmless, and I've been arguing all along that it contributes to
diversity, which is key for idea generation/exchange. Homogenize the
community, and it begins to lose value.

I MIGHT submit a series of articles that others think are "fluff", but does
this mean that now I go on some sort of "fluff watch list" and need to be
monitored, or perhaps "dealt with" because my articles/views/ideas are not in
line with the majority members?

Apologies to everybody if I am still coming across as implying some sort of
conspiratorial tone (in which case you may down mod me or explain to me where
I err in my argument), but I still stand by my conviction that monitoring
others to "weed out" certain "alarming trends" is far worse than the pain
suffered by "fluff".

I will add that fluff != spam. What others judge to be "fluff" is just that -
a judgement call. I'm simply speaking out against judgmental attitudes en
masse (or if you prefer the shortened terminology - mobs).

I appreciate the response!!

