Hacker News new | past | comments | ask | show | jobs | submit login

If there's one thing I wish I could do to improve HN, it would be to detect this sort of middlebrow dismissal algorithmically.

Unsophisticated people read an article like this and think: Gosh, I better eat honey for breakfast! People a little more sophisticated think: Hey, this is anecdotal evidence! Yeah, we know that. But is that the most interesting thing one can say about this article? Is it not at least a source of ideas for things to investigate further?

The problem with the middlebrow dismissal is that it's a magnet for upvotes. The "U R a fag"s get downvoted and end up at the bottom of the page where they cause little trouble. But this sort of comment rises to the top. Things have now gotten to the stage where I flinch slightly as I click on the "comments" link, bracing myself for the dismissive comment I know will be waiting for me at the top of the page.

I'm not commenting on this article specifically (not least because I haven't read it), but there are plenty of articles that deserve little more than "middlebrow dismissal".

Very likely one could find plenty of more interesting things to say about those articles. One can find interesting things to say about anything. But if an article is about an attention-grabbing subject, is superficially plausible, but is just plain unsound in the sort of way the grandparent of this comment is alleging, the "middlebrow dismissal" -- predictable as it is for the cognoscenti -- may still be the most useful thing there is to say about it, and a comment thread that didn't have "please note, this is probably wrong in the usual way" near the top of it would be a bad and misleading one.

If HN is worse off for being full of middlebrow dismissals, the real problem may not be the middlebrow dismissals but the articles that provoke them.

I'd been thinking about this. It's true that sometimes an article is so obviously mistaken that the most valuable comment would be one that pointed out how. But such articles are fairly rare here. Ideally comment threads in which the top comment was a dismissive one would be proportionately rare. Whereas now they're almost like background radiation.

I agree, but I'm not sure that there is any other way to establish positive discourse -- for most posts there are the enthusiasts and the middlebrow detractors, and virtue may often only appear after both of these positions are raised and discarded in favor of an hopefully Aristotelean mean. This is also to say that appropriate discourse depends largely on the context (i.e. what has already been said on the topic), and not simply on a single right answer (as in the sciences).

It would be interesting to do some sentiment analysis on comments and look at the statistics of dissmissive/approving top comments and all comments in general in HN threads.

I think the way to solve this is by better education and reinforcement. Wouldn't hurt to place a prominent link in the top nav like"better commenting" that went to a page with explanations and examples of good and bad. Perhaps have it up there one week per month.

There are many articles that should be dismissed, but when every time I click on the comments, I know somebody will be saying something negative about the article (and usually something fairly obvious and uninteresting), then I can't help but agree with PG that there is intellectual snobbery rather than real, interesting discussion happening.

You don't get many points for interesting. You get points for cynicism and negativity.

What I find interesting is that I often learn/gain or notice something from these middlebrow dismissal comments. I am not saying they are all good - but many here make legitimate, logical and scientific points. With the rise of blogging and other open/free channels of self publishing, these comments are often necessary rebuttals to the overhyped tones present in many articles.

Every dismissive comments not a "middlebrow" comment. There are some people here with some really great knowledge. I love the "highbrow" dismissals because that's when I learn something new.

What the article itself deserves is a minor issue.

What the HN discussion deserves, though, is different. We as a community can do much better than that. We do, sometimes. We can more often.

Perhaps a challenge for anyone who feels that an article really must have a middlebrow dismissal---

Limit your dismissal to a few sentences, then follow on with a paragraph or two of thinking that goes beyond the dismissal into areas of thought sparked by the subject of the article. What bigger picture might it factor into? What underlying phenomenon may be behind it? What additional understanding can it add for us as a community?

I certainly won't try to speak for pg, but one thing to keep in mind is that this is a good way to approach topics if you're trying to come up with new ideas that have big potential. Merely pointing out the existing problems is the first step. Take the next few as well.

It's been a while since I've coined a new term on HN, but I think now is as appropriate a time as ever:

cargo cult skepticism.

edit: ~10 previous results on Google. :-/

Or you could just reference William Golding.

http://www.smartercarter.com/Essays/Thinking%20as%20a%20Hobb... It's "Grade-two thinking".

Skepticism is also the wrong name for this ideology. Genuine skeptics would direct their doubt against the leading dogma of the age, whereas this ideology seeks to comfortably reinforce it. Historically, its true name – as I believe you pointed out here once – is positivism.

No, it's a fine name:

cargo cult skepticism is to skepticism like cargo cult science is to science. (Which is to say: it isn't)

I've noticed this too. It's a vicious hit-and-run attack on a thread's discussion that presumes only the empirical viewpoint is worth considering. It often trainwrecks the thread into intellectual pissing contests, rather than discussing subjects with fellow human beings.

Removing karma entirely might help. Too many people treat it as a vanity metric, and game it.

And the way to game it is by posting "middlebrow dismissals".

As a poster I am often surprised by what gets upvoted. Everyone surely can see this: negativity is rewarded. It only reinforces in my mind that "karma" means little anymore.

You can get karma just by being dismissive. Look at the tone of some of the posters who consistently jump into the top spot, thanks to their "karma". I get tired of seeing those same monikers over and over^1; the comments are often rubbish. No matter how articulate and cogent their commetns may have been in the past, no one is 100% consistent; we should not have to read _everything_ they say. But it doesn't matter if they are on the mark from day to day because they get a top spot no matter what they contribute, based on accumulated karma. You are forced to read what they've said, no matter how silly it is.

1. Unless you need to contact someone, I find usernames and profiles to be about as useful as karma (=not very), but I doubt many others would share my view. My interest is in quality comments that offer useful information, not "reputations". People with great "reputations" often make some very dumb comments. Judging the quality of a comment by the author's username instead of its content is a fool's game. It's also a basis for the HN algorithm.

> only the empirical viewpoint is worth considering

So what other viewpoints are there?

I would be far more interested in a discussion about how we could identify the factors that lead to a community that doesn't appear to be following the aging and health norms of other communities rather than just saying "the article is stupid and isn't worth reading".

"how we could identify the factors that lead to a community that doesn't appear to be following the aging and health norms of other communities"

How do you propose to do this while avoiding empirical evidence?

> I would be far more interested in a discussion about how we could identify the factors that lead to a community that doesn't appear to be following the aging and health norms of other communities rather than just saying "the article is stupid and isn't worth reading".

OK, I don't think we're using 'empirical' in the same way, then. 'Empirical', the way I've always seen it used, just means 'evidence-based' or, more verbosely, 'based on observed facts and not purely theory or philosophy'.

In particle physics, the fine structure constant is an empirical constant: We don't know how to derive it from any theory that doesn't include it already; if we want to have the correct value for the fine structure constant in a theory, we have to explicitly put in the value we know from experiment, that is, the value we derive empirically. Compare this to the value of the acceleration due to gravity between two objects of known mass: We can compute this value, derive it from a theory, called the theory of universal gravitation. We don't have to physically construct an apparatus and perform an experiment every time.

Frankly, it seems that you're tired of people being dismissive based on an imperfect knowledge of a set of specific formal and informal fallacies they came across once.

No, we aren't using the word empirically differently. I'm just saying that the interesting conversations revolve around where to look, not whether to look or not.

"Empirically" there is something interesting on that island. I'd love to hear ideas of what it could be, along with ways to test those ideas. The former without the latter is how snake-oil gets sold, but shutting down all conversation because snake-oil could be sold doesn't move us forward.

Really, it's intellectual pedanticism at it's finest.

It's amusing that people are so worried about being correct (and others being correct) on the Internet. Insisting on only talking about empirically measurable things is not a fail-safe way to raise the S/N ratio of a site; it just dulls the topics to those we already know well.

Meanwhile, it potentially rules out threads on things that we're still trying to discover the inner workings of: nutrition, aging, sleep, and many others. Those topics are extremely interesting because they can veer into uncharted intellectual territory. And we may only have anecdotes to go on. Quelle horreur!

We use empirical evidence to talk about things we don't fully understand all the time. That's what theoretical physics is, in fact.

> I'm just saying that the interesting conversations revolve around where to look, not whether to look or not.

And we find out where to look based on empirical evidence most of the time.

> shutting down all conversation because snake-oil could be sold

This has nothing to do with empirical evidence.

And we find out where to look based on empirical evidence most of the time

-- This is most often not true.

Conscious thought is terribly inefficient. Most 'looking' is instinctive, or intuitive. That's not to say it has not been educated or modified by empirical data at some stage. But this illusion of such hyper-rationality is worth avoiding.

What is interesting (sometimes) is to hear other people's intuition and prioritization as the evaluate what to look for. This is typically what seperates out class in real world performance. This can be considered "framing" done loosely. when, why and where people create a box (in which to think, in the manner you are suggesting above).

To the parent's point, it's often times boring to disregard an interesting framing (out of hand) because of a technical flaw. Similarly, there is endless boredom to be had reading articles with reasoned logic in flawed or boring frames (ie, those which exclude or impugn the interesting bits).

We see this alot in the media, now, because its part of the PR spin game. The formula is to put bounds around the problem that suit your desired result. Journalists also often due this due to ignorance of a technical subject matter. We also see this as part of the fairness doctrine -- every story needs 'two sides' so a (often false) dichotomey is cookie cutter textbook inserted into every 'analysis'. ect.

In all honesty, if this is what bugs you the most about HN comments, then that mostly means that HN is doing remarkably well, given its population increase.

2 years ago, many HNers were predicting that HN's growth would inevitably turn it into yet another Reddit or 4chan. Instead, the consistent top comment is what you call the "middlebrow dismissal". Really, that's not so bad, compared to the top comment on $ANY_OTHER_SITE,_REALLY. Just scroll down to the next!

(that said, I found your complaint insightful: in fact, I realise now that I've been guilty of the odd middlebrow dismissal myself)

> in fact, I realise now that I've been guilty of the odd middlebrow dismissal myself

This isn't as bad as you think it is. This kind of dismissal is a bit like adolescence; you engage in it as a part of mental growth. Every strong thinker has gone through it at some point, and probably regresses to it with some frequency. You hopefully grow past it, but you do go through it.

> "If there is one thing I wish I could do to improve HN it would be to detect this sort of middlebrow dismissal algorithmically."

Do you have some data that we could use to play with? That sounds like a nice problem to solve.

You already have the data I'd use: the text of the comments.

If anyone wants to try to train a filter to detect this sort of comment, I'd be very interested to see the result.

I'll consider that a challenge...

To any that have experience getting comments data from HN -- what's the fastest, most polite way to do this? And am I correct in remembering that there's some aggressive rate-limiting for crawling the site?

Don't crawl the site, please. The place to get data is the HNSearch API.

There's a database(quite old) of HN posts and comments here:


Any plans to try to get a dataset for supervised ml? Perhaps collect the top 4 comments from all front page threads and post a survey on HN asking HNers to rate those comments for skepticism/dismissiveness?

I might bet that upvotes to middlebrow dismissal would be highly correlated with downvotes to the article itself, if we had downvotes to articles. In fact, I believe that these comments rise to the top because people are downvoting the article vicariously, by upvoting a rebuttal, however vacuous. In order to confirm this hypothesis, though, you'd have to collect data by implementing a downvote button on articles -- though it would not necessarily have to do anything in article-ranking terms ;)

(That might introduce a confounding factor, though--namely that by alleviating people's urge to downvote the article by giving them a [nonfunctional] button to do just that, people might stop upvoting the dismissive comments. Hmm....)

>You already have the data I'd use: the text of the comments.

Wouldn't that require real AI though? I thought for a minute that NLP (Natural Language Processing, not the other meaning(s) of the acronym) might help, but then thought that it may not work for cases where the comment is quoting another comment. Note: I'm not at all an expert in any of those fields, just interested.

Sounds like a job for Sentiment Analysis [1]. Modern systems are pretty good at discerning negative from positive comments.

You could probably find a way to mark negative and positive comments. Whether the resulting algorithm would be fine-grained enough to semi-reliably mark 'middlebrow dismissal,' I really don't know. Actually, as somebody who has worked on that stuff in the past, I don't think it would be very easy.

[1] http://en.wikipedia.org/wiki/Sentiment_analysis

Bing Liu is one of the researchers working on this. I've discussed Amazon review fraud detection with him. http://www.cs.uic.edu/~liub/

Thanks for the link.

Agreed I don't think it would be an easy task, but I wonder how would perform a "bag of word" approach.

Harder part as I see it would be to categorize the comments on middlebrow dismissal / Not dismissal. It seems like we would be spending more time preparing the data than in the algorithm itself.

Welcome to statistics/data science/machine learning.

The hardest part is always getting the data into usable form. Its not as much fun as fitting models, but its definitely the majority of any role where people pay you to do this kind of stuff.

There's a lot of good research on forums (pm me if you want a bibliography i collected for a previous role), and short texts have become a bigger deal post Twitter. I completely agree with pg on the somewhat annoying nature of comments such as the GP.

Hey... Thanks for the offer! How can I contact you? I didn't know we can pm users over here.

I did have worked with ML before but mostly with images which are (IMHO) way easier to put in a format depending on the problem.

My email is obsfuscated in my profile. Should be pretty clear, conditional on your humanity.

Hadn't heard of it; thanks for the link.

It might be easier to investigate submissions first and filter at this level if a trend is discovered. That is, a certain type of submission might attract a certain attitude of comment.

Could anyone with access to HN comments voting data execute something similar to the SQL query below and evaluate the results?

The idea is to use past voting correlation with other users to sort the comments.

  declare @userId int;
  select @userId = UserId
  from users
  where Username = 'pg';

  /* Let's calculate expert table first. We will use it to rate comments later. */
	sum(v1.Score * v2.Score) as VotingCorrelation
  into #expert
  from CommentVotingLog v1
  inner join CommentVotingLog v2
	on v2.CommentId = v1.CommentId
  where v1.UserId = @userId
  group by v2.UserId;

  /* Now we can rate comments against #expert table: */
	(select sum(v.Score * e.VotingCorrelation)
	from CommentVotingLog v
	inner join #expert e
		on e.UserId = v.UserId
	where v.CommentId = c.CommentId
	) as Rating
  from Comment c
  where c.ArticleId = 4692598 -- or another article that's discussed
  order by Rating desc;

I assume CommentVotingLog table has CommentId, UserId of the voter, and Score that voter gave to the comment: +1 for upvote or -1 for downvote)

I also assume that CommentVotingLog table has at least one record for every comment -- the author of that comment gives Score = +1 to that comment.

These queries don't have "freshness" adjustment (older comments had higher change to get upvoted, so their rating should be somewhat downgraded).

I'm always conflicted about interesting comments which link to something I read and learn from, but which are needlessly insulting. The grandparent linked to FightingAging.com and I read the article and clicked on some of the links and found it informative, though not convincing. But I cringe when he says, "Nonsense about antioxidants in the diet is exactly that: nonsense."

So do I upvote because I learned something from his comment, or downvote because his tone degrades the quality of discourse on Hacker News? Clearly pg says "downvote", and I often do, but I'm always on the fence about it.

Solution: Get rid of voting. Order the comments randomly.

I've never understood the point behind anonymous voting.^1 What value does it add?

1. But some people, think of them what you will, have tried to use voting as a way to be more convincing when pandering to advertisers, e.g., Facebook "Likes".

Assuming we were the intelligent, high brow readers you would hope would be reading your forum, then wouldn't we be smart enough to see that voting adds nothing, except a source of amusement (as the silliest comments or those from random members with "high karma" rise to the top)? Intelligent people do not need a "karma system". They can take in all available information and separate the wheat from the chaf on their own. (No need for someone else, someone else's algorithm, to manipulate the order of comments.)

How about a "middlebrow dismissal" link to complement the flag link?

Of course, it doesn't need to be called "middlebrow dismissal" but there is nothing better for determining if something is or isn't middlebrow dismissal than users.

TBH, the flag link could be more powerful/useful by asking people to explain why they are flagging something. Later on you can release the categorized flagged post dataset for others to train an algorithm against. If someone is willing to spend the time to click on flag, they are demonstrating that they care about post/comment quality and that is itself a good indicator that they'd be willing to spend the time to tell you exactly why they are flagging something.

"If there's one thing I wish I could do to improve HN, it would be to detect this sort of middlebrow dismissal algorithmically."

3 years ago, HN was great. Amazing in fact. What's changed in 3 years? Certainly not the system. The user base has changed, grown, degenerated into stereotypes and punch lines. There is an old saying that I believe succinctly explains what has happened: Garbage in, garbage out.

I keep seeing people writing about wanting to "improve HN." Every time I see this I think, are these people mad? It's dead Jim. He's been dead. We can all sit here and prod his body and make recommendations for how best to make his arm into a grappling hook or some such nonsense, but at the end of the day, the patient is STILL dead.

If there is one thing I would do to improve HN, it would be to write the death certificate and move on to finding or creating the next HN.

Also, I thought the article was excellent, albeit long.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact