
Show HN: I ran sentiment analysis on Show HN comments and got the meanest ones - walz
https://hn.walzr.com
======
Nasrudith
It is funny how polite mentions of crashes are regarded so negatively by it
compared to messages. Now I wonder what can be said to get an overwhelmingly
positive sentiment while being not just rude but downright horrifying
messages. Say "I hope you enjoy eating your family."

Reminds me of the one account of an essay question grading program's unique
flaw where it scored higher with every mention of orangutans with no regard to
relevance or possibily even grammar.

~~~
tahw
Do you have a source on the orangutan grader? That sounds like an hilarious
read.

~~~
Nasrudith
Sadly it was a while ago - the closest I could find was a MIT one mentioning
essay graders judging by length and connective words allowing for properly
structured nonsense to be judged as good writing.

[https://www.theguardian.com/education/mortarboard/2013/may/2...](https://www.theguardian.com/education/mortarboard/2013/may/24/automated-
marking-bad-for-essays)

I guess orangutan is a sufficiently long word that replacing most nouns with
it would improve your score.

------
brudgers
_Looks like a cool project, but I can 't scroll very far down the site before
my browser crashes. I've reproduced this several times, here's the terminal
output if it helps: $ conkeror
[https://alpha.trycarbide.com](https://alpha.trycarbide.com) ...... JavaScript
strict warning:
[https://alpha.trycarbide.com/](https://alpha.trycarbide.com/), line 603:
SyntaxError: test for equality (==) mistyped as assignment (=)? fault...._

To me, this is a good ShowHN comment...on the other hand, it might say
something about software error messages.

~~~
NathanKP
It's not a good comment if you consider that the browser crash they were
experiencing wasn't even the website's fault, it was the browser being buggy.
The JavaScript error that they copied from the console was an error from
inside the Conkeror browser itself (that browser is written mainly in JS). So
the OP was complaining to the Show HN creator about the failings of a buggy,
non-maintained (in the last 6 years) non-mainstream browser that they choose
to use to view the website.

Unfortunately I have seen that type of comment quite a bit on some of the Show
HN threads: someone complains that the site doesn't work without JavaScript,
or doesn't work with their bizarre non standard web browsing setup.

~~~
brudgers
Sorry for not being clear. From a sentiment standpoint, it's good, and that's
the relevant context for my comment. [0] From a technical standpoint, it's
gray. People land on both ends of the supported browser spectrum [1]...and all
over the middle. Likewise with graceful degradation. Anyway, while Conkeror
support is certainly near one end of the technical goodwill spectrum, I think
discovery via "ShowHN" is probably better than via an upset customer.

[1]: "Best viewed in IE6/Netscape" was standard at one point. Never mind the
recent trend of jackassware "Upgrade to a modern browser" popups for anyone
not using Chrome.

------
CodeWriter23
Looks like your algorithm is classifying some neutral-toned problem solving
type feedback as “mean”. Personally, that’s exactly why I would do a Show HN.
Examples

“Looks like a cool project, but I can't scroll very far down the site before
my browser crashes. I've reproduced this several times, here's the terminal
output if it helps: $ conkeror
[https://alpha.trycarbide.com](https://alpha.trycarbide.com) ...... JavaScript
strict warning:
[https://alpha.trycarbide.com/](https://alpha.trycarbide.com/), line 603:
SyntaxError: test for equality (==) mistyped as assignment (=)? fault.... -16
sentiment, chriswarbo 2 years ago in reply to "Show HN: Carbide – A New
Programming Environment"

“just filed an issue - but the error message is pretty obnoxious for a catch
all- bound to the $(window) error event is a catch all error that blames me
for not having enough data (56 public repos not enough?) This means that
anyone who knows this url and decides to look me up will see a message
accusing me of being a non-producer if anything goes wrong with the resume -14
sentiment, beezee 6 years ago in reply to "Show HN: My Github rsum"

...those were within the first 10. If this similarly neutral-toned problem
solving type report makes me mean in your algorithm’s view, that is a label I
shall wear with pride.

------
jancsika
> The most negative ones are shown below.

The message that is the third most negative by this metric is the following:

> Looks like a cool project, but I can't scroll very far down the site before
> my browser crashes. I've reproduced this several times, here's the terminal
> output if it helps: $ conkeror
> [https://alpha.trycarbide.com](https://alpha.trycarbide.com) ......
> JavaScript strict warning:
> [https://alpha.trycarbide.com/](https://alpha.trycarbide.com/), line 603:
> SyntaxError: test for equality (==) mistyped as assignment (=)? fault....

That is clearly a false positive.

~~~
craftyguy
My favorite is #5 from the top, where a user DMCA'd themselves to get yahoo to
delete some website they created previously:

> I used to have a Geocities containing weird bad poetry I wrote when I was a
> teenager. I forgot about it, until years later I stumbled upon it again. I
> was embarrassed. I asked Yahoo to delete it. But I'd forgotten the password,
> and I'd used fake personal details (wrong date of birth) to create the
> account, and I couldn't remember what the fake info was, so they refused to
> delete it because I couldn't verify that I was who I said I was. What do I
> do? I hit on a solution. I decided to DMCA myself. I sent Yahoo a DMCA
> takedown request for my old Geocities, and straight away it disappeared.
> Mission accomplished.

Again, not negative at all, IMHO.

~~~
russh
Yeah, been there done that. I once created a fake neo-nazi like web site with
the same name of our Dark Age of Camelot guild. Mythic had a no guild name
change policy at the time and we wanted to change the name. I sent off a
trouble ticket to customer support with the link to the "newly discovered"
site and we had a new name in less then an hour.

------
kunimi
Quick question. How does your sentiment analysis treat the following two
sentences?

I fucking hate this thing.

VS

I fucking love this thing.

~~~
anoncoward111
It would have to somehow know that "fucking" in this case is being used as an
adverb similar to "really", e.g "I really love dogs".

Another tough one would be "I don't fucking hate dogs", which actually means
you like them. The sentence needs to be parsed together, not word for word :)

~~~
yeleti
"I don't fucking hate dogs"

A double negative is a positive.

~~~
rectangletangle
This does kind of start to adopt a counter-accusative tone, which could be
interpreted as negative. Though it would entirely depend on context. The
expletive just makes it sound angry overall.

~~~
blattimwind
"I don't fucking hate dogs!!" sounds like something yelled with a raised fist-
finger in a dispute about dog shit on the lawn between neighbours in some
lowly apartment complex.

------
osrec
When I do my next Show HN, and the negativity gets too much to take, this will
be a good resource to turn to, in order to feel a little better about myself
(unless, of course, I end up at the top of your list)!

~~~
stcredzero
The site is down now. I was going to see if I was on yet another HN
leaderboard!

------
noobermin
Even despite the false positives as pointed out in other comments, I'm
delightfully surprised that the worst it gets is around <15% negative
comments. Sometimes, HN seems to cynical to me, at least the comments that
float up to the top do. What would be interesting is negative comments
weighted by the place in the comment section (since you can't see upvoted
scores).

~~~
jbob2000
The site is blocked for me at work, but if he didn't include shadow banned
comments, then he's missing the biggest pool of potentially negative comments.

------
anonytrary
> how are you going to avoid head hunters' spam, either as fake candidates to
> discover new clients or with fake offers for CV mining?

Was given a sentiment of -9, but I'd say the sentiment is closer to 0. Anyway,
it's clear there are a ton of false positives, but overall, this was a really
neat idea and it would definitely be interesting to further index the posts.

------
misterbowfinger
I'm confused how this comment got rated "-10":

I use Backblaze now and once I get my NAS, I’ll probably end up using a B2
based backup. But let’s make an honest comparison. Backblaze does not
replicate your data across data centers. The standard S3 storage class does
(0.23/gb). The comparible storage class for S3 is one zone infrequent access
(.01/gb). B2 still comes out ahead, but I wouldn’t use either one for primary
storage. For thier suggested “3-2-1” backup strategy, sure. Then again, just
for backup, I could use S3 glacier for $.004/gb. That’s cheaper than B2 and I
get multiple AZ storage. The data charges would be higher - but its backup. If
catastrophe struck and I lost my primary and my local backups, getting my data
fast is the last thing I would worry about.

[https://news.ycombinator.com/item?id=17407275](https://news.ycombinator.com/item?id=17407275)

~~~
soared
> does not > I wouldn't use > then again > catastrophe > struck > worry

I can see it. If you bag-of-words'd it there are a lot of negative words used
and effectively no positive words.

------
CM30
Huh, so apparently the day with the highest percentage of mean comments is
Sunday.

Anyone want to have a guess as to why that may be the case? Personally, I'd
expect people to least happy on Monday morning or something, not the second
day they usually get a rest that week.

Similarly confused as to why everyone is supposedly so positive on
Wednesday...

~~~
abjorn
I'm wondering how this compares to the frequency of _any_ comments by day. It
could be that more comments are posted on Sunday in general, and the least on
Wednesday.

~~~
walz
The negativeness is a percent (number of negative comments divided by number
of total comments). I don't think the data is too significant though, because
they are all + or - 5 percent of each other

------
scarface74
I’m no expert in the field - I’ve only watched a few videos - but the example
I’ve seen is where they use a movie’s rating by a person (1-5) and their
comments to train a model and then use the model to determine sentiment
analysis. Unfortunately, since AFAIK their isn’t a way to determine how many
points a post earned except for your own, he couldn’t do that.

------
blattimwind
For some reason there is a .nobreak class that's actually enabling word
breaks. Weird! And it even goes one step further and enables "word-break:
break-all", so that the renderer will break all your nice words apart
anywhere. That's not nice.

(Yes, this was a half-arsed attempt to match the "negative sentiment tone"
without actually being mean ;)

~~~
walz
Haha, thanks for sounding nice :)

I added that because some comments had really long URLs, so I had to enable
breaks so the page wouldn't be really wide, more so on phones. Didn't realize
that it added hyphens to words, thanks for pointing that out, it's fixed now

------
shove
I consider it a personal failure of character not to have at least gotten an
honorable mention ;)

------
crsv
This is pretty cool - I'd love to see similar analysis for truly "contentious"
comments, wherein there were an almost equal but large amount of upvoting and
downvoting, controlled for accounts that have the ability to do either.

------
bumholio
I hope you realize there is a built in sentiment analyzer on HN, based on
highly advanced, natural intelligence algorithms.

Your filter actually seems to dig strongly opinionated posts. They are not
automatically bad, and they can be quite good.

------
sam0x17
I think it's really funny how most of the comments in this list are genuine
critiques and concerns, that get downvoted by toxic users. I went through and
upvoted about half of them.

------
tmaly
I would love to see the top nasty comments for the day similar to
[http://hckrnews.com/](http://hckrnews.com/) and how it ranks

------
libeclipse
The irony here is amazing:
[https://news.ycombinator.com/item?id=11082337](https://news.ycombinator.com/item?id=11082337)

------
estsauver
A comment on my own show HN got flagged, which is kind of funny.

------
michaelmior
It would also be great to see the most positive comments!

------
glitcher
Great job, love the "Y tho?" logo :)

~~~
walz
Haha thanks :)

------
dangle
I'm very happy to report that most sentiment analysis is awesomely,
incredibly, even _beautifully_ inane.

------
lylecubed
I'd love to see how HN's 14% negative comments compare to youtube or reddit.

------
lukev
This kind of automated or rule-based analysis is not sufficiently smart to use
as any kind of moderation tool. And it won't be until it can interpret
semantic content as well as recognize patterns.

Consider the following exchange:

 _Person 1: Hello! I just wanted to chime in and make you aware of the fact
that according to some very cool research <link>, people with sub-equatorial
ancestry exhibit markedly lower test scores, in fact very similar to many of
the great apes! Your mileage may vary, but in my humble opinion I would never
hire someone or work with someone from that demographic._

 _Person 2: Shut up and go away, you racist prick._

This pattern plays out on twitter fairly regularly, and it's usually Person 2
who gets moderated, despite the fact that the content of their message is
actually more appropriate and a net positive for the community (given the
context.) Meanwhile, as long as it's polite, actual hate speech can make it
through most of these filters.

I've heard this referred to as the "polite Nazi" problem, and it's quite real.

~~~
repolfx
It sounds like what you want is not sentiment analysis but rather some sort of
political-correctness analysis, which rather than identifying aggressive
language identifies "wrongthink".

I suspect you're the sort of person who would have torn down the signs
4channers put up last year saying "It's OK to be white", because despite the
obviously anodyne and correct content of the message, you would have
interpreted it as "polite Nazi".

~~~
happytoexplain
I thought his point was interesting and practical, and that he used a non-
controversial example. The problem he's describing is probably impossible to
solve because real examples aren't so obvious, but it seems like you jumped to
the conclusion that he would be willing to accept the downside of such a
system in practice (punishing people who are in fact acting in good faith, and
not just hiding hatred behind apparently rational wording), and you seem to be
personally attacking him. However, I can only say that by assuming from the
way you wrote it that you believe the 4channers you referenced were by-and-
large acting in good faith. You should be more clear if that's what you're
saying, since that would be a surprising conclusion to me. From everything I
saw and heard, that movement seemed to be dominated by the usual hatred, and
"It's OK to be white", while ostensibly an addressal of very real problems,
was really just a slogan that's unassailable outside of any other context and
therefore easy to hide behind.

