Here's how science works: If you don't like the conclusions, you can question the methodology. You can conduct the experiment yourself to see if you can reproduce the results. You can run your own studies and see if you come up with different conclusions. But you don't get to say "Hmm, the data don't conform to my preconceived notions about how the world should work, so I reject their implications."
Everyone doesn't agree with the idea that every thought from every person should be heard at all times. Some people are just jerks. "Trolling", by definition, is about making provocative statements to incite reactions, as opposed to contributing to conversations. There are places where that kind of thing is accepted (4chan) and places where it is not (hn). Tools that help communities to encourage their preferred forms of communication are good things, and will be increasingly important in the future.
The assumption that "trolls" are trying to "incite reactions, as opposed to contributing to conversations" is often repeated, but in practice, "troll" is a term that is often used against people whose viewpoint is minority, but who are legitimately arguing it.
Those with the majority viewpoint often find it convenient to dismiss (and attempt to discredit) those they disagree with by simply calling them troll. And it's not uncommon for forums to be operated in such a way as to actually ban people with minority viewpoints.
Of course, but the "troll" wants to have that conversation, and the rest of the community does not. That doesn't mean the conversation or opinions are unworthy or wrong, just that they are unwelcome in this context.
Good or bad doesn't matter, the "troll" is a disruptive influence that turns a comfortable place into an uncomfortable one.
Most people don't come to the Internet to have their beliefs challenged, no matter how wrong those beliefs might be.
These measures they came up with are heuristic proxy measures at the very best, and noise at the worst.
The troll hunting algorithm has to face the false positive problem , which the paper does not address.
My very legitimate content has been censored various places (notably Facebook) because it tripped 'anti-trolling and scam algorithms' but the things I was trying to post were Snowden and Manning, and TPP leaks.
Disproportionally faster engagement over a given short time period and low variance of word choices along with repeated use of n-grams >= 4 words are all much more indicative then profanity.
Trolls are argumentative and tend to resort to trite sloganish language. They are no more rude then the average commentator (which is fairly impersonal and insincere)
It's about a general discordance with a generic community and what that looks like. There's two kinds: the salvageable and the hopeless. The hopeless is unwavering and defensive - irritable and divisive.
You can see where and when users post - them there are extreme outliers - those are usually bot spammers; the next group in, the first humans, they are the trolls
These are not hard things to compute - and profanity has nothing to do with it.
Alas,yet another thing I should have written up in LATEX and sent off to a fancy journal...
Because once you implement the classifier you affect the world. And you have to take into account what those effects are. Two properties I think are desirable: 1. It should encourage good behaviour. By this I mean that if you adapt to the get a better evaluation that means you also become a better member of the forum. This relates to not being gameable. 2. It should give everyone a chance. For example I could see how being poor could correlate with low quality posts. But shutting out all poor people means you can lose valuable perspectives, so its not an ideal solution. As long as your posts are good you should be welcome, even if your a pleb.
I light of these, consider what filtering for poor spelling actually does. What do we know that correlates with poor spelling? a) being a foreigner. b) being poor / uneducated. c) being underage. Those are the people you filter out. This goes against the 2'nd principle I mentioned of giving everyone a chance. It's a tradeoff, of course, and I could see how it's worth it sometimes.
I believe the core of paulhauggis' point was closer to the idea that trolling and behaviour that leads to being banned are not the same thing. Scope in studies is as important as anything else.