For me, malice relates to intent. Intent isn't observable. When person X makes a...

Eliezer · on Nov 3, 2022

This exchange ought to be a post in its own right. It seems to me that malice, hate, Warp contamination, whatever you want to call it, is very much a large part of the modern problem; and also it's a true and deep statement that you should moderate based on effects and not tell anyone what their inner intentions were, because you aren't sure of those and most users won't know them either.

japhyr · on Nov 3, 2022

Kate Manne, in Down Girl, writes about the problems with using intent as the basis for measuring misogyny. Intent is almost always internal; if we focus on something internal, we can rarely positively identify it. The only real way to identify it is capturing external expressions of intent: manifestos, public statements, postings, and sometimes things that were said to others.

If you instead focus on external effects, you can start to enforce policies. It doesn't matter about a person's intent if their words and actions disproportionately impact women. The same goes for many -isms and prejudice-based issues.

A moderator who understands this will almost certainly be more effective than one who gets mired in back-and-forths about intent.

https://bookshop.org/p/books/down-girl-the-logic-of-misogyny...

kashyapc · on Nov 3, 2022

Hi, dang! I wonder if it makes sense to add a summarized version of your critical point on "effects, not intent" to the HN guidelines. Though, I fear there might be undesirable ill effects of spelling it out that way.

Thanks (an understatement!) for this enlightening explanation.

dang · on Nov 3, 2022

There are so many heuristics like that, and I fear making the guidelines so long that no one will read them.

I want to compound a bunch of those explanations into a sort of concordance or whatever the right bibliographic word is for explaining and adding to what's written else where (so, not concordance!)

kashyapc · on Nov 3, 2022

Fair enough. Yeah your plans of "compounding" and "bibliographic concordance" (thanks for the new word) sound good.

I was going to suggest this (but scratch it, your above idea is better): A small section called "a note on moderation" (or whatever) with hyperlinks to "some examples that give a concrete sense of how moderation happens here". There are many excellent explanations buried deep in the the search links that you post here. Many of them are a valuable riffing on [internet] human nature.

As a quick example, I love your lively analogy[1] of a "boxer showing up at a dance/concert/lecture" for resisting flammable language here. It's funny and a cutting example that is impossible to misunderstand. It (and your other comment[2] from the same thread) makes so many valuable reminders (it's easy to forget!). An incomplete list for others reading:

- how to avoid the "scorched earth" fate here;

- how "raw self-interest is fine" (if it gets you to curiosity);

- why you can't "flamebait others into curiosity";

- why the "medium" [of the "optionally anonymous internet forum"] matters;

- why it's not practical to replicate the psychology of "small, cohesive groups" here;

- how the "burden is on the commenter";

- "expected value of a comment" on HN; and much more

It's a real shame that these useful heuristics are buried so deep in the comment history. Sure, you do link to them via searches whenever you can; that's how I discovered 'em. But it's hard to stumble upon otherwise. Making a sampling of these easily accessible can be valuable.

[1] 3rd paragraph here: https://news.ycombinator.com/item?id=27166919

[2] https://news.ycombinator.com/item?id=27162386

dang · on Nov 4, 2022

Thanks for reading and absorbing those old comments—I always leave them in the hope that someday somebody will :)

PebblesRox · on Nov 3, 2022

Someday I hope you will write a whole book on the subject!

bombcar · on Nov 3, 2022

Commentary or gloss on the text, I believe, is sometimes used.

dang · on Nov 4, 2022

You're correct, but there are other more precise words that come from hermeneutics and whatnot. Grad school was a long time ago.

AnimalMuppet · on Nov 3, 2022

But the difference between the original post and the revised post often is malice (or so I suspect). The ideas are the same, though they may be developed a bit more in the second post. The difference is the anger/hostility/bitterness coloring the first post, that got filtered out to make the second post.

I think that maybe the observable "bad effects" and the unobservable "malice" may be almost exactly the same thing.

dang · on Nov 4, 2022

I agree but I don't think malice is the best word for that.

AnimalMuppet · on Nov 4, 2022

OK, I think I can see that. "Malice" is too strong; it implies too much intent and/or too much long-term. Hostility? Bile? Venom?

bombcar · on Nov 3, 2022

I would attribute to malice things like active attempts to destroy the very forum - spamming is a form of "malice of the commons".

You will know when you encounter malice because nothing will de-malice the poster.

But if it is not malice; you can even take what they said and rewrite it for them in a way that would pass muster. In debate this is called steelmanning - and it's a very powerful persuasion method.

Zak · on Nov 3, 2022

Spamming is an attempt to promote something. Destroying the forum is a side effect.

It's fair to describe indifference to negative effects of one's behavior as malicious, and it is, indeed almost never possible to transform a spammer into a productive member of a community.

bombcar · on Nov 3, 2022

Yeah, most people take the promotion spamming as the main one, but you can also refer to some forms of shitposting as spamming (join any twitch chat and watch whatever the current spam emoji is flood by) - but the second is more almost a form of cheering perhaps.

If you wanted to divide it further I guess you could discuss "in-group spamming" and "out-group spamming" where almost all of the promotional stuff falls in the second but there are still some in the first group.

Zak · on Nov 3, 2022

I guess I'd describe repeatedly posting the same emoji to a chat as flooding rather than spamming. Even then, your mention of cheering further divides it into two categories of behavior:

1. Cheering. That's as good a description as any. This is intended to express excitement or approval and rally the in-group. It temporarily makes the chat useless for anything else, but that isn't its purpose.

2. Flooding. This is an intentional denial of service attack intended to make the chat useless for as long as possible, or until some demand made by the attacker is met.

bombcar · on Nov 4, 2022

Yeah - one thing I've noticed with some forums is that the addition of the "like/dislike" buttons (some have even more reactions available) greatly INCREASES the signal to noise ratio (I mean makes the forum have more signal, maybe I said it backwards) because the "me too" posts and the "fuck off" posts are reduced, you can just hit the button instead.

Some streaming platforms have a button you can hit that makes party emoji or heart emoji or whatever appear in a stream from the lower right, that's a similar thing which helps with cheering so you can then combat flooding.

Zak · on Nov 5, 2022

I've observed the same with Matrix and Discord. It reduces noise to the point that while "fuck off" would call for moderation in a lot of contexts, reacting with the middle finger emoji usually doesn't even though it has the same meaning.

motohagiography · on Nov 3, 2022

This aspect of people writing what they meant again after being challenged and it being different - I'd assert that when there is malice (or another intent) present, they double down or use other tactics toward a specific end other than improving the forum or relationship they are contributing to. When there is none, you get that different or broader answer, which is really often worth it. However, yes it is intent, as you identify.

I have heard the view that intent is not observable, and I agree with the link examples that the effect of a comment is the best available heuristic. It is also consistent with a lot of other necessary and altruistic principles to say it's not knowable. On detecting malice from data, however, the security business is predicated on detecting intent from network data, so while it's not perfect, there are precedents for (more-) structured data.

I might refine it to say that intent is not passively observable in a reliable way, as if you interrogate the source, we get revealed intent. On the intent taking place in the imagination of the observer, that's a deep question.

I think I have reasonably been called out on some of my views being the artifacts of the logic of underlying ideas that may not have been apparent to me. I've also challenged authors with the same criticism, where I think there are ideas that are sincere, and ones that are artifacts of exogenous intent and the logic of other ideas, and that there is a way of telling the difference by interrogating the idea (via the person.)

I even agree with the principle of not assuming malice, but professionally, my job has been to assess it from indirect structured data (a hawkish, is this an attack?) - whereas I interpret the moderator role as assessing intent directly by its effects, but from unstructured data (is this comment/person causing harm?).

Malice is the example I used because I think it has persisted in roughly its same meaning since the earliest writing, and if that subset of effectively 'evil' intent only existed in the imaginations of its observers, there's a continuity of imagination and false consciousness about their relationship to the world that would be pretty radical. I think it's right to not assume malice, but fatal to deny it.

Perhaps there is a more concrete path to take than my conflating it with the problem of evil, even if on these discussions of global platform rules, it seems like a useful source of prior art?