You either die an MVP or live long enough to build content moderation

Alir3z4 · on Sept 28, 2021

Around 1 year ago we got hit badly on our [blogging platform][0] by people/groups submitting fake customer support description of other big companies, either being Microsoft, Facebook, Comcast etc.

Rolled out a machine learning model and trained it on the database. 99% of them vanished.

Next day, the machine didn't work and success rate was around 5%.

Found out, they have learned the trick and now using symbols from different languages to make it look like English.

Trained again, success rate went up again.

Next hour, success rate fallen.

This time, they mixed their content with other valid content of our own blogging platform. They would use content from our own blog or other people posts and mix it to fool the machine learning.

Trained it again and was success.

Once a while such content appear and machine model fails to catch them.

It only takes couple of minutes to mark the bad posts and have the model get trained and redeployed and then boom, bad content is gone.

The text extraction, slicing through good content and bad content, finding out symbols vs sane alphabet and many other thing was at first challenging, but overall pretty excited to make it happen.

Through this we didn't use any platform to do the job, the whole thing was built by ourselves, little bit of Tensorflow, Keras, Scikit-learn and some other spices.

Worth noting, it was all text and no images or videos. Once we got hit with that we'll deal with it.

[0]: https://www.gonevis.com

edit: Here's the training code that made the initial work https://gist.github.com/Alir3z4/6b26353928633f7db59f40f71c8f... it's pretty basic stuff. Later changed to cover more edge cases and it got even simpler and easier. Contrary to the belief, the better it got, the simpler it became :shrug

tdeck · on Sept 28, 2021

For adversarial problems like this, a shadowban approach can sometimes be necessary. Perhaps people can still see their blogs but GoogleBot gets blocked from indexing them, or they only appear to someone with the spammer's cookies. That way it takes them longer to catch on and evade the model.

Of course, that means you'll need to at least spot check your bans because you can't rely on legit users escalating to you.

Alir3z4 · on Sept 28, 2021

Yep, shadow ban was in place as well.

The thing is, the people that weren't the ones posting the content. It appeared their computer was affected by some type of malicious file to be part of a bigger network. (botnet?)

I could see from the thousands of different IPs in different countries around the globe, that it could be affected personal computers.

Very few of them were computers from hosting companies, the rest were normal people computers.

I'm sure these machines were doing the job, someone else would have tests the result.

When we did the shadow banning it didn't made a dent in their effort.

The way they changed email, changed username, tried to be unique was completely prepared specifically for our platform (I would guess so)

Whenever we counter their attack, they would be silent for a while and then attack again. They would adjust.

Shadow ban is effective when the attacker themselves will not be aware it, in our case it was tricky to know who was the observer.

bryan_w · on Sept 28, 2021

Yup they utilize "residential proxies" to hide their behavior which makes the situation worse because sometimes it will affect your legit users

0xcde4c3db · on Sept 29, 2021

I wasn't familiar with the term, so I just searched the phrase "residential proxies", and the space seems even more sketchy than the usual VPN peddlers who promise the world and more. Are they using malware-infected PCs, or what?

cmeacham98 · on Sept 29, 2021

The most legit ones use their own users (to access the residential proxy you have to agree to become an endpoint yourself).

More shady but probably legal ones use JavaScript and Apps that typically hide the fact you are becoming an endpoint in the T&Vs.

Straight up illegal ones use compromised computers and routers like you'd think.

viraptor · on Sept 29, 2021

Some of them use the endpoints using their app. Which means a lot of people become the spammers just by joining the network. Of course it's mentioned in the middle of their T&C, but who reads that...

zxcvbn4038 · on Sept 29, 2021

I’ve always been curious how the residential proxies work. Is someone going around paying people to run proxies in their home? Are these compromised devices being exploited? Are there a bunch of storage units someplace with cable service? The mind runs wild.

nebula8804 · on Sept 29, 2021

This seems like a lot of effort? What do you think was the motivation for something like this on your site?

morenoh149 · on Sept 29, 2021

The cynical answer is backlinks haha

jart · on Sept 28, 2021

Did they map to ISP ASNs? Country geolocation doesn't say much anymore since there's so many VPN providers whose business is to buy a CIDR in every country and resell access.

Alir3z4 · on Sept 28, 2021

Yes, almost all of them mapped to ISP ASNs.

Very few of them were from AWS, OVH and other hosting providers, very very few.

We ran each IP towards black lists ips, paid IP reputation checkers. Majority of the IPs were clean.

Back then we had IP reputation check, but it was a headache to maintain, so we disabled it later, however that time very very few them got stopped at IP reputation checks.

kbenson · on Sept 28, 2021

As someone that's used IP proxying services that provide millions of IPs for scraping purposes, that is a very mature industry, and they advertise (and I believe them) "millions" or IPS, even for what you might consider hard to supply ones, like mobile IPs, and they let you slice and dice them however you want? Datacenter IPs? Residential IPs? Mobile IPs?[1] What state or city would you like them in? Would you like the site you're hitting to not have been accessed by this IP (through proxying at least), and if so how many days? Do you want some mix of that? Make your own configurations and set them up as proxy endpoints, etc.

Fighting against abuse at the level of IP address attributes seems like a losing game to me. Honestly, the best I saw at this (3-5 years ago at least) for traffic was Distil networks, where they put a proxy device in front and examine your traffic and captcha or block based on that.

Since you have content being submitted, there's a lot more you can use to classify, such as how you used ML, so that's good. Part of me worries that this is all sort of reminiscent of infections and antibiotics though. The continual back-and-forth of you finding a block them finding a workaround feels kind of like you were training the spammers (even if you were training yourself at the same time). At some point maybe we'll find that most the forum spam is ML generated low information content posts that also happen to be astroturfing that is hard to distinguish from real people's opinions.

1: Fun fact, to my knowledge anonymous mobile IPs are provided by a bunch of apps opting into an SDK (like an advertising/metrics SDK) which while their app is open (at least I hope that's a requirement) registers itself to the proxying service so it can be handed out for use by paying proxy customers. Think about that next time you play your free "ad-supported" mobile game.

mrkurt · on Sept 28, 2021

We shadow ban abusive users on Fly.io and it works great. Everything seems to work right up until they try to connect to a deployed app.

It took me a while to realize that ramping up the frustration level is, itself, a helpful deterrent.

ljm · on Sept 28, 2021

I remember an old mailing list discussion on sourcehut, because sourcehut provides a build service that you can use for automation.

The decision from sircmpwn was, at the end, to charge money for the service. Charging money and KnowYourCustomer will kill most exploits dead.

In this sense, this is turning the frustration level to 11. You can use the service to a certain extent, without frustration, but if you want to get serious then you're going to have to jump through some hoops.

Dedicated people will still find a way through, but you've cut off 95% of the flow and killed the low-effort attempts. Now, you can focus on the serious shit.

pitay · on Sept 29, 2021

Sounds like a micropayment system where there is much more significant first buy in amount to be on the micropayment system than the actual (tiny) cost of a micropayment to use a service would make it worth far less worth it to spam online services.

The risk of having their payment identifier/address banned from services before significant use makes it very risky for them to use such a thing even if tiny micropayments a worth it to the spammer.

It certainly could have other problems with people getting banned from such a system for things other than: spam and other use detrimental to the service provider. There is also the issue of how the initial buy in fee is distributed.

But a high buy in for a system that many online service providers use would very strongly discourage use detrimental to the services providers (I think; this is only for discussion, as this is posted by someone with little knowledge on this. Micropayment systems have been talked about a lot, but I don't remember high buy-in mentioned).

Edit: Forgot to mention that the idea of this is that service providers can offer their services at lower cost because the risk to them from a account/address with a high buy in is lower than from an account/address with no buy in.

Qub3d · on Sept 29, 2021

> The decision from sircmpwn was, at the end, to charge money for the service.

Frankly, I'm shocked the other major free providers (GitHub/Lab) haven't done this by default. GitHub's current default is free for public branches, and a small fee for private.

I could see a flipped setup working: by default a very small fee (1-5 cents per x number of batches) that most companies wouldn't notice, and a path for FOSS projects to apply for credits.

I can't find the direct link, but I remember someone on HN pointing out that because CI tools are turing complete, GitHub actions is the cheapest serverless cloud product in the world right now -- you just need to figure out how to game the system.

I'm sure they've built very sophisticated filtering tools, but imagine someone slips through the cracks and gets a cryptominer working. Get that action registered in enough projects (by, say embedding it in an Actions library or generator tool) and that could be significant.

shakna · on Sept 29, 2021

> I'm sure they've built very sophisticated filtering tools, but imagine someone slips through the cracks and gets a cryptominer working.

It's been done, multiple times. Here's a handful (96) of documented cases which are somewhat recent. [0][1]

It seems to be surprisingly easy to abuse the process, and GitHub are continually playing catch up.

[0] https://dev.to/thibaultduponchelle/the-github-action-mining-...

[1] https://www.bleepingcomputer.com/news/security/github-action...

sdenton4 · on Sept 28, 2021

Yeah, in the end you wind up with a small set of persistent adversaries who have been tweaking their abuse alongside your fixes, and a hopefully much higher wall for new abusers to scale.

If possible, it can help to hold back new systems and release a bunch of orthogonal anti abuse systems at once. Then the attackers need to find multiple tweaks instead of just evading one new system.

OneTimePetes · on Sept 29, 2021

I wonder though, would it not be easier, to identify the watchers and shadowban those.

Just present bisected realities to spam watching groups and identify those who trigger a reaction.

Nextgrid · on Sept 28, 2021

Are you doing this because of resource usage?

If resources are free then you could even actually deploy their app and either whitelist it for their own IP or only allow very few requests before taking it down.

This would be even more frustrating and could ruin whatever they plan to do with their abusive app in the first place. Let's say they deploy their malware/phishing page, test it a couple of times (possibly from a different IP) and it works. They then start spamming the malicious link and waste decent amounts of time/money/processing power, not realizing that the link was dead after the first 10 hits.

mrkurt · on Sept 28, 2021

We're primarily trying to prevent fraudulent payments combined with expensive VMs. Throttling CPU to almost nothing on high risk accounts sounds delightfully irritating.

We also get the less resource intensive, but still harmful abusive apps that port scan the internet. Those are relatively easy to detect. We generally don't want to be a source of port scans so we shut them off pretty quickly.

Nextgrid · on Sept 28, 2021

I wonder if you could return bogus but plausible results to port scans? Either whitelist a set of safe ports such as HTTP(S?) so that if they try to "curl google.com" to confirm whether everything is OK they get a good response but silently drop everything else causing their scan to return negative on all ports.

New_California · on Sept 29, 2021

How do you know the payment was fraudulent? Is there any possibility it can affect honest users?

New_California · on Sept 29, 2021

Is there any possibility user is unaware she is "abusive"?

For example, I quickly searched the ToS for port scanning and didn't find it banned, yet you admit to intentionally frustrate such users.

I am not Fly.io customer at this time but I do port-scan my own services as a realtime security check.

As otherwise great service, you may open yourself for very angry user reviews if the tactic is "frustrate the user".

lmm · on Sept 29, 2021

I'd think a serious attacker would easily detect a shadowban, but they'll be intensely frustrating for legit users. Seems like a poor tradeoff.

andrei_says_ · on Sept 28, 2021

I’d presume that large scale spammers check their work in incognito via a different network. It’s their job.

ineedasername · on Sept 28, 2021

check their work... It’s their job

The idea of running UAT for spammers/scammers/troll-farms is pretty depressing.

zxcvbn4038 · on Sept 29, 2021

You would think, I’ve run into plenty of click bots that continue to run for years despite receiving noting but 4xx or 5xx results. You would think someone someplace would monitor that and spin the IP but no reaction at all.

Then there are others where its only an hour or so before rates get adjusted to our threshold and/or new IPs start emitting the same requests.

bcrosby95 · on Sept 28, 2021

When we had this problem, we added an input element positioned off-screen and ignored submissions that populated it. Cleaned it all up.

SilverRed · on Sept 29, 2021

Only works if it's generic spam not targeted to your site in particular. Otherwise they will just adjust the script to not fill it.

Spone · on Sept 28, 2021

This is usually called a "honey pot" if some people want to search tools for implementing it.

ummonk · on Sept 28, 2021

This story needs to be turned into a writeup and submitted to HN on its own.

Y_Y · on Sept 28, 2021

If your eyes can "normalize" a unusual symbols to a common one to make an English word then so can a lookup table. I feel like this isn't a case where you'd reach first for a neural net.

jffry · on Sept 28, 2021

In fact, the Unicode consortium provides a report and extensive list of "confusable" symbols, which you could use alongside Unicode normalization tables to map adversarial back into more ASCII-equivalent text before running it through anti-spam mechanisms that are interested in the content of the message.

https://www.unicode.org/reports/tr39/

wanderingstan · on Sept 28, 2021

I built a Python library for finding strings obfuscated this way. Was critical when moderating our telegram channel before an ICO.

https://github.com/wanderingstan/Confusables

E.g. "𝓗℮𝐥1೦" would match "Hello"

Alir3z4 · on Sept 28, 2021

I should have had better eyes while searching, could find this and saved some hours.

jffry · on Sept 28, 2021

I only learned about it myself after spending too long building my own half-baked version. I think it's in pretty opaque language that makes it hard to find even if you know what you want.

Maybe somebody else on here will see it and learn about it before they need it, and at least you still have a new tool to reach for in the future.

mr_toad · on Sept 29, 2021

What I don’t understand is who sees spam that looks like an old school ransom note and thinks ‘seems legit’, and then actually buys something?

airstrike · on Sept 28, 2021

You should post this to Show HN. Also you have a typo on your README ("characgters")

jart · on Sept 28, 2021

The one built-in to Python will get you most of the way there:

    >>> import unicodedata
    >>> unicodedata.normalize('NFKD', '𝓗℮𝐥1೦𝗵𝗲𝗹𝗹𝗼')
    'H℮l1೦hello'

Obviously it isn't going to remap leetspeak characters like 1 -> l but it covers a lot of cases.

zerocrates · on Sept 29, 2021

Obviously you're saying it doesn't cover everything, but a big thing it's not going to catch beyond leetspeak-type situations is the kinds of thing you (used to) see in internationalized domain spoofing: legitimate non-Latin-script letters that just look the same or nearly the same.

NFKC/NFKD will handle "this is another form of the Latin letter A" type stuff but not "Cyrillic A looks like Latin A."

wanderingstan · on Sept 28, 2021

Thanks, I've fixed the typo! It was such a simple project, hardly seems worthy of a "Show HN".

jonplackett · on Sept 28, 2021

I've seen crazier things get to #1

8note · on Sept 28, 2021

Test for the library: would it catch that that typo still refers to characters?

JanisL · on Sept 28, 2021

Thank you for making this!

jonplackett · on Sept 28, 2021

There should be a 'bad symbols' list you can block like you do expletives. There's surely zero need to support that kind of thing in comments.

jcims · on Sept 28, 2021

One term used to describe this is homoglyph, as in homoglyph attacks for phishing.

Back in 2015 I did some work just using simple bitmap rendering plus OCR to find text that looked like a small selection of known words. It was actually reasonably effective.

nradov · on Sept 28, 2021

Those are known as homoglyphs. This issue has been studied and best practices are documented. There's no complete solution but it can be mitigated.

https://en.wikipedia.org/wiki/Homoglyph

dredmorbius · on Sept 29, 2021

All kinds of possibilities there.

Unicode Character “;” (U+037E) is a beaut.

https://news.ycombinator.com/item?id=28621681

Alir3z4 · on Sept 28, 2021

Yeah, then someone has to create or find that whole table and make.

The initial problem wasn't those symbols but the content itself, the symbols and special characters came into the problem later.

Later on as mentioned in my original comment, that they would use positive content from other blog posts that were published/passed the moderation to mix up their bad content.

Probably could use a different method, but at that time needed something quick and fast and it worked and still works with very little tweaking.

Although we don't have massive amount of threats or abusers anymore to exactly know the effect, but again, so far it works.

That time, they would coming several thousands per minute, IP blocking, range blocking, USER AGENT, captcha or anything such didn't work on them.

jffry · on Sept 28, 2021

The good news is that the Unicode consortium has a report on this issue, and the tables already exist for normalization and mapping of confusables to their ASCII lookalikes: https://www.unicode.org/reports/tr39/

Alir3z4 · on Sept 28, 2021

Oh, that's nice.

I guess I can use that next time time to work on the data cleaning for that model.

Thanks.

wanderingstan · on Sept 28, 2021

As I commented above:

I built a Python library for finding strings obfuscated this way. Was critical when moderating our telegram channel before an ICO. https://github.com/wanderingstan/Confusables E.g. "𝓗℮𝐥1೦" would match "Hello"

infogulch · on Sept 28, 2021

If you can identify text written with mixed glyphs just ban it outright. Normal users don't use text like this, the pure binary presence of such "homomorphic" text at all is probably a better signal for spam than whatever your neural net when running it after normalization.

chefandy · on Sept 28, 2021

> Normal users don't use text like this

I think that depends on the users. People copying and pasting bits of text that was in English or another common language— think documentation, code, news articles, tweets, etc.— with a different character set could be problematic.

Also, 𝒮ℴ𝓂ℯ 𝒜𝓅𝓅𝓈 marketed as "𝔽𝕠𝕟𝕥𝕤 𝕗𝕠𝕣 𝕤𝕠𝕔𝕒𝕝 𝕞𝕖𝕕𝕚𝕒" would be ℭ𝔞𝔲𝔤𝔥𝔱 𝔲𝔭 𝔦𝔫 𝔱𝔥𝔦𝔰. (math symbols) A user base with young people getting bounced or shadow banned for trying to express themselves or distinguish themselves from their peers would be like ಠ_ಠ (Kannada letter ttha)

I think targeting the language they're using is a better bet.

¯\_(ツ)_/¯ (Hirigana letter tsu)

ronsor · on Sept 28, 2021

I can especially echo the "social media fonts" trend. They're quite popular on certain Discord guilds at least.

Minor nitpick, but ツ is the katakana tsu.

chefandy · on Sept 28, 2021

Oh, fact. Not a Japanese speaker. (or reader)

cutemonster · on Sept 29, 2021

> pasting bits of text that was in English or another common language

If they use many (maybe three? four? or more) character sets in the same post, or different character sets in any single word, then that'd be highly suspicious?

Whilst still letting people copy paste from another language

Special case needed for the shoulder shrug with an Hirigana letter tsu I mean katakana tsu

zerocrates · on Sept 29, 2021

I've noticed much more usage of alternative Unicode ranges for numbers/letters in email subjects lately to make marketing messages stand out, too (in addition to emoji of course), though I wouldn't necessarily mind banning that...

v_london · on Sept 28, 2021

Bizarrely, I'm seeing recruiters using this on LinkedIn.

chefandy · on Sept 28, 2021

huh. For any specific purpose? Does it seem like they avoiding paying for recruiter accounts or something by evading algorithms designed to detect their activity, or is it just for the heck of it?

TheSmiddy · on Sept 28, 2021

It's following some business wank advice to stand out from the crowd like printing your resume on A3 card stock.

chefandy · on Sept 29, 2021

Ah. Sounds like something one of those pickup artists would say if they blindly pivoted into marketing.

NavinF · on Sept 28, 2021

> Normal users don't use text like this

Sounds like you live in a filter bubble.

(╯°□°)╯︵ ┻━┻

(ﾉ◕ヮ◕)ﾉ*:･ﾟ

jart · on Sept 28, 2021

I know right? There are so many times when I've wanted to use something like box drawing unicode characters (cp437) to explain a complicated concept on hacker news, but alas I couldn't, due widespread computer fraud and abuse. How are we going to build a more inclusive internet that serves the interests all ALL people around the world, regardless of native language, if the bad guys are forcing administrators to ban unicode? (╯°□°)╯︵ ̲┻̲━̲┻

falcolas · on Sept 28, 2021

> Normal users don't use text like this

They kinda do. Check out the shrug "emoji", table flip, and so forth. Then there's the meme of adding text above and below by abusing Unicode's "super" and "sub" modifications.

You could block it to only ever represent ASCII, but then you've knocked out the ability to expand internationally.

tgsovlerkhgsel · on Sept 28, 2021

You can hardcode a rule for this specific bypass. Or you just retrain the neural net, and it learns that presence of these symbols = bad very quickly, and you spent less time writing and testing a custom solution.

LuisMondragon · on Sept 28, 2021

> Found out, they have learned the trick and now using symbols from different languages to make it look like English.

I wonder if you can train a ML model using text as images. For example, taken as strings, "porn" and "p0rn" are not very similar, but visually they are.

vanderZwan · on Sept 29, 2021

Are you familiar with the Xerox copying flaw? Not exactly the same problem and unlikely to apply here but still a fun tangent:

https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...

seanmcdirmid · on Sept 29, 2021

Not to mention “pom” in certain fixed width TTY fonts.

cutemonster · on Sept 29, 2021

That could detect bad text in actual images too

oblio · on Sept 28, 2021

Do you have a mechanism for appealing the automated process?

Alir3z4 · on Sept 28, 2021

Not sure if I understand correctly, but if you mean how re-training and deployment would be.

Nothing fancy tbh.

For the first several deployment while taking care of edge case and debugging, all manually on my own laptop and shot into cluster as a docker image.

Later, when starting to classify more content on the platform itself:

- Webhook will trigger the CI to train the model with new ham and spam content. - A new docker image was created and deployed to container.

The above would fail if success rate in validation/testing would be below 95%.

Still when classifying bad content, the whole process happens automated.

Special thanks to Gitlab CI, shell scripts, Python and Docker Swarm.

nerd_light · on Sept 28, 2021

I think they meant (and I am interested in hearing about) appealing a "block" decision that was made by your automation.

If I'm a real human and trying to post a "good" post, but the model classifies it as bad and automatically blocks it, how do I appeal that decision? Can I? Or is my post totally blocked with no recourse?

Alir3z4 · on Sept 28, 2021

Oh got it.

Thanks for clarification.

When a post gets published, it will be send to machine learning image via REST.

If bad, the post will be kept as Draft.

A new record gets created in another database table to keep track them, the accuracy rate was recorded as well.

This was made to make sure no irreversible action was done on the good content.

Blogs with more than 1 year of history would not go through moderation but no action was being taken, just recording the accuracy for future reference.

Later, someone from our team (me usually) would check them by eye and pull trigger on them, they would go into make the training better.

If something would pass the moderation but it was indeed spam, would go into another iteration.

We had to do this for over a month, through the time, the success was around 99%, no blogs would be wiped by machine classification from our database unless confirmed by someone.

That time the whole model was trained for that specific content. Later it get into other type of spams. Which we trained different models.

Overall, the the machine actions were logged, content/users/blogs would get labeled and bad marks on them.

They would be displayed in a report page, until someone make the final decision, through the whole time, the user would be shadow banned (shadow banning didn't help though) and their content would not be published.

nerd_light · on Sept 28, 2021

Thanks for the detailed response! And nice to hear how much you've managed to keep humans involved in the process. I used to work on a content review automation system for a big company, so it's always fun to hear about how others handle similar cases.

And there's a lot of overlap between how that system acted and what you're describing. It makes we wonder if there's space for a company that offers this sort of model training + content tagging + review tooling capability as a service, or if there's too many variation on what "good" and "bad" input is to make it generalizable.

fallingknife · on Sept 28, 2021

I think he's talking about false positives.

bliteben · on Sept 28, 2021

what are they google? wait google doesn't even have that

oblio · on Sept 28, 2021

My comment was targeted at companies trying to take Google's crappy approach at this problem.

cowmoo728 · on Sept 28, 2021

I'm interested in why these people were doing this. Were they hoping to get non-tech-savvy people that were searching for computer help? I guess that's a good audience of unwitting users to attempt to hack, but was the goal to get them to submit to one of the remote tech support scams? Were they embedding malware into your blogging platform, or getting ad revenue out of this somehow?

Alir3z4 · on Sept 28, 2021

> Were they hoping to get non-tech-savvy people that were searching for computer help?

Yes.

They would create this posts and get quickly on search results (The platform is pretty good for making SEO optimization out of the box) and they would write good quality posts as well.

They would also share this posts on some other websites, especially social media accounts.

We don't have Google analytics or such to see where exactly they would come from. I noticed huge traffic to such pages by looking at the logs.

Our nginx log parser was alerting us about sudden spike on certain blogs and pre-defined list of words we have.

That's when we noticed something is going on.

Didn't take more than couple of hours (while working on the model) that we receive email from data center people about hosting phishing content, again didn't take much longer we received emails from some of those companies as well.

> Were they embedding malware into your blogging platform, or getting ad revenue out of this somehow?

No. On the blogging platform, we have everything bleached out, nothing would go in without passing through sanitizers.

They would simply had people convinced to call those US numbers.

I actually called one of those numbers and yeah, it was one of those customer supports some other part of the planet earth and definitely not from the company he was pretending to be and very quickly asked me to install team viewer on my machine. I really wanted to let them access it via the windows on my virtual box and have some fun with them, but well, someone had to fix the moderation issue :D

pier25 · on Sept 28, 2021

Why didn't you ban the user when you found out the scam?

Alir3z4 · on Sept 28, 2021

We did.

We banned, shadow banned, deleted, recorded and many other things to them to make sure they're not reaching their goal.

The problem was, it wasn't a single user to deal with. thousands and thousands with no stop.

Flooding the platform like zombie apocalypse.

pier25 · on Sept 28, 2021

Jesus... what a nightmare.

Were these all using your free plans?

Alir3z4 · on Sept 28, 2021

Oh yes.

Never ending.

We still keep the free plans even through there are abusers, but that won't be a reason to retire it. So many people using for legitimate reasons and keeping their personal writing there.

It's unfortunate, but well, it happens.

pier25 · on Sept 28, 2021

Why do you think they were using your platform to do this?

Is it because these scammers were non technical?

Or maybe the anonymity provided by your platform?

Alir3z4 · on Sept 28, 2021

2 theories

1. Highly technical, because the flow was scripted to work with our website. Bypassing captcha, email verification by using many different domains, email accounts and also highly distributed via many ips.

2. Non technical. Where they paid some ppl to do it manually, which doesn't seems to be due to the way they walk through many steps like piece of cake.

However, our platform was/is a target due to several reasons:

1. Easy to register and start blogging. 2. Free plan with no hard limit. 3. Quick rankings due to SEO implementation out of the box. 4. Absence of any moderation before such attack.

And probably some other possible reasons that made their job easier and us a better target.

NineStarPoint · on Sept 29, 2021

If I was this sort of scammer, I’d have the text being posted and progression of how hard it is to figure out pretty well ironed down. Then you just try it on lots of platforms to find the ones that don’t figure out how to solve the issue. So my guess would be they tried because they might as well, and then they hit the limit of how much effort they had put into their own generation systems and moved on to the next mark.

HodorTheCoder · on Sept 28, 2021

If you don't mind me asking, what sentence embeddings model (bert/roberta/etc) did you have the best luck with for your classifier? I like the quick retrain that can be done with an approach like this, though I have found that if you throw too many different SPAM profiles at a classifier it starts to degrade, and you might have to build multiple and ensemble them. The embedding backend can help a lot with that.

Alir3z4 · on Sept 28, 2021

Tried bert but didn't get the proper result, probably wasn't working with it properly.

Here's the old source I have on my computer that did the training

https://gist.github.com/Alir3z4/6b26353928633f7db59f40f71c8f...

This was doing the early work and later changed more to fit other cases.

Pretty basic stuff.

YetAnotherNick · on Sept 28, 2021

How do you detect the ground truth for training the model? Do you manually label it?

Alir3z4 · on Sept 28, 2021

Yes, simple classification. Nothing fancy.

Basically, pulled the database into CSV file and anything that was published before the bad content was classified as HAM.

We had content that were OK, so marked as HAM and then our new bad content all marked as SPAM.

When deployed to production for some hours HAM content got wrongly marked and model got trained on them as well which made so many confusion but the problem was taken care of once the model got properly tuned and safer to let it be automated.

benjaminjackman · on Sept 28, 2021

Hmm I wonder if it picked up timestamps as its initial filter.

neilv · on Sept 28, 2021

What was the motivation of those attackers?

tootie · on Sept 29, 2021

I'm curious why you'd roll your own at all when there are moderation services available. Did you just have a use case that didn't match anything on the market?

drdebug · on Sept 28, 2021

Would you be able to share information about the tools you used, they could be very useful for other blogs and platforms it seems!

Alir3z4 · on Sept 28, 2021

Basically https://gist.github.com/Alir3z4/6b26353928633f7db59f40f71c8f...

but don't think it will be useful for everyone, each usage and requirements is different and needs different solutions.

drdebug · on Oct 1, 2021

Thank you! You'd be surprised how useful a starting point can be, thanks for sharing.

vadfa · on Sept 28, 2021

What software/libraries have you used for your machine learning moderation system?

Alir3z4 · on Sept 28, 2021

tensorflow and scikit-learn to train and build the model.

On the front FastAPI (behind uvicorn) to accept calls via REST API.

Deployed via docker.

To be honest, tensorflow and scikit-learn may not be the right fit for everything.

Every situation needs different approach and different solution.

Worth nothing, the most time consuming part was dealing with data itself and not model training or machine learning.

In couple of hours you'd notice you're starting at charts and tuning parameters.

flal_ · on Sept 28, 2021

"Worth nothing, the most time consuming part was dealing with data itself and not model training or machine learning."

Become a data scientist they said. Yay, artificial intelligence...

Alir3z4 · on Sept 28, 2021

Hahaha. Yes, I felt that every second while dealing with it.

KorematsuFredt · on Sept 28, 2021

Did you try rate limiting, shadow banning, ip banning etc?

neiman · on Sept 28, 2021

I wonder if it also ends up blocking legit content?

pbreit · on Sept 28, 2021

How did the lousy content affect your legit users?

Alir3z4 · on Sept 28, 2021

Every blog is pretty isolated from other users.

It's not that the content will be popped up to everyone when someone posted something.

There's a Feed page where you can read what others you follow have published.

There's a Explore page where latest content without any filter or categorization would be visible. This is where such content would appear, but only blogs older than 7 days would appear there (we have removed that delay in recent versions).

Basically no one noticed them.

Although we disclosed the issue we were dealing with some of the old users of platform when they complained about their posts not getting published. That was the first issue in first 15 minutes of the machine learning model classifying wrongly due to being fed mixed content (where bad content was mixed with good content from those exact blogs by spammers.)

Other than several bloggers reporting their posts wouldn't go through as expected, no one else got effected and I hope no people were lured with those scammer while their content was published on our platform.

nbardy · on Sept 29, 2021

That sounds like a killer saas startup

asiachick · on Sept 29, 2021

and then your filter has a few false positives and you get cancelled as people claim you're anti <fill-in-the-blank>

5faulker · on Sept 28, 2021

Tale of humans man...

nothrowaways · on Sept 28, 2021

I would recommend you a different approach, such as using metadata like their location etc.

Alir3z4 · on Sept 28, 2021

It was a coordinated attack from many different locations and IPs.

The IP blocking was in place on application level and later into Cloudflare blacklist. Still they would flood in with different IPs and browsers.

nothrowaways · on Sept 28, 2021

Still i would consider working on such metadata. I have seen many interesting results based on this.

jfengel · on Sept 28, 2021

If you host blobs for free, somebody is going to use you as their host. Even if you just hosted audio, I'm sure somebody will quickly come along with a steganography tool to hide their content on your site (and use your bandwidth).

Similarly, if you make compute power available, people will use you to mine cryptocurrency. Even if all you host is text, somebody will come along to be abusive. When you put a computer on the Internet, it's open to the entire world, including the very worst people.

If you're hosting a community, start from the beginning by knowing who your community is and how they will tell you who they are. If the answer is "everybody", then know what everybody means -- it means some people won't want to be there, because some people will make life hard for them.

It's no longer 1991, when you could assume that such people wouldn't find you. They will find you -- for money, or the lulz. You have to plan for that on day 1. You can't fix it after the fact.

michaelpb · on Sept 28, 2021

Yeah, there's an entire category of "idea guys" who don't get this. They repeatedly try to crack the code on a truly moderation-free or purely crowd-moderated platform, and it never, ever, ever works.

It almost always boils down to a poor understanding of how humans work (usually some sort of "homo economicus") or how computers work (usually some sort of "AI magic wand").

999900000999 · on Sept 28, 2021

Idea Guys are useless for many other reasons.

Generally they'll want to make a half baked social media network without understanding you need to pay for things like hosting, or a programers time. I've made the mistake of writing code for these folks.

Guaranteed they'll never appreciate it, and this includes non profit coding groups. Never ending scope creep, vague requirements, etc.

My rule is unless you're one of my best friends I simply will not build your project for you. However the few times I have built something for a friend I found the experience to be very rewarding, it can be good to develop with someone else who can give you feedback so you actually know you're building something someone would like

seph-reed · on Sept 28, 2021

I still theorize crowd-moderated platforms are possible, as long as there's really good gate-keeping.

My bet is some real-world tie, one which is time consuming and expensive to create. From there it should be possible to create moderation tools that keep the rest going.

An example of a real world tie would be a trust network that requires status with in-person communities and local businesses. And not just "accept the hot chick friend request," but an explicit "I'm staking my reputation by saying this person is real."

But once you let bots in, it's over.

snowwrestler · on Sept 28, 2021

Slashdot’s meta-moderation system worked well for a long time. One set of people could make moderation decisions directly on content, and then another unrelated set of people would review the moderation decisions and support or revert them.

It was all tied to karma and permissions in ways I can’t quite remember. But essentially there was no way for a motivated bad-faith group to both moderate and meta-moderate themselves, and the incentives marginalized bad faith actors over time.

nemo44x · on Sept 29, 2021

You had limited resources to use (5 moderation points at a time for example), at least in theory. Having moderation be moderated was a great idea though I think should come back in some form.

WorldMaker · on Sept 28, 2021

Moderation is labor and you get what you pay for. Which is not that crowd-moderation cannot work, but that for good crowd-moderation you still have to treat it as a labor pool, have a very good idea of how you are incentivizing/paying for it, and what "metrics/qualities" those incentives are designed to optimize for.

(In some cases it actually is far cheaper to pay a small moderator pool a good wage than to pay an entire community a bad wage to "crowd-moderate" if you actually test the business plan versus alternatives.)

Sohcahtoa82 · on Sept 28, 2021

Whatever happened to Something Awful? Are they still around?

They charged a one-time $10 fee to access their forums. If you got banned, you could pay $10 to get a new account. It made being a total dick expensive. I've heard it get called the Idiot Tax.

thangalin · on Sept 29, 2021

Here's a concept for a moderated deliberation platform:

https://bytebucket.org/djarvis/world-politics/raw/master/doc...

It's like StackOverflow + Reddit + Wikipedia. Section 3.1 is what makes the concept fairly unique. Most moderation systems require known moderators; the proposed system uses random selection. Eligible moderators could require a minimum reputation to further reduce the possibility of bad actors. Using something like Slashdot's interaction limit may be helpful.

pjc50 · on Sept 28, 2021

The nearest was Advogato; it didn't have an abuse problem, but it did end up as a ghost town. https://web.archive.org/web/20170715120119/http://advogato.o...

KirillPanov · on Sept 29, 2021

IIRC it was based on a web of trust.

Webs of trust are awesome, but call for a lot of investment from users. I think they are more viable if they are independent of any one website, so all that effort isn't flushed down the drain if the guy who owns the domain goes incommunicado.

woko · on Sept 29, 2021

It depends on the scale. I can vouch that crowd-moderation works fine for a small forum (~ 1000 members) which I am part of. And there is no karma system. You get to report posts (3 reports mean that the post is deleted), and warn users (24-hour ban after 3 "active" warnings, and then it scales up to a permanent ban after 15 "active" warnings). Warnings become "inactive" after a month.

It also depends on the threat model. If the community is the target of an harassment campaign coordinated by external actors, then you might need additional tools, or people dedicated to the job. However, this won't necessarily solve the problem, as external actors could double-down, and moderators can lose their minds (suspicion of a troll behind every post, abuse of power, absence of control of the moderators, possible presence of a spy/agitator among the moderation team, etc.). I won't name the forum and the community, but I have a specific one in mind. It does not help that it is a source of information for gaming media, which means that it is often linked to in press articles, which attracts much attention from all kinds of people.

That being said, I get back to the subject: user-generated content on platforms (and not just forums). If the goal is to reach a large scale, then I fully agree with you.

spurgu · on Sept 29, 2021

> You get to report posts (3 reports mean that the post is deleted), and warn users (24-hour ban after 3 "active" warnings, and then it scales up to a permanent ban after 15 "active" warnings). Warnings become "inactive" after a month.

Sounds like I could do some damage there by signing up with three accounts?

woko · on Sept 29, 2021

In practice, this has not happened yet, and it has been 3 years since the forum inception.

One obstacle which I forgot to mention is that an account cannot report posts or warn other members unless the account is 3 months old *and* the account has created at least 300 posts. Both conditions have to be met. I guess it is a sufficient hindrance for most Internet trolls to forget about the forum if they had no intention to take part in the community in the first place.

spurgu · on Sept 29, 2021

Ah yes, that obstacle turns it from dubious into quite solid. :)

michaelpb · on Sept 29, 2021

Yeah, I will agree with you that there can be some smaller-scale systems that work fine. That said, in these cases in my experience it's always few key "hero" mods who are just very committed to volunteering to keep things cleaned up.

Without actually hiring people, it's hard to get that level of commitment, and just as you are saying, as soon as the work gets hard enough (eg clever trolls that turn users against each other, or paranoid political crusaders who think the mods are in league with unseen forces), even the best volunteers end up quitting at the worst times.

In the long run I think the solution is just hiring moderators. It costs money, but if you want a job done well and consistently ya gotta pay.

chrismorgan · on Sept 29, 2021

In the email space this has happened so much that there’s a well-known term for it: FUSSP (Final Ultimate Solution to the Spam Problem).

michaelpb · on Sept 29, 2021

No wonder the email space is so behind. I'm already working on the next problem, NRTOIATFUSSP

(No, Really, This One Is Actually The Final Ultimate Solution to the Spam Problem)

evandwight · on Sept 29, 2021

Why can't you verify real identities? Can't you use phone numbers, photos of IDs, charging a credit card, verification of physical addresses, or invites to increase the difficulty if creating fake accounts?

Yes, there is an issue of increasing the difficulty if signing up for real users but once accounts are tied to real identities doesn't that allow crowd moderation?

- person whose fallen into your apparent fallacy

CivBase · on Sept 29, 2021

Somehow authenticating every user with a "real ID" doesn't help much unless you engage in content moderation.

A system like that would be complex, costly, a major barrier for growth, and would likely still be vulnerable to fraud. You probably wouldn't have much opportunity to take legal action against abusers, even if you could identify them. Plus, safely storing the user info needed to make a system like that work would be a huge liability.

And at the end of the day you still have to moderate the platform to identify abuse and take action against abusers. But if you use "real IDs" that probably wont be a problem because you'll have no users anyways.

evandwight · on Sept 29, 2021

I should clarify - crowd sourced moderation, not no moderation. People can still be banned if the community doesn't like them.

I broadly agree but invites aren't complex. See lobste.rs

lmm · on Sept 29, 2021

lobste.rs is tiny enough to be irrelevant, and already virtually unusable because it's unwilling to ban people who are unpleasant without being unambiguous rulebreakers.

pjc50 · on Sept 29, 2021

That swings to the other side of "you either die an MVP or build content moderation": people are not going to submit real ID for a random project. They've only just started implementing this on Youtube ""age verification"" because they were made to, and Facebook only did it as an arbitrary after-the-fact hammer. It causes all sorts of problems (what do you regard as valid? What about deadnames?).

Twitter and lots of other sites do phone number verification which is less onerous but far easier to spoof.

And of course the biggest, highest profile moderation challenge involves people whose identities are known but nonetheless are toxic to the community. Including the "final boss" of content moderation challenges, Donald Trump.

evandwight · on Sept 29, 2021

I agree. You build the MVP then as you need content moderation you start requiring more onerous proof of identity. The only goal is to make ban evasion more difficult.

For everything else, just ask the community what they want. If they don't like Donald Trump in the conversation, then he's gone. Donald can then attempt to find a community (subreddit) that accepts him. That community can be quarantined or banned if people really don't like it.

Thank you for the feedback. I strongly suspect I'm wasting my time :(

marcthe12 · on Sept 29, 2021

Privacy issues? That come up alot here on HN

evandwight · on Sept 29, 2021

I agree that you'd need to securely store the personal information but as it's only used during account sign up it could be an entirely separate system.

It's definitely a huge draw back that you'd risk such important information.

pjc50 · on Sept 28, 2021

> Even if you just hosted audio

Absolute worst case! You're going to end up DMCA'd by the entire music industry.

> It's no longer 1991, when you could assume that such people wouldn't find you.

Even back in the nineties, there was abuse .. but the internet was so much smaller, and it was possible to manually ban them. Except on USENET. The labour of dealing with spam fell to a small number of people, one of whom wrote this astonishing rant: https://www.eyrie.org/~eagle/writing/rant.html

(and partly disowned it, but I think he was right first time)

Sohcahtoa82 · on Sept 28, 2021

> Absolute worst case! You're going to end up DMCA'd by the entire music industry.

You'll even get DMCA'd for your own content!

In late 2018, musician TheFatRat had one of his YouTube videos taken down due to a DMCA report: https://twitter.com/thisisthefatrat/status/10729330469391933...

Herman Li, the lead guitarist for Dragonforce, had his Twitch account suspended due to supposed DMCA violations because he played his own music on stream: https://www.kotaku.com.au/2020/10/twitch-dragonforce-herman-...

mijustin · on Sept 28, 2021

> If you host blobs for free

This is the key distinction. If you charge money, from the beginning, most of your content moderation woes go away.

At Transistor.fm we host podcasts and charge money for it (starting at $19/month). We've had very little problems with questionable content.

We're a counterpoint to the narrative here: small (4 full-time people), profitable, and calm.

> Even if you just hosted audio

Most DMCA takedown requests these days are handled through the big podcast directories (Spotify, Apple Podcasts). We haven't had to write/implement any fingerprinting tech.

dredmorbius · on Sept 29, 2021

Charging also gives a (potential) communications path with the customer, through the payments processor. This can be useful for resolving account issues, which are otherwise a nightmare when there's no identification whatsoever.

But any payment also means that you're now fighting against "free" for growth. Free may come either from some massive monopoly who can offer their alternative as a loss-leader, or as a subsidised offering on the back of another monetisation scheme (usually advertising), where revenue potential increases with platform scale. Yes, you'll have fewer issues, but you'll also always be on the short end of the growth stick.

This is a rephrasing of another reply to your comment, hopefully with a different and clarifying emphasis.

nemo44x · on Sept 29, 2021

> We're a counterpoint to the narrative here: small (4 full-time people), profitable, and calm.

I hope your model wins in the end but let’s face it - the internet makes available a global market for nearly no marginal cost per user. Networks effects and all that on top of that premise.

You can carve out a niche but the serious money will be spent on scale.

htrp · on Sept 28, 2021

> If you host blobs for free, somebody is going to use you as their host. Even if you just hosted audio, I'm sure somebody will quickly come along with a steganography tool to hide their content on your site (and use your bandwidth).

This feels like something more of a theoretical example cited versus something that has happened. Do you have any examples of steganography being used as bandwidth redirection/hosting?

michaelpb · on Sept 28, 2021

Not at all theoretical, this happens all the time. There are tools like StegoShare: https://en.wikipedia.org/wiki/StegoShare

I googled and one of the top articles was this (I didn't read it): https://www.deccanchronicle.com/technology/in-other-news/070...

I couldn't find the article I had read a few years back, but I remember this sort of thing being used to host content on Facebook, Wikipedia, Reddit, etc, before they cracked down on it.

TravisHusky · on Sept 28, 2021

I did this when I was in high school for fun. I used a poorly designed comment system somebody designed and used it to transfer files around hidden in gibberish comments. Maybe not the most common thing, but more common than you would expect.

joshuamorton · on Sept 28, 2021

https://handwiki.org/wiki/GmailFS

https://github.com/maxchehab/redditfs

etc etc.

MauranKilom · on Sept 28, 2021

Many examples (maybe not exactly steganography, but same spirit) in this discussion from three weeks ago:

https://news.ycombinator.com/item?id=28431716

jcun4128 · on Sept 28, 2021

Recently saw a post about using imgur to host websites by the website code embedded in images (steganography?)

SilverRed · on Sept 29, 2021

Eventually someone will come around and build a "Google docs FUSE" tool that stores arbitrary data in your system. I suspect this is the main reason Google switched docs to actually count against your usage. For normal users you can still store hundreds of millions of documents but it's pointless to encode data in it over just uploading it to drive.

jonshariat · on Sept 28, 2021

I'll never forget having to be a moderator for a somewhat popular forum back in the day and oh man did I learn how a few people can make your life hell.

One thing not mentioned many times in these discussions are the poor moderators. Having to look at all that stuff, some of which can be very disturbing or shocking (think death, gore, etc as well as the racy things) really takes a toll on the mind. The more automation the less moderators have to deal with and then usually its the tamer middle ground content.

Goronmon · on Sept 28, 2021

I'll never forget having to be a moderator for a somewhat popular forum back in the day and oh man did I learn how a few people can make your life hell.

I was also a mod for a popular gaming forum way back in the day. It was pretty miserable looking back.

Personally, for me, the extreme/shocking content wasn't the biggest issue. That stuff was quick and easy to deal with. If you saw that type of content you just immediately deleted it and permanently banned account. Quick and easy.

What was a lot harder were the toxic users that just stuck around. Not doing anything bad enough to necessarily warrant a permanent ban, but just a constant stream of shitty behavior. Especially sometimes when the most toxic users were also some of the most popular users.

user-the-name · on Sept 28, 2021

> What was a lot harder were the toxic users that just stuck around. Not doing anything bad enough to necessarily warrant a permanent ban, but just a constant stream of shitty behavior. Especially sometimes when the most toxic users were also some of the most popular users.

What people find out, again and again, is that you just ban those users. Don't need an excuse. Just ban them. Even if they are popular. Your community will be much better once you do.

SilverRed · on Sept 29, 2021

I have done this for years and it just works. You know these users give off a bad vibe and that others are put off by it. Just remove them and ignore the complaints from the user. They can find another space that accepts them. I don't even bother writing rules lists because they are pointless.

I do give warnings out first but usually that does nothing to change behavior anyway.

doublerabbit · on Sept 28, 2021

> What people find out, again and again, is that you just ban those users. Don't need an excuse. Just ban them. Even if they are popular.

Rule #1 of moderation. Keep it in an open transparent log and you'll find positivity.

nicbou · on Sept 28, 2021

That was also my experience moderating a medium-sized city subreddit. Bigger problems were easily dealt with. Toxicity was a lot harder to deal with, especially when it's so easy to create a throwaway account. I quit when one user decided to target me personally, and kept evading bans to cause more grief.

All of this crap, and your reward is more complaints, more demands.

specialist · on Sept 29, 2021

> ...so easy to create a throwaway account

Bingo.

Why is authenticity so verboten?

If u/gonewild can manage user verification, then any one can.

Doubly so when surveillance capitalists like facebook and NSA already have (shadow) profiles for every person, living and dead.

Facebook absolutely already knows the true identity of each and every troll. Not verifying account creation is a convenient fiction, willful ignorance, allowing their outrage machine to profit. "lalala", hands over ears, "i can't hear you!"

notacoward · on Sept 29, 2021

> What was a lot harder were the toxic users that just stuck around.

In about a year as a mod on a semi-busy political forum, the trickiest situations always seemed to involve two users, neither generally horrible but both continually stepping over the line in their interactions with each other. And each had their own highly motivated allies, so any action would ignite a new firestorm of complaints about biased moderators. What a nightmare. Probably part of why that site doesn't exist any more.

BTW, that's also where I learned some rules of effective moderation. Unfortunately, finding a forum where moderators know how to moderate is hard. Far more often, they fall into a pattern of ruling on technicalities instead of considering what will actually improve discourse, and they always end up getting manipulated by the community's worst members to drive out better ones.

hnbad · on Sept 29, 2021

> What was a lot harder were the toxic users that just stuck around. Not doing anything bad enough to necessarily warrant a permanent ban, but just a constant stream of shitty behavior. Especially sometimes when the most toxic users were also some of the most popular users.

This is the problem with community guidelines being the be-all and end-all. Hard rules are great for catching insults or slurs. They're not so great for dealing with actual abuse or inciting very bad ideas.

I think Innuendo Studios' video on this problem is one of the best explanations: https://www.youtube.com/watch?v=P55t6eryY3g

There's a reason white supremacists and literal nazis (yes, really with the salutes, genocide fantasies, Jewish conspiracy theory and all) have shifted from using obvious language to dogwhistles and "just asking questions". Erosion is a much more powerful force than a few direct impacts.

If you want to moderate a community, you need to have a plan for dealing with toxic individuals, not just language. We tend to imagine "hackers" and (foul-mouthed) "trolls" but I find Molly's archetypes a lot more though-provoking: https://twitter.com/mollyclare/status/1254886822779502593?la...

I think community moderation is a problem we tend to run into the Dunning-Kruger effect with because it seems like something we have an intuitive understanding of even if we have zero experience actually doing it and having ever learned what works and what doesn't.

notacoward · on Sept 29, 2021

> Hard rules are great for catching insults or slurs.

Bingo. Hard rules encourage brinksmanship. There's always a class of "picador" users who will poke and prod and provoke just up to the line where the rules are, then flag the response. A moderator too wrapped up in rule by technicality (or too lazy to look at context) will then come down as harshly as they can on the author of the flagged comment, and give the picador a total pass. Problem is, the picador does this again and again and again, never making a positive contribution, while their targets are often chosen precisely for their prominence. Guess which one is encouraged to continue their behavior, and which one is encouraged to go away. Has the "moderator" helped to improve discourse on the site, or helped to ruin it?

sroussey · on Sept 28, 2021

We had some of the crazy people track us down and call in bomb/death threats to our office building.

So many though we were in collusion with a specific forum moderator (out of a million forums) and got to incensed. And this was in the early 2000s that we think was a saner time.

lelandfe · on Sept 28, 2021

A close friend of mine is a primary contributor to an extremely popular console emulator. He learned quickly to author under an alias which he keeps secret – even from most of our friend group.

It's bizarre that he has to keep this real love of his, which he's devoted hundreds and hundreds of hours to, so close to his chest.

But sadly The Greater Internet Fuckwad Theory holds true today.

zy0n911 · on Sept 28, 2021

This forum, didn't happen to start with a T and end with a G? (Shortened acronym)

simorley · on Sept 29, 2021

> Especially sometimes when the most toxic users were also some of the most popular users.

If the "toxic" users were the most popular, how do you know you were not the "toxic" one instead? If the community is supporting the "toxic" material, how could it be "toxic"?

In my experience, it's not the toxic users that's the problem. It tends to be the toxic mods. You can ignore toxic users. You really can't ignore the toxic mods.

I also have a problem with the term "toxic". It ultimately means "something I don't like". Mods should never ban "toxic" content. They should ban illegal and perhaps non-pertinent content. But that's just my opinion.

saas_sam · on Sept 28, 2021

There have been a bunch of articles lately about the horrors that Facebook moderators have to pour through. FB has been forced to pay $MMs to some of them for mental health: https://www.bbc.com/news/technology-52642633

wainstead · on Sept 28, 2021

> Having to look at all that stuff, some of which can be very disturbing or shocking

Yup, was the designated person to report all child porn for our photo-sharing website. It was horrific. Some of those images still haunt me today, they were so awful. And the way the reporting to the NCMEC[0] server worked, you had to upload every single image individually. They did not accept zip files or anything at the time. It was a giant web form that would take about forty image files at once.

[0] https://www.missingkids.org/HOME

ericd · on Sept 28, 2021

Even without seeing that stuff, seeing a constant stream of bad behaviors with the probably-good behavior filtered out can subtly change your priors about people - it makes you start thinking people suck more in general, kind of like how watching news where they show the worst of the worst makes one trust people less.

I definitely used to notice this after some time working on our moderation queues.

echelon · on Sept 28, 2021

> I'll never forget having to be a moderator for a somewhat popular forum back in the day

Similar experience, though I'll say that the worst was dealing with other teenagers that threatened suicide when you banned them. That always took a lot of effort to de-escalate and was a complete drain on personal mental health.

I could deal with porn, shock images, and script kiddie defacements, but having people threaten to kill themselves was human and personal. It hurt, especially when the other person was legitimately having a personal crisis.

I still think about some of these people and wonder if they're okay.

xwdv · on Sept 28, 2021

Several years ago a popular gaming forum with a significant teenage audience I used to read had declared a simple policy toward threats of suicide. If you were threatening to kill yourself, do it, and stop messaging the mods, they are not here to talk you down from a ledge. It seemed pretty effective.

fragmede · on Sept 28, 2021

That's horrible! Did you run that plan past any lawyers?

rfrey · on Sept 29, 2021

Your parent said they used to read the forum, not that they ran it.

getlawgdon · on Sept 29, 2021

Oof. That is a genuinely terrible and cruel approach.

dredmorbius · on Sept 29, 2021

Pointing to a mental health crisis service would be far more advisable.

I'd stick with the ban if the threat was in response to moderation actions.

xwdv · on Sept 29, 2021

Most these threats weren’t serious, just some problematic teen looking for attention. Showing them that some strangers don’t give a shit about them or their antics could be a real eye opener.

dredmorbius · on Sept 30, 2021

And what of the threats which were not?

Responding "just kill yourself already" is simply beyond the pale.

wffurr · on Sept 28, 2021

Someone still ends up reviewing images for the ML training dataset.

That's still a huge improvement over every mod everywhere seeing the same images repeatedly, but someone has to make the call at some point.

jedberg · on Sept 28, 2021

It always blew people's minds when I told them that 50% of the engineering time at reddit was spent on moderating. What's interesting though is that we didn't even have any moderation for the first year or so, because the community would just downvote spam.

It wasn't until we got vaguely popular that suddenly we were completely overwhelmed with spam and had to do something about it.

ninkendo · on Sept 29, 2021

What blows my mind is that only 50% is spent moderating.

At some point the engineering work can be something approximating “done” (or should get asymptotically close to it), and the health of your platform ultimately rests a lot more on the quality of the community than on any particular technical project.

What is engineering doing that it takes up as much person-hours as moderation? Running A/B tests to tweak the “try the app” dialog?

(Yes, I’m bitter… I deleted my Reddit account years ago and I’m still lamenting the loss of what the site used to be.)

wslack · on Sept 29, 2021

Super-niche subreddits still have something to give, but inevitably success > popularity > memes unless there is strong moderation.

ziml77 · on Sept 29, 2021

I saw this happen to a couple subreddits. I don't blame reddit for that (other than selection of default subs). It's just the natural progression as a subreddit grows that it turns to memes and constant recycled content. Heavy moderation is the only way to stop that.

Animats · on Sept 28, 2021

Roblox is working on a "moderation" system that can ban a user within 100ms after saying a bad word in voice. But their average user is 13 years old.

Interestingly, Second Life, the virtual world built of user-created content, does not have this problem. Second Life has real estate with strong property rights. Property owners can eject or ban people from their own property. So moderation, such as it is, is the responsibility of landowners. Operators of clubs ban people regularly, and some share ban lists. Linden Lab generally takes the position that what you and your guests do on your own land is your own business, provided that it isn't visible or audible beyond the parcel boundary. This works well in practice.

There are more and less restrictive areas. There's the "adult continent", which allows adult content in public view. But there's not that much in public view. Activity is mostly in private homes or clubs. At the other extreme, there's a giant planned unit development (60,000 houses and growing) which mostly looks like upper-middle class American suburbia. It has more rules and a HOA covenant. Users can choose to live or visit either, or both.

Because it's a big 3D world, about the size of Greater London, most problems are local. There's a certain amount of griefing, but the world is so big that the impact is limited. Spam in Second Life consists of putting up large billboards along roads.

Second Life has a governance group. It's about six people, for a system that averages 30,000 to 50,000 concurrent connected users. They deal mostly with reported incidents that fall into narrow categories. Things like someone putting a tree on their property that has a branch sticking out into a road and interferes with traffic.

There's getting to be an assumption that the Internet must be heavily censored. That is not correct. There are other approaches. It helps that Second Life is not indexed by Google and doesn't have "sharing".

armchairhacker · on Sept 28, 2021

The biggest reason IMO for moderation in the first place is because if you don't block/censor some people, they will block/censor others. Either by spamming, making others feel intimidated or unwelcome, making others upset, creating "bad vibes" or a boring atmosphere, etc.

So in theory, passing on moderation to the users seems natural. The users form groups where they decide what's ok and what's banned, and people join the groups where they're welcome and get along. Plus, what's tolerable for some people is offensive or intimidating for others and vice versa: e.g. "black culture", dark humor.

If you choose the self-moderation route you still have to deal with legal implications. Fortunately, I believe what's blatantly illegal on the internet is more narrow, and you can employ a smaller team of moderators to filter it out. Though I can't speak much to that.

In practice, self-moderation can be useful, and I think it's the best and only real way to allow maximum discourse. But self-moderation alone is not enough. Bad communities can still taint your entire ecosystem and scare people away from the good ones. Trolls and spammers make up the minority of people, but they have outsized influence and even more outsized coverage coverage from news etc.. Not to mention they can brigade and span small good communities and easily overwhelm moderators who are doing this for volunteering.

The only times I've really seen moderation succeed are when the community is largely good, reasonable, dedicated people, so the few bad people get overwhelmed and pushed out. I suspect Second Life is of this category. If your community is mostly toxic people, there's no form of moderation which will make your product viable: you need to basically force much of your userbase out and replace them, and probably overhaul your site in the process.

specialist · on Sept 29, 2021

> ...if you don't block/censor some people, they will block/censor others.

Terrific rephrasing of the paradox of tolerance. Thank you.

fragmede · on Sept 28, 2021

It sounds like Second Life lived long enough to build content moderation, pushed the work of content moderation onto its users, and in a hilarious psychological trick worthy of machiavelli, made the users think they own a piece of something (they don't) so that what other users do on "your" land is up to you. My job would also love if I paid them to work there instead of the other way around.

The Internet must be heavily censored to be suitable for mainstream consumption and the tools described make Second Life sound like no exception.

> You either die an MVP or live long enough to build content moderation

Animats · on Sept 28, 2021

made the users think they own a piece of something (they don't)

True, land in Second Life is really a transferable lease. It is a asset, though. You can resell to someone else. Second Life makes most of their money from "tier charges", which work like property taxes and are a fixed amount per square meter. Land resales and rentals, though, are a free market. "Content moderation" is a minor part of using land in Second Life. You usually build something on land and do something with it.

You have to be present (as an avatar) to make trouble. You can be a jerk in a virtual world, but you have to do it "in person". Most of the social pressures of real life work. Troublemakers can be talked to before things reach the ejection stage.

Many of the problems on forums come from being able to post blind, with no interaction during posting. What you posted persists, and is amplified by "sharing" and search engines. Virtual worlds lack that kind of amplification. You can gather a crowd, if you wish, but they don't have to stay around.

It's not perfect. There are still jerks. But being a jerk in a virtual world does not scale. Space is what keeps everything from being in the same place.

dredmorbius · on Sept 29, 2021

In both your original comment and in this followup, which makes the point explicit, I find it ironic that SL have addressed one of the key challenges of "spaceless" cyberspace ... by reinventing the concept of space. Which, as you say, "keeps everything from being in the same place".

That's not a criticism. There's a considerable degree of respect. There's some irony in that spacelessness is one of the purported advantages of the online world. It also makes me wonder what, if any, other options might exist, because effective tools for combatting abuse seem scarce, and those that do exist either brittle or capricious.

dexwiz · on Sept 28, 2021

For some reason I remember everyone's behavior on old school message boards as much better than modern social media. Sure, you have your degenerate boards, but just don't go there. Moderation and censor ship will always exist, but they seem to work better when they are more locally applied.

gmmeyer · on Sept 28, 2021

I remember it being a huge mix! It really depended on what boards you were on. There were boards I was a member of in 2003 that had very strong moderation and they were great!

I was also on some basically unmoderated boards and saw some stuff I wish I didn't see.

I think this is more indicative of the communities you were a part of than the actual behavioral norms of people at the time.

petermcneeley · on Sept 28, 2021

Echoing the same sentiment Counter Strike was exactly the same. There was an insane diversity of servers. Some were literally labeled Adult Content and way on the other extreme some were 'christian' where saying the word "shit" would get you banned.

doublerabbit · on Sept 28, 2021

True sense of authority keeps everyone at bay; if one a mod goes rogue it all collapses. But when a mod can be countable for their actions, everyone acts as a community and holds the peace. A transparent modlog could really make a community.

Clan's were more than just a bunch of mates playing a game. It was a free-open community where everyone was treated with respect regardless of who you were. Q3Arena was my first FPS at 13 and I fell in love for just the community spirit.

Organized clan-wars between X and Y, joining rival clan-servers just to poke and have fun are days which are now lost. It's the same experience of inserting a VHS cassette and hitting play knowing you were going to receive a real-feel of an experience.

I may of hit the tequila a bit too stiff tonight and this really hits hard but I do wonder if the same experience will ever make a come back.

s-lambert · on Sept 29, 2021

I think this is because old school message boards had a sense of community between existing users and there wasn't a large influx of users at any single point. If one person comes in and starts running amok it's easy to just ignore them, tell them off or ban them. But now there's less persistent forums that people are a part of, so there's a lack of community standards that people just naturally gravitate towards. There's no overall community between, for example, people who comment on youtube, so any youtube comment section is just whoever happens to stumble across it.

sroussey · on Sept 28, 2021

Having run a platform of a million or so of those, this is somewhat true. But there were spammers posting across many communities and those where the mods left got littered. We had to set forums to automatically move to require moderator approval to post which at least saved the board in history, but was a pain for a moderator if they returned.

fragmede · on Sept 28, 2021

Gosh, imagine how the users of the Internet from before us felt when we all joined!

https://en.wikipedia.org/wiki/Eternal_September

hnbad · on Sept 29, 2021

> Second Life has real estate with strong property rights.

Property rights is the wrong framing. We know this from the social sciences. The actual solution is just ownership.

This is the problem with public spaces in big cities compared to close-knit smaller communities: if it "belongs to everyone" it doesn't belong to anyone, if it is owned by the city/state/government, it's not actually owned by the people.

I'm not saying framing it as individual property rights doesn't work, I'm just saying it's too narrow an interpretation and a bad analogy, e.g. because property rights can allow for layers of indirection which erode this effect whereas ownership can be shared and still maintain the effect (although a cynic would then frame it as "peer pressure").

908B64B197 · on Sept 28, 2021

> Roblox is working on a "moderation" system that can ban a user within 100ms after saying a bad word in voice. But their average user is 13 years old.

Reminds me of XBox live more than 10 years ago. They banned the word "Gay" since it was used as a slur by (by their estimate) 98% of users.

But there was a two percent population that simply used it legitimately. [0]

[0] http://www.mtv.com/news/1605966/microsoft-apologizes-for-xbo...

rtkwe · on Sept 28, 2021

It's really funny every time a "we don't censor" platform pops up catering too the American right they speed run going from moderation == censorship to we're moderating our platform in record time. Turns out moderation is really important to make a platform for a community.

Sohcahtoa82 · on Sept 28, 2021

If you create a platform with absolutely zero censorship, you become a repository for child porn. I participated in Freenet many years ago because I liked its ideas (And thought it would have been a nice way to pirate games without my ISP being able to know), but it got a reputation for being used for CP, and I promptly deleted it, because I want no part in that.

If you merely censor illegal content, you will become a home for disinformation and ultra right-wing conspiracies. See Parler.

In either case, and especially the first, you're likely to get kicked off your hosting platform and get a lot of attention from the government.

I don't think it's possible create a "we don't censor" platform without hosting it in some foreign country that doesn't care about US laws and also hiding that you're the one that runs it.

hnbad · on Sept 29, 2021

I think the more general rule is that if your platform is advertising itself as an alternative to an existing dominant platform but that doesn't censor "X", the kind of people who'll flock to you will mostly be those who value X over any other interaction on the dominant platform.

Most moderates don't feel restricted enough in their speech to use Twitter even if they may have some racist ideas they don't feel safe stating there, but literal white supremacists who want to talk about the superiority of the white race all day long will love your alternative that doesn't censor them.

But if those white supremacists now all flock to your platform you likely now have a platform mostly consisting of white supremacists. This will naturally limit who else wants to join your platform because if they don't like being around tons of white supremacists they don't stick around long enough to build a counterbalance.

This can also happen after you think you've already established a more nuanced community as with your example of Freenet.

It turns out that while some people may seek opportunities to talk about X, others will not want to share space with discussions about X, often because of what it means to their own safety but also sometimes just from fear of association.

If you have a fun social network and it has a prospering community of nazis in it, you're now the nazi network, no matter how much of your community is about other things than being a nazi.

fragmede · on Sept 28, 2021

> See Parler.

Gab may be a more useful point of reference, but what's hilarious about right-wing platform is that their censorship is fine, it's the other people's censorship that's the problem. (Similar to how immigrants having fake papers is wrong but having a fake vaccination card is sticking it to the man). Go post some pro-vaccine or pro-mask mandate things, and see how long you last before being deplatformed.