
Ask HN: Captcha Alternatives? - ev1
TLDR: I help with a gaming community-related site that is being targetted by a script kiddie, they are registering hundreds of thousands of accounts on our forums to &#x27;protest&#x27; a cheating (aimbot) ban. They then post large ASCII art spam, giant shock images (the first one started after we blocked new accounts from posting [img]), the usual.<p>Currently we use a simple question&#x2F;answer addon at registration time - it works against all untargeted bots and is just a little &quot;what is 4 plus six&quot; or &quot;what is the abbreviation for this website&quot; type of question. It&#x27;s worked fine for years and we don&#x27;t really get general untargeted spam.<p>I am somewhat ethically disinclined to use reCAPTCHA, and there are some older members that can&#x27;t reasonably solve hcaptcha easily. Same for using heavy fingerprinting or other privacy invading methods. It&#x27;s also donation-run, so enterprise services that would block something like this (such as Distil) are both out of budget and out of ethics.<p>Is there a way I can possibly solve this? Negotiation is not really an option on the table, the last time one of the other volunteers responded at all we got a ~150Gbps volumetric attack.<p>I&#x27;ve tried some basic things, like requiring cookie and JS support via middleware; they moved from a Java HTTP-library script to some kind of Selenium equivalent afterward. They also use a massive amount of proxies, largely compromised machines being sold for abuse.
======
huhtenberg
* Allow new accounts, but hide messages from them until their posts are verified manually and the accounts are either approved or shadow-banned.

* Don't delete ban accounts, don't notify them in any way, but tag their IPs and cookies to auto shadow-ban any sock puppets, so that these don't even make into an approval queue.

* Use heuristics to automate the approval process, e.g. if they looked around prior to registering, or if they took time to fill in the form, etc.

* Add a content filter for messages, including heuristics for an ASCII art as a first post, for example, and shadow-ban based on that.

* Hook it up to StopForumSpam to auto shadow-ban known spammers by email address / IP.

* Optionally, check for people coming from Tor and VPN IP, and act on that.

Basically, make it so that if they spam once, they will need both to change
the IP _and_ to clear the cookies to NOT be auto shadow-banned. You'd be
surprised how effective this trivial tactic is.

All in all, the point is not to block trolls and tell them about it, but to
block them quietly - to discourage and to frustrate.

~~~
ev1
We already tag/drop all Tor and VPN traffic. Cookies don't make much sense
because this looks like browser automation, not someone just swapping VPNs by
hand repeatedly.

For IP bans, they are now using illegitimately acquired or fraudulent IP space
(the guy is not intelligent enough for this, he's almost certainly just buying
proxies with in-game gold or some BS - but there is a criminal element here),
including what might involve significant hijacking of AT&T, CenturyLink,
Level3, and Windstream network resources.

(if you work at one of those places and are clueful, I would be very
interested in asking about this)

~~~
tdrp
By the way how do you detect VPN traffic? For tor we just pick up the list of
exit nodes but we've had trouble identifying VPN without using a 3rd party
API.

~~~
fomine3
I know a site just banning a IP opening 443/tcp. It bans web server and
possibly ssl-vpn server, but causes false positives.

~~~
soupsranjan
we are a stealth startup founded by former Paypal and Coinbase engineers. We
have a device intelligence product that can detect accesses from proxies/VPNs
without using any IP list. We can also detect the True OS that someone is
using - useful to detect emulators and script kiddies. Happy to chat if of
interest: info AT sardine.ai

------
laurieg
Other posters have given good advice on technical aspects. I'd like to add my
experience from moderating a large subreddit.

Focus on making it not fun to troll. Never acknowledge the disruption. Make
all your countermeasures as silent as possible. Never address the script kiddy
directly. Don't accidentally make a "leader board" or similar by counting
number of bans/deleted posts etc.

Eventually it just becomes a waste of time to scream into nothingness and they
will go elsewhere.

------
zaarn
A method used by a german community once was the troll throttle. The basic
idea being that troll and spam content compressed better than average content.

So you point various compression algorithms against your community content to
form an average/median/other statistical points. Try both compressing each
post individually and compressing it as one giant text corpus and counting the
size growth a post generates by being added. These are your measurement
points.

An incoming post must solve a captcha to be able to post, however, the
likelihood of solving it is tied to the compressability of your post.

A compressable post is likely to be spam or ascii art. The captcha fails even
if the data was entered correctly. IIRC I used a relationship of 'min(1,
sqrt(1/compress_factor)-1.05)'.

A non-compressable post is not only likely to succeed a captcha, they might
succeed even if they actually failed it.

The entire point is that it shifts balances. Trolls will have to submit their
posts a few times and resolve captchas, which slows them down. Making content
that does not compress well across a variety of compression algorithms,
especially if you also account for existing text corpus, is a very hard
problem to solve. They'd have to start to add crap to the post to bloat it up,
at which point you can counter with the next weapon.

Repeat all of the above, except instead of compression, you estimate entropy.
High entropy blocking means you can block messages containing compression
decoys.

~~~
0-_-0
The best part about this is it's difficult to figure out how the detection
mechanism works to counter it, and I really like the idea, but isn't it easy
to just add random noise to the a post to give it the right amount of entropy?

~~~
zaarn
It would be quite an art to generate a legible post that contains both ASCII
art and a consistent amount of entropy without containing too much entropy or
being too compressible.

And since posting it adds it to the compression dictionary, it's likely to
only going to work some time before it gets compressed better.

You'd be fighting yourself more than the captcha. Plus an attacker would now
have to, in order to figure out if a post will work or not, attempt to
compress the post over X number of compression algorithms (plus of course the
unknown text corpus of the long-term compressor) as well as estimate entropy
via Y algorithms. You'll be wasting a lot of CPU on these tasks.

Meanwhile, humans have no issue because while english does compress somewhat,
when we express unique ideas it's less compressible then when we do not.

~~~
zupa-hu
> it's likely to only going to work some time before it gets compressed better

Unlikely, random noise does not compress. It's practically always unique.

~~~
zaarn
But that triggers the entropy detector.

------
TomGullen
We’re a UK company, and we had an incredibly persistent spammer. He’d also
send us threatening emails. His persistence and nastiness was draining and
quite frankly impressive how much time he was putting into it all.

I don’t know if it was coincidence, but after some sleuthing found his real
name and did the online FBI tip-off form about his emails to us. He had a bad
history and may of been on bail.

Stopped pretty promptly after that - guessing he got a phone call.

------
dougk16
You didn't mention email confirmation in the first place but figure I'd
mention this for others. I recently ran into a similar situation and had the
idea of registrants emailing ME a secret code I give them instead of
confirming they receive it. Still technically automatable but would definitely
throw a curveball to the bots. I confirmed with an Ask HN that this is a
secure method:
[https://news.ycombinator.com/item?id=24116530](https://news.ycombinator.com/item?id=24116530)

------
SquareWheel
Not the answer you're looking for, but reCaptcha is probably your best option.

I attempted half a dozen mitigation strategies to prevent spam on one forum I
ran. I tried honeypots, questionnaires, other captchas, and proxying services
to block bots. They slowed the bots at best, but when there's a torrent of bad
actors it really doesn't matter if you slow them down 50%.

I finally installed reCaptcha and it solved the problem instantly. Not a
single bot has signed up in 6 months. I started getting suspicious that
signups were just broken, but I tested it and it was fine.

After that experience, I'm very much on team reCaptcha. I tried hCaptcha as
well (on a different project), but found it was much harder to solve.

~~~
tdrp
Also, after you hit a certain number of users the "bad actors" sometimes have
people behind them manually adjusting their bots' algorithms to match your
tricks.

------
alexnewman
HCaptcha founder here. I am sorry you had trouble solving captchas. Perhaps
your older members might have luck with
[https://www.hcaptcha.com/accessibility](https://www.hcaptcha.com/accessibility).

~~~
buixuanquy
Gosh, your captcha are really hard to solve, and I'm not even 25 year olds
yet. Sometime I can't understand the question and what I should select because
the images are showing only small part of the objects, and it's keep going and
keep going, I must solve 5 - 7 questions before I can access to the target
website.

~~~
zaarn
For me it's the opposite. hcaptcha questions are usually atleast solvable,
even if you get some parts wrong. Google usually wants me to solve 4-5 of the
image questions, sometimes I just get fully blocked without explanation and
can't solve further captchas.

And that is on top of not having to give more data to google.

------
obblekk
client side, GPU bound challenge (the user doesn't do anything but wait for a
spinner to load, the javascript has to solve a np hard problem).

won't block all spammers, but will increase the server cost (even for
selenium) to the point where they'll have to get GPU instances which will be
too expensive for a script kiddie.

this is what cloudflare is sorta doing when they say "verifying your browser"

~~~
unnouinceput
How will this stop a bot that has embedded browser in it? For the bot is just
a thread that will wait on "cloudflare verification" to happen. Not that hard
to bypass

~~~
Kiro
It will cost server resources for the attacker. It's not just waiting, it
needs to actually solve the problem.

~~~
luckylion
Unless you make it very expensive (which would turn legitimate users away,
because they don't want to wait for a minute to sign up), that won't help, I
think. You tie up a core for 5 seconds, but the server has 12 of those, so
they can create 12 accounts every 5 seconds => still too much to handle in
moderation.

~~~
zaarn
It's 12 accounts per second instead of 600 per second if it only takes 1
second. Or 1200 if it takes half a second. That is still an improvement.

------
crazypython
Hey, my game will be in a similar situation. I'm looking into building a
CAPTCHA that works by taking submissions from r/notinteresting,
r/mildlyinteresting, and r/interestingasfuck, and asking the user to take an
image and classify the image into not interesting, mildly interesting, and
very interesting. We can distort, crop, and recolor the image to defeat
reverse image search. That should be enough of a stopgap to stop them. Contact
me via email (in my profile page) if you want to work together on that
project.

~~~
aaronbasssett
So my script has a 1 in 3 chance of getting the CAPTCHA correct each time,
with just a random guess?

A regular 6 character alphanumeric CAPTCHA has a 1 in 56,800,235,584 chance as
a comparison…

~~~
cyphar
reCAPTCHA works in a similar way, they just ask you to determine which photos
in a set of 9-12 contain a particular object (2^12 at best). If you just asked
people to do the Reddit classification 7 or 8 times you'd get the same chance
of random guesses passing. The trick is to rate limit attempts.

Personally my problem with this is that even with the basic categories
reCAPTCHA asks for I find it difficult to figure out whether certain edge
cases should count. I feel it could be more frustrating to have to guess
whether someone on Reddit found a particular image interesting or not.

------
mey
Some CDN services provide Bot detection. (as well as other DDoS mitigation
options).

[https://www.cloudflare.com/products/bot-
management/](https://www.cloudflare.com/products/bot-management/)

[https://www.akamai.com/us/en/products/security/bot-
manager.j...](https://www.akamai.com/us/en/products/security/bot-manager.jsp)

Edit: I didn't see your comment about budget. I expect Akamai may be out of
reach, not sure about Cloudflare's options. Most bot detection is going to
need to finger print behavior of the interaction to the site (Captcha as
well). If that data is handled correctly, (not being sold/made available to a
third party/destroyed after use), I believe it can be done ethically.
Obviously my ethics are not yours.

~~~
ev1
"Bot Management" is enterprise only on CF, and Akamai is out of reach.

~~~
kennethkhaw
hey there, i work for ipinfo.io and we have a privacy detection API
([https://ipinfo.io/developers/privacy](https://ipinfo.io/developers/privacy))
that includes vpn/proxy/tor/hosting(bot, scrapes,etc) flagging which may help
your predicament? You can test out how accurate we are by appending any
nefarious IPs to our URL and see if we flag it correctly. e.g.
[https://ipinfo.io/5.62.18.115](https://ipinfo.io/5.62.18.115).

Not sure if that will be your magic bullet but will be happy to have a chat
and riff on this. I've talked to customers that are using our privacy
detection API and I do remember a good subset using it for your sort of use
case. At the very least, I can pass on some learning, strategies, etc to
combat it.

Available anytime at ken@the-ipinfo-domain.

~~~
aww_dang
$50/mo for geolocation + ASN lookup?

~~~
kennethkhaw
that's our paid plan but we do have a free tier! if you have something in mind
and an interesting use case, hit me up and I can organise access for you.

------
kilburn
A more extreme approach that may or may not work for you is to make the
community invite-only.

Track the network of invites and shadow-ban linked accounts when you detect
the spammer popping up. The spammer will eventually run out of invitees.

You can combine this with "no invitation required" short periods, where you
make changes to the signup flow, spam detection, etc. and make the window
short enough for the spammer to not have the time to adjust their bots.

------
bawolff
Maybe try stealing Wikipedia's ip ban list - wikipedia gets a massive amount
of spam which makes it an easy resourse for getting a list of evil ip
addresseses.

Their list is a combo of
[https://en.wikipedia.org/wiki/Special:GlobalBlockList](https://en.wikipedia.org/wiki/Special:GlobalBlockList)
and
[https://en.wikipedia.org/wiki/Special:BlockList?wpTarget=&wp...](https://en.wikipedia.org/wiki/Special:BlockList?wpTarget=&wpOptions%5B%5D=userblocks&blockType=sitewide&limit=50&wpFormIdentifier=blocklist)
and TOR (which is handled automatically) [there is also an api version in json
format]

------
tommica
If you can detect them as spammers, instead of banning them, shadow ban them,
making their posts invisible to others, and slow down the servers response for
them.

There is also alternatives to recaptcha, that might be more ethical, for
example [https://www.phpcaptcha.org/](https://www.phpcaptcha.org/) \- there
are some image matching ones too, but I don't know any specific ones.

~~~
aww_dang
Perhaps embed some recursive/infinite XSL in an iframe or data src attribute.
There are also ways to create memory intensive HTML documents. This will
cripple their Selenium or other headless browser instance.

------
tdrp
Not sure why it's not mentioned but, in addition to technical mitigation, if
you know the attacker's general info, then maybe you can also try other
avenues such as law enforcement or legal claims.

More work as well but when you whois some of the attacking machines you can
find out what the abuse@ email is for them and contact them. That can put the
provider on notice if you later also go with some legal action.

------
MattGaiser
Is there a reason to not use hcaptcha for signup only? Older members are
already members, so all you are doing is applying it to the new people.

Or add 2FA with a text message for sign up. That is a lot harder to automate
and unless he is willing to spend a ton of money on extra phone numbers, he
should run out of them quickly.

------
laksdjfkasljdf
the only wining move against spam is to make it easier to clean up than it is
to spam.

captcha only benefit google and the like, who couldn't care less for the
community or content. Captcha makes honest content (and spam cleanup) more
expensive than the spam! it's a losing proposition that only looks good when
you look at it without considering all the situations.

make honest content (and spam) easy, but cleaning up easier. Things like every
user can flag something, after a certain number of flags, also remove other
content from the same IP (or same bundle of users with a close registration
time window) automatically. And of course a feature for admins to
automatically ban and erase content from users wrongly flagging honest
content.

It's harder than captcha, but it is an actual solution. Captcha is lazy and
ableist.

~~~
ev1
This is exactly why I do not want to do captcha. I don't want more third party
analytics, tracking, or making our players solve endless captchas.

We are a gaming community that also has some older folk that we'd like to be
accomodating for. Most of the cheapo PHP captcha libraries don't support any
form of accessibility, or if they do it's a vulnerability that allow instant
solving.

Part of the problem is that we can moderate just fine, but we can't moderate
or look through hundreds of thousands of new registrations - just need
something to somehow get rid of 99% of the garbage automatically.

~~~
alexnewman
Hi, hCaptcha founder. For anyone who has trouble with captchas I'd recommend
checking out
[https://www.hcaptcha.com/accessibility](https://www.hcaptcha.com/accessibility)
. For the privacy obsessed (like me) our privacy pass option is pretty great
as well [https://www.hcaptcha.com/privacy-
pass](https://www.hcaptcha.com/privacy-pass).

------
awinder
How many people are you registering a day normally? I’m wondering if you shut
off signups for a while + handle the inevitable attack & they can’t get back
in they might move on. How much money and time do you think they (or you) are
willing to commit though, what a crappy tale :-(

------
niftylettuce
I'm working on [https://spamscanner.net](https://spamscanner.net), which will
be useful very soon for this with a public and free API (which will store zero
logs and adhere to same privacy as
[https://forwardemail.net](https://forwardemail.net)).

------
mpol
In my experience JavaScript filters work very well against spambots. For
example, you could have 2 honeypot fields, 1 with a certain value, 1 empty. In
JavaScript you switch their values, and on the server side it should validate
this way. Most spambots don't run JavaScript (yet). Another one could be a
simple timeout, again 2 fields with a certain value. You count 1 down, the
other up. On server validation there should be a difference of more than 1.

For an example, check a WordPress plugin I made 2 years ago:
[https://wordpress.org/plugins/la-sentinelle-
antispam/](https://wordpress.org/plugins/la-sentinelle-antispam/)

There is also the slider thing on Ali Express, that you could check out. I
haven't looked into it, not sure how it exactly works.

~~~
ev1
This is targetted spambots, so they will run through it once by hand and check
the request.

Aliexpress uses heavy, extreme amounts of fingerprinting, including port
scanning your device and your internal network via <img onerror> tags and
Websockets. The slide part is the least of it.

~~~
whakim
Yeah, this is the trouble with client-side solutions. If it's worth their time
(for example, if there's a credit card field or something), the bad actor will
first take a look at the request as it's sent to the server and then they will
make requests that look similar. You can do some aggressive stuff with
fingerprinting like this example but honestly at a certain point captchas are
just going to save you a ton of hassle and the alternatives start to become
increasingly invasive too. And I say this as a person who strongly dislikes
captchas from both a privacy perspective and an end-user perspective.

------
heartbeats
Try requiring new accounts' first few posts to be manually approved. Then
he'll have to make enough quality posts to build up credibility first. This is
very difficult, especially for a script kiddie.

Alternatively, you can take away the instant gratification by adding a
cooldown of, say, three days for each created account. Then he'll have to
register them in bulk and hope the humans don't spot the patterns.

You could also try using Bayesian filtering, but you'd have to block the ASCII
art first.

------
2FA4spam
How about a simple out-of-band confirmation requirement for every account
signup?

“Thank you for registering. Please send an SMS to number XXX with code YYY to
activate your account.”

Kind of like a reverse 2FA.

~~~
mrweasel
I think that would kill signup rates, mostly because it's so different from
other solutions, but damn is it an interesting approach.

You may need to add something that prevents the same phone number from being
used for multiple signups.

------
hinkley
You could also try detecting Selenium, but that could be cat and mouse as
well:

[https://stackoverflow.com/questions/33225947/can-a-
website-d...](https://stackoverflow.com/questions/33225947/can-a-website-
detect-when-you-are-using-selenium-with-chromedriver#41220267)

Remember, the goal is to flag accounts for cheap bulk rejection, without
telegraphing to the attacker.

~~~
hinkley
> without telegraphing to the attacker.

To extend this thought: I recommend that you don't roll out single mitigations
at once anymore. Debugging is always confounded by having multiple
simultaneous errors causing the same problem. Doing one at a time just lets
him ladder up with you. Knock out some rungs.

Always do pairs or triplets from now on if you can.

------
NetToolKit
We at NetToolKit have been working on related problems for years and might
have two products that directly address what you are looking for.

We launched Shibboleth (a CAPTCHA service) about a year ago, and you can
select from a variety of different CAPTCHA types (including some non-
traditional types; different types have different strengths and fun factors):
[https://www.nettoolkit.com/shibboleth/demo](https://www.nettoolkit.com/shibboleth/demo)
There are a variety of options that you can set, and you can also review user
attempts to solve CAPTCHAs to see if you want to make the settings more or
less difficult.

Recently, we launched Gatekeeper (
[https://www.nettoolkit.com/gatekeeper/about](https://www.nettoolkit.com/gatekeeper/about)
) which competes against Distil and others, but without fingerprinting.
Instead, site operators can configure custom rules and draw on IP intelligence
(e.g. this visit is coming from Amazon AWS or this IP address has ignored ten
CAPTCHAs in two minutes), and Gatekeeper will indicate to your website how it
should respond to a request based on your rules. There's also other
functionality built in, such as server-side analytics. Some light technical
integration is required, but we're happy to help with that if need be.

As with all NetToolKit services, we have priced both of these services very
economically ($10 for 100,000 credits, each visit or CAPTCHA display using one
credit).

We would very much appreciate a conversation, even if it is only for you to
tell us why you think our solutions don't fit what you are looking for. I
would be happy to talk to you over the phone if you send me your phone number
via our contact form:
[https://www.nettoolkit.com/contact](https://www.nettoolkit.com/contact)

~~~
ev1
Yep, unfortunately usage based billing is not possible for us. We can't use
any usage based cloud services at all due to abuse and attacks - can't even
host a simple avatar or button images on S3 without someone trying to infinite
loop curl them to blow through budget abusively. On top of that, if you're
going to reverse proxy the site, your service will probably be hit repeatedly
with 300G+ attacks.

Do you have an email (ideally one that doesn't pipe into a ticket system)?
Maybe I can share some possible/creative attacks we've seen that you can
improve your service with, even if it's out of budget for us.

As a comparison note, Stackpath does 1mil requests for $10/m.

~~~
NetToolKit
Our contact form does not pipe into a ticketing system (it goes into
support@[our domain, available via our profile link], which is just a G Suite
email account that you can use to contact us directly).

I'd very much appreciate hearing your thoughts about attacks and understanding
what an effective solution would be. Thanks also for your note about Stackpath
-- we aren't a CDN, but Gatekeeper could help reduce bandwidth usage by
denying requests.

~~~
ev1
I mean that is the price for Stackpath WAF (captcha, rate limiting, etc) :)

------
HEHENE
This may run afoul of your "no privacy invading methods", but are you able to
implement email verification before new users can post? Then once they get
bored of trying to attack the site you can go and purge all accounts created
in the last n days that haven't been verified yet.

I run a gaming community with several thousand members and we regularly have
to fend off attacks on both the community (spam bots in Discord) and the game
servers themselves (targeted DDOS attacks usually in the 200-300Gbps range.)

From my experience, they tend to get bored and move on rather quickly so often
times whatever we have to implement is more temporary in nature and doesn't
really affect the existing community much if at all.

~~~
ev1
Email verification is already required and always has been.

He's cycling through handfuls of oddball throwaway/disposable providers, some
catchalls. We block all known temporary email providers, but there are a few
that are obscure/blackhat/let you point a MX record from any free dynamic dns
provider to enable abuse.

Another interesting thing is that after we blocked all known VPN provider
space, he switched to more "darknet" proxy providers that pretend to be
legitimate by having random eastern european dirty IP blocks announced on
Comcast/Verizon AS.

A human eyeball can detect them, they're all pretty obviously following a
pattern like NameNameName or random letters, but unsure how I'd want to write
something to catch this in an automated fashion.

Oddly, this actually started over ~2 month ago, and it just started again this
week after a few weeks of no activity or attempts at all. Our complete VPN
block resulted in no successful activity for 9 days.

He also periodically tries to re-register from the same home IP once a month
claiming to be a new account and why is he getting banned? and etc.

~~~
heartbeats
You could whitelist the email providers, and require "strange" email providers
to be approved by mods. The workflow would look like this:

1\. Sign up with Gmail 2\. Verify email 3\. Account is instantly approved

1\. Sign up with sharklasers 2\. Verify email 3\. "You're using a weird email
provider. Mods will look at your account and see if it looks OK. If so, we'll
approve it"

~~~
hinkley
Don’t telegraph that information, I think. Better perhaps for the automatic
approval to look like a fast human and the manual approval to look like a
slower one. A process doesn’t have to be manual to look manual. The goal here
is to reduce the cost per request.

------
bo1024
Sorry to hear you're dealing with this. I'm not in the field, but this is a
case where I would abstractly be tempted to use javascript blockchain mining
or similarly require some amount of useless computation by the browser during
signup.

------
Pick-A-Hill2019
Some good answers in the stuff others have posted (Especially the
accessibility one).

You don't provide many details of what you do and do not have at your disposal
in terms of skills, tech stack, access to log files etc so this is a non-
expert cut and paste from SO [1]. Yeah I know (StackOverflow) and it doesn't
even relate directly to your problem ....But if you read the long bit below it
might give you a bit of blue-sky thinking.

>> The next is determining what behavior constitutes a possible bot. For your
stackoverflow example, it would be perhaps a _certain number of page loads in
a given small time frame from a single user (not just IP based, but perhaps
user agent, source port, etc.)_

Next, you build the engine that contains these rules, collects tracking data,
monitors each request to analyze against the criteria, and flags clients as
bots. I would think you would want this engine to run against the web logs and
not against live requests for performance reasons, but you could load test
this.

I would imagine the system would work like this (using your stackoverflow
example): The engine reads a log entry of a web hit, then adds it to its
database of webhits, aggregating that hit with all other hits by that unique
user on that unique page, and record the timestamp, so that two timestamps get
recorded, that of the first hit in the series, and that of the most recent,
and the total hit count in the series is incremented.

Then query that list by subtracting the time of the first hit from the time of
the last for all series that have a hit count over your threshold. Unique
users which fail the check are flagged. Then on the front-end you simply check
all hits against that list of flagged users, and act accordingly. Granted, my
algorithm is flawed as I just thought it up on the spot.

If you google around, you will find that there is lots of free code in
different languages that has this functionality. The trick is thinking up the
right rules to flag bot behavior. <<

[1] [https://stackoverflow.com/questions/6979285/protecting-
from-...](https://stackoverflow.com/questions/6979285/protecting-from-
registration-bots)

------
issa
I've had a lot of luck with variants of a honeypot. Add a visually hidden
field and any time it is submitted with content, block the post. Super simple
and with some creativity, it's hard for the bots to keep up.

~~~
birdman3131
Even better. Take a current field and hide it and put a replacement for it.
(Ie hide FirstName and add a FName field) More likely for it to be triggered.

------
boredatworkme
You have received some great suggestions so far.

One of the forums that I frequent has a "newbie" section, which is not visible
to full members or guests (who are not logged in). Whoever registers to the
website needs to get a predefined set of "Likes" on their posts. Not every
post gets a "like" \- only those that contribute to the discussion do (not
everyone needs to agree, debates are welcome as long as they are civil).

This helps maintain the quality of the forum to an outside viewer and cuts out
a large amount of spam.

~~~
boredatworkme
Couldn't edit, so replying to my comment:

The newbie section comes with a limited number of posts per day - so for
example, I sign up today, I can post a maximum of 5 posts per day, and this
limit goes up as I accumulate "likes".

I'm not sure if this will help if there are no community moderators trying to
share a bit of the workload though.

------
miki123211
Be sure you do email verification before users are able to post. Block domains
of temporary email services (there are lists floating around GitHub, Google is
your friend). Only allow one account per address. Figure out what domains the
spammer is using to make email accounts. If you can, block them entirely, if
not, require manual approval just for those domains. Use the other suggested
technoques, like shadowbanning etc. Consider requiring or allowing social log
in or phone number verification.

~~~
phantom123
If you need a reliable & good API to do email verification, check out
[https://removebounce.com/](https://removebounce.com/)

------
Seb-C
It happened to me a long time ago. He was not only spamming and using SQL
injections to destroy my community, but also advertising his own concurrent
website.

When I also started to build scripts and destroyed his own website, he
basically realized the harm he was doing, apologized and stopped.

Reminds me of the old good times when you could trap script-kiddies on msn.

"I think you are lying and not capable of hacking my computer. I'm waiting for
you, my IP is 127.42.196.8"

------
freitasm
Cloudflare perhaps with a firewall rule that blocks bots over a certain
threshold? It may fall under fingerprinting if you need to know that.

------
ve55
I mentioned quite a few alternatives to ReCAPTCHA that often work in
situations like yours here: [https://nearcyan.com/you-probably-dont-need-
recaptcha/](https://nearcyan.com/you-probably-dont-need-recaptcha/)

Some of the best solutions include very minimal/quick captchas, or simple
checks for things like javascript

~~~
ev1
Thanks for the article! In this case, this is custom spam from someone that
has spent a few hours looking at the network tab in devtools. The bad actor is
running headless browsers after we started doing basic cookie and JS checks;
we added some additional JS checks for basic things like whether it's a
800x600 window or similar - this stopped the spam for a few days until he
figured it out.

~~~
ve55
That's good, at least if they're not too skilled sometimes the ratio of time
spent playing cat vs. mouse with them isn't too bad, since it takes them a
long time to bypass measures.

You could also look into things on other layers rather than the application
level, for example, maybe the IPs they're using all come from similar VPN
providers or services?

------
paulintrognon
Loosely related discussions, for reference:
[https://news.ycombinator.com/item?id=23089599](https://news.ycombinator.com/item?id=23089599)
[https://news.ycombinator.com/item?id=20058697](https://news.ycombinator.com/item?id=20058697)

------
phenkdo
I think we might need more creative solutions to the problem of spam
<everything> including online reviews, posts, phone calls. Some kind of PKI id
verification system that every user should sign up with. Sure that will be
turn off a lot of users, but the trade off is only the true enthusiasts will
be participants.

------
Rotten194
Can you enable hcaptcha with a whitelist for known-good accounts? Not ideal
but might annoy them enough to give up.

~~~
ev1
Unfortunately, all (ok, the only) the hcaptcha plugins available basically
just let you enter a sitekey and nothing else/no other configuration possible.

~~~
GoblinSlayer
Simply don't add hcaptcha to the page for approved accounts or ignore absence
of a solved captcha on the server in submitted data. If you can't whitelist
approved accounts, then you have a problem.

------
vmception
> Negotiation is not really an option on the table, the last time one of the
> other volunteers responded at all we got a ~150Gbps volumetric attack

that's hilarious, have you tried trolling them back like just enjoying their
company? start saying pool's closed and things like that

------
codegeek
Could new accounts not have an approved Post and instead, all new accounts
must get manually approved ? Or is that not feasible due to the volume you
have ? I would make sure that first few posts by a new account is now
automatically approved/shown to anyone.

------
ehutch79
Temporarily charge a dollar to create a new account?

Also, do you have cloudflare in front of you?

------
methodmi
Do you know if they are using headless browsers? If so, we can block them
without fingerprinting and without captchas.
[https://methodmi.com](https://methodmi.com)

~~~
coolspot
> MMI’s primary implementation is through the use of a JavaScript tag which
> interrogates the device to which the ad was delivered for the presence of a
> graphical processing unit (GPU). The JavaScript creates a Canvas Element in
> HTML5, which allows access to WebGL (Web Graphics Library), a JavaScript API
> for rendering interactive graphics within any compatible web browser without
> the use of plug-ins. An IVT classification decision is made based on the
> results of the rendering capability of the device.

How this can’t be emulated by a headless browser?

------
aoqooqoqoq
Try using a proof-of-work system. It won’t differentiate between humans and
bots but it’ll significantly slow down the pace they can register accounts,
and it is completely transparent to legitimate users.

------
dawnerd
Are you sure they’re not posting directly to the registration endpoint and
bypassing the signup form? We just had this problem with spammers in China and
QQ emails. Adding a nonce helped dramatically.

~~~
ev1
We have a number of required fields that are generated in client-side JS and
not present in the underlying HTML. Our best guess based on logs is someone
walking through signup with devtools open, and then trying to replicate the
requests. A day or two of reprieve whenever some funky change is made, like
making the form element default to action="/a/url/that/bans/the/person" and
then when JS loads and it detects more than 1000 pixels worth of mousemoves,
replace the form action with the real signup endpoint.

It's a standard forum software with some plugins, all <form> elements are CSRF
protected with a random value, etc.

~~~
hinkley
I wonder if you could rotate mitigations to keep him off balance. Using
different ones for different IP ranges may complicate his ability to analyze
the code.

------
raverbashing
A JS tarpit (not sure this is the right name) might help

You add a JS snippet that does some work, but if you detect a bot you make it
do increasingly more work. Think bitcoin mining but not actually that

------
paxys
Do you have any problems with hCaptcha other than its accessibility? It sounds
like that is a much easier problem to solve than everything else people are
suggesting in this thread.

------
compsciphd
naive Q: how damaging would it be for you to stop accepting new accounts
temporarily? or put differently, how many legitimate accounts are created on a
daily basis? (single digits?)

------
sktguha
you could try google recaptcha v3 which does not require any user input and
runs in background. not 100% sure however if it helps in your case but i think
it should definitely help

~~~
bo1024
This falls under fingerprinting and privacy-invasive methods.

------
renewiltord
Recaptcha it and wait them out. He can't do it forever. Then unrecaptcha. Just
do a month. You'll lose some sign-ups and then you can go back to the old
thing.

~~~
encom
Aren't there captcha services that are less creepy than Googles?

~~~
johneth
hCaptcha[1] springs to mind, not that I know if they're more or less creepy
than Google.

[1] [https://www.hcaptcha.com/](https://www.hcaptcha.com/)

------
petre
Add e-mail verification and a introduce a random 30-60 min delay before
sending the verification e-mails. Then you can cut on disposable e-mail
domains and so on.

------
rexfuzzle
Might be slightly unorthodox, but email the first post to a Gmail account from
a random address and see if it is marked as spam, only display the post if
not.

~~~
viraptor
Sorry, but that will backfire badly. The source IP / email / donation will be
blacklisted in general very quickly. Also lots of targeted forum spam would
not count as "random email" spam

------
GoblinSlayer
Can't he just do that 150Gbps thing to make you do anything? Also can't you
allow aimbots? You can keep them in a separate space and put them in a
separate leaderboard.

------
dorgo
[https://xkcd.com/810/](https://xkcd.com/810/)

