
Stop the Bots: Practical Lessons in Machine Learning - new_here
https://blog.cloudflare.com/stop-the-bots-practical-lessons-in-machine-learning/
======
hannes0x21
"we issued more than 660,000 challenges, of which only 0.32% were solved —
meaning our algorithm detected bots with a 99.68% rate of True Positives."

Isn't that a bit optimistic? The captcha might have driven people away.
Especially on mobile i'm not too keen on clicking photo tiles containing
storefronts. Is there a way to detect false positives?

------
penagwin
I know some people from Cloudflare frequent HN so I'll ask here, how are you
guys getting the data to label?

You said you get it from the traffic you serve, but wouldn't this be a privacy
issue? If I host a wordpress blog and use cloudflare, does that mean there's
the possibility of a human reviewing a login request, potentially revealing a
user's password?

(Disclaimer, I use cloudflare for personal projects, and yes I know cloudflare
could be recording/MiTM everything anyway - and they need to MiTM to provide
their service, however I generally trust them)

~~~
jgrahamc
No, a human does not have access to your password.

For the purposes of machine learning we can do something like this: as a
request passes through us see if it's a POST to /wp-admin or similar, see if
the response is 200 or 302 (which would tell us if the login worked or not).
All that's done by code not people. Use that as a label "good login" or "bad
login" and then see if there are lots of "bad login" events for certain
characteristics and use that to predict what's a bot.

~~~
penagwin
Makes sense, thanks for replying! A lot of companies don't communicate much,
it's refreshing to have the CTO reply with a real response.

~~~
jgrahamc
:-) I have code that detects any mention of Cloudflare on HN and emails me
directly. Latency is a few seconds from post to email. Thanks for being a
customer.

~~~
ttul
Since you’re watching this space... the data you have on web site attacks
would be valuable in detecting phishing attacks too.

