
Google Faces The Slickest Click Fraud Yet - aresant
http://www.forbes.com/2010/01/12/google-click-fraud-tech-security-trafficsolar.html
======
eli
We don't sell anything directly on our websites, but I've been fighting a
massive botnet of click fraudsters who come to the site off an ad and then
visit a bunch of random other pages and submit any forms they encounter with
gibberish.

It's quite annoying. In addition to paying for bogus clicks, it throws off all
the analytics. And there are thousands of IPs involved (mostly in Asia).

~~~
dbz
I have an idea for you (aside from banning all of those annoying IPs). Try
mapping the courses the random clicks have. If you see a pattern, then once a
bot starts the pattern, have a form pop up asking it if it is human. If there
is no patter, then you can still initiate a random form pop up asking if the
user is a bot. I think you will probably stop some of the bots.

Fine. I'll suggest banning IPs. If you find a form filled with random
characters- ban the bot. And then maybe- there might be software which can
tell you if certain text is a sentence. If that exists- run that software on
the posts, and if there is a large percentage of non sentences, then you can
either ban that IP or pop up form asking for human confirmation! If the
gibberish is random sentences, you can try searching for common words. If you
find none, then you can auto have a form pop up. Gosh. Lots of work =/

I have lots of ideas, however, lots of them wont be easy to implement for
various reasons. Well, if you want more ideas- Feel free to ask me!

Man. I've always wanted to design anti-bad-things software. Makes me feel
superior?

~~~
eli
Yup, you've got the right idea.

I've managed to figure out a few peculiar things the bots do while crawling
combined with the fact that they seem to only use one of a handful of legit,
but very specific user agents. I've got some mod_security rules + a script
that combs logs and passes the IP to an iptables script to block them. Are you
familiar with mod_security? The default rules are a little too aggressive for
my liking, but it's a really fantastic tool.

Adding a CAPTCHA to all forms would be an easy solution, but it's just not
practical on many of our forms. I've had a good deal of success adding a
hidden form field (that is, one that's set to display:none by a CSS rule) and
then ensuring that it hasn't been tampered with when the form is submitted.

The gibberish is typically random letters and numbers, but it's smart enough
to fill all numbers in fields expecting phone numbers and email addresses in
fields expecting emails.

The scale of the operation is daunting, though. My script is pretty
conservative (I really don't want to block legit usrs) and it still picks up a
few hundred new IP addresses every day. I expire the bans after a week or the
list would get unmanagable.

And dealing with Google is very frustrating. Google says they've already
detected all fraudulent clicks and credited us. I think they're wrong, but I
don't see how I could possibly prove it.

~~~
dbz
Sadly, I'm not familiar with lots of securities. I've heard about that form
trick and I'm kicking myself for forgetting about it and not mentioning it;
however, you already knew so no harm there =p

Yeah. Captcha is an idea that I would have mentioned, but my ideas for
captchas are insane. When I start talking about all of the little things I
want to do I start sounding ridiculous. But I definitely suggest putting a
basic one on all of your forms. It's quite simple to make actually. Php
provites all of the necessary "image creation" functions you would need.

Indeed. However, a smart bot will still not fill in a form box with display
off. My solution is to provide a blank form and literally telling the user "If
you fill out this box, then the submission of this form will be ignored" or
something of the like. Of course this wont stop bots which have been fine
tuned to "attack" your site. Okay. I'll stop ranting. I just love talking
about the subject.

The scale of the operation really is daunting. Sadly without setting up subtle
things all over the place and then scrutinizing the data, it becomes hard to
do anything at all. BTW that is a LOT of IP addresses. How many of those are
actually confirmed bots?

As for proving google wrong. It would be very difficult. I'd love to help if I
could haha. I'll throw out a couple of ideas that come to mind seeing as I'm
talking about one of my favorite topics and well...you can't possibly stop me!

You can detect the site from which they just came from. If you can confirm the
bot

(Yeah. Hard part, but someday if someone doesn't do it first- I'll make the
software necessary myself -and of course make it free. Open sourcing it would
be a tricky issue because then the enemy can read it! I don't like making a
smarter enemy.)

and put that together with the site, you can investigate where the clicks are
coming from. If they are coming from several sites with google ad sense or
whatever, well. That can turn out to be pretty compelling evidence. (odds are)
Furthermore, a lot of those IP addresses will be from Asia. Pretty compelling
evidence. K. That was racist- but also a statistically proven fact =/. When
you stack up similarities, like maybe even browsing time on your site- you may
be surprised at the quality of the argument you can make. I mean. I am
assuming the attackers aren't like me. If I were an attacker I would spend
weeks forming the software so that I weren't bested by someone like me. But
hey. No body is perfect. Find similarities. Get your money back! RAWR

------
greg
This doesn't read so much like click fraud to me as hijacking a user's browser
to convert organic traffic into affiliate credit. The fraudulent clicks are
merely a cover story.

~~~
aresant
Click-fraud is the carrot - that's why these guys are doing it.

Google's best layer for detecting click-fraud is tracking click-to-buy ratio,
particuarlly for large advertisers.

This company has figured out how to make fake-clicks appear to have a good
click-to-buy ratio.

------
gyardley
There's got to be something more here. Redirects through seven partners - most
likely paying on net 30 terms and taking a cut of the revenue themselves for
brokering - would reduce the money torrent to a long-delayed trickle. Perhaps
the intermediary partners are placing cookies for targeting purposes and
therefore adding a bit to the revenue stream?

I'm also surprised that clicks are being generated from Google, of all places.
Faking the click from an affiliate of a comparison shopping site would be much
more lucrative (higher PPC due to higher purchase intent) and much less likely
to result in aggressive countermeasures (comparison shopping sites aren't
Google.)

------
ShabbyDoo
I worked for a site where we were considering going through an intermediary
for AdSense because we could earn a greater percentage of Google's take per
click than dealing with Google directly! It turns out that major partners can
cut such a good deal that they can pass on a better percentage even after
taking their own cut. Perhaps Google should consider reducing intermediaries
by paying out more fairly, even if doing so increases administrative costs.

------
robk
Cookie stuffing is an old trick. This seems like a way of defrauding AdSense
uniquely by juicing the conversion rate, which is easily doable if one views
the conversion code and puts some thought into this sort of method. But it
still seems easy enough for Google to catch over time as they track-back the
AdSense accounts.

