
Google Spam Heresy: The AdSense Paradox - wheels
http://blog.directededge.com/2011/01/06/google-spam-heresy-the-adsense-paradox/
======
revorad
From Larry and Sergey's original Google paper
(<http://infolab.stanford.edu/~backrub/google.html>):

 _Currently, the predominant business model for commercial search engines is
advertising. The goals of the advertising business model do not always
correspond to providing quality search to users. For example, in our prototype
search engine one of the top results for cellular phone is "The Effect of
Cellular Phone Use Upon Driver Attention", a study which explains in great
detail the distractions and risk associated with conversing on a cell phone
while driving. This search result came up first because of its high importance
as judged by the PageRank algorithm, an approximation of citation importance
on the web [Page, 98]. It is clear that a search engine which was taking money
for showing cellular phone ads would have difficulty justifying the page that
our system returned to its paying advertisers. For this type of reason and
historical experience with other media [Bagdikian 83], we expect that
advertising funded search engines will be inherently biased towards the
advertisers and away from the needs of the consumers._

Google solved the problem by turning it on its head, but now it looks like the
spammers have indirectly implemented what Google avoided.

------
hooande
The spamminess of a site is not proportional to it's ad click through rates.
Strong web communities have the best click through rates (though they don't
usually generate the most ad revenue). Yahoo Mail has incredible click through
rates right now, and most people wouldn't classify it as being spammy.

The real question that google needs to answer is "Does this site exist to
serve ads, or do the ads exist to support the site?". This seems like a great
computer science problem to me, but I'm not sure if statistics will be the end
all solution. I think Google will need another quasi-social hack, ala
PageRank, to really solve it.

~~~
wheels
Sure -- I guess in general it wouldn't be click through rates, but click
through rates on pages arrived at via search traffic. Obviously a community
where members of a niche frequented a given site and clicked on related links
would be different. And you're also right that using the word _proportional_
was sloppy -- what was meant was correlated in assumedly meaningly ways.

Really this was the front end of the question of, "Could some application of
(negative) weighting based on the combination of search + adsense clicks help
to remove some of the spaminess that is currently being lamented on the
interwebs?"

------
rquirk
Google showing adverts on google.com/search is one thing, but I think the real
metric to look at is usage of the Google advert network on the sites
themselves. The more googlesyndication.com or doubleclick.net on a site, the
spammier it is.

The problem is that Google can't penalize a site for using these technologies.
And not because Google makes money from sites using these adverts, which they
do, but because it'd be an arms race. The guys taking advantage of SEO would
catch on and change advertisers. Google could penalize other advert networks
too (Bing say), but then they'd be open to accusations of anti-competitive
behavior.

~~~
bambax
It would be interesting to know the proportion of Google ad revenue between
displaying ads on its own properties (search, but also Gmail, etc.) and
networks of ads on external content.

The reason why spam exists is because people make money from it. If AdSense is
the problem, how much would it hurt to just drop it?

Certainly another system would appear, but then Google could discriminate
against any website with ads on it without being accused of being unfair to
competitors.

~~~
zone411
69% and 31%. It's in their SEC filings.

If Google discriminated against websites with ads on it, their SERP quality
would drop.

~~~
bambax
Thanks for the info.

But how would SERP quality drop if pages without ads received a boost?

Many high quality sites have ads, but not all. Yet all spam sites have ads (or
else what's the point of spamming).

So by discriminating against pages with ads, you eliminate spam at the
(possible) cost of hurting a little legitimate sites with ads.

This should be testable.

~~~
notahacker
ExpertsExchange (sells subscriptions) or StackOverflow (funded by ads) ?

~~~
bambax
Good point, but ExpertsExchange has ads too. It also behaves in a spammy way,
burying the answers at the very end of the page. That should be detectable
behavior.

~~~
stcredzero
I bet there's a lot of things that spam pages have in common with regards to
things like page placement.

So, if Google gets better at quantifying "spammy-ness" there will be an
initial drop in spam effectiveness, but then spammers would catch on and
emulate the behavior of good pages.

Eventually, the spammers' behavior would be indistinguishable from genuine
posters.

<http://xkcd.com/810/>

------
zemaj
Don't really think that makes sense. Ad placement, site design, ad relevancy
etc... all dominate the reasons of why people click ads. I don't think you're
going to get much information about content relevancy out of the CTR as you
can't hold those other features constant.

If Google is looking for how useful a page is, they're better off just looking
at whether people go back to their search results and click on another site
after a short delay.

~~~
al_james
I think it makes sense in extreme cases. For a 'real' page (e.g. with useful
content and ads that are secondary to the main content) CTRs are unlikely to
be that high. From experience I would say its hard to get > 10% CTR in such a
site.

However, for 'spam pages' (e.g. those that have little information and ad
links in primary positions) its common to have much higher CTRs. The prime
examples are those sites that present AdSense links in with navigational links
so its not clear you are clicking on an ad. CTRs of >50% are not uncommon.

~~~
getsat
This is correct. If you don't present the exact information someone is looking
for but provide links that look like they will, they will click those links.

------
jacquesm
Any search engine that capitalizes on a new and so far unused element of the
content that it indexes (in the case of google the links) and that becomes
successful because of this will cause the destruction the very thing that it
it capitalizes on.

So, for altavista, that was meta keywords and on page text, for google it is
the link structure of the web.

------
leif
I find the assumption that being on the correct site precludes me from
clicking through to ads or vice versa faulty.

------
Tichy
I've been amused by that for a while - in effect, Google is trying to destroy
it's own business model. With perfect searches, there would be no need for
ads.

~~~
chopsueyar
Just kill Adsense, and the problem would quickly be solved.

~~~
Tichy
How would they make money?

~~~
chopsueyar
Adwords

------
rmc
This would be a temporary fix. Spammy sites would move to a non-AdSense
platform, and in Google eyes would appear to be totally ad-free.

------
thailandstartup
I wonder if there would be a way to split Google up into multiple companies to
deal with the (multiple) conflicts of interest it has viz-a-viz providing both
search and buying/selling advertising. I doubt it would be good for Google
stock holders, but good for the market in general.

~~~
DennisP
If you did that, how would the search company make money, if it's not involved
with advertising?

~~~
thailandstartup
I imagine they'd still make their revenue from advertising - they'd have an
adsense bar. This would put the search part of the business on the same
footing as other sites selling advertising space.

I'm imagining a world where there are multiple adsenses (ad brokers) competing
effectively with each other for space, driving out supernormal profit for
brokers, and returning the bulk of the advertising revenues to the publishers
rather than to the middleman (currently Google). With more revenue for the
publishers, more publishers come online and drive out supernormal profits for
publisher too. That would be a more efficient market.

~~~
chopsueyar
Sound like CPM...or tv advertising.

 _With more revenue for the publishers, more publishers come online and drive
out supernormal profits for publisher too._

We've been seeing that for a long time...In the mid 90s porn affiliate stuff
would pay tens of dollars per click, from banners!

*By "publishers", you mean people that make websites with content that have ads.

------
DanielBMarkham
Couple points to add to the discussion.

You really don't want folks clicking your ads because they can't find what
they want, you want them clicking your ads because they found what they want,
and now they are on to a new search. A good way to check for this is to use a
Google search box on your page. If folks are going back to Google for another
search, they probably didn't get what they want the first time. You can also
provide trackable links. So if on my <http://facebook-login-help.com> page you
really wanted facebook login, I provide a link. Then I can track folks that
are in the wrong place (But I haven't hooked up the links yet because the
traffic was so low). It's in the publisher's best interest to provide lots of
targeted information and be clear and honest about what they are doing.
Another good metric is looking at time on site. If the people are only there
for 5 seconds, you got a bad visit. If they are there 2 minutes, you must be
doing something right.

An ideal page is where somebody is looking for X, finds it, then realize that
they need to purchase Y in order to do X. You've provided them necessary
information, the advertiser and ad network is providing them the next piece of
information on their journey.

If you ask me, the sad fact of the matter is that most folks have no idea what
they are searching for. Let me give you an example. I created a site that is
related to financial matters. People search for this stuff and come by to
visit. I try to provide them useful information.

Now I know from looking at my logs that the most people that click through the
ads are going on to Quickbooks, which is tangentially related to the search
term but not really. It's as if people were searching for "muffler
installation tips" and then went on to click on an ad for car insurance. It's
kinda-sorta related. I guess? But at least I can measure it. But I'm really
confounded as to the relationship that's happening between the site and the
ad. But at least it's something.

Metrics are very important. Without an ad click or some other event, I'm left
to assume that you have changed your mind or just didn't know what the heck
you were doing and were just thrashing around.

Contrast this targeted approach to my blog, which is just a bunch of stuff
thrown together. I get folks all the time coming in on random searches --
bikinis, monkeys, ocelots -- things I've mentioned in a blog entry about
something else. But they visit, stay a short time, then leave. If you ask me,
these are the people who are truly not getting anything useful at all. At
least with the targeted content the publisher (presumably) is trying to help
the searcher. These are just random combinations of search terms and content.

Couple other things: first, Google limits the numbers of ads you can put on
the page to 3. I personally think 3 is pushing it -- my first sites only had
one ad per page. Google was kind enough to email me and tell me that I could
put up to three. So lately I've went ahead and put three basically because I'm
lazy: I don't want to have to go back and update the site later. Plus it's
easier to design for three right off the bat than try to put them in later.

But 1 ad per page and 3 ads per page has a completely different CTR, even
though it's the same user, same search, and same types of ads. So unless the
publisher is breaking the 3-limit, I think there's a lot of room for mis-
diagnosis here.

Second, the content provider has no control over ad availability or scheduling
from the ad network. So if you write a wonderful series of articles on polar
bears, and there are no polar bear advertisers, you're going to get a bunch of
ads that are Public Service Announcements, or are for polar bear ice cream
bars, or something else that's a poor fit -- which also decreases your CTR.

I'm not sure any of that helps, but at least it's a bit more information. I
know the intent was to look at this from the searcher's perspective, but I
think the only way to work towards a solution is to look at problems from
multiple angles, and content providers (at least good ones) should have the
same goals as searchers.

~~~
chopsueyar
Why can't you target your content to specific affiliate marketing promotions?

I've found the adsense ads on my sites are not contextual, but based on the
search history of the user (according to Google).

Quick question: You track how many people actually want the facebook login
page, but what do you do with that info? Specifically, how to you use that
knowlesge to improve the user's search experience and expectation?

~~~
DanielBMarkham
Interesting. Depends on search history? That might explain why it's so hard to
fathom from this end.

As far as the login, I'm in a bit of a quandary. For the mis-typings, at some
point on the page I tell them they have mistyped their search, the correct way
to type it, and provide a link to where they really wanted to go. I feel like
this is how I would want to be treated if I made the same mistake.

As I see those links being clicked, the idea is to move them higher or lower
on the page depending on their use. This should get the user to where they
want to go faster. But -- I also think it's important to correct the user's
spelling. I don't want them coming back because they keep making simple
mistakes. And I'm not happy with a page beginning with "Hey bub! You can't
spell!" so there's probably an upper limit to how high I would put the "I
really wanted Facebook you moron!" links. Maybe about halfway from the top?

------
chopsueyar
What percentage of AdWords revenue is AdSense responsible for?

