
How My Popular Site was Banned by Google - kbrower
http://kbrower.posterous.com/banned-from-adwords-and-google-search-in-less
======
Matt_Cutts
You have an autogenerated web site that consists of practically nothing other
than affiliate links to Amazon. You can make an infinite number of
autogenerated pages on your site, e.g.

<http://www.filleritem.com/index.html?q=hacker+news>

<http://www.filleritem.com/index.html?q=31.69>

<http://www.filleritem.com/index.html?q=teen+sex>

and each autogenerated page consists of literally hundreds of affiliate links
stuffed with keywords for unrelated products.

When Google's webspam team takes action on websites in our websearch index, we
can pass that information on to the ads group so they can check for
violations. But it's a one-way street: we can send the ads team signals or
information about spammers or other violations of our quality guidelines, but
the ads team doesn't send information over to the quality/webspam team.

~~~
kbrower
Hi Matt,

Thanks for responding.

The point of the site is to find items of a particular price that qualify for
free shipping on Amazon.com. If you want I will give you access to my google
analytics to show you that this is a site people want and use.

I guess I should not let people link to the search results page?

<http://www.filleritem.com/index.html?q=hacker+news> and
[http://www.filleritem.com/index.html?q=anything+that+is+not+...](http://www.filleritem.com/index.html?q=anything+that+is+not+a+number)
returns the same thing as if you did a search for $0. I will fix the bug so it
returns an error. These are pages that no one has ever linked to as far as I
know.

~~~
Matt_Cutts
My advice would be to use robots.txt to block out the autogenerated pages;
when users search for e.g. a long-tail phrase and then land on a page that's
nothing but affiliate links with a lot of keywords, they tend to complain to
us.

Users would be happier if they landed on the root page of your site or the
root page of <http://www.superfillers.com/> or
<http://www.filleritemfinder.com/> than if they landed on a deep page full of
links and unrelated products.

~~~
nirvana
<http://www.filleritemfinder.com/> is the first hit for "filler item" on
google right now.

A search for hacker news results in a page full of affiliate links, just as
the example you gave above. Only difference is that they didn't re-write the
URL.

filleritemfinder.com has no robots.txt that I was able to pull up.

So, filleritem.com, a google customer, was blocked, but filleritemfinder.com,
doing the same thing, is the number one result.

Further, shouldn't this kind of advice be given to people who appeal being
excluded from the index? Or should we all post to Hacker News when it happens
to us so that you can come explain directly?

I think %50 of the problem is the arbitrary picking of sites to block (and
it's not working, btw[1]) and %50 of it is that google seems uninterested in
explaining or advising people when it happens to them.

[1] Been buying gear for a project lately, and so doing a lot of google
searches in the form of product-model-number review or product-name review.
Overwhelmed with spam sites, and mindless human generated spam sites like
dpreview.com, etc.

~~~
Matt_Cutts
When I do the search [filler item], one of the top results I see is
[http://www.makeuseof.com/tag/top-5-amazon-filler-item-
finder...](http://www.makeuseof.com/tag/top-5-amazon-filler-item-finders-
qualify-free-shipping/) which shows five different services to fill this
information need, and that also has pictures. I do think that's a pretty
helpful result.

I mentioned filleritemfinder.com as a random example (there are many of these
services), but filleritemfinder.com appears to use AJAX to keep results on the
same page rather than making a new url.

"filleritem.com, a google customer, was blocked, but filleritemfinder.com,
doing the same thing, is the number one result."

The filleritemfinder.com site is not doing the same thing, because it's not
generating fresh urls for every possible search. But you're not really
suggesting that we should treat advertising customers somehow differently in
our search results, are you? The webspam team takes action regardless of
whether someone is an advertising customer or not.

"shouldn't this kind of advice be given to people who appeal being excluded
from the index?"

This advice is available to everyone in our quality guidelines. It sounds like
the site owner reached out to the AdWords team, which gave him clear guidance
that the site violated the ads policy on bridge pages. It sounds like the site
owner also filed a reconsideration request, and we replied to let the site
owner know that the reconsideration request was rejected because it was still
in violation of our policies. It doesn't look like the site owner stopped by
our webmaster support forum, at least that I could see in a quick look. At
that point, the site owner did a blog post and submitted the post to Hacker
News, where I was happy to reply.

~~~
OpenAlgorithm
Would it be likely that if the site was to use another monetization plan that
it would be less likely to be penalized e.g. advertising instead of affiliate
links?

~~~
Matt_Cutts
I recently talked for about a minute about the topic of "too much advertising"
that sometimes drowns out the content of a page. It was in this question and
answer session that we streamed live on YouTube:
<http://www.youtube.com/watch?v=R7Yv6DzHBvE#t=19m25s> . The short answer is
that we have been looking more at this. We care when a page creates a low-
quality or frustrating user experience, regardless of whether it's because of
affiliate links, advertising, or something else.

~~~
OpenAlgorithm
Thanks Matt, took a look at the video, that answers my original question.

Also is there a preferred monetization model, e.g. do Google think
advertisements are more or less harmful to the user experience than affiliate
links, sponsored posts, etc?

Obviously across different models you can't just track space taken up, so is
there some kind of metric that tracks the rate of content diluting via
monetization?

~~~
option1138
I know you were looking for an answer from Matt but I thought I would offer up
my opinion here (as you might have noticed, I love talking about this stuff).

Our models currently suggest that the presence of contextual advertising is a
significant predictive factor of webspam.

We use 10-fold bagging and classification trees, so it's not all that easy to
generalize. But I pulled one model out at random for fun.

The top predictive factor in this particular model is the probability outcome
of the bigrams (word pairs) extracted the visible text on the page. Here are a
few significant bigrams:

relocation _companies products_ providing products _pure quality_ book
recruitment _website tickets_ our these _traffic representing_ clients today
_play tours_ high registry _repair rent_ properties wedding _portal printing_
canvas pr _human privacy_ protection providing _efficient way_ trade printing
_stationery prices_ everything website*daily

Next, this model looks for tokens extracted from the URL and particular meta
tags from the page. Similar to above, but I believe unigrams only. A few
examples follow. Please keep in mind that none of these phrases are used
individually... they are each weighted and combined with all other known
factors on the page:

offer review book more Management into Web Library blog Joomla forums

The model then looks at the outdegree of the page (number of unique domains
pointed to).

From there, it breaks down into TLD (.biz, .ru, .gov, etc)

The file gets pretty hard to decipher at this point (it's a huge XML file) but
contextual advertising is used as a predictive variable throughout.

Just from eyeballing it, it appears to be more or less as significant as the
precision and recall rate of high value commercial terms, average word length
(western languages only), and visible text length.

Based on what I'm looking at right now, my answer would be that sponsored
posts are going to be far more harmful to the user experience than
advertising.

Can't answer the rest of your question which I assume relates to the number of
ad blocks or amount of space taken up by ads... we don't measure it.

Edit: Just realized that Google will probably delist this page within 24
hours. Should've used a gif for those bigrams. Oh well ;-)

~~~
OpenAlgorithm
Thanks for your response, is the data you are viewing publicly available?

~~~
option1138
No...

------
mrbgty
From google's bridge page definition:

Not acceptable: "Websites that feature links to other websites while providing
minimal or no added functionality or unique content for the user Added
functionality includes, but isn't limited to, searching, sorting, comparing,
ranking, and filtering"

The page in question provides the added functionality listed
(searching/filtering at least)

~~~
GeoFan49
The filleritem.com website is a very simple tool to search for items of a
certain price on Amazon. The purpose is to help select an item to fill the gap
in the shopping cart to qualify for free shipping. There is no reason for
Google to ban this site. What we MAY have here is an abuse of power by Google.

------
arkitaip
I can't judge if you broke the rules or not but not showing up on Google's
SERPs is terrifying and a stark reminder on how dependent we've become on
Google traffic.

~~~
kbrower
I included an image of my adwords account. The ad had been running for
years(albeit infrequently due to very low budget), and I had not touched it
since creating it.

I checked after this all happened, and there were several warnings from
adwords, but nothing from webmaster tools. I simply ignored the adwords
warnings as I was not actively using the account and figured they were just
disabling ads. If I had know these were the consequences, I would have just
deleted my adwords account.

~~~
kposehn
Unfortunately, deleting an AdWords account won't remove the violation; it may
actually make it worse. Google will look back several years and disable ads
that haven't run for a very long time. Sometimes they even ban accounts that
are inactive or deleted!

It is very important to remember: when dealing with Google AdWords, the
definition of a violation changes with time. If you violated the current
policies in the past, you may still be banned. You take a great risk running
there and any and all warnings you get should be dealt with immediately.

That said, seeing an exclusion from organic based on an AdWords suspension is
extremely alarming. That may be exactly what the DoJ is looking for in terms
of violations by Google.

~~~
thematt
_That said, seeing an exclusion from organic based on an AdWords suspension is
extremely alarming. That may be exactly what the DoJ is looking for in terms
of violations by Google._

According to Matt Cutts, it was the exact opposite: he got the organic
suspension _then_ the AdWords suspension.

~~~
kposehn
I saw, much less alarming now.

------
eulienohso
Google should really get some healthy competition.

filleritemfinder.com is reacheable from google, and as far as I can tell, has
the same functionality, with better design. Maybe some google guru could
comment if a redesign would save you.

------
darksaga
Another great example of Google randomly removing sites based on flimsy or nor
evidence. The fact they don't give you a chance to plead their case is pretty
scary.

My hunch is one of your competitors got the site taken down. I've had a few
clients who had their competitors file complaints with Google, or inform them
they were using black hat SEO to get on the first page of the SERPS. In one
case they were successful, but I had the site back up in less than 24 hour,s
so it wasn't a huge deal.

It sounds like it might be an ongoing issue and are not going to put your site
back in the SERPS for a while. I feel for you brother.

~~~
infinity
But often there is a reason why a site vanishes from the Google index. We
would have to exclude all possible violations of the Google guidelines, before
we could safely claim that this is a great example for random removal by
Google.

Edit: In a comment above Matt Cutts from Google gives an explanation.

------
trevin
Google's definition of a bridge page:
[http://adwords.google.com/support/aw/bin/answer.py?hl=en&...](http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=190435)

------
SODaniel
It's a website filled with autogenerated content and links, how are you
surprised that you got sandboxed?

------
mutt_cutts
To Matt Cutts and the search quality team: Why does sites like this rank for
every possible term with autogenerated results (such as filleritem.com)?

Nettkatalogen.no in Google.no ranks for about every possible term.

[http://www.google.no/search?pws=0&q=h%C3%A5ndverker+oslo](http://www.google.no/search?pws=0&q=h%C3%A5ndverker+oslo)
[http://www.google.no/search?pws=0&q=snekker+trondheim](http://www.google.no/search?pws=0&q=snekker+trondheim)
[http://www.google.no/search?pws=0&q=r%C3%B8rlegger+troms...](http://www.google.no/search?pws=0&q=r%C3%B8rlegger+troms%C3%B8)

(They use the "powered by Google logo". They use very aggressive phone sales
tactics and scam people. They buy links from newspapers.)

Is www.nettkatalogen.no violating Google Quality Guidelines? Or is this okay?

------
spxdcz
Dropping out of the organic results sounds like a result of the Panda update
([http://searchengineland.com/official-google-
panda-2-3-update...](http://searchengineland.com/official-google-
panda-2-3-update-is-live-87230))

My best guess is that you'll need to include some additional 'unique content'
on the site to get it back in the listings. At the moment the Google crawler
will see not a whole lot of text, quite a few Javascript buttons and some
links, and will probably flag it as low quality.

It may be a bit of a hassle, but may be worth adding a kind of
'interesting/unique filler items' blog to the site to increase the content to
links/js ratio.

------
xpose2000
<http://www.filleritem.com/index.html?q=teen+sex> This should have no results.
Other queries should have relevant results, but they don't. How can anyone
complain over his site being axed?

The guy needs to code a better site and I sure as hell would never use that so
he can get affiliate money.

Sorry, but your fortunate this site has survived for so long.

~~~
codemonkey3k
If you look around, almost everything you click on the web is set up to make
someone money. It's none of Google's business how a website is making their
money as long as it's legal and follows FTC policies. Browsers make money
every time you use that little search box in the top right corner of Firefox.
People have a right to be compensated for their work. If the site provides a
good service, I'll use it. If you don't want people to make money off your
clicks. It would be best to close your browser and never open it again.

My site got hit by Panda (no manual penalty). After relaunching my site (which
I had luckily been working on for over a year), my bounce rate is 30% and my
direct link rate is 50%. Still, Google has me by the throat. My traffic is
almost dead. Panda doesn't care if you're solving a problem for users and
giving them the eye candy they want. It apparently just wants your site to be
a "unique, well-written" news site. People come to my site to browse vintage
items and find stuff that they didn't know existed in the niche. They usually
visit during work hours and late in the evening. Reading ease for these items
is on a college level (Which Panda doesn't like). They don't want to read a
1000 word article, they usually just want to hear it, see it, maybe ask
question or two about the item and see who may be selling one. I offer that.

Panda's obvious Bayesian nature doesn't understand nuance. It only knows
napalm.

Niche markets are the ones getting the crap end of the stick. And it's funny,
niche markets used to be what made the web so great! Not anymore. Now it's
100% Googlized Walmart-ization.

And yeah, I'll be using www.filleritem.com when shopping Amazon. I didn't know
it existed until I read this post. I don't mind, transparently, giving someone
a few pennies for a good service. You do it all the time.

------
coffee
Yea that sucks, it's happened to me MANY times. The fact of the matter is,
it's Googles world and we're just living in it...

~~~
infinity
I don't think that it is as simple as that. There is some form of ecosystem on
the web: there are the content creators and website providers, there are the
searchers, and there are the search engine providers. Each of these has its
own interests and guidelines.

Searchers have quality guidelines too, though they are not explicitly
published by them on the web. If a search engine delivers low-quality results,
the searchers move on to the next search engine. A search engine without
searchers is hardly profitable.

Most of the quality guidelines of Google are just common sense, following them
will in many cases increase the usability and findability of a site.

If this has happend to you many times, you are likely doing something wrong. I
have never had one of my sites banned from Google.

~~~
coffee
> If this has happend to you many times, you are likely doing something wrong.
> I have never had one of my sites banned from Google.

Right. So. YOU'VE never personally had a site banned, therefore others who
have must be doing it wrong.

You know what, I've never been hit by a car while crossing the street,
therefore others who have must be crossing the street wrong.

I've never been in a plane crash, therefore others who have must be doing it
wrong.

------
SteveOllington
lol, poor Matt, always under fire from everybody who hasn't got the rankings
they want. It's simple really, Google try to make it fair for everybody (and
granted it doesn't always work perfectly, which is why the continue to make
changes, it's evolution) and if some sites give themselves an unfair advantage
and are caught, then they're punished. How else could a search engine work? If
it were any different then al of our results would be greed pages with no
useful info and everybody would stop Googling stuff.

As somebody once quoted, "You can please some of the people, some of the time,
but you can't please all of the people, all of the time!"

~~~
option1138
This is a fair comment but I believe the issues stem more from the process
Google follows.

It is obvious that Google cannot communicate exact reasons why a site was
penalized as that would help spammers. However, there is nothing that prevents
them from adding a step to warn the offending website and give them a heads up
before the ban/penalty takes place, along with an explanation of the policy
that is/was being violated.

Most of these heads up would go ignored, some would not and yes, it would
incur a support cost. However, the number of websites which are significantly
penalized isn't onerous... I believe fewer than 1,000 each year?

When a company has become the defacto gateway to the internet, I believe they
have a responsibility to webmasters. Google has lost a lot of goodwill over
the years because of these seemingly arbitrary penalties... Instituting such a
practice would be a worthwhile investment.

~~~
Matt_Cutts
option1138, I'm afraid you need to recalibrate your expectations of spam on
the web. Blekko made a site called Spam Clock that estimates 1 million spam
pages are created _every hour_ : <http://www.spamclock.com/> .

There's 200+ million websites out there. 1,000 spam sites would be a spam rate
of 5.0 × 10^-6. If you remember the days of Altavista before Google, the
actual rate of spam on the web is much higher. Here's one stat: I once heard a
search engine rep (not from Google) say that they had to crawl 20 billion
pages to find 1 billion non-spam pages.

So yes, we do tackle more than 1,000 websites a year. There's a ton of spam on
the web, and Google has to operate on the scale of the web (e.g. in 40
different languages) to tackle all that spam.

~~~
option1138
Matt,

You are of course correct. The fault is mine for miscommunicating... I find
myself becoming less self-editorial these days when I write on the web and
tend to think everyone is on the same page as I am.

I was actually referring to an informal study I did earlier this year. I
measured sites which were receiving an average of 50,000 or more visitors from
Google US search (organic) per month over a six month period. Then I compared
those with a similar set from a subsequent six month period to see which had
significantly dropped off in traffic and rankings. The purpose of this was to
estimate the number of _significant_ sites which were penalized over that
period of time. The final estimate came to about 700 sites/year which were
penalized. There are lots of uncontrolled variables here of course... but I
was looking for an "order of magnitude" answer simply for curiosity's sake.

The 1 million spam pages created per day were of course excluded from
consideration as they never received much traffic from Google in the first
place.

So just to clarify my earlier response, I am advocating for a policy that
would apply to websites exceeding a certain threshold of organic traffic for a
significant period of time.

------
learc83
For better or worse Google is the gateway to the internet. If your site isn't
on Google it is virtually invisible to most of the world.

Since Google has such an enormous position of power, they are really going to
have to do better with customer service (and by customer service, I mean
providing a way to talk to a human on their end).

I understand it's expensive, but if it doesn't get any better, eventually some
government somewhere (that has jurisdiction) is going to regulate them.

------
esrauch
Just out of curiosity, if only 30% of your results are from organic search,
why did your hits drop by more than 30% after you stopped getting organic
search hits?

~~~
kbrower
that is the google analytics for just organic traffic from google. the site
gets ~1000 visitors a day.

------
general-marty
University should offer courses on Google policy. A thorough understanding of
such policies will be the primary determining factor for your success on the
web.

~~~
davidwhitehouse
I disagree with you there general-marty, I think the primary factor
determining your success on the web is a decent business plan and marketing
strategy. Without that you rely on free traffic from Google. Any company that
puts all its eggs in one basket (e.g. Google Organic) is skating on thin ice.

------
ajones05
wow, good karma +50 to Matt on this one

------
Bcvvg
Thanks matt

------
yanw
They usually change policy, or enforce existing ones before an algo update
that doesn't prove any connection in you case and it's not like they are
dependent on your ad expenditure.

------
rorrr
Your site looks totally spammy-doorway.

------
josefresco
Amazon probably complained to Google and Google despite already having the
rules on the books decided to enforce it on this one site, while also keeping
many other low traffic clone sites offering the same service alone. Call me
paranoid, or cynical but I've seen much worse from Google on the AdWords side
of their offerings.

~~~
Matt_Cutts
Nope, that didn't happen.

~~~
webcurious
This has been an amazing thread but there really is a pattern that is hard to
follow for webmasters. In the case of the original site couldn't Google have
just de-indexed the "extra" pages but left the the index page alone since it
is indeed useful?

Also, how did his site get canned but others like sportslinkup.com, which is
an ebay affiliate spam site cloaked as a link directory, have over 7 million
indexed pages and 8 million indexed images (all hosted by ebay, not the sports
site) and it gets over half a million Google visitors per month (according to
compete.com)?!? I think the sports site is even scraping Google for keywords,
it's full of examples of what not to do, but it's been sailing along for years
with Google approval, or at least no automated detection.

The fine line between sites getting canned and sites getting MASSIVE traffic
for essentially the same thing is very confusing.

