

How We Have Attempted to Recover from Google Panda - ackkchoo
http://ericbjorndahl.tumblr.com/post/6603914556/how-we-have-attempted-to-recover-from-google-panda

======
WalterGR
I've run The Online Slang Dictionary (<http://onlineslangdictionary.com/>)
since 1996. On April 11, the date of one of the Panda updates, traffic from
Google dropped 20%. Between April 11 and a month ago, I made the following
changes:

* Correct spelling, grammar, and capitalization errors

* Where spelling "errors" are legitimate "slang terms", link to the definition pages for those slang terms (gonna, wanna, ain't, dat's, etc.)

* Remove unnecessary extra punctuation, e.g. example sentences ending in "???" or "!!!"

* Checked keywords on top landing pages for searches like "slang", "slang dictionary", "slang thesaurus"

* Remove unnecessary extra spaces in the middle of sentences

* Use complete sentences even where completely unnecessary

* Restructure entire /definition+of/word+goes+here directory structure to be /meaning-of/word-goes-here, since people search for "meaning of" more frequently than "definition of", and since Google seems to have stopped treating + as a word separator in some cases.

* Reworked meta descriptions and page titles

* Removed meta keywords. I know Google doesn't use them, but maybe they treat them as a negative indicator of site quality? Who knows?

* Completely re-designed site's front page

* Switched from Google Custom Search Engine to a custom search implementation, because I think I can provide better results, improve the user experience, and reduce search exit rate

* Delay-load Twitter widget to reduce page load time

* Use rel="canonical" on every page on the site, since Google is indexing the IP address of the site and showing it separately in SERPs

* Removed <priority> nodes from the sitemap, just in case Google knows better

* Fixed the "HTML suggestions" on GWT, such as short meta descriptions and duplicate meta descriptions

* Excluded directories via robots.txt like /word-of-the-day/, since the words of the day are already spidered elsewhere on the site, in case Google is giving me a duplicate content penalty of some kind

I've also made plenty more changes in the past month, but I haven't written up
a list yet.

Sine the Panda update, I've seen no improvement in traffic from Google to my
site.

~~~
ackkchoo
Hi Walter - Thanks for posting your site and describing the changes you've
made as well. It's interesting that you've focused on spelling/grammar errors
(despite it being a slang site), and still haven't noticed any improvements.

We have a lot of members from around the world, who may not have English as
their first language. We've thought about correcting some spelling/grammar
programatically, but since we have a UGC site, and each review is tied to a
real person, it seems disingenuous to rewrite someone else's words.

I recently read an article about how Zappos uses Mechanical Turk to correct
grammar and spelling in user reviews
([http://www.techdirt.com/articles/20110428/02453114067/zappos...](http://www.techdirt.com/articles/20110428/02453114067/zappos-
uses-mechanical-turk-to-correct-spellinggrammar-errors-reviews-it-increases-
sales.shtml)), and one of the comments that stood out to me was:

"Grammar, punctuation and style are part of how we decide if a review is
credible. For Zappos to artificially inflate the credibility of reviews by
changing those qualities is a form of fraud IMO."

I don't know if I'd call it fraud, but it certainly artificially inflates the
credibility of reviews, and that's something (in addition to the rewriting
someone else's thoughts part) we don't feel comfortable doing.

(For your site, since it's a dictionary, seems like a better fix and I'm
surprised that it hasn't yielded any improvements)

------
spolsky
Because of the lucrative nature of affiliate payments and commissions in the
travel industry, one of the biggest sources of what Google considered bad
content was sites linking to hotels.

Look at this page: [<http://www.travbuddy.com/hotels>]. All those links to
"Hawaii Hotels" and "New York Hotels" look exactly like an old-school
linkfarm.

Ask yourself this: if you're Google, and somebody is typing in "Hawaii Hotels"
to do a search, is your Hawaii Hotels page really something they want as a top
result? I don't think it is. Your content is not really "canonical" for Hawaii
Hotels, it's just a page that lists a few reviews you've managed to collect.

It's all speculation on my part, since I have no inside information, but
essentially, I think you're sort of poisoning your own reputation at Google
with all those links.

~~~
ackkchoo
Hi Joel - Thanks for the feedback, that's really helpful and you do make a lot
of valid points, especially about travel affiliates. We definitely aren't
trying to rank for "Hawaii Hotels". Those links are intended as a sitemap to
help people find our review pages, many of which I believe we do have
valuable, original content.

As a concrete example following the Hawaii theme, we used to rank pretty well
when someone searched for "Hale Koa Hotel" (a hotel in Hawaii).

Here is our page on it:

<http://www.travbuddy.com/Hale-Koa-Hotel-v189806>

It's got a few detailed reviews and candid photos from active members of the
site. Certainly not perfect, but I think it's helpful and that it provides a
different perspective. Right now it's ranked #55 on my browser, and here is a
sampling of the sites that rank above it:

#9
[http://old.armymwr.com/portal/travel/recreationcenters/hale_...](http://old.armymwr.com/portal/travel/recreationcenters/hale_koa_hotel.html)
(a site that just links to info from the official page)

#11 <http://www.frommers.com/destinations/oahu/H42763.html> (A trusted brand
but limited original information about the hotel and no photos)

#14 <http://www.dollar.com/Locations/gen.aspx?locationId=HNLC02> (no content
at all)

#15
[http://www.gohawaii.com/listing/Dining/90173401_HaleKoaHotel...](http://www.gohawaii.com/listing/Dining/90173401_HaleKoaHotelDinnerShow?mIslandId=oahu)
(no original information)

#17 [http://www.wedhawaii.com/hawaii-wedding-
locations/oahu/halek...](http://www.wedhawaii.com/hawaii-wedding-
locations/oahu/halekoahotel.html) (a wedding photo album site, but no specific
information about the hotel)

#20
[http://hotelandtravelindex.travelweekly.com/Destinations/Hon...](http://hotelandtravelindex.travelweekly.com/Destinations/Honolulu/Hotels/Hale-
Koa-Hotel-Military-Only-p1437225) (no original information)

#21
[http://gohawaii.about.com/library/gallery/blwaikiki_hotels_4...](http://gohawaii.about.com/library/gallery/blwaikiki_hotels_4.htm)
(1 photo, nothing else)

#25 [http://travel.aol.com/travel-guide/united-
states/hawaii/hono...](http://travel.aol.com/travel-guide/united-
states/hawaii/honolulu/hale-koa-hotel-military-only-hotel-detail-075094/)
(hardly any information)

#28 [http://www.maplandia.com/united-states/car-
rental/honolulu-h...](http://www.maplandia.com/united-states/car-
rental/honolulu-hale-koa-hotel-car-rental/)

...

I'm not arguing we should be #1, or even #10, but we're certainly better than
many of the sites between #1 and #55, and we've seen this pattern repeat
itself for most of the hotels and pages we have original and unique content
for. While we certainly don't have as many hotel reviews as TripAdvisor, we do
have more quality reviews than most other sites, and we have to start
somewhere :)

I guess a better question would be, if we have 75,000 reviews, what is the
best way of internally linking to them within our site WITHOUT looking like a
link farm? We're just trying to put our best content forward, so any
suggestions would be appreciated!

~~~
patio11
Outside of Panda, in terms of your goals for your business: do you really want
to be in the business of arbitraging existing travel brands like Hale Koa
Hotel and charging them (Hale Koa Hotel) money for customers who already knew
of them? That strikes me as not a great place to be in life, for a lot of
reasons. For one: what does Google need you for in that scenario? (Answer:
nothing after they debut Google Travel and give it 80% of the non-ad real
estate on the search page.) For another, you're in perpetual competition to be
in the top ~10 of the 1,000 sites who are adding equally little value, and
your best answer on why you should be there is guaranteed to be "OK so we're
meh-tastic but come on we're 2% better than our meh-tastic competition doing
the exact same thing we do."

~~~
ackkchoo
Hi Patrick - Our goal isn't to charge money to customers who already know
about existing brands, but to provide first-hand information ABOUT that brand
that customers wouldn't otherwise learn from other sites (including the brand
itself).

We've been focused on hotels in this specific example, but the same argument
could be applied to any travel destination. If you're planning a trip
somewhere, would you only want to get information from the official tourist
bureau and the hotels themselves, or would it also be beneficial to see candid
reviews, blogs, and photos, and interact directly with other travelers who
have been to the same place? I would argue that the value provided by that
information is worth a lot more than 2%.

The average traveler researches 2-5 sites before booking. and a large % of
them are motivated by a desire to read reviews
(<http://www.phocuswright.com/library/fyi/427>). Clearly there is some value
provided there, and if we can provide relevant and unique information that
leads to a sale, or profit off the sharing of such information through
unobtrusive advertisements, then I don't see any problem with that.

And if we can provide more useful information than our competition (many of
those sites listed above have almost no original information at all), then I
also see no problem with ranking higher in the search results.

That's not to say we are 100% there yet for every destination, but I believe
we do have a ton of valuable information, and our goal is to get to 100%.

------
franze
hi, the funny thing is: all big travel sites lost
[http://trends.google.com/websites?q=tripadvisor.com%2C+exped...](http://trends.google.com/websites?q=tripadvisor.com%2C+expedia.com%2C+travbuddy.com%2C+lonelyplanet.com&geo=all&date=all&sort=0)

i did not hear from a single big travel site that won in the so called google
panda update.

and to be honest, i understand it

the spammiest verticals are not only PPP (pills, porn, poker), it's PPPIT
(pills, porn, poker, insurances, travel). i worked in poker, i worked in
travel, i did insurances and yeah from an SEO perspective they are all very
disgusting verticals.

and from what it looks like: the whole travel segment got a hit. so yeah, you
can try to get out of "panda" but then you should realize that "panda" is
nothing you can get out of. it's not a "penalty", it's a new set of rules. try
what works, try what doesn't and then iterate. so simple.

and from what i see:
[http://tools.pingdom.com/?url=http://www.travbuddy.com/Hotel...](http://tools.pingdom.com/?url=http://www.travbuddy.com/Hotels-
Am-Walde-
Hahnenklee-v390237&treeview=0&column=objectID&order=1&type=0&save=true)

you root HTML already takes 900ms to get delivered, then you load about 162
additional dependencies. ... i would work on speed.

~~~
klbarry
Hope you don't mind if I ask you here, I can't find your email. I saw on your
twitter feed you recommended two great non-obvious resources for SEO - In the
Plex and schema.org. An other resources (sites/books/etc.) you can recommend?

------
ackkchoo
I run a large site that was unexpectedly affected by Google's "Panda" update.
There has been a lot of talk on the subject, but most of it is FUD, and I
haven't seen many large sites lay everything out for discussion. This is the
first post (of many planned ones) about our experience with Google's "Panda"
update.

Hopefully it will generate some good discussion from those facing a similar
situation, and help some other people out.

------
elsewhen
The site should 410 removed pages insted of 404ing... 404 just means "not here
anymore" and google will wait to remove it from their index until they have
revisited the page a few more times over days/weeks.

A 410 on the other hand indicates that this page has been intentionally
removed, and google tends to act quicker on those.

~~~
elsewhen
Alternatively, they could 301 the removed pages to the most relevant parent
page if that would be a better user experience for those few who did navigate
to them. (in my experience 301ed pages also get removed from the index faster
than 404s)

~~~
ackkchoo
Thanks, that's helpful information as well. I didn't know about the 410. We
used to 301 some pages, but ran into duplicate content issues where Google
would continually spider old 301'd URLs (even when there were no longer links
pointing to the original bad URL).

------
spiralganglion
Google's intention with search, to return exactly the most relevant
information for any query, seems a bit like the strive for a grand unified
theory of physics. It's a noble ambition, but there are going to be a lot of
mistakes and missteps along the way; it's probably going to take an awful long
time to arrive there, if it ever happens; and when we do get there a lot of
people are going to be upset by it. I really appreciate that you've stepped
forward with both information about the effects of the changes and what you've
done to remedy the situation. I'm keen to see your followup posts, and I will
be keen to see what future changes Google implements and how they improve
things for you, me, and everyone.

But as time has shown, holding out hope for Google to do anything
_specifically_ helpful has been a disappointment in so many instances I've
lost count.

------
grok2
An observation based on mentions of "low quality" in Amit Singhal's blog post
at [http://googlewebmastercentral.blogspot.com/2011/05/more-
guid...](http://googlewebmastercentral.blogspot.com/2011/05/more-guidance-on-
building-high-quality.html) \-- you mention that you have no-indexed thin
content pages, but is it possible that if Google sees that the volume of
"noindex" pages is high compared to the overall volume of pages on the site,
that Google still views your site as overall of "low-quality" (relative to the
other sites it indexes for the same keywords)?

Rather than noindex pages, I am thinking using robots.txt to prevent access to
these pages might be better -- Google can't perhaps then tell what to make of
these pages you've hidden via robots.txt. Just thinking...not really an expert
on this.

~~~
chocoheadfred
I think that makes a lot of sense and worth a shot. Might be worth removing
(many) more pages until your rankings come back to see as a test.

------
coliveira
I think this is all nonsense. Google's work should be to separate good content
from bad content. Now, they are inverting the relationship and saying that web
masters should be responsible to handle that information to them.

This is wrong in several levels. First, Google starts to dictate what is
acceptable or not in their index, using tools like webmaster central and all
the "semantic web" talk -- Why should I care?

Second, it creates the incentives and the opportunity for bad guys to do well.
If you need to do all this overhaul of a web site, the only people willing to
do the work will be the very same ones that created content farms in the first
place. After all, they are the ones that make big buckets from Google, not the
after hours hobbyist that maintains a single web site.

~~~
extension
Google is in a tough position. If they keep their algorithms secret, they are
accused of not being transparent. If they reveal their algorithms, they get
gamed by spammers. If they reveal only vague hints and guidelines, they are
accused of manipulating content. And if they don't use ever more elaborate
algorithms, they get overrun with spam.

By becoming defacto ruler of the internet, Google has put themselves in a
position of outrageous power and responsibility. They are fighting a vicious
war with spammers and content farms and the rest of the web is caught in the
crossfire.

Personally, I agree with you that Google should not be telling people how to
run their sites. They should never have said a word about how their algorithm
works, or even what it's named, when it's updated, etc. They should tell
people "just make great web sites, searching them is our problem".

~~~
coliveira
I agree with that. But Google is weakening its position by creating these
extra semantic levels that only benefit people that are making a lot of money
in this game.

------
Tichy
What still confuses me is the duplicate content issue. Doesn't it make sense
to make some information reachable in different ways?

For example in a typical blog, the same text can be found via direkt link,
latest articles, categories... How does one make that go away - and should it
be done?

~~~
Terretta
Canonical URL. When you reach the page via tag list or category, have it
rel=canonical the article URL instead of tag or category URL.

[http://googlewebmastercentral.blogspot.com/2009/02/specify-y...](http://googlewebmastercentral.blogspot.com/2009/02/specify-
your-canonical.html)

------
terryjsmith
This is great. I work for a blog network that got hit similarly hard by both
Panda updates, and we've done a lot of the same stuff (removing and/or de-
indexing short/no page view content, improving our site map and linking
system), but scrapers continue to be our biggest issue, especially the ones we
don't syndicate to. These can rank higher than us in Google as well, and we
continue to try to find a way to flag them or submit removal requests without
having to do each one manually.

This is a great write up though, it's good to see that others are trying the
same things as we are.

------
jerrya
Hi Eric,

Have you tried approaching Google to discuss it?

~~~
ackkchoo
Hi Jerry - I don't know anyone who works in search quality at Google, and they
have said a few times that they aren't going to make any manual exceptions.
I've filed a "reconsideration request" with Google, and only got a boilerplate
response saying our site has "no manual penalties". Unfortunately, any spam
report, reconsideration request, or communication via the Webmaster Tools
interface seemingly goes to a black hole. I wouldn't even know who to contact
or how to contact them.

I can understand why they can't reveal too much about their algorithm, but at
the same time if they want to build a "healthier web ecosystem" I think they
have to do a better job of communicating with legitimate webmasters (this is
going to be the subject of a future post). There is so much uncertainty out
there that I think even legitimate site owners are afraid to speak out for
fear of being punished, when instead they should be working together with
Google to help improve things. It's like the attitude is: if you're doing
well, don't say anything or you might draw Google's scrutiny (especially
because you're probably doing something shady), and if you are doing badly
don't say anything either because Google will just punish you more (because
they have not told you why you have been punished in the first place). So
instead people are driven to anonymous postings on random forums where wild
misinformation spreads.

The intent of this post wasn't to focus on why our particular site wasn't
ranking well anymore in Google, but to try to see if there were other website
owners out there who feel comfortable coming out with their stories, and to
share anything they may have learned. Again, there is so little real
information coming from Google that most people I know have just been grasping
at straws, so it would be great to hear other people's stories. I recently
read about another seemingly legit site with lots of good content punished
([http://www.google.com/support/forum/p/Webmasters/thread?tid=...](http://www.google.com/support/forum/p/Webmasters/thread?tid=1981eab5c6140e68&hl=en)),
so there must be more of them out there.

~~~
brentdev
They've admitted that there's al algorithm then also sites can be manually
'reviewed'... layman's terms... yeah they made exceptions and got called out
hardcore with this last panda release on it and at least they 'claim' that
they're working on fixing it. search google for panda site slap. you'll see
lists of all the big ones that went down...

~~~
narad
Matt Cutts has already confirmed that there won't be any manual changes.
[http://followmattcutts.com/2011/05/25/matt-cutts-confirms-
no...](http://followmattcutts.com/2011/05/25/matt-cutts-confirms-no-manual-
exceptions-in-panda-algorithm/)

~~~
jerrya
And Matt refers to this blog post by Amit Singhal to give the official word on
what Panda does. [http://googlewebmastercentral.blogspot.com/2011/05/more-
guid...](http://googlewebmastercentral.blogspot.com/2011/05/more-guidance-on-
building-high-quality.html)

That link answers one of my question about Eric's article, which is why
TravBuddy took the steps they did take.

------
johng
I'd like to talk to you a bit about your site. Please send me an email, I
don't see an obvious contact form on your blog.... johng a t forum foundry d o
t com

