Hacker News new | past | comments | ask | show | jobs | submit login
Google's Cat & Mouse SEO Game: Google's Collateral Damage (seobook.com)
21 points by pier0 on Mar 22, 2011 | hide | past | web | favorite | 19 comments

What I'd like to see somebody from Google explain to me is why Google considers Wikipedia as an example of a high-quality site (Cutts says so in http://www.wired.com/epicenter/2011/03/the-panda-that-hates-...), when Wikipedia users copied and regurgitated content from a website I own at least 4000 times (that's how many times it's been referenced on different Wikipedia pages, it has likely been copied thousands of more times without being referenced). Wikipedia ranks #1 or #2 for almost all these articles when my site is rarely among the top ten and it has also been hurt in this latest update. How does this work?

The big issue here is that if you don't get other people to link to your stuff then Google has little way to distinguish quality vs non-quality, source vs copy, lower quality rewrite, etc.

While many people find push marketing tasteless and annoying (and perhaps a signal of poor product quality), the truth is that most successful sites have relied on the person launching them already having some combination of status + influence + distribution + connections, or they used push marketing for a while to build that following and awareness.

The easiest SEO answer to obscurity or lack of awareness (assuming the on-page SEO & site structure are solid) is to think territorially & dominate a niche where you can own the idea. This is a great article on that front. http://www.copyblogger.com/how-to-dominate-your-niche/

Right, I understand technical reasons why Wikipedia outranks this site. Google started by being terribly arrogant in thinking that they are smarter than SEOs and their search engine would be so hard to game, that they made the Page Rank values publicly visible. It went downhill for them quickly, necessitating introduction of no-follow, which is why my site gets no benefit at all from any of these Wikipedia links. Wikipedia also benefits from all other content on their site (some of which is admittedly good enough, especially in topics of interest to programmers).

Here is another great article for understanding the Google / Wikipedia ecosystem http://www.johnon.com/399/google-las-vegas.html

It's not that Wikipedia is necessarily high quality, it's that users want to see relevant Wikipedia articles near the tops of their searches. It's a really useful starting point for a wide variety of searches.

You might say the say thing about many other content farms. That's their whole purpose: to appear good enough for people who are not very familiar with the topic. Google specifically said that the goal of their update was to penalize "low-quality sites—sites which are low-value add for users, copy content from other websites" (http://googleblog.blogspot.com/2011/02/finding-more-high-qua...). As I said, Wikipedia did exactly this. Thousands of times for just this one site. Providing links on the bottom to where they copied their content from is not a high-value add. This update punished my high quality site and helped Wikipedia.

Wikipedia has very decentralized interests unlike single-owner websites. Google can entrust it because things like link solicitation are hard to keep in Wikipedia articles because of its open-source nature.

The noncommercial nature of wikipedia at least makes it somewhat more trustworthy than other sites with likely commercial motives.

This has been disproven many times. People with commercial and political and other motives can write Wikipedia's content, which can be much more damaging than clearly marked ads.

Also, since Wikipedia copied, regurgitated, and sometimes introduced errors in content from a commercial site (which you don't consider trustworthy), it is obviously not any more trustworthy.

I really like the info-graphic, but then again I'm kind of a visual thinker. Perhaps a more entertaining way to diagram it would be "The Land of SEO" which is really just the outline of Eurasia and you could map Ghengis Khans great battles to seminal moments in SEO strategy changes. Brad Templeton's email on ARPAnet about sending a joke a week to a billion people who had to pay a penny if they thought it was funny would make him a multi-millionaire was fairly prophetic. With ad clicks or 'page views' standing in for sending fractions of a cent.

It will be interesting to see how this evolves over time. One of the things I don't like about the idea of 'tv unplugged' aka the Hulu Model, aka the Netflix model, is the idea that ads you don't like could follow you from channel to channel without you being able to switch. This evolution of Google's strategy between revenue, value, and antagonizing advertisers has great depth and complexity to it which will take time to play out.

Next up, the great 'unflowering' where all of the useful information gets locked up behind paywalls and the 'free' information you can get on the internet is about as useful as the 'free' information you can get from those papers that are free at the coffee shop.

I think there will always be some amount of great content accessible because brand & popularity are required to be able to charge, and there will always be someone who is either really hungry and willing to put the extra work in, or they are just doing what they love & delighted to share it.

Pieces may go behind paywall (like our site or iTulip) but they both still share lotes & sites like KhanAcademy offer tons of great accessible content.

The big issue though is that in most markets people have to get a bit shook down before they find what they need. I remember thinking that search and ads were going to be huge when I bought Inktomi & DoubleClick stock during the last stock bubble. I of course got my head served on a platter on that, but it was cool to get into search a few years later & be right that time. The Google IPO gains made up for the losses on the earlier bets, but everything comes down to timing & then just sticking with something you believe in.

The first site you find in any category won't likely be the best one, just the one which is the most heavily marketed. But the same was true before their was a web.

The hard part with paid content is that the more common it becomes the harder it is for Google or other ad networks to take a big slice of the value chain. For that reason I see the move to paid online content being a slow one (outside of niche b2b sort of environments).

Valid points, I won't bore HN with my theories about what makes the information economy tick but suffice it to say that the value chain will evolve as efficiencies in the market are developed.

"The Google IPO gains made up for the losses on the earlier bets, but everything comes down to timing & then just sticking with something you believe in."

Yes, but Google's performance in the last 5 years hasn't been stellar (from a stock perspective, instead what might have been dividends is being banked by the company, which is another rant) The tricky bit is understanding why Google and not Altavista or Yahoo? Not because they didn't have traction and penetration, but I believe because they didn't understand the economics of what they were selling.

Imagine that Zog the caveman starts getting trade goods for pies made out of mud. He's thrilled and others get into the mud pie business, but one guy realizes that the pies that people want are round and hard and so he also gets into the mud pie business but only makes his pies out of hard fired clay. He becomes the dominant mud pie seller and runs the other guys out of business. Could they have prevented it? Sure they could but they needed a better understanding of how and why customers valued their mud pies.

Google got there sooner and it gave them a tremendous advantage, but at the same time, to completely abuse the metaphor, they realized they had an elephant's tail and knew when to step aside when the crap came out. But they still struggle with elephantness, or at least they did 12 months ago :-)

Am I the only one that finds it hilarious that they used an infographic to describe the pitfalls of handling SEO spam?

The point wasn't just that Google has to handle spam, but also that they are stuck dealing with it even as they create/fund much of it ... and that their "solutions" for round 1 leave unforeseen exploit in round 2 or 3.

To out it as an equation...

increased weight on domain authority + rel nofollow + premium adsense feeds = Content farm problem

Their latest update left a couple other big exploits open as well.

As far as faulting using an infographic as a format goes, people are more receptive to them than textual articles.

Sure some people do exploitative crap about total junk & have spammy storyboards put together by total strangers for $20, but the above was a storyboard that came from someone who has watched how search has evolved over the past ~ decade, with literally 20,000+ hours of experience in the SEO game.

I could write a 9000+ word article like http://www.seobook.com/relevancy/ but generally the market is more receptive toward infographics. As a marketer it is generally easier to go with human nature than to try to fight it. That is marketing 101 ;)

And since people do exploit infographics for links, sure it can seem cutting edge to label anything in an infographic as 'spam' (or some such), but if you could find me another online document that has described everything on that page with better clarity & in a way that is easier to consumer faster I would be quite surprised.

Right, you're talking about the pitfalls of handling SEO spam, e.g. unwittingly creating more spam vectors through your attempts to stop previous ones.

My point about infographics is although they may indeed deliver a message effectively, they are infamous for being used as link bait to game SEO.

I love Aaron Wall's blog, but can't you guys quit being suckers for infographics?

How else would you suggest visually laying out the patterns? Would you like it in a flash file instead? :D

Prose might be a good alternative. Your infographic is laid out like a flowchart, but seems to be anything but. I found it extremely hard to follow, and would've preferred 500 words instead.

The tricky part was that many of the things were happening in parallel...so it is somewhat hard to connect it all together. I sorta tried to push it as best I could into a cohesive organization with themes like "link spam, adsense, content mills & domain authority" but search is pretty complex and I am sure we could have done a bit better...one problem with anything like this is determining how deep or nuanced to go with it. The more information you put on it the harder it is to keep it organized.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact