Hacker News new | comments | show | ask | jobs | submit login

This has nothing to do with Reader. We were tackling a spammer and inadvertently took action on the root page of digg.com.

Here's the official statement from Google: "We're sorry about the inconvenience this morning to people trying to search for Digg. In the process of removing a spammy submitted link on Digg.com, we inadvertently applied the webspam action to the whole site. We're correcting this, and the fix should be deployed shortly."

From talking to the relevant engineer, I think digg.com should be fully back in our results within 15 minutes or so. After that, we'll be looking into what protections or process improvements would make this less likely to happen in the future.

Added: I believe Digg is fully back now.




If this would happen to a less popular site, what chances does a site-owner have of getting attention to this problem, and getting it fixed?

-----


None.

Even if you know people within Google, there's so much fear of allegations of impropriety that employees are too afraid to even ask the appropriate team if there's a possible mistake that they should look at.

-----


Hey relix, it took an unfortunate chain of corner cases for this to happen, and for this situation it was actually more likely for the corner cases to hit a larger site rather than a less popular site.

In general, when a member of the webspam team directly applies a manual webspam action against a site, we also drop a note to the site owner at http://google.com/webmasters/ . That helps the site owner tell whether something is going on with manual spam vs. just algorithmic ranking. Then any site can do a reconsideration request at the same place or post in our webmaster forum at https://productforums.google.com/forum/#!forum/webmasters .

People like to scrutinize Google, so I've noticed that writing a "Google unfairly penalized me" blog post typically makes its way to us pretty often.

-----


Hi Matt,

That doesn't match my experience. Could you explain the penalty against onlineslangdictionary.com?

Showing citations of slang use[1] caused what appears to be an algorithmic penalty. The correlation between showing citations and the presence of a penalty is apparent:

http://onlineslangdictionary.com/static/images/panda/overvie...

Missing from those 3 charts is the one showing that citations were once again removed over 120 days ago, yet the penalty remains. It would appear that the algorithmic penalty was turned into a manual penalty.

I've followed all procedures including those listed in your comment, without resolution.

[1] By citations of slang use, I mean short (1-3 sentence) attributed excerpts of published works, shown within the appropriate definitions, as evidence of the correctness of those definitions. All citations were gathered and posted by hand.

-----


Hi Walter, the only manual webspam action I see regarding onlineslangdictionary.com is from several years ago (are you familiar with a company called Web Build Pages or someone named Jim Boykin?), but that no longer applies here.

You're affected by a couple algorithms in our general web ranking. The first is our page layout algorithm. See http://googlewebmastercentral.blogspot.com/2012/01/page-layo... or http://searchengineland.com/google-may-penalize-ad-heavy-pag... for more context on that. In particular, comparing a page like http://onlineslangdictionary.com/meaning-definition-of/compy to a page like http://www.urbandictionary.com/define.php?term=Pepperazzi... , your site has much more prominent ads above the fold compared to Urban Dictionary.

Your site is also affected by our Panda algorithm. Here's a blog post we wrote to give guidance to sites that are affected by Panda: http://googlewebmastercentral.blogspot.com/2011/05/more-guid...

-----


Hi, This is Jim Boykin. I have no records of ever doing anything for onlineslangdictionary.com ...but I guess it "no longer applies here"...but just strange that you'd associate me with this website...hum I wonder who else I'm associated with by google whom I have had nothing to do with...

-----


P.S. One other quick thing. I saw you sending me tweets, but the tweets looked fairly repetitive, and you hadn't chosen a Twitter avatar. I get a lot of tweets from bots, and this looked fairly close to bot-like to me: https://twitter.com/mattcutts/status/315232934040846337/phot... That (plus the fact that the site had no current manual webspam actions, plus the fact that I wasn't sure what you meant by citations) meant that I didn't reply. Hope that helps.

-----


Makes sense. All the tweets were by hand. I tried to tweet every weekday but missed some, then eventually gave up.

I just didn't know what to do upon getting no feedback from you guys after posting to the Google Webmaster forums, filing reconsideration requests, contacting friends at Google, posting to and commenting on Reddit about it, commenting on HN about it, posting to Facebook, blogging, and tweeting about it, and putting a yellow box at the top of all pages on the site mentioning the penalty and linking to a page with the details.

Thanks again for talking with me about this. (I'd still like to hear about Web Build Pages / Jim Boykin and the rest - https://news.ycombinator.com/item?id=5444996 ...)

-----


After trying for 1 year and 11 months to get anyone from Google to talk to me about the penalty, I can't tell you how ecstatic I am that you responded. Thank you! I would have responded sooner but I wanted to deploy some changes to the site, and also I was floored by some of the things in your comment and didn't know quite how to respond.

I'm very happy to read that there's no current manual action against onlineslangdictionary.com.

Hi Walter, the only manual webspam action I see regarding onlineslangdictionary.com is from several years ago...

Oh? I've never received a notice about having a penalty or about a penalty being removed. When was that penalty in place?

(are you familiar with a company called Web Build Pages or someone named Jim Boykin?)

Nope. I hadn't heard of either until I read your comment. Why do you ask? Did he/they cause the manual penalty against my site and then cause the manual penalty to be removed? How?

You're affected by a couple algorithms in our general web ranking. The first is our page layout algorithm. See <snip>. In particular, comparing a page like http://onlineslangdictionary.com/meaning-definition-of/compy to a page like http://www.urbandictionary.com/define.php?term=Pepperazzi..., your site has much more prominent ads above the fold compared to Urban Dictionary.

Interesting! I had read that post on the Webmaster Central Blog before, but never even considered that the layout algorithm was penalizing my site, for a few reasons. 1. The upper leaderboard ad + side wide skyscraper ad combination is so commonly used everywhere on the web. 2. I removed the leaderboard ad from the entire site from 11 May 2011 through 31 August 2011 and found it had no effect on the site's ranking. (It also had no effect on user behavior, such as bounce rate or time-on-site.) 3. My site isn't one of the sites "that go much further to load the top of the page with ads to an excessive degree or that make it hard to find the actual original content on the page."

I have removed all advertising from onlineslangdictionary.com, and also removed the yellow box at the top informing visitors why they no longer have access to the citations of slang use. (Better safe than sorry, I guess.) The page layout penalty should no longer be a problem.

(Since ads are no longer on my site, for reference, here are screenshots of those two URLs you linked, one from my site and one from urbandictionary.com:

    OSD: http://onlineslangdictionary.com/static/images/layout-algorithm/2013-03-24-osd.com-atf-def-of-compy.png
    UD: http://onlineslangdictionary.com/static/images/layout-algorithm/2013-03-24-ud.com-atf-def-of-pepperazzi.png
Interestingly, with my screen size, Urban Dictionary has more pixels above the fold dedicated to ads.)

Your site is also affected by our Panda algorithm. Here's a blog post we wrote to give guidance to sites that are affected by Panda: http://googlewebmastercentral.blogspot.com/2011/05/more-guid...

I've read that article in the past, and gave it a re-read. I understand Panda is about penalizing low-quality sites.

High-quality dictionaries have citations of use from published sources. Citations prove the definitions are correct, provide real-world illustrations of proper usage, are just plain interesting, etc. Penalizing a dictionary for showing citations is like penalizing Wikipedia for having lots of numbered sentence fragments at the bottom of their articles. That's how they prove that their claims are factual.

onlineslangdictionary.com had around 5,000 citations of slang use, collected and added by hand. The presence/absence of citations on the site is the only thing I've found to correlate with the presence/absence of a penalty (http://onlineslangdictionary.com/static/images/panda/overvie...).

Due to Panda, they were removed for non-authenticated users (including Googlebot) most recently starting 16 November 2012. They have been unavailable to authenticated users starting 8 March 2013. Because of a coding mistake on my part, they were visible for between 3 and 4 hours on 12 March 2013. (Basically: I accidentally inverted the logic of the 'if' statement that checks whether citations need to be removed (the answer should always be "yes") causing the code to not remove citations.) I fixed the bug as soon as I noticed it, and filed an updated reconsideration request.

The citations are gone. All content on the site is 100% original. It's got the only real, free slang thesaurus on the web. There are other unique features. I don't know what Panda would be penalizing the site for.

I started The Online Slang Dictionary in 1996, and have been working on it full-time for the past 6 years. My goal is to create the "Wiktionary of Slang" - not a flash-in-the-pan made-for-AdSense site. I was delighted with the site's ranking Between The Penalties: from 8 days after I first removed the citations until 3 days after I put them back on the site (13 November 2011 until 9 October 2012.)

It would be awesome to have the chance to once again compete on a level playing field with other slang websites. I'd love to have the time to implement the new features I've been dying to add, rather than spending time (over a year now) trying to guess why Google is penalizing the site and fixing those guesses - since my data shows that site growth is impossible with the penalties in place.

-----


Thanks Matt, that's good to know.

-----


You bet. I thought https://news.ycombinator.com/item?id=5422855 was a pretty good example of that. I'd never heard of Amy Wilentz, but her blog post made it our way via several channels and made the front page of HN today. Admittedly it was a bad fact to get wrong, but I think it still demonstrates that blogging about Google doing something suboptimally can get a fair amount of attention.

-----


A similar issue (removed from search results, but remained in index) happened to my site due to a DNS issue last year, and I had to perform some magic steps+ in webmaster tools and we re-appeared in search within a day or so.

+ no magic, I just don't remember what exactly I had to do.

-----


The Star Trek TOS "The Ultimate Computer": Captain James T. Kirk: "And how long will it be before all of us simply get in the way?"

-----


So was it the well known missing robots.txt killing a site problem (as described in the original article) or a manual action gone astray?

When you looking at protections you need to ask what happens of this is a mom and pop site and not a well known sv company with high level Google contacts?

-----


I can't speak for Google or Matt Cutts. But it sounds like he said it was a Google bug.

-----


Hey matt,If its a problem with one link alone then just take action on that one page alone, then it could be considered as action otherwise it can be taken as overaction. On the name of quality one should not take the whole site down for a day,This action of yours is completely wrong..

-----


Why not take the 'nuclear launch' method? Every time someone's about to de-list [large number] of pages, a second human has to confirm it.

(This is presuming humans were involved at all)

-----


Is this a stunt to put Digg.com on the map again?

-----


I would imagine it was more like:

Apply actions to (tick one) :

[] Individual Link [] Individual Page [] Individual Keyword [] Whole Domain

[SUBMIT]

And the manual reviewer was drunk on Google Juice :)

-----


I see so many flaws with this process and small business owners that may not be able to get the attention that Digg just has.

-----


@Matt How can 1 bad link on a site take down a whole site? Is this normal?

-----




Applications are open for YC Summer 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: