Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why is Google purging all search results for Syrian Arab News Agency?
51 points by teamgb on Sept 14, 2013 | hide | past | favorite | 19 comments
The website sana.sy of the Syrian Arab News Agency has been 'censored' or 'purged' from Google search results. Even after clicking through 10 pages of results, not a single one links to Sana.sy. Contrast with the results from DuckDuckGo and Bing where it's the top result.

Google: https://www.google.com/search?q=syrian+arab+news+agency

DuckDuckGo: https://duckduckgo.com/?q=syrian+arab+news+agency

Bing: http://www.bing.com/search?q=syrian+arab+news+agency

This test was conducted based on a post about the Syrian War (http://www.moonofalabama.org/2013/09/a-short-history-of-the-war-on-syria-2006-2014.html#more) which has been posted on HN here (https://news.ycombinator.com/item?id=6387286).

Technically it's not purged, 178000 pages are in index: https://www.google.com/search?q=site:sana.sy

Perhaps they're downranked, but anyway the first two links on your link leads to wikipedia and facebook pages of that agency which both has direct link to the site in question.

http://sana.sy/robots.txt - oops, no robots.txt. Maybe it's just a results of poor SEO.

Not having robots.txt just results in Google defaulting to their reasonable crawl settings. For most sites, a hastily configured robots.txt usually results in problems rather than SEO help.

The indexed pages are also missing most of the 'important' landing pages and other big pages, so most likely some sort of automated spam detection was triggered.

My ISP seems to time out on the domain, sshing and curling from various servers returns the page.

Just saying, the absence of `robots.txt` can be sign of not really caring about SEO.

Not sure about SEO otherwise how would SANA be the top result for Bing, DuckDuckGo, Yandex and... (cue drumroll) Lycos!

I'm not sure either. But generally search engines have very different ranking, I was using Google, Yandex and DDG - and they often have completely different output with Google much better generally, but often having not the sites which in Yandex etc etc

To quote from the Wikipedia article about SANA:

  Up until November 2012, SANA's website was hosted in Dallas, Texas by the United States company SoftLayer. Due to sanctions related to the Syrian civil war, which make this hosting illegal, the SoftLayer company was obliged to terminate its hosting responsibilities with SANA.
I have no idea about the exact legal powers of international (unilateral?) sanctions in the US, but is it possible that Google de-listed SANA because of legal issues?

I'm curious what sanctions prevent hosting a Syrian website? http://damascus.usembassy.gov/sanctions-syr.html claims there are only 3 sanctions: No US goods(basically) can be exported to Syria, something against the commercial bank, and denying Syria access to the US financial system.

How can you host a website without paying for it? Just like fighting against "pirate" websites, just in reverse this time.

If there were any legal issues, why are Bing and DuckDuckGo still returning SANA in their results? This is all very strange.

I asked the crawl/indexing team about this, and it looks like sana.sy hasn't let Google crawl the site since August. It's unclear whether it's deliberate vs. something like timeouts. So it's nothing on Google's side. No webspam-related issues or anything like that, either.

In fact, if sana.sy were to register for Google's free webmaster console at google.com/webmasters/ , then they would have gotten automatic alert emails regarding the high level of errors we get when trying to crawl the site.

Technically it may be purged by blackhat SEO, because google search result ranking algorithm is poor actually and can be easily gamed by anyone who has 100 dollars in her pocket.

Only because Bing/DDG SERPs were not gamed by "SEO" activity doesn't mean it can't be done and/or those SEs algos are any better than Googles.

I think it's obvious. That's our reality unfortunately.

As far as I know, this is an unprecedented action for Google in the US. Just because it fits your narrative does not make it obvious!

I have a website where not only did Google remove it, they removed all pages that linked to it, and all future pages that linked to it. You could search on the domain and get zero results. So they have the capability to do anything. (The website is a DMV licensed traffic school for traffic tickets). The website doesn't have to be anything too special for them to purge it.

That sounds like a manual webspam action. There's a thing for it under Google Webmaster Tools.

I'm going to go out on a limb here and say that there's more to the story than what you have stated. There are numerous reasons why your site may have been removed. It could have been shady SEO tactics such as link building or invalid meta data, which is a big no no for Google. Should definitely check your Google Webmaster Tools to see what's up.

It was a classic I bought about 20 links. But as soon as the webspam action happened I removed them (through text-link-ads.com). I was on top of it and cleaned it up but I have 3 times resubmitted it for re-approval over 1.5 years now with no luck.

I broke the rules but Viagra sites get better treatment.

The problem was that 80% of the customers came from lists given out by courthouses. Customers would type in the web address into Google search instead of the address bar.

Once the website disappeared from the Internet the only way customers could find the site was by Google adwords which went from spending $150 a day to $500 a day. A total win for Google I guess!

It was impossible to spend $500 a day forever, I put the website (the same website, no changes) onto a new domain and now it spends about $200 a day in Google adwords. Much better. I seemed to have received the worst possible webspam action for really very little. Consider other site buy thousands of links.. my 20 links were small potatoes that I thought would fly under the radar.

The whole incident cost about $100,000+ in sales mostly from customers who if they just knew how to use the address bar would have made it to the website.

When I see other websites have issues with Google, I know it doesn't take a whole lot to bring a great deal of Google issues upon them. Google if they wants to can just remove them or send them to page 200 and since Google gets 80% or so of searches and people don't know how to use the address bar the website is going to have to buy Adwords in order to stay online or change domains (assuming it is not an on-page issue).

Did you ever contact them regarding this? They have webmaster tools for this.

By the time it's really obvious most people will already be plugged into the Google Matrix with no recollection of anything :)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact