
Spam sites in Google News - gnewssheriff
http://googlenewsspam.com/index_.php
======
Animats
One of the sites mentioned, "wtnnews.com" ("West Texas News"), is a
straightforward content farm running Google Adsense ads. Here's their home
site, "wtnmedia.com", "A global business-to-business media company where you
can forge high level client relationships through the web, print, and events."
Their advertising kit shows a base ad rate is $1,295 for 7,500 impressions of
a banner ad, for a Cost per Thousand of $171. Nobody would pay that; typical
CPTs today are around $1. So that's not their revenue source.

The use of the business addresses of other businesses is unusual. That's
identity theft. Our SiteTruth system checks out the address of the business
found on the web site, and it brings up StreetView pictures of auto repair
shops for some of those sites. Lots of sites don't give an address, or use
some mail drop, but use of fake street addresses is rare. It tends to draw tax
collector and law enforcement attention.

Google's anti-spam efforts have never been very aggressive. They filter out
some of the worst offenders, but don't try hard to get rid of bottom-feeder
sites like these. If Google didn't put spam sites in news results, how would
they get revenue? Google News itself has no ads, and many of the top news
outlets have their own advertising systems that don't involve Google.

~~~
a3n
"... where you can _forge_ high level client relationships ..."

Great double entendre.

------
danso
I used to work on a news site and the process to get listed on Google News was
very manual, at least compared to how you think the rest of Google operates.
You had to have a physical address and phone number (though I don't think the
two attributes are actually tested, i.e. sending a postcard via snailmail).
You also had to configure your pages for the spider so that if your site
delivers wire content (e.g. Reuters, AP) or does aggregation, the spider would
ignore those pages, and only pick up your original news. The first time we
submitted, the response we got seemed to indicate that someone had looked over
our sitemap and saw aggregated news/wire reports, and so we had to resubmit.
The process took a few weeks, IIRC.

That said, bigger news sites were probably whitelisted...it seems like the
whitelist could be done algorithmically...but the number of _new_
authoritative news sites, in theory, changes so slowly, that there's probably
not really a need to do that.

~~~
eli
You left out the most bizarre part of the process -- the insane constraints on
URLs. For example:

 _" The URL for each article must contain a unique number consisting of at
least three digits. For example, we can't crawl an article with this URL:
[http://www.google.com/news/article23.html](http://www.google.com/news/article23.html).
We can, however, crawl an article with this URL:
[http://www.google.com/news/article234.html](http://www.google.com/news/article234.html).
"_

(Source:
[https://support.google.com/news/publisher/answer/68323?hl=en](https://support.google.com/news/publisher/answer/68323?hl=en))

~~~
declan
When I worked at CNET, we were (and still are) indexed by Google News despite
not having three-digit numbers in URLs.

That's because your excerpt from Google News' guidelines left off a very
important addendum: "Please note that this rule is waived with News sitemaps."

~~~
bushido
Actually the big reason why it may have worked back in the day for CNET was
because it was CNET.

Thought these days a news sitemap suffices.

I would personally still add the 3+ digit unique number to the permalinks and
eventually if necessary remove it once rankings improve overall on Google
News.

Some Back Story

==================

I consulted for a Movie News and Review website for about 6 years. We had news
sitemaps and would show up on Google News once in a while, but one of the
small yet significant changes that we made that increased ranking on Google
News was when we implemented adding a unique number to urls.

That was implemented back around 2009-2010.

We stopped using that unique numbers in the URL around 2013, mostly because
the numbers were becoming pretty darn big. We noticed a small downtrend in
Google News traffic which stabilized in a couple of months.

For anyone using Wordpress unique numbers can be added using a permalink
structure such as:

    
    
       /%postname%-%post_id%/
    

If you do decide to use it don't forget to setup 301 redirects.

~~~
franze
which would not fulfill (the basically deprecated) three digit spec for the
first 99 posts (maybe even the first 100).

also it makes the URL longer, less user friendly and you have to deal with an
URL migration that you might have to revert at some point (i.e.: the example
you mentioned)

additionally: as it is mentioned that the 3 digit requirement is "waived" with
a google news sitemap it is a strong indicator that this is a crawling, not a
ranking directive.

my 2 cents: don't do it, as it would be a clear violation of the golden URL
rule a.k.a. "Don't overdo the f###### URLs!"

------
tokenadult
There is a lot of good information in the post submitted here, which should be
actionable for Google if Google cares about the quality of news results. I'll
note for the record that I have been using Google News as a news aggregator
since the beginning of its existence, and since I almost use Google News in a
logged-in condition, Google News responds to the way I have trained it about
my news interests by mostly showing me stories from established news
organizations all around the world, and not from spammy linkfarms.

My current gripe about Google News is that it pushes far too many low-quality
or too local news outlets into the Editor's Picks section of my view of Google
News, and even if I click repeatedly as those sources display "Personalize
this news source" to display it rarely, the same podunk local TV station or
trade magazine website will keep displaying in that section over and over and
over. (I have already complained to Google through Google News feedback
channels about this behavior. It should be possible to mark a source as NEVER
appearing in the "Editor's Picks" section and have that selection be
implemented until the user affirmatively turns it off.)

On the whole, I like Google News. But for sure if a site has eyeballs,
spammers will try to grab those eyeballs, so the price of freedom from spam is
eternal vigilance.

------
alternatives
We need alternatives to Google. We can't have one company deciding what gets
seen on the WWW.

It is a great search engine, Google changed the world. But now we need
alternatives. We can't have their editorial staff dictate what should and
should not be seen on the WWW.

If you look closely at the top search results, they're mostly big spenders in
Adwords. Ebay, Amazon, Expedia, TripAdvisor, Yelp, Answers.com and several
others. It's blatant conflict of interests. There is no way those search
results are "organic". Basically the WWW has become controlled and curated by
Google and every site that gets seen must conform or be destroyed.

We need alternatives to Google. We need them urgently.

~~~
a3n
Then stop using Google. Don't use their search, news or email, and don't be
logged in to Google.

There are a handful of good alternatives. Use one. Use multiple.

Until people stop saying "yeah, but those other sites don't give me the same
result as Google" then it's going to always be Google.

It's really up to you.

~~~
_delirium
> There are a handful of good alternatives. Use one. Use multiple.

For search at least, unfortunately, the alternatives are pretty questionable.
I've been using DuckDuckGo experimentally for a few months now, which is
mostly Bing as the backend, with some DDG-specific add-ons and tweaks. I find
myself having to use the !g command to rerun the search in Google somewhere
around 40% of the time. Some of this is site owners causing the problem: a
surprising number of sites block all crawlers in robots.txt, but then
whitelist Googlebot and only Googlebot. But even leaving aside those sites,
DDG seems to miss a lot of results, and the quality of the first-page results,
at least for how I search, is consistently lower.

If you try searching in a language other than English the differences are even
larger. I suspect that's because some of the infrastructure and datasets used
for Google Translate are also used by search in some form, while Bing doesn't
have a similarly solid multilingual stack.

I'd like to see more competition in search, but the barrier to entry to
produce a good full-web search engine seems quite high.

------
graupel
The funny thing is that as a large publisher, it's not links on
news.google.com that we really care about, it's landing in universal Google
search one-boxes as a result of being indexed in Google News - that's where
the traffic really comes from.

~~~
Destitute
Yea, there's a lot of sites that take advantage of this like The Christian
Post who target high search keywords for their "news" that has no relevance to
the actual term.

Search for "Stream NFL games" and "Watch NFL football online free" and other
variants and see them pop up for everything with no mention of actually
streaming games online (obviously).

See: [http://www.christianpost.com/news/carolina-panthers-vs-
seatt...](http://www.christianpost.com/news/carolina-panthers-vs-seattle-
seahawks-live-stream-free-watch-nfl-2015-playoffs-football-online-fox-tv-
start-time-preview-132376/)

~~~
notadoc
Many very top sites do this with any popular news term or keyword, its
incentive to create content farms. Madness.

------
eli
Google News has always given off this vibe like it's being administered by
someone as their side project. Very curious for one of the most popular news
websites in the world.

~~~
tymekpavel
Without going into too much detail, this is essentially true. The site is
built on top of algorithms that were developed years ago, and there are some
runbooks to help SREs keep the site running. However, there isn't much
investment going on. Occasionally they'll have one engineer or an intern add a
feature, but it's not permanently staffed.

~~~
Scoundreller
Would this change if they started running ads on it?

I guess the question is, is the lack of investment because it's a cost-centre,
or for some other reason?

~~~
larrys
One of the reasons is that Google doesn't seem to want to hire the "b" team.

And the "a" team I'm guessing doesn't want to work in a backwater place with
no glory like "google news".

You know they only hire the best and the brightest and all of that. I'm sure
even the person who gets hired as a janitor is a cut well above the average
janitor. [1]

The funny part is there are plenty of people who would die to work at Google
on anything. But they aren't the type (total conjecture here btw) that would
ever pass the google interview process.

[1] I've always thought this was an interesting paradox. That is someone gets
a job as a maid in the White House and works near the President let's say. So
she/he must have something going for them to get that type of job. But yet
they are still a maid in the white house. You would think if you are able to
land that job you would have risen above that job.

~~~
UrMomReadsHN
>You would think if you are able to land that job you would have risen above
that job.

What an awful thing to say. Some people like doing that type of work. There is
nothing wrong with that. I don't think our society is destined for greatness
when we devalue important work.

As an aside, my aunt did a lot of jobs and always went back to being a cleaner
because thats what she likes to do.

------
jmarbach
I too have found some serious quality control issues with Google News. Some
news sites are redirecting their mobile viewers (knowingly or not) to the app
store instead of giving them the article that they clicked on. I wrote about
this experience here: [http://jmarbach.com/google-news-growth-hack-
exposed](http://jmarbach.com/google-news-growth-hack-exposed)

------
thezach
I reached out to Emily Linnert, one of the real journalist who's photo is
being used on one of the sites and shes replied. Her (or her station) is
taking legal action.

[https://twitter.com/TheZach/status/554450915319492608](https://twitter.com/TheZach/status/554450915319492608)

------
stevenh
I noticed that one of the scam sites is currently listed for sale on Flippa.
On their listing, they even openly boast that the best way to use the site is
to use its Google News status to scam Google's search results:

[https://flippa.com/3500230-google-news-approved-site-
with-14...](https://flippa.com/3500230-google-news-approved-site-
with-143-951-uniques-mo-making-2-132-mo)

Here's a mirror, in case it gets modified or deleted:
[https://archive.today/Jzubr](https://archive.today/Jzubr)

~~~
gnewssheriff
Update: Site has been cataloged and added:

[http://googlenewsspam.com/index_.php](http://googlenewsspam.com/index_.php)

Thanks for your help.

------
zaroth
Where's Matt Cutts when you need him? [1]

[1] - [https://www.mattcutts.com/blog/on-
leave/](https://www.mattcutts.com/blog/on-leave/)

------
kumarski
Thank you for doing this.

------
everydaypanos
+1 for exposing the fake writer profiles. Very amusing!

