
Blogwashing - sessy
https://eggonomy.com/blogs/news/blogwashing
======
nicoburns
Honestly, I feel like this is Google's fault. It shouldn't be prioritising new
content as much as it does. The best articles are often "classics" from
several years ago that haven't changed because they haven't needed to.

~~~
ThrowAway54769
How is this Googles fault? They’re just following the market. People want
“modern” information. They don’t want old stuff from two years ago.

~~~
yowlingcat
What people say or indicate they want and what people need can be two very
different things. The long term potential backlash from solving for the former
should be weighed against the short term gains, but that's tricky!

------
geocrasher
If I wrote an article in 2015 and freshen it up with more relevant links,
better writing, or some other thing, is it still published in 2015? Should I
re-date it? Am I "gaming" the system by doing that? Or am I notifying my
readers (all 3 of them!) that there is fresh content?

This is actually a real question from me, and I'd love feedback. I look back
on my stuff from a few years ago and the blog posts need fixing up, be it for
SEO, for my more up to date 'voice' or because I switched to Gutenberg and
removed janky slider plugins that haven't been updated since 2014. Should I
re-date? Or leave them as-is?

~~~
dorfsmay
Often dates are extremely relevant, for example a 10 year old hardware review
won't be as relevant as a new one. On the other hand, blogs don:t have to be
static, some programing technique might still be relevant, but need to be
updated for a new version of the language.

Solution: put a "first published" and a "last updated" date, or use a wiki
that makes the revisions available online and a way to link to the different
revisions.

~~~
mroche
A variant of that, but something I’d like to see is listing initial
publication and current revision/update status (including initial). Then at
the bottom of the post put a summary of each update so readers know how the
post has evolved.

------
pjc50
Not a particularly detailed post and, as it says, it's an old SEO trick.

But if people want an answer as to why the blogosphere is dead and
everything's on centralised silos: this is why. Any decentralised system that
doesn't take spamfighting into account from the beginning will drown under it
as soon as it becomes popular.

~~~
bagacrap
Is the blogosphere dead though? I see links to tons of previously unknown to
me blogs on this site. The community has moderated the spam out. Thanks guys!

~~~
rchaud
It is "dead" in the sense that you are more likely to comment on the post on a
third party site like HN or Reddit. A decade ago the conversation would be
happening on the author's site, in the comments section. That died off because
bots and click farms flooded comment boxes with spam and chased away
legitimate visitors.

Authors of popular blogs didn't want to handle comment moderation, and once
they started encouraging discussion directly on FB, Reddit or HN, it was game
over.

------
mfer
The "freshness" issue is one that causes problems and subtly sets priorities.

Problems come up when looking for older content on purpose. A lot of older
content is very relevant still today. I know people who could not find
something they knew existed in Google. Switched to Bing and it came right to
the top. The difference was the way newer content was prioritized.

Google doing this sets priorities. It says that newer is more important. Is
that true? Many would argue it's not. Google says it is and "advises" people
where to go based on that.

I find these worth considering.

------
zafiro17
Prioritizing new content is going to exacerbate a lot of already existing
problems with the WWW. I won't echo the already good list of reasons why old
content is relevant. But I will point out it will usher in a new era of re-
showing you old content that has been repurposed "as new" via minor edits. And
it's going to also help convince a growing audience of web surfers that only
the latest/greatest is worth knowing about. That in turn puts additional
pressure on content producers to obsess endlessly over SEO tricks and
gimmicks. In sum, this worsens the user's experience by prioritizing things
that benefit Google. And this, of course, should be no surprise to anybody who
has watched Google's business decisions over the past 4 or 5 years.

~~~
stebann
Yes, exactly that. Also Google doesn't take actions when you report something
or they just dismissed the reports: spam-blogging, spam results when you're
looking for references, badly translated news sites copying content from
legitimate sources, and the list goes on.

------
rchaud
This is 100% Google's fault. They discourage both "old content" and "thin
content", so guess what happens?

Content writers create posts like "Ultimate Beginnner's guide to X in 2019"
articles and just update the post title and "Last updated" metadata each year.
Nobody's going to create a brand new guide to building muscle or whatever if
the core information doesn't change year to year.

------
antjanus
I can't be the only one thinking that:

1\. it's possible the author double checks the content each year and redates
so that visitors know it still applies

2\. the author updates the article to be current and redates it

It's a little weird to say this is "blogwashing". It's pretty common (for me
at least) to check the date of an article when it's a tutorial so I know if
it's current or not. And I've seen this happen before where authors append a
"changelog" to the article at the end so you know that it's up to date.

------
nickjj
How does everyone feel about only showing the updated date on the blog post
itself but keeping the real / original published time in the meta tags?

I do that on my site mainly to keep things less cluttered. Every post has an
"Updated on November 12th 2019 in #docker #flask" line at the top of the post
and that date is either the original published date or the last time I updated
the content in the post, but the meta tags are always the correct values (ie.
I don't refresh the published date with the updated date).

But now it's making me think I should include both the "Posted on" date as
well as a separate "Updated on" date in the presentation of the page itself to
be crystal clear. My only concern with that is that will eliminate some
vertical space on the page because I can't fit all of that on 1 line cleanly.
I would have to break the dates and tags onto 2 lines. For example, this is
what a current line looks like: [https://nickjanetakis.com/blog/make-your-
static-files-produc...](https://nickjanetakis.com/blog/make-your-static-files-
production-ready-with-flask-static-digest)

~~~
markosaric
On some of my sites, I only list the updated date too. I prefer to keep it
simple for the average visitor. The reason is that these topics are more time
sensitive so I regularly update all the posts to list latest information,
advice etc. Having the original date there might give a bad impression to some
and they may think the content is old and outdated.

------
codingslave
Search is just getting worse and worse, its a problem that probably nobody can
solve with the current paradigm. I think in a few years there will be a
significant opening in the search market for an algorithm that manages to
structure information differently

~~~
tambourine_man
Or, some public database of the crawled web where we can apply our own
algorithm.

I've always dreamed of a grep for the web, for instance. Trying to Google for
code is a pain, even when quoted/verbatim.

~~~
istjohn
This exists: commoncrawl.org

------
ddevault
Another thing I see a lot is scammers scraping my blog's RSS feed and re-
publishing my articles on their site, then filling it with adware. Sometimes
the page they show to googlebot is completely unrelated to the page you get
when clicking through.

~~~
imgabe
I thought Google heavily penalized your site if the human version was
different from the googlebot version

~~~
kevingadd
How do they detect that? Presumably human review, which can't possibly cover
every malicious page on the internet. I assume if you report the site they
queue it up to be scanned by a human, unless their solution is just to have
versions of googlebot that are harder to detect - possible, but if someone is
already going out of their way to trick googlebot, I don't know how well this
would work in practice.

As a starting point, your not-googlebot needs to spider sites differently from
googlebot (so it can't be detected by traffic analysis), imitate average user
hardware well (GPU acceleration + high GPU performance, more realistically
slow network, slower CPU hardware, etc), use network addresses not obviously
Google's, and imitate user behavior (plausible input events, scrolling, etc).
This is within Google's capabilities but is definitely an undertaking and SEO
types could eventually identify their strategies.

~~~
ses1984
>How do they detect that?

Easy, their crawler has a google bot user agent. Then they sample some number
of links with a human like user agent, and diff the output, plug the diff into
some algorithm to assess the score.

------
ravivyas
Actually, a lot of sites keep updated posts with new information, which is
useful. Which I believe makes it harder for Google to figure out who is gaming
SEO and who are providing actual value.

~~~
zo1
How so? Could they not ML-detect the main body/content of the post and do a
simple diff to previous "crawls" of the same content? You would think this
gets picked up and somehow "resolved" because it's essentially a _duplicate_
post or entry in their crawled pages DB.

~~~
lolc
> Could they not > ML-detect the main body/content > a simple diff > previous
> "crawls" > the same content

Yeah, I could do that in an afternoon :^)

------
WA
What? I thought I read somewhere on Google's webdev infos that they know when
an article was first created vs. updated. Not sure if this "old trick" still
improves the ranking of an article, but surely enough, it baits people into
clicking on a link.

I noticed it a couple times myself. Stuff that's obviously an older article
appears in the SERPs as if it was published a few days ago.

~~~
sct202
It could be a self fulfilling thing, if the CTR increases on a search listing
(because it appears newer to the user) that listing generally moves up because
of higher CTR.

------
jakobegger
Another great trick is just increasing version numbers in your blog posts!
People look for content relevant to their version number, so you should just
make sure you have copies of your content with all the version numbers people
might search for!

And with version numbers, you are not limited to dates in the past, you can
even write articles about the future!

Here's a brilliant example:
[https://gorails.com/setup/ubuntu/20.04](https://gorails.com/setup/ubuntu/20.04)

How to set up Rails on Ubuntu 20.04, which will be released in April next
year. You can already read the guide today! Some of the links might not work
yet, because obviously you can't download Ubuntu 20.04 yet, but once it's
released, those guys are bound to be the first ones who had a guide out!

~~~
cardiffspaceman
But won't their article be old in April, and get less viewership than the one
some other author is holding back until day-of?

------
vinaypai
Interesting that the text in the search bar in the screenshot says "AWS
Pinpoint Alternatives" doesn't actually line up with the text at the bottom
where it says "Searches related to Send with SES — AWS Pinpoint Alternative"
and you only see 5 pages of results. The text at the bottom should repeat the
search phrase verbatim.

I get completely different results on Google if I actually search for the
phrase in the search bar, with no sign of the blog in question. I see zero
evidence that the scummy SEO tactic actually works and a lot more evidence of
a faked "Google" screenshot.

------
wbillingsley
Frankly, I don't see a problem with this if it is done consciously rather than
automatically. *

Usually, I'm interested in currency not recency. If, say, a technical article
was written in 2015, I don't exactly care that it was _written_ in 2015 but do
care very much whether it's outdated today or not. APIs change, etc. If the
blogger has re-dated the article, that suggests they believe it is still
current, which is useful information to me.

(* - Caveat: no, I've never redated a blog article myself. But I am only a
very infrequent blogger anyway.)

------
lessname
That's not just a thing done by blogs. Several "news" sites like (german)
t3n.de do this. Or some sites like cnet.com etc... mostly content like "The
best CMS as of {insert year here}" \- so mostly things you can use years
later. However, it doesn't help if the library isn't receiving any updates
anymore. I don't think it's just to get a better search ranking but also to
manipulate users (they want to use up-to-date software/etc)

~~~
flagpack
I‘ve been working for an online news network in Germany for several years.
They hire people (mostly students) to do nothing but update their old articles
on at least a weekly basis. Even too a point where it becomes almost
unreadable because of so many useless and non-contextual updates. But it works
in terms of ad revenue from google traffic.

------
jellicle
Vast numbers of people are currently employed writing/updating articles with
titles like "What You Need To Know About Cats In Mid-November 2019".

This is Google's fault.

~~~
UserIsUnused
And it's a 5 year old article that has just edited the title. you still see
the comments from 3 years ago

------
stebann
Old way sharing is in danger when big guys like Google don't fight back spam
and black-hat SEO. I try to do my search on different search engines so older
content- sometimes well established articles on programming - and meaningful
new content can both go to surface. Since four or five years ago if you search
something with Google then sponsored content goes first, sometimes it's just
brand propaganda.

------
PretzelFisch
I don't really see the harm in re-dating a blog to keep it high in search
results. There is a lot of information that once publishes stays relevant and
informative for years, to get dinged on seo because it is stale says more
about google then the blog authors. Also they have an image which doesn't tell
you if a links url was updated in an edit.

------
myrryr
Oh gods I hate this, when I'm looking for how to get something done on a
framework which has gone under big changes, I need to look up the articles
within the last 6 months.

But this bullshit makes that really hard.

------
s_gourichon
Some Blog

November 13th 2422

This week, Groaar and Mrumfm have been experimenting a new invention. We are
considering calling it "wheel". Will keep you informed.

Comments

This is old news. Our tribe has been using it for eons.

------
mikorym
> It's almost 2020... Google knows about this

Google also knows about vertical search [1] and actively destroys anyone who
pops up with a good new algorithm and hope for a startup.

[1]
[https://en.wikipedia.org/wiki/Vertical_search](https://en.wikipedia.org/wiki/Vertical_search).
I am pretty sure there has been a HN frontpage article about a couple with a
vertical search startup that was legally and practically destroyed by Google.

~~~
jessriedel
First, does this have anything to do with the OP article? Second, care to link
to anything to substantiate your claims?

~~~
mikorym
> what does this have to do with OP

OP made the point that Google does not want to address this problem and may
even facilitate it (maybe passively). However, they have actively prevented
progress on search engines that may be more difficult to SEO engineer/hack
(because of their specificity) and particularly have prevented people with
good vertical search algorithms from building a business (read: actively
sabotaged their business). [1]

In any case, Google known as well to quash any other small projects that they
feel challenges them. [2] [3] [4]

[1] [https://www.nytimes.com/2018/02/20/magazine/the-case-
against...](https://www.nytimes.com/2018/02/20/magazine/the-case-against-
google.html) and
[https://news.ycombinator.com/item?id=16420004](https://news.ycombinator.com/item?id=16420004)
[2]
[https://news.ycombinator.com/item?id=19553941](https://news.ycombinator.com/item?id=19553941)
web browser [3]
[https://news.ycombinator.com/item?id=18566929](https://news.ycombinator.com/item?id=18566929)
person's idea taken after interview [4]
[https://news.ycombinator.com/item?id=19124324](https://news.ycombinator.com/item?id=19124324)
business

References [2]–[4] are not specifically important; there are easily a dozen
such kind of complaints if you just search for "google" on HN.

~~~
jessriedel
Wouldn't leaving a persistent flaw in Google search results which could be
avoided by using a vertical search engine be an example of Google assisting
vertical-search start-ups?

~~~
mikorym
Maybe in theory, but if you look in particular at the reference labelled [1],
it seems like they prefer to act as though it doesn't exist and if other
people try to get vertical search going, then they simply block their
startups.

------
dbatten
Am I the only one who still bristles when people say blog but mean blog post?
Or is this generally accepted now?

~~~
cmsd2
what's a blog? oh you mean a weblog. /s

~~~
dbatten
I'm assuming your point is that words get shortened for convenience and that
that's OK. That's a fair point. It's also true that language doesn't always
evolve in ways that make sense, and I get that.

With that being said, there's a huge difference between shortening "weblog"
into "blog" and shortening "blog post" into "blog":

First, when "weblog" was shortened to "blog," "blog" didn't already mean
something (and certainly didn't mean anything in the relevant context). When
"blog post" got shortened to "blog," "blog" already had a meaning - AND it
already had a meaning _in the context of the internet_. One of these leads to
confusion, one of them doesn't.

Second, when "weblog" was shortened to "blog," we didn't already have a
shorthand way of saying "weblog." But we've been shortening "blog post" to
"post" basically since the beginning. There was no reason to shorten it to
"blog" also. "Post" was just fine.

I'd argue that a more fair comparison would be if, after using "blog" for a
while, we decided to shorten "weblog" into "web" instead. It would have been
silly, because "web" already meant something, and because we already had a
shorthand version of "weblog" (i.e., "blog") - so why did we need another?

But I guess your sarcasm and the down votes answer my question anyway. The
internet has accepted "blog" as meaning "blog post." I might as well get on
board.

