
Synthetic Content - pchojecki
https://medium.com/swlh/synthetic-content-9cf5838d8e80
======
ccheney
About 12 years ago I hacked together a "black hat SEO" website using synthetic
content. I picked a niche (baking/recipes) and cobbled together a PHP script
to generate content based on a keyword list I put together. I found a simple
markov chain text generator and had it spit out a web page for each keyword in
my list. There were hundreds of static html pages full of keyword rich
content. The search engines loved it.

I traded backlinks through a website (whose name escapes me) where you'd place
a javascript snippet on the page and it would load 10 random links to other
websites. I think you could also pick which sites you wanted your link to
appear on and they'd pay a certain price per month to keep it there. It was
all about boosting PageRank in Google through a keyword link.

The idea was to create a website for search-engines, not for humans. It was
about gaming the search engines and as a side-effect, creating organic traffic
to the site where a human would then be bombarded by Google Adsense. The ads
were "relevant" to my generated content and keywords.

It only took Google about a month or so to blacklist the domain. I think I
made about ~$400 in ad revenue in just a few weeks. IIRC, the threshold to
payout your adsense revenue was $1k and I never reached it.

All in all, it was a fun little project that furthered my interest in web
development.

------
merlincorey
Maybe I've just been following computer generated content too much, but it was
fairly obvious to me when reading the WallStreetHack generated website that it
was in fact generated content.

For example, the following paragraph seems to me to display a clear synthetic
generation because the phrasing is very repetitive and the context is never
really expanded, even though we have several sentences (which a human writer
would almost certainly have condensed):

> One of the ways that a creditor knows if you owe money is when you pay the
> balance. But, how does an insurance company know if someone has a medical
> condition or is in medical crisis? Well, it does this to see if you have
> insurance or are suffering from a medical emergency. In order to determine
> whether you have insurance, a medical professional performs a background
> check in order to determine whether any medical problems may exist. These
> background checks can show whether or not you are in medical crisis and
> whether or not payment of the debt is being considered.

~~~
Baeocystin
If I'd come across those sentences while browsing, I'd have assumed that it
was some low-grade SEO blogospam written by a Mechanical Turk participant.
That some Markov-chain-with-extra-steps generated it instead is of mild
interest.

But! I am also 100% certain that a lot of people, particularly non-technical
ones, wouldn't even think that it could have been machine-written, even as
kludgy as it is. I think there is very real reason to be concerned by this,
although I am at a loss as to what to do about it.

------
fitzroy
"No Bake Apple and Sweet Potato Pie" ... "bake in a 375º oven"

Have to say, I'm a bit disappointed with this recipe.

------
tudorw
"Some corn grains can actually work like magnets, helping us keep energy
levels the highest we can. "

Good to know!

~~~
Baeocystin
...we should get these bots to start generating ad copy for the high-end audio
world, they'd fit right in!

------
hayksaakian
It seems like we're close, but we're not there yet.

Read any of the articles on the two sites mentioned in the OP and they are
clearly nonsense.

~~~
ThinkingGuy
Assuming this technique were put into practice in the wild, by the time its
target realized that the content was garbage, it would already be too late:
the page (and any associated advertising/tracking content) has already loaded.

