
Deconstructing Pinterest’s reverse-image-search SEO growth hack - jenny8lee
https://www.rankscience.com/blog/pinterest-image-seo-growth-hack
======
smcameron
Back when I was an SRE at google for their web crawler, I thought they should
de-index pinterest (did not get any traction with this). I still think they
should de-index pinterest. I add "-site:pinterest.*" to all my image searches.

~~~
superasn
I wish we could do the same for Shutterstock and all these useless stock image
websites that have totally hacked Google image search yet Google won't do
anything about them for some reason.

~~~
nine_k
How do you search for previews of commercially available images then?

What I would like is an open protocol for media licenses. Some media-
license.txt file with names of images (and wildcards) and license info.

    
    
      hero.jpg CC0
      *.PNG PD # public domain
      # A free preview of a proprietary image.
      Teaser.jpg preview
    

Everything not marked is considered proprietary, like now by default.

Then sites like shutterstock could clearly mark their previews of real images,
and galleries of free images could mark their images as free to use, etc. This
could be reflected in the search engines' UI.

~~~
zozbot234
Doesn't Google support this already as an advanced search feature?

------
xg15
Just mentioning that Pinterest used to boast on their engineering blog how
they designed the non-dismissable nag-screen [1] and how they increased
engagement by making their landing pages more confusing [2].

Not surprised they hack their way through search.

[1] [https://medium.com/pinterest-engineering/lessons-in-
growth-a...](https://medium.com/pinterest-engineering/lessons-in-growth-and-
increasing-signups-d6e7d8fef479)

[2] [https://medium.com/pinterest-engineering/lessons-in-
growth-e...](https://medium.com/pinterest-engineering/lessons-in-growth-
engineering-how-we-doubled-sign-ups-from-pin-landing-pages-1c0bc400cdb9)

~~~
neotek
If that's the stuff they're willing to openly brag about, it makes me wonder
what dirtier things they're doing behind closed doors.

------
alpha_squared
This is fascinating in a potentially terrible way. In summary: when an image
is posted, a reverse-image search is done and top results are scraped then
added to a "More like this" section for the post. This ranks highly because
it's _exactly_ what Google already associates with that image, except now all
in one page instead of across multiple pages.

Applying this same method to other content (blog posts/pages and video posts)
would, presumably, also work. The terrible part of all this, is that it would
create more junk posts ranking higher and stratifying the word association of
that content with the respective search terms. New content could end up having
zero chance of _ever_ ranking highly because another factor of result ranking
is content age (older content is weighted heavier).

Am I understanding all that correctly?

~~~
nullc
> Applying this same method to other content (blog posts/pages and video
> posts) would, presumably, also work.

I believe reddit started recently doing something like this for text and as a
result has made google searching for reddit posts essentially useless.

Basically when you view any post/thread while logged out there are a bunch of
other threads shown on the page that have related text, with all the text of
their posts included collapsed in the HTML.

The result is that when you search for the context of any reddit post you get
hundreds of results which on vaguely similar topics which don't contain the
post that you're looking for (unless you log out and count hidden text
collapsed under other threads).

~~~
specialist
_"...all the text of [the related] posts included collapsed in the HTML"_

Huh. Facepalm.

Is there any way to exclude portions of content from indexing?

Like maybe inlining robots directed pragmas:

    
    
      <span data-robots="noindex"> blah blah blah </span>
    

Or using meta tags to spec exclusions:

    
    
      <meta name="robots" content="noindex:.related-posts" />
    

FWIW:

[https://en.wikipedia.org/wiki/Robots_exclusion_standard#Meta...](https://en.wikipedia.org/wiki/Robots_exclusion_standard#Meta_tags_and_headers)

~~~
nullc
I assumed that the practice was intentional to make reddit be more heavily
represented in google results.

Reddit staff doesn't actually use the site that much (or at least that was my
impression a couple years ago after having a meeting in the office and finding
that I knew a lot more about the meme-art that users sent them than their
staff did)-- so it wouldn't be shocking that they'd be indifferent to making
google search unusable for users.

------
stelonix
For some years now, Google Images has been pretty awful to navigate, all
thanks to Pinterest. I do hope Google is just unaware of the mentioned "hack",
because the other option is they do not care because Pinterest helps them
through ads.

Note: I have not checked whether Pinterest has Adsense since I use Adblock.
I'm going by some of the comments here.

A couple of days ago I was trying to find out better image search websites
because google removed the feature to find images of an exact size. I was
brought to Bing, but it also lacks that feature. It kept me up that night
thinking how to make a better service and whether I'd have to scrape Google,
which made me think of webcrawlers, site indexing... It's turtles all the way
down, I can't imagine a solo developer (or startup) pulling it off. I know
there are decentralized search attempts, but anything decentralized simply
does not work for 2020.

So I ask here on HN, how could a power-user friendly http image search service
work without depending on big corps? Is it just impossible and we need to keep
praying billionaire CEOs will listen to us or it's simply something no one has
done yet?

~~~
mthoms
The first problem is monetization. Do you think people would pay a yearly
subscription fee for a better image search? I'm not sure they would, but
maybe.

------
kristopolous
They speak so positively about flooding the internet with useless garbage
results

------
mdoms
This doesn't make a lot of sense to me, and it doesn't seem to be backed by
any evidence, either in the original Twitter thread nor the linked blog post.

I think far more likely what's happening is Pinterest is actively crawling the
web for images and surrounding text. They will then perform their own reverse
image search on _their own_ database, and add this context text to their
existing copy of the image, or create a new one if one is not present.

Why would they rely on a user submitting a picture then go to the trouble of
reverse searching it? I think they are much more proactive than that.

~~~
stevesearer
A lot of business accounts sign up to activate Rich Pins which adds a few
additional features when people pin from you site.

When you are signed up for this they will scrape the page the pinned image
comes from to collect some extra info on the page to show along with the pin.

I’m not sure how if they do this if you are not signed up for Rich Pins
though.

------
ryanb
Hey folks, I'm the author here and CEO of RankScience (YC W17). AMA about this
growth hack or SEO in general!

~~~
cj
In your opinion, is this a growth hack you'd suggest one of your customers
try? Or does it cross a line in your opinion?

"SEO Optimization" is difficult because it's such an imperfect science (or
perhaps not, based on your company name!) What are a couple examples or
anecdotes of SEO "hacks" / optimizations that are least obvious or things most
people wouldn't think of, that have had the most significant impact on SEO?

~~~
ryanb
I think it crosses the line and I'm surprised Google hasn't issue them a
warning, or maybe they have and we just haven't heard about it.

Organic search is still where 80% of the clicks are in search. (only 20% go to
paid - though this is changing)

These are easy wins and are considered "white hat" and legitimate tactics:

* Title tag optimization with SEO A/B testing - increase CTR from Google by 10-40%

* Using NLP Content Optimization to increase content relevancy

* Optimize internal linking structure of your site so Google can easily understand your site hierarchy and spread value to important pages

* Build natural backlinks by creating interesting content that people want to engage with and link to (like this post!)

[https://www.rankscience.com/coderwall-seo-split-
test](https://www.rankscience.com/coderwall-seo-split-test)

------
stevesearer
Pinterest does a lot of SEO stuff that is interesting.

For one of my boards that ranks #1 in Google I've found that the page Google
indexes is quite a bit different than the one I see as a logged-in user.

One of the differences is that they display the text content associated with
the pin. This is also used as the image alt text, but then appended with a
bunch of keywords.

They also link to other people's boards which have names related to the images
so it looks like "tags", but I have the feeling it is probably a mix of
keyword stuffing/linking to other content for Google to follow.

The page title is also adjusted to include something like, "237 Best ________
images in 2020" followed by the board name.

------
blinding-streak
There are certain sites like Wikipedia that improve the web's utility for all
who use it, small and giant companies alike. Google search is improved by
Wikipedia's existence.

And then there are sites like Pinterest, that degrade the web's utility for
the vast majority of people on Earth. It actually harms Google's search
experience (obviously image search) and frustrates us that try to benignly
browse images. Why does Google let Pinterest get away with its user-hostile
approach?

~~~
nine_k
An obvious question: does Pinterest buy a lot of Google's ad services?

~~~
foobarian
Another obvious question: does Google pay for the valuable free content they
get from Wikipedia?

~~~
blinding-streak
"The licenses Wikipedia uses grant free access to our content in the same
sense that free software is licensed freely. Wikipedia content can be copied,
modified, and redistributed if and only if the copied version is made
available on the same terms to others and acknowledgment of the authors of the
Wikipedia article used is included (a link back to the article is generally
thought to satisfy the attribution requirement; see below for more details).
Copied Wikipedia content will therefore remain free under an appropriate
license and can continue to be used by anyone subject to certain restrictions,
most of which aim to ensure that freedom."

From
[https://en.wikipedia.org/wiki/Wikipedia:Copyrights](https://en.wikipedia.org/wiki/Wikipedia:Copyrights)

~~~
not2b
Besides, Wikipedia isn't an ad-supported site, so Google's practice of
including their text in the search result doesn't cost them ad revenue. You
could argue that it means people will see the banner requesting donations less
often, but that's compensated for by Google's own donations to them. So while
I have definite concerns about some of Google's other practices, that one is
no big deal.

------
jwegan
Hey, I run the Growth team at Pinterest. I just wanted to comment on the
article to be very clear and say we have never scraped Google search results
either currently or at any time in the past.

~~~
wolrah
I have two questions for you then.

First, if you insist that you have not scraped data then can you offer an
alternative explanation for the data being presented here?

Second, it's pretty clear that the majority of at least the tech community
hates what you do with the regwall. Do you not know how bad your site's
behavior makes GIS for those of us who aren't interested in joining it? Or do
you just not care?

~~~
notacoward
I'm guessing that they use one or more third parties that do indeed scrape
Google results but provide plausible deniability in the process. And I'd look
very closely at the employment histories of principals at those nominal
"third" parties. It's not that uncommon in this industry to find companies
with exactly one customer, staffed entirely by ex-employees of that one
customer.

~~~
shostack
Interesting. Can you cite any examples if it isn't uncommon?

------
solarkraft
"growth hack" is a great way of saying spam.

------
boto3
To me the most interesting bit in the article is they managed to scrape Google
results at scale.

------
musicale
> "If so, it's the greatest SEO scam of all time."

Pretty much.

Hopefully Google will fix the bugs that allow search results to be ruined by
Pinterest spam.

------
person_of_color
Pinterest goes in the PiHole.

